Headless support for Cog Cocoa VM

Hi guys.

As you may know, I finished my PhD in Computer Science in France and I am now back in my country, Argentina. I have started working as a freelancer/consultant/contractor/independent. If you are interested in discussing with me, please send me a private email.

For a long time, Pharaoers and Squakers have been asking for headless support in the Cocoa VMs just as we have with Carbon VMs. Carbon is becoming a legacy framework so people needed this.
 I wanted to take this opportunity to thanks Square [i] International for sponsoring me to implement such a support. Not only have they sponsored the development but they have also agreed to release it under MIT license for the community. This headless support will be included in the official Pharo VM and will be, therefore, accessible to everybody. You can read more details in the ANN email.

So…thanks Square [i] International for letting me work in something so much fun and needed.


LZ4 binding for Pharo

Hi guys. In the last days I wrote a Pharo binding for the LZ4 compressor (thanks to Camillo Bruni for pointing out), and so I wanted to share it. The main goal of LZ4 is to be really fast in compressing and uncompressing but not to obtain the biggest compression ratio possible.

The main reason why I wrote this binding is for Fuel serializer, with the idea of compressing/uncompressing the serialization (ByteArray) of a graph. Hopefully, with a little bit of overhead (for compressing and uncompressing), we gain a lot in writing to the stream (mostly with files and network). However, the binding is not coupled with Fuel at all.

I have documented all the steps to install and run LZ4 in Pharo here.  Please, if you give it a try, let me know if it worked or if you had problems.

I would also like to do some more benchmarks with it, because so far I only did a few. So if you have benchmarks to share with me, please do it.

So far LZ4 does not provide a streaming like API. We tried with Camillo to build a streaming API in Pharo (like ZLibWriteStream, GZipWriteStream, etc) but the results were not good enough. So we are still analyzing this.

Ahhh yes, for the binding I use Native Boost FFI, so I guess I will wrote a post soon to explain how to wrap a very simple library with NB.

See you,


Dr. Mariano Martinez Peck :)

Hi guys. Last Monday 29th of October, I did my PhD defense and everything went well (mention très honorable!) so I am now officially a doctor :)  My presentation was 45 mins long and I liked how it went. Have you ever wondered why I was involved in Fuel serializer, Ghost proxies, VM hacking, Moose’ DistributionMaps, databases, etc? If so, you can see the slides of my presentation here. Notice that there  lots of slides and this is because I have several animations and each intermediate step is a new slide in a pdf.

After my presentation, the jury had time to ask me any questions they had and give feedback. Lots of interesting questions and discussions came from there. After a private discussion between the members of the jury,  the president read my defense report and we followed with a cocktail with drinks and snacks.

The presentation was recorded (thanks Santi and Anthony for taking care) but now I am processing it … I will let you know when this is ready.

The jury was composed by 8 persons, 4 of which were my supervisors:

Rapporteurs:
-Pr. Christophe Dony, Lirmm, Univ. Montpellier, France.
-Pr. Robert Hirschfeld, HPI, Postdam, Germany.
Examinateurs:
-Dr. Jean-Bernard Stéfani, DR Equipe SARDES, INRIA Grenoble-Rhone-Alpes, France.
-Dr. Roel Wuyts, Principal Scientist at IMEC et Professeur à l’universté catholique de Leuven, Belgium.
Directeur:
-Dr. Stéphane Ducasse, DR Equipe RMod, INRIA Lille Nord Europe, France.
Co-Encadrants:
-Dr. Marcus Denker, CR Equipe RMod, INRIA Lille Nord Europe, France.
-Dr. Luc Fabresse,  Ecole des Mines de Douai, Université de Lille Nord de France
-Dr. Noury Bouraqadi,  Ecole des Mines de Douai, Université de Lille Nord de France

So, the PhD has reached its end. Now it is time to move to a different stage.

See you,

Dr. Mariano Martinez Peck :)


My PhD defense: “Application-Level Virtual Memory for Object-Oriented Systems”

Hi all. After 3 years of hard work, my “PhD journey” is arriving to an end (which means, among others, that it is now time to search a job again hahaha). The defense will take place on Monday, October 29, 2012 at Mines de Douai, site Lahure, room “Espace Somme”, Douai, France.

After the defense there will be a kind of cocktail with some food and drinks. If you are reading this and you are interested, you are more than invited to come :) Just send me a private email for further details.

The following is the title and abstract of the thesis:

Application-Level Virtual Memory for Object-Oriented Systems

During the execution of object-oriented applications, several millions of objects are created, used and then collected if they are not referenced. Problems appear when objects are unused but cannot be garbage-collected because they are still referenced from other objects. This is an issue because those objects waste primary memory and applications use more primary memory than what they actually need. We claim that relying on operating systems (OS) virtual memory is not always enough since it is completely transparent to applications. The OS cannot take into account the domain and structure of applications. At the same time, applications have no easy way to control nor influence memory management.

In this dissertation, we present Marea, an efficient application-level virtual memory for object-oriented programming languages. Its main goal is to offer the programmer a novel solution to handle application-level memory. Developers can instruct our system to release primary memory by swapping out unused yet referenced objects to secondary memory.

Marea is designed to: 1) save as much memory as possible i.e., the memory used by its infrastructure is minimal compared to the amount of memory released by swapping out unused objects, 2) minimize the runtime overhead i.e., the swapping process is fast enough to avoid slowing down primary computations of applications, and 3) allow the programmer to control or influence the objects to swap.

Besides describing the model and the algorithms behind Marea, we also present our implementation in the Pharo programming language. Our approach has been qualitatively and quantitatively validated. Our experiments and benchmarks on real-world applications show that Marea can reduce the memory footprint between 25% and 40%


Halting when VM sends a particular message or on assertion failures

This post is mostly a reminder for myself, because each time I need to do it, I forget how it was :)

There are usually 2 cases where I want that the VM halts (breakpoint):

1) When a particular message is being processed.

2) When there is an assertion failure. CogVM has some kind of assertions (conditions) that when evaluated to false, mean something probably went wrong. When this happens, the condition is printed in the console and the line number is shown. For example, if we get this in the console:

(getfp() & STACK_ALIGN_MASK) == STACK_FP_ALIGN_BYTES 41946

It means that the condition evaluated to false. And 41946 is the line number. Great. But how can I put a breakpoint here so that the VM halts?

So….what can we do with CogVM? Of course, we first need to build the VM in “debug mode”. Here you can see how to build the VM, and here how to do it in debug mode. Then we can do something like this (taken from an Eliot’s email)

McStalker.macbuild$ gdb Debug.app
GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...<wbr />Reading symbols for shared libraries ................ done
(gdb) break warning
Breakpoint 1 at 0x105e2b: file /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c, line 39.
(gdb) run -breaksel initialize ~/Squeak/Squeak4.2/trunk4.2.image
Starting program: /Users/eliot/Cog/oscogvm/macbuild/Debug.app/Contents/MacOS/Croquet -breaksel initialize ~/Squeak/Squeak4.2/trunk4.2.image
Reading symbols for shared libraries .+++++++++++++++..................................................................................... done
Reading symbols for shared libraries . done

Breakpoint 1, warning (s=0x16487c "send breakpoint (heartbeat suppressed)") at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:39
39              printf("\n%s\n", s);
(gdb) where 5
#0  warning (s=0x16487c "send breakpoint (heartbeat suppressed)") at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:39
#1  0x0010b490 in interpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:4747
#2  0x0011d521 in enterSmalltalkExecutiveImplementation () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:14103
#3  0x00124bc7 in initStackPagesAndInterpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:17731
#4  0x00105ec9 in interpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:1933
(More stack frames follow...)

So the magic line here is “(gdb) break warning” which puts a breakpoint in the warning() function. Automatigally, the assertion failures end up using this function, and therefore, it halts. With this line we achieve 2)

To achieve 1) the key line is “Starting program: /Users/eliot/Cog/oscogvm/macbuild/Debug.app/Contents/MacOS/Croquet -breaksel initialize ~/Squeak/Squeak4.2/trunk4.2.image”. Here with “-breaksel” we can pass a selector as parameter (#initialize in this case). So each time the message #initialize is send, the VM will halt also in the warning function, so you have there all the stack to analyze whatever you want.

I am not 100% sure but the following is what I understood about how it works from Eliot:

So, if I understood correctly, I can put a breakpoint in the function warning() with “break warning”. With the -breaksel  parameter you set an instVar with the selector name and size. Then after, anywhere I can send  #compilationBreak: selectorOop point: selectorLength   and that will magically check whether the selectorOop is the one I passes with -breaksel and if true, it will call warning, who has a breakpoint, hence, I can debug :)   AWESOME!!!!   Now with CMake I can even generate a xcode project and debug it :)  

That was all. Maybe this was helpful for someone else too.


“New” Tanker – current status

Both ESUG 2012 and GSoC have finished. At ESUG I gave a presentation about Fuel/Tanker (sorry, no slides because it was all demo). I also presented Tanker in the ESUG Awards….we got the 4th place, and only 8 points of difference with the 3er place :)   Anyway…I wanted to make public what exactly we finally managed to do with Tanker, a package exporter and importer that uses Fuel serializer. As I said in a previous post, we have changed a lot how Tanker works internally. In that post also I mentioned that Tanker didn’t support source code export/import, but in this new version, it does support methods source code, class comments, timestamps and everything that is currently being stored in the .changes or .sources file.

Exporting packages

The following figure shows how the export works. The input is a package of code (classes + extension methods), in this example “MyPackage”. The first step is to traverse the classes and methods of the package and do 2 things: a) write the source code (source of methods, class comments, timestamps, etc) to a text-based chunk format sources file (MyPackage.tankst); b) create “definition” objects. These objects represent a kind of object model that store all the necessary information of the code entities to be able to recreate them back  in another image. So we have for example, TAClass, TATrait, TAMethod, TABinding, TAAdditionalMethodState, TATraitAlias, TATraitComposition, etc…

Something key here is that besides all that information, we also store the relative position of the source code of the entity in the “MyPackage.tankst”. So for example, TAMethod instances will have a number representing that relative offset (similar to the pointer that CompiledMethods have in their trailer to .sources/.changes) where its source code is.

Another important detail is that in TAMethod we are storing the bytecodes, because even if we have the source code we want to avoid needing to compile during the import.

Once we have finished writing the sources file and creating the definitions, we just serialize them into a Fuel binary, say “MyPackage.tank”.

Importing packages

The following diagram shows how the import works. The input is the sources file (MyPackage.tankst) and the Fuel binary file (MyPackage.tank). The first step is basically to read the sources file and append all its contents in the .changes file (so that to get the same behavior as if we were installing a package with Monticello). But before doing that, we temporally store/keep the current “end” position of the .changes file ;)  (you already guessed why?).

The second step is to materialize the definitions using Fuel. Something very nice is that the definitions that were serialized, also understand the necessary messages to be installed in the system (#installUsing: aSourcesFile environment: anEnvironment). So we basically tell each class and each extension method to get installed in the system. Those objects will delegate to the rest of the definitions to complete their task (all definitions know what to do when installing them in the system).

Classes and traits will use the class builder to recreate themselves back. The source code was already installed in the .changes, so now we need to fix the “pointers” from our newly created classes/methods to point to the correct place of their source in the .changes file. And this is very easy because their position is just the “end” position of the .changes before installing our package (that value I told you we were temporally storing) + the relative position that was stored in the definition itself :)

So what is the biggest difference with Monticello for example? The key point is that we export bytecodes and we avoid having to compile during import. This make things faster while also being able to install packages without compiler (very useful for bootstrapping for example).

How to install it and use it

Tanker only works in Pharo 2.0 (because we rely on new class builder and layouts) so you first need to grab an image. Then, you can evaluate:

Gofer it
smalltalkhubUser: 'marianopeck' project: 'Tanker';
package: 'ConfigurationOfTanker';
load.
(Smalltalk at: #ConfigurationOfTanker) perform: #loadDevelopment.
Yes, so far there isn’t a stable version yet, we are waiting for new class builder. Be careful because that will install the new class builder (and change the system to use it)  and it may have some bugs…so take care :)
The export and import look like:
aPackage := TAPackage behaviors: {MyClass. MySecondClass} extensionMethods: {String>>#myExtensionMethod1. Object>>#myExtensionMethod2}.
(TAExport
 package: aPackage
 binariesOn: aBinaryWriteStream
 sourcesOn: aTextSourcesWriteStream)
 run.

(TAImport
 binariesFrom: aBinaryReadStream
 sourcesFrom: aSourcesReadStream)
 run.

However, we do not expect the final user to provide all the list of classes and extension methods. Therefore, we have helper methods to export and import RPackages and PackageInfos. Example:

 TAExport exportRPackageNamed: 'MyProjectCore'.
 TAExport exportRPackageNamed: 'MyProjectTests'.

TAImport importPackageNamed: 'MyProjectCore'.
 TAImport importPackageNamed: 'MyProjectTests'.

Conclusion

So that was all for today. Probably, I will do another blog post where I will show how we can query Metacello to know which packages to export and in which order, some details about the new class builder, some benchmarks while exporting all seaside/pier/magritte (10 seconds to export and 20 to import), and so on. The conclusion we got with this project is that indeed Fuel can be successfully used in yet another completely different domain. If you want to help us, please test it with your own packages. Right now we have only one open issue (need to recompile if superclasses present in the image have reshaped). I guess that soon we will fix this last issue and release the first stable version. In the future we plan to analyze how can we really integrate Tanker with Monticello/Metacello.


ESUG 2012: when the time to pay for the beers has arrived!

Hi guys. In a blog post of last year, I said:

“What is really great is to meet people. Have you ever sent 1 million emails to someone without even knowing his face?. Is he 70 years old? 20 ? What language does he speak?  Well, ESUG is the perfect conference to meet people by real, face to face. The best part of ESUG happens in the “corridors”, I mean, talking with people between talks, after the conference, in the social event, etc. There will be people who will ask you about your stuff, they will give you feedback and ideas. You will find yourself wanting to give feedback to others. It is a nice circle.”

This year, we should have another motivation. How many times have you promised people to buy them a beer if they did XXX? How many times people have promised you a beer for doing YYY? What better place than Belgium to balance the accounts?

I have already done my homework. I own beers to: Alain Plantec and  Henrik Sperre Johansen.

Anyway guys, hope to see you there!


Tanker screencast and image ready for testing

Hi guys. Last days, we submitted Tanker to the ESUG Innovation Technology Awards. As part of that submission, I have created a screencast and an image with the examples.

The screencast gives you an introduction to Tanker and shows you how to export and import packages. I starts with a simple example, then it exports a real library and ends up exporting all Seaside, Magritte and Pier :)   Ahhh yes, sorry for my voice, I know it is ugly hahaha. You can also watch it here:

You can also get the image I used for the screencast, which you can try and experiment.

Have fun,


Tanker: transporting packages with Fuel

Hi all. You may have noticed that Tanker is starting to appear in some mails or in the Pharo issue tracker. Tanker is a project that Martin and I have been developing for a while and we are going to submit it this year to ESUG Innovation Technology Award. Therefore, I thought it would be interesting to explaining what it is, its current status, its goals, etc.

What is “Tanker” and what was “FuelPackageLoader”?

Right now the common way to export and import packages in Pharo is by using Monticello (or doing fileOut, which is almost the same). This ends up exporting the source code and then compiling it during the import. Tanker is a tool to export and import packages of code in a binary way using Fuel serializer. Using Fuel enables us to avoid having to compile from sources during the import. Tanker understands the concept of “packages of code” and the correct integration of them into the system. For example, it initializes classes, sends notifications, etc.

Tanker was first a prototype called “FuelPackageLoader” which was what I used for the example of exporting and importing seaside packages. In the last months, we have renamed the project to “Tanker”. Why? Because we do not want people to think that it is a Fuel project. In fact, Tanker is a simple USER of Fuel. Just as any other code that uses Fuel. This is why we have also moved it to its own repository.

Fuel has a package called “FuelMetalevel”. This package gives Fuel the knowledge of how to correctly serialize and materialize classes, metaclasses, traits, method dictionaries, compiled methods and closures, in other words, all the entities related to code and runtime infrastructure. It only knows how to serialize and materialize correctly. Nothing else. It does not initialize classes, it does not notify the system about the materialized classes, it does not install classes in Smalltalk globals, etc.

Current features, design and missing things

Right now, Tanker provides the following features:

  • It is able to export a package to a .tank file and import it in another image. The input for the export is a TAPackage which basically contains a list of classes and a list of extension methods. We are completely decoupled from the “package representation” (PackageInfo, RPackage, MCPackage, etc). However, we provide an API if you want to directly export from those types of packages.
  • Classes are initialized and installed in Smalltalk globals, events are sent, etc.
  • It has the ability to add additional user-defined objects to the package being exported (this is used, for example, for the Pharo generation from a kernel to store large/heavy class variables, tables and fonts).
  • It supports pre and post load actions represented as closures.

From the design point of view, Tanker:

  • Fully serializes classes and traits (not its “definitions”)
  • Does not use the ClassBuilder during materialization. Tanker itself materializes the “class objects” and sets the data.

So far, we are missing:

  • The possibility to export source code (right now classes and methods do not have source code) and also to install it in the .changes file during import.
  • Some validations during import. For example, the superclass of a class being installed may have changed its shape and, therefore, the classes to install need recompilation (because the bytecodes accessing instVars offset may be shifted or wrong). Or if a class already exists in the image and the shape has changed, we need to update the existing instances.
  • Integration with other tools like Monticello and Metacello.

Results with the “current status”

There are so far 3 real examples of Tanker:

How to install it and use it

Tanker will work only in the bleeding edge of Pharo 2.0. So I first recommend you to get an image from Jenkins. Then, you can install Tanker this way:

Gofer it
 url: 'http://smalltalkhub.com/mc/marianopeck/Tanker/main';
 package: 'ConfigurationOfTanker';
load.
(Smalltalk at: #ConfigurationOfTanker) load.

To export a package and provide yourself the classes and extension methods, you can:

| aPackage aStream |
"Export"
aStream := 'demo.tank' asFileReference writeStream binary.
aPackage := TAPackage behaviors: {TestCase. TestLocalVariable. } extensionMethods: #().
TAPackageStore new storePackage: aPackage on: aStream.

"Import"
aStream := 'demo.tank' asFileReference readStream binary.
TAPackageLoader new loadFrom: aStream.

Then you can also use the API that provides helper methods to RPackage and PackageInfo:

aPackage := TAPackage fromPackageInfoNamed: 'MyPackage'

You also have #fromPackagesInfoNames:, #fromRPackageNamed: and #fromRPackagesNames:. Of course, there are more use-cases, API and scenarios. But, so far, that is the simplest usage. For more examples, browse the class side methods of TankerExamples.

GSoC and “new status”

The results so far are quite promising and not anymore a “proof of concept”. However, we still need to support source code management as well as the already mentioned pending features. Because of this reason, Martin submitted Tanker for the GSoC and fortunately it was accepted. So, right now we are moving to a different design to solve the requirements.  The idea now is NOT to serialize classes and traits, but instead serialize their “definition”. Imagine by “definition” the string used to create them. Then, during import, instead of just materializing class objects, we take the definition and, using the ClassBuilder or similar, we “evaluate” the definitions and we get the new classes.

At the same time, the idea is to export the source code of a package in a file (myPackage.tank.st or something like that) and the binary representation in another file (say myPackage.tank). Then, during import, you should be able to import with or without sources.

Side-effect projects of Tanker

You may be wondering why we didn’t start from the very beginning with the “definitions” way. Well, to be honest, the ClassBuilder is a mess, difficult to understand, maintain and extend. It was really hard trying to use it for our purpose. So the first “side-effect project” of Tanker is to continue pushing the “new ClassBuilder” started by Toon Verwaest based in “slots”. Martin Dias, Guillermo Polito and Camillo Bruni are pushing it and writing tests. I think it could be soon integrated in Pharo and replace the old one. The idea is that Tanker will use this ClassBuilder, for example, to evaluate the definitions.

When we are importing a class, it may happen that the superclass (present in the image where we are importing) has changed its shape (added or removed instVars, change supperclass, etc). If this is true, we have to recompile because the bytecodes accessing intsVars will be a shifted offset. However, recompiling is slow and we don’t want that. Therefore, Tanker will use IR (intermediate representation) which was developed by Marcus Denker and the team working with the new Opal compiler. IR is just a nice model generated from a CompiledMethod. The idea is that we can generate the IR, modify it (bytecodes for instVars accessing, for example) using this nice abstraction and API and then generate back a new CompiledMethod. This is way faster than recompiling. Furthermore, IR is decoupled from Opal so we don’t need whole Opal.

Conclusion

Tanker started as an experiment to see whether Fuel coud be used to export and import packages in a binary way. The proof of concept was quite good so we are now going forward with the source code management and related stuff. It is important to notice that Tanker just “uses” Fuel. Fuel is completely decoupled from Tanker. We think Fuel was well received by the community. We are doing our best so that Tanker gets positive feedback as well.


Reviving CI test failures in local machine

The problem

These days, most serious software developments include a Continuous Integration server which runs tests. A problem appears when tests fail in the server but they do not fail locally. There can be differences in the used operating system, virtual machine, configuration, etc. Let’s take as an example the Jenkins server of Pharo. We use such server to not only build and test the Pharo images but also the VMs. There are 3 slaves (one for each: Windows, Linux and MacOSX) and tests are run in all of them. Still, it is common to have tests that we cannot reproduce locally. Why?

  • Random failures: tests that fail randomly. Of course, we would prefer not having these tests but sometimes we do.
  • Tests that fail because as a side effect of other tests.
  • The OS of the server or even its configuration/infrastructure is different.
  • The used virtual machine can be different (for example, Jenkins uses the VM it builds to test the other jobs).

What do we have now?

So, we have a failure in the server that we cannot reproduce locally. How can we understand what happened? So far, the only thing we have is a piece of a text-based stack trace. For example, let’s take this test failure:

Error Message
Assertion failed
Stacktrace
SocketStreamTest(TestCase)>>signalFailure:
SocketStreamTest(TestCase)>>assert:
SocketStreamTest(TestCase)>>should:raise:
SocketStreamTest>>testUpToAfterCloseSignaling
SocketStreamTest(TestCase)>>performTest

As you can see, this is not that helpful and you may still don’t know what  has happened. Something really useful would be to at least know what where the values of the instance variables involved in that stack… Here is where Camillo Bruni had a nice idea :)

Fuelizing test failures

In Pharo, the stack of the running system is reified also from the language side and we can access them! (we can even modify them). We have instances of MethodContext which hold an instVar ‘sender’ that refers to the next sender in the stack (another MethodContext or nil if it is the last). Apart from ‘sender’, a context also includes the receiver, the method that caused its activation, the arguments and the temporal variables. The Fuel serializer can serialize any type of object including MethodContext. If we can serialize a MethodContext (and closures and methods), we can serialize a stack, right? And what does this mean? Well, it means that we can serialize a debugger with its current state. I have already shown several times (at ESUG Innovation Technology Award and at PharoConf) how we can use Fuel to serialize a debugger (from image X) in the middle of its execution and materialize it in image Y and continue debugging.

Pharo provides ‘exception’ objects and, at the end, test failures are exceptions (TestFailure). We can always ask its “signaler context” to an exception, in other words, the MethodContext that signals it. Once we have that MethodContext, we have all the stack (because that object has a sender and the sender context has a sender and ….). So, how do we serialize that?

context := testFailure signalerContext.
FLSerializer newFull
 serialize: context
 toFileNamed: 'context.fuel'.

So that piece of code will serialize all the stack of contexts including all the transitive closure: receiver, arguments, temporal variables, etc.

Reviving test failures

So we have serialized our test failure on a file. Now we want to revive them in our local machine. The first obvious thing is to materialize the original stack from the file. But then, what do we do with the stack? How can we do something useful with it? Well, Pharo allows us to open a debugger for a particular stack :) . This means we can just open a debugger with the stack of the test failure! To do that:

| aContext |
aContext := FLMaterializer materializeFromFileNamed: 'context.fuel'.
Debugger
 openContext: aContext
 label: 'This is the new debugger!'
 contents: nil

And that opens our nice debugger. Much better than a text-based stack trace, isn’t it?

Caveats when serializing a stack

When you serialize the whole stack, you may find some problems:
  1. The object graph that you serialize and, therefore, the resulting stream size can be really large depending on what the contexts have. Sometimes a context end up in the UI so you end up serializing lots of morphs, colors, forms, etc. If everything is fine, the file should be a couple hundred or thousands KB. If the file  size is in MB…then you may be serializing too much.
  2. Not only the graph is too big, but it also incorporates objects that CHANGE while being serialized (mostly when these are objects from the UI). This will cause Fuel to throw an error saying the graph has changed during serialization.
  3. If 2) happens, then depending where you trigger the fuel serialization, you may end up in a loop. For example, say you want to serialize each error with Fuel. So you change  SmalltalkImage>>logError:inContext:  to write the context with Fuel. Now, if 2) happens and Fuel throws an error, you will try to log that again causing again the serialization… infinitive loop.
  4. Apart from the previous points, there are still more problems. You can read the title “Limitation and known problems” in this post.
So… some workarounds are (still, not sure if they will help in all cases):
  • Deep copy the context before serializing it.
  • If you want to serialize particular contexts (for example, particular domain exceptions), then you may know WHERE to hook to make some instVars transient and, therefore, avoid serializing things you don’t want and that may cause 2).
  • Serialize a PART of the stack.

Jenkins integration

Thanks to Camillo and to Sean P. DeNigris, now Jenkins serializes (for some jobs) each test failure into a file (here you can see how to set up your own Jenkins for Pharo). For example, we have the job “pharo-2.0-tests”. If you select the OS and then a particular build number, you will have an artifact called “Pharo-2.0-AfterRunningTests.zip”. For example, this one: https://ci.lille.inria.fr/pharo/view/Pharo%202.0/job/pharo-2.0-tests/Architecture=32,OS=mac/lastSuccessfulBuild/artifact/Pharo-2.0-AfterRunningTests.zip. This zip contains all the .fuel files of all the test failures. Each file is named ClassXXX-testYYY.fuel.

To workaround the problem mentioned in the previous paragraphs (“Caveats when serializing a stack”), we just serialize a part of the stack: from the context that signals the failure up to the test method. Example:

  ...
  performTest
"Start context slice"
> testMyFeatureBla
> ...
> ...
> assert: foo equals: bar
"end context slice"
  assert:
  Exception signal

The idea is to serialize the least number of stack-frames possible while still giving decent debug feedback. To do that, our Jenkins code (HDTestReport>>serializeError: error of: aTestCase) is:

serializeError: error of: aTestCase
 "We got an error from a test, let's serialize it so we can properly debug it later on..."
 | context testCaseMethodContext |

 context := error signalerContext.
 testCaseMethodContext := context findContextSuchThat: [ :ctx|
 ctx receiver == aTestCase and: [ ctx methodSelector == #performTest ]].
 context := context copyTo: testCaseMethodContext.

 [
 FLSerializer newFull
 " use the sender context, generally the current context is not interesting"
 serialize: context sender
 toFileNamed: aTestCase class name asString,'-', aTestCase selector, '.fuel'.
 ] on: Error do: [:err| "simply continue..." ]
During serialization the graph can somehow reach classes of the Jenkins code (like HDTestReport). If you materialize in an image where such class is not present, you will have a Fuel error. For this purpose in the same Pharo-2.0-AfterRunningTests.zip besides having the .fuel files, we also have a Pharo-2.0-AfterRunningTests.image which, as it names says, was saved after having run all tests (meaning it has the Jenkins code installed). This means we can directly use that image to materialize and it will work. The other option is to take another image and install the following before materializing:
Gofer new
 url: 'http://ss3.gemstone.com/ss/CISupport';
 package: 'HudsonBuildTools20';
 load.
This is temporal because soon Jenkins support code will be directly integrated in Pharo.
Anyway, I recommend using the same version of image that was used during serialization. So I think that using directly Pharo-2.0-AfterRunningTests.image is more reliable.

Conclusion

It is clear that there are several caveats. However, I do believe this is yet another step in CI and development. It is just one more tool you have at hand when something is failing in the server and you cannot reproduce it locally. It the worst case, it won’t help but it won’t hurt either. If you have luck, you may find out the cause :) It is incredible all the things you can do when the stack is reified and visible from the language while also being serializable. For me, asking for a text-based stack trace in Smalltalk is like going to a cabaret and ask for a hug. We have so much power that we should take advantage of it. At the end, using a debugger is way better. Anyway, I do not recommend to remove the stack trace information, just adding also the Fuel possibility.

Follow

Get every new post delivered to your Inbox.

Join 27 other followers