These days, most serious software developments include a Continuous Integration server which runs tests. A problem appears when tests fail in the server but they do not fail locally. There can be differences in the used operating system, virtual machine, configuration, etc. Let’s take as an example the Jenkins server of Pharo. We use such server to not only build and test the Pharo images but also the VMs. There are 3 slaves (one for each: Windows, Linux and MacOSX) and tests are run in all of them. Still, it is common to have tests that we cannot reproduce locally. Why?
- Random failures: tests that fail randomly. Of course, we would prefer not having these tests but sometimes we do.
- Tests that fail because as a side effect of other tests.
- The OS of the server or even its configuration/infrastructure is different.
- The used virtual machine can be different (for example, Jenkins uses the VM it builds to test the other jobs).
What do we have now?
So, we have a failure in the server that we cannot reproduce locally. How can we understand what happened? So far, the only thing we have is a piece of a text-based stack trace. For example, let’s take this test failure:
Error Message Assertion failed Stacktrace SocketStreamTest(TestCase)>>signalFailure: SocketStreamTest(TestCase)>>assert: SocketStreamTest(TestCase)>>should:raise: SocketStreamTest>>testUpToAfterCloseSignaling SocketStreamTest(TestCase)>>performTest
As you can see, this is not that helpful and you may still don’t know what has happened. Something really useful would be to at least know what where the values of the instance variables involved in that stack… Here is where Camillo Bruni had a nice idea🙂
Fuelizing test failures
In Pharo, the stack of the running system is reified also from the language side and we can access them! (we can even modify them). We have instances of MethodContext which hold an instVar ‘sender’ that refers to the next sender in the stack (another MethodContext or nil if it is the last). Apart from ‘sender’, a context also includes the receiver, the method that caused its activation, the arguments and the temporal variables. The Fuel serializer can serialize any type of object including MethodContext. If we can serialize a MethodContext (and closures and methods), we can serialize a stack, right? And what does this mean? Well, it means that we can serialize a debugger with its current state. I have already shown several times (at ESUG Innovation Technology Award and at PharoConf) how we can use Fuel to serialize a debugger (from image X) in the middle of its execution and materialize it in image Y and continue debugging.
Pharo provides ‘exception’ objects and, at the end, test failures are exceptions (TestFailure). We can always ask its “signaler context” to an exception, in other words, the MethodContext that signals it. Once we have that MethodContext, we have all the stack (because that object has a sender and the sender context has a sender and ….). So, how do we serialize that?
context := testFailure signalerContext. FLSerializer newFull serialize: context toFileNamed: 'context.fuel'.
So that piece of code will serialize all the stack of contexts including all the transitive closure: receiver, arguments, temporal variables, etc.
Reviving test failures
So we have serialized our test failure on a file. Now we want to revive them in our local machine. The first obvious thing is to materialize the original stack from the file. But then, what do we do with the stack? How can we do something useful with it? Well, Pharo allows us to open a debugger for a particular stack🙂 . This means we can just open a debugger with the stack of the test failure! To do that:
| aContext | aContext := FLMaterializer materializeFromFileNamed: 'context.fuel'. Debugger openContext: aContext label: 'This is the new debugger!' contents: nil
Caveats when serializing a stack
- The object graph that you serialize and, therefore, the resulting stream size can be really large depending on what the contexts have. Sometimes a context end up in the UI so you end up serializing lots of morphs, colors, forms, etc. If everything is fine, the file should be a couple hundred or thousands KB. If the file size is in MB…then you may be serializing too much.
- Not only the graph is too big, but it also incorporates objects that CHANGE while being serialized (mostly when these are objects from the UI). This will cause Fuel to throw an error saying the graph has changed during serialization.
- If 2) happens, then depending where you trigger the fuel serialization, you may end up in a loop. For example, say you want to serialize each error with Fuel. So you change SmalltalkImage>>logError:inContext: to write the context with Fuel. Now, if 2) happens and Fuel throws an error, you will try to log that again causing again the serialization… infinitive loop.
- Apart from the previous points, there are still more problems. You can read the title “Limitation and known problems” in this post.
- Deep copy the context before serializing it.
- If you want to serialize particular contexts (for example, particular domain exceptions), then you may know WHERE to hook to make some instVars transient and, therefore, avoid serializing things you don’t want and that may cause 2).
- Serialize a PART of the stack.
Thanks to Camillo and to Sean P. DeNigris, now Jenkins serializes (for some jobs) each test failure into a file (here you can see how to set up your own Jenkins for Pharo). For example, we have the job “pharo-2.0-tests”. If you select the OS and then a particular build number, you will have an artifact called “Pharo-2.0-AfterRunningTests.zip”. For example, this one: https://ci.lille.inria.fr/pharo/view/Pharo%202.0/job/pharo-2.0-tests/Architecture=32,OS=mac/lastSuccessfulBuild/artifact/Pharo-2.0-AfterRunningTests.zip. This zip contains all the .fuel files of all the test failures. Each file is named ClassXXX-testYYY.fuel.
To workaround the problem mentioned in the previous paragraphs (“Caveats when serializing a stack”), we just serialize a part of the stack: from the context that signals the failure up to the test method. Example:
... performTest "Start context slice" > testMyFeatureBla > ... > ... > assert: foo equals: bar "end context slice" assert: Exception signal
The idea is to serialize the least number of stack-frames possible while still giving decent debug feedback. To do that, our Jenkins code (HDTestReport>>serializeError: error of: aTestCase) is:
serializeError: error of: aTestCase "We got an error from a test, let's serialize it so we can properly debug it later on..." | context testCaseMethodContext | context := error signalerContext. testCaseMethodContext := context findContextSuchThat: [ :ctx| ctx receiver == aTestCase and: [ ctx methodSelector == #performTest ]]. context := context copyTo: testCaseMethodContext. [ FLSerializer newFull " use the sender context, generally the current context is not interesting" serialize: context sender toFileNamed: aTestCase class name asString,'-', aTestCase selector, '.fuel'. ] on: Error do: [:err| "simply continue..." ]
Gofer new url: 'http://ss3.gemstone.com/ss/CISupport'; package: 'HudsonBuildTools20'; load.