Tag Archives: Debug

Halting when VM sends a particular message or on assertion failures

This post is mostly a reminder for myself, because each time I need to do it, I forget how it was ūüôā

There are usually 2 cases where I want that the VM halts (breakpoint):

1) When a particular message is being processed.

2) When there is an assertion failure. CogVM has some kind of assertions (conditions) that when evaluated to false, mean something probably went wrong. When this happens, the condition is printed in the console and the line number is shown. For example, if we get this in the console:

(getfp() & STACK_ALIGN_MASK) == STACK_FP_ALIGN_BYTES 41946

It means that the condition evaluated to false. And 41946 is the line number. Great. But how can I put a breakpoint here so that the VM halts?

So….what can we do with CogVM? Of course, we first need to build the VM in “debug mode”. Here you can see how to build the VM, and here how to do it in debug mode.¬†Then we can do something like this (taken from an Eliot’s email)

McStalker.macbuild$ gdb Debug.app
GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...<wbr />Reading symbols for shared libraries ................ done
(gdb) break warning
Breakpoint 1 at 0x105e2b: file /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c, line 39.
(gdb) run -breaksel initialize ~/Squeak/Squeak4.2/trunk4.2.image
Starting program: /Users/eliot/Cog/oscogvm/macbuild/Debug.app/Contents/MacOS/Croquet -breaksel initialize ~/Squeak/Squeak4.2/trunk4.2.image
Reading symbols for shared libraries .+++++++++++++++..................................................................................... done
Reading symbols for shared libraries . done

Breakpoint 1, warning (s=0x16487c "send breakpoint (heartbeat suppressed)") at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:39
39              printf("\n%s\n", s);
(gdb) where 5
#0  warning (s=0x16487c "send breakpoint (heartbeat suppressed)") at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:39
#1  0x0010b490 in interpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:4747
#2  0x0011d521 in enterSmalltalkExecutiveImplementation () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:14103
#3  0x00124bc7 in initStackPagesAndInterpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:17731
#4  0x00105ec9 in interpret () at /Users/eliot/Cog/oscogvm/macbuild/../src/vm/gcc3x-cointerp.c:1933
(More stack frames follow...)

So the magic line here is “(gdb) break warning” which puts a breakpoint in the warning() function. Automatigally, the assertion failures end up using this function, and therefore, it halts. With this line we achieve 2)

To achieve 1) the key line is “Starting program: /Users/eliot/Cog/oscogvm/macbuild/Debug.app/Contents/MacOS/Croquet -breaksel initialize ~/Squeak/Squeak4.2/trunk4.2.image”. Here with “-breaksel” we can pass a selector as parameter (#initialize in this case). So each time the message #initialize is send, the VM will halt also in the warning function, so you have there all the stack to¬†analyze¬†whatever you want.

I am not 100% sure but the following is what I understood about how it works from Eliot:

So, if I understood correctly, I can put a¬†breakpoint¬†in the function warning() with “break¬†warning”. With the -breaksel¬† parameter you set an instVar with the selector name and size. Then after, anywhere I can send¬† #compilationBreak: selectorOop¬†point: selectorLength¬†¬† and that will magically check whether the selectorOop is the one I passes with -breaksel and if true, it will call warning, who has a¬†breakpoint, hence, I can debug ūüôā¬†¬† AWESOME!!!!¬†¬† Now with CMake I can even generate a xcode project and debug it ūüôā ¬†

That was all. Maybe this was helpful for someone else too.

Advertisements

Reviving CI test failures in local machine

The problem

These days, most serious software developments include a Continuous Integration server which runs tests. A problem appears when tests fail in the server but they do not fail locally. There can be differences in the used operating system, virtual machine, configuration, etc. Let’s take as an example the Jenkins server of Pharo. We use such server to not only build and test the Pharo images but also the VMs. There are 3 slaves (one for each: Windows, Linux and MacOSX) and tests are run in all of them. Still, it is common to have tests that we cannot reproduce locally. Why?

  • Random failures: tests that fail randomly. Of course, we would prefer not having these tests but sometimes we do.
  • Tests that fail because as a side effect of other tests.
  • The OS of the server or even its configuration/infrastructure is different.
  • The used virtual machine can be different (for example, Jenkins uses the VM it builds to test the other jobs).

What do we have now?

So, we have a failure in the server that we cannot reproduce locally. How can we understand what happened? So far, the only thing we have is a piece of a text-based stack trace. For example, let’s take this test failure:

Error Message
Assertion failed
Stacktrace
SocketStreamTest(TestCase)>>signalFailure:
SocketStreamTest(TestCase)>>assert:
SocketStreamTest(TestCase)>>should:raise:
SocketStreamTest>>testUpToAfterCloseSignaling
SocketStreamTest(TestCase)>>performTest

As you can see, this is not that helpful and you may still don’t know what ¬†has happened. Something really useful would be to at least know what where the values of the instance variables involved in that stack… Here is where¬†Camillo Bruni had a nice idea ūüôā

Fuelizing test failures

In Pharo, the stack of the running system is reified also from the language side and we can access them! (we can even modify them). We have instances of MethodContext which hold an instVar ‘sender’ that refers to the next sender in the stack (another MethodContext or nil if it is the last). Apart from ‘sender’, a context also includes the receiver, the method that caused its activation, the arguments and the temporal variables. The¬†Fuel serializer can serialize any type of object including MethodContext. If we can serialize a MethodContext (and closures and methods), we can serialize a stack, right?¬†And what does this mean? Well, it means that we can serialize a debugger with its current state.¬†I have already shown several times (at ESUG Innovation Technology Award and at PharoConf) how we can use Fuel to serialize a debugger (from image X)¬†in the middle of its execution and materialize it in image Y and continue debugging.

Pharo provides ‘exception’ objects and, at the end, test failures are exceptions (TestFailure). We can always ask its “signaler context” to an exception, in other words, the MethodContext that signals it. Once we have¬†that MethodContext, we have all the stack (because that object has a¬†sender and the sender context has a sender and ….). So, how do we serialize that?

context := testFailure signalerContext.
FLSerializer newFull
 serialize: context
 toFileNamed: 'context.fuel'.

So that piece of code will serialize all the stack of contexts including all the transitive closure: receiver, arguments, temporal variables, etc.

Reviving test failures

So we have serialized our test failure on a file. Now we want to revive them in our local machine. The first obvious thing is to materialize the original stack from the file. But then, what do we do with the stack? How can we do something useful with it? Well, Pharo allows us to open a debugger for a particular stack ūüôā . This means we can just open a debugger with the stack of the test failure! To do that:

| aContext |
aContext := FLMaterializer materializeFromFileNamed: 'context.fuel'.
Debugger
 openContext: aContext
 label: 'This is the new debugger!'
 contents: nil

And that opens our nice debugger. Much better than a text-based stack trace, isn’t it?

Caveats when serializing a stack

When you serialize the whole stack, you may find some problems:
  1. The object graph that you serialize and, therefore, the resulting stream size can be really large depending on what the contexts have. Sometimes a context end up in the UI so you end up serializing lots of morphs, colors,¬†forms, etc. If everything is fine, the file should be¬†a couple hundred¬†or thousands KB. If the file ¬†size is in MB…then you may be serializing too much.
  2. Not only the graph is too big, but it also incorporates objects that CHANGE while being serialized (mostly when these are objects from the UI). This will cause Fuel to throw an error saying the graph has changed during serialization.
  3. If 2) happens, then depending where you trigger the fuel serialization, you may end up in a loop. For example, say you want to serialize each error with Fuel. So you change ¬†SmalltalkImage>>logError:inContext: ¬†to write the context with Fuel. Now, if 2) happens and Fuel throws an error, you will try to log that again causing again the serialization… infinitive loop.
  4. Apart from the previous points, there are still more problems. You can read the title “Limitation and known problems” in this post.
So… some workarounds are (still, not sure if they will help in all cases):
  • Deep copy the context before serializing it.
  • If you want to serialize particular contexts (for example, particular domain exceptions), then you may know WHERE to hook to make some instVars transient and, therefore, avoid serializing things you don’t want and that may cause 2).
  • Serialize a PART of the stack.

Jenkins integration

Thanks to Camillo and to Sean P. DeNigris, now Jenkins serializes (for some jobs) each test failure into a file (here¬†you can see how to set up your own Jenkins for Pharo). For example, we have the job “pharo-2.0-tests”. If you select the OS and then a particular build number, you will have an artifact called “Pharo-2.0-AfterRunningTests.zip”. For example, this one:¬†https://ci.lille.inria.fr/pharo/view/Pharo%202.0/job/pharo-2.0-tests/Architecture=32,OS=mac/lastSuccessfulBuild/artifact/Pharo-2.0-AfterRunningTests.zip. This zip contains all the .fuel files of all the test failures. Each file is named ClassXXX-testYYY.fuel.

To workaround the problem mentioned in the previous paragraphs (“Caveats when serializing a stack”), we just serialize a part of the stack: from the context that signals the failure up to the test method. Example:

  ...
  performTest
"Start context slice"
> testMyFeatureBla
> ...
> ...
> assert: foo equals: bar
"end context slice"
  assert:
  Exception signal

The idea is to serialize the least number of stack-frames possible while still giving decent debug feedback. To do that, our Jenkins code (HDTestReport>>serializeError: error of: aTestCase) is:

serializeError: error of: aTestCase
 "We got an error from a test, let's serialize it so we can properly debug it later on..."
 | context testCaseMethodContext |

 context := error signalerContext.
 testCaseMethodContext := context findContextSuchThat: [ :ctx|
 ctx receiver == aTestCase and: [ ctx methodSelector == #performTest ]].
 context := context copyTo: testCaseMethodContext.

 [
 FLSerializer newFull
 " use the sender context, generally the current context is not interesting"
 serialize: context sender
 toFileNamed: aTestCase class name asString,'-', aTestCase selector, '.fuel'.
 ] on: Error do: [:err| "simply continue..." ]
During serialization the graph can somehow reach classes of the Jenkins code (like HDTestReport). If you materialize in an image where such class is not present, you will have a Fuel error. For this purpose in the same Pharo-2.0-AfterRunningTests.zip besides having the .fuel files, we also have a Pharo-2.0-AfterRunningTests.image which, as it names says, was saved after having run all tests (meaning it has the Jenkins code installed). This means we can directly use that image to materialize and it will work. The other option is to take another image and install the following before materializing:
Gofer new
 url: 'http://ss3.gemstone.com/ss/CISupport';
 package: 'HudsonBuildTools20';
 load.
This is temporal because soon Jenkins support code will be directly integrated in Pharo.
Anyway, I recommend using the same version of image that was used during serialization. So I think that using directly Pharo-2.0-AfterRunningTests.image is more reliable.

Conclusion

It is clear that there are several caveats. However, I do believe this is yet another step in CI and¬†development. It is just one more tool you have at hand when something is failing in the server and you cannot reproduce it locally. It the worst case, it won’t help but it won’t hurt either. If you have luck, you may find out the cause ūüôā¬†It is incredible all the things you can do when the stack is¬†reified and¬†visible¬†from the language while also being¬†serializable. For me, asking for a text-based stack trace in Smalltalk is like going to a cabaret and ask for a hug. We have so much power that we should take advantage of it.¬†At the end, using a debugger is way better. Anyway, I do not recommend to remove the stack trace information, just adding also the Fuel possibility.

Moving contexts and debuggers between images with Fuel

Hi guys. During ESUG 2011, at the Awards, I was showing Fuel. The week before such event I was thinking what I could show to the people. This was a challenge because showing a serializer can be plain boring. I was working at home that afternoon, and suddenly I thought: “What happens if I try to serialize a living debugger and materialize it in another image?” After 5 minutes, really, you will see it takes only 5 minutes, I notice that such crazy idea was working OUT OF THE BOX. Even if I knew Fuel supported serialization of methods, contexts, closures, classes, etc…I was surprised that it worked from the first try. I was so happy that I tried to explain to my poor wife what I had just done hahahah. Unfortunately, she told me it was too abstract and that understanding the garbage collector was easier (I promise she really understands what the garbage collector does hahhahaha).

Well….several months has passed, but I would like to show you how to do it because I think it may be of help for real systems ūüėȬ† So…the idea is the following: whenever there is an error, you can get the context from it, and such context is what is usually written down into a log file (in Pharo this is PharoDebug.log). I will show you two things: 1) how to serialize a debugger in one image and materialize it another one and; 2) how to write down the context into a Fuel file when there is an error so that you can materialize it later in another image.

Installing Fuel

The first step is, of course, install Fuel. The latest stable release is 1.7 but to have better results with this example, I would recommend 1.8. Fuel 1.8 is not released yet it is because we plan to write some stuff in the website. The code is almost finish, so you should load Fuel 1.8 beta1. In my case I am using a normal Pharo 1.3 image:

Gofer it
url: 'http://ss3.gemstone.com/ss/Fuel';
package: 'ConfigurationOfFuel';
load.
((Smalltalk at: #ConfigurationOfFuel) project version: '1.8-beta1') load.

Once you have finished loading Fuel, save the image. Let’s call it Fuel.image.

Serializing and materializing a debugger

Now its time to do something hacky in the image so that to open a debugger. Or you can just take a piece of code and debug it. In my example, I opened a workspace and wrote the following:

| a |
a := 'Hello Smalltalk hackers. The universal answer is '.
a := a , '42!'.
Transcript show: a.

Then I select the whole code, right click -> “debug it”. Then I do one “step over” and I stop there before the concatenation with ’42!’.

I am sure there could be better ways, but the simpler way I found to get the debugger instance for this example, is to do a Debugger allInstances first ūüėȬ† so… be sure not to have another debugger opened hahaha.¬† Now…let’s serialize the debugger:

Smalltalk garbageCollect.
FLSerializer
serialize: Debugger allInstances first
toFileNamed: 'debugger.fuel'.

After that, you should have a ‘debugger.fuel’ created in the same directory where the image is. Now close your image (without saving it) and re open it. If everything is fine, we should be able to materialize our debugger and continue debugging. So, let’s try it:

| newDebugger |
newDebugger := FLMaterializer materializeFromFileNamed: 'debugger.fuel'.
newDebugger openFullMorphicLabel: 'Materialized debugger ;)'.

So????¬† Did it work??¬† are you as happy as me when I first saw it? ūüôā¬† if you check this new opened debugger, you will see its state is correct. For example, the instVar ‘a’ has the correct state. You can now open a Transcript and continue with the debugger as if were the original one.

Of course that even if this simple example works, there are a lot of problems. But I will explain them at the end of the post.

Serializing and materializing errors

In the previous example we have serialized the debugger manually. But imagine the following: you have a production application running. There is an error, and PharoDebug.log is written with all the stack. The user/client send you by email the .log and you open your favorite text editor to try to understand what happened. Now imagine the following: you have a production application running. There is an error, and a PharoDebug.fuel is written with all the stack. The user/client send you by email the file and you open an image, and then materialize and open a debugger. How does it sound? ūüôā magical?

For this example, we will just change the place where Pharo writes PharoDebug.log when there is an error. That method is #logError:inContext:. What we will do is to add just 2 lines at the beginning to serialize the context:

logError: errMsg inContext: aContext

" we should think about integrating a toothpick here someday"
FLSerializer
serialize: aContext
toFileNamed: 'PharoDebug.fuel'.

self logDuring: [:logger |
logger
nextPutAll: 'THERE_BE_DRAGONS_HERE'; cr;
nextPutAll: errMsg; cr.

aContext errorReportOn: logger.

"wks 9-09 - write some type of separator"
logger nextPutAll: (String new: 60 withAll: $- ); cr; cr.
]

Now yes, let’s execute something that causes an error. What I did is to evaluate 1/0. After that, you should see the file PharoDebug.fuel in the same directory where the image is. You can now close the image and reopen it. And then, let’s reopen de debugger:

| aContext |
aContext := FLMaterializer materializeFromFileNamed: 'PharoDebug.fuel'.
Debugger openContext: aContext label:  'This is the new debugger!' contents: nil

Et voil√†! Hopefully that worked ūüôā¬†¬† Notice that in this example and the previous one, there is nothing in special with the Fuel serialization. You are using the normal API, and all you do is to serialize a debugger or a context as if you were serializing any normal object. Of course, you can also apply this idea to other places. For example, in Seaside you have an error handler. You may want to serialize the error with Fuel there.

Limitation and known problems

  • Even if Fuel can fully serialize methods, classes, traits, etc., it is recommended that the image were the contexts/debuggers are serialized and materialized are equal. If you are doing this in a production application, then you can have the same image running locally. The idea is that both images have the same classes and methods installed. This is because, by default, if the object graph to serialize includes compiled methods, classes, class variables, etc., they are all considered as “globals”, which means that we only serialize its global name and then during materialization it is searched in Smalltalk globals. Hence, classes and methods have to be present. Otherwise you have to use Fuel in a way that it serializes classes as well, but that’s more complicated.
  • There may be things that affects the debugger which are part of the image and not the serialization, and they may have changed. Imagine for example, a class variable which has changed its value in the image where you serialize. Then it will have a different value in the image where you materialize. Most of these problems also happens if even if you open the debugger later in the same image…some state may have changed…
  • The graph reachable from the contexts can be very big. For example, Esteban Lorenzano was doing this for an application and one of the problems is that from the context it was reachable the whole UI…which means lots and lots of objects. In such a case, you can always use Fuel hooks to prune the object graph to serialize.
  • Be aware to use exactly the same version of Fuel in both images ūüėČ

Conclusion

All in all, I think that as a very first step, it is very nice that we can serialize this kind of stuff like contexts and debuggers out of the box with Fuel. This could be the infrastructure for a lot of fancy stuff. I don’t think that the debugger materialization can be as¬†reliable as if you were debugging in the original image. I don’t think either that it should replace PharoDebug.log. However, what I do think is that you can add the Fuel serialization just as another way of getting more information about the problem. It’s one more tool you can add to your Smalltalk toolbox ūüôā


How to debug the VM?

Hi. Whether you are experimenting and hacking in the VM (where it is likely that some things will go wrong) or you are running an application in production, it is always useful to know how to debug the VM.¬† In this post, we will see how to compile the VM with all the debug information, how to run the VM from GDB (the GNU Project Debugger) and how to debug it using an IDE’s debugger.

Before going further, I would like to tell you something I did last weekend (apart from sleeping quite a lot).¬† Laurent Lauffont, creator of the PharoCasts, started a new sequence of screencasts called ” PharoCasts with Experts“. The idea is to do a more or less 1 hr “interview” about certain topic. Laurent calls by Skype to the “expert” and they can talk, ask questions, etc. Using TeamViewer, the “expert” shares his desktop to Laurent, who is in addition recording the screencast. The screencast is finally edited and uploaded to http://www.pharocasts.com . Ok, I am not a VM expert at all, but last week we did a 1:30 screencast about compiling and debugging the VM ūüôā¬†¬† This screencast contemplates all what we have learned in the previous posts and what you will learn today.

So…let’s start the show.

Prepared image for you

In previous posts, I told you that Pharo guys recommend us to use the latest PharoCore for building the CogVMs. There are several reasons for this but the basic idea is that they want the VM can be build in latest versions, and if cannot, then fix it as soon as possible. But we have a much better testers than you, and it is called Hudson. So…I don’t want to force you to use the latest unstable PharoCore (which in fact, it is unstable sometimes). Even more, if you are starting in the VM world, I don’t want to add even more problems. So…from now onward, I have prepared and image so that you can use. At the time of this writing, there is no PharoDev 1.3 yet, but all the VM tools doesn’t load directly in a Pharo 1.2. There are 3 little changes that are needed. So…you can grab a PharoDev 1.2.1 and file in these changes, or directly load this image that I have prepared for you. This image has only those 3 changes. It does not contain any Cog or CMakeVMMaker package, that’s your homework ūüėȬ†¬† The good news is that since we use a Dev image now we have all the development tools like refactorings, code completion, code highlighting, etc.

When do we need to debug the VM?

I think it is a good idea to start thinking when we need to debug the VM. In my little experience, I have found the following reasons:

  1. When there is a crash and you cannot find why it was.
  2. When you are developing something on the VM.
  3. For learning purposes.
  4. When you are just a hacky geek ūüôā

I think most people will need the first one.

Debugging and optimizing a C program

If you are reading this post, you are probably a Smalltalker. And if you are a Smalltalker, I know how much you love the debugger. We all do. But there are guys that are not as lucky as we are ūüėČ Now, being serious, how do we debug a program written in C ? I mean, the VM, at the end, is a program compiled in C and executed like any other program. I am not a C expert so I will do a quick introduction.

When you normally compile a C program the binary you get does not contain the C source code used to generate such binaries. In addition, the names of the variables or the functions may be lost. Hence, it is really hard to be able to debug a program. This is why C compilers (like GCC) usually provide a way to compile a C program with “Debugging Symbols“, which add the necessary information for debugging to the binaries. Normally, as we will see today, these symbolic symbols are included directly in the binary. However, they can be sometimes separated in a different file. The GCC flag for the debugging symbols is -g¬† where -g3 is the maximum and -g0 the minimal (none). There are much more flags related to debugging but that is not the main topic of this post.

In addition to the debugging, a C compiler usually provides what they call “Compiler Optimizations”. This means that the compiler try to optimize the resulted binary in different ways: speed, memory, size, etc. When compiling for debugging it is common to disable optimizations, why? because, as its documentation says some variables you declared may not exist at all; flow of control may briefly move where you did not expect it; some statements may not be executed because they compute constant results or their values were already at hand; some statements may execute in different places because they were moved out of loops. The GCC flag for the optimization is -O¬† where -O3 is the maximum and -O0 no optimization (default).

But we are Smalltalkers, so much better if someone can take care about all these low-level details. In this case, CMakeVMMaker is that guy.

Building a Debug VM

As I told you in the previous post, there are several CMakeVMMAker configurations classes, and some of them are for compiling a debug VM. Examples: MTCocoaIOSCogJitDebugConfig, StackInterpreterDebugUnixConfig, CogDebugUnixConfig, etc. What is the difference between those configurations and the ones we have used so far ? The only difference is the compiler flags. These “Debug” VMs use special flags for debugging. Normally, these flags include things like -g3 (maximum debugging information), -O0 (no optimization), etc. For more details, check implementors of #compilerFlagsDebug. But…since there were a couple of fixes in the latest commit, I recommend you to load the version “CMakeVMMaker-MarianoMartinezPeck.94” of the CMakeVMMaker package.

So…how do we build the debug VM?¬† Exactly the same way as the regular (“deploy”)¬† VMs (I have explained that here and here), with only one difference. Guess which one? Yes, of course, use the debug configuration instead of the normal one. That means that in Mac for example, instead of doing “MTCocoaIOSCogJitConfig¬† generateWithSources”¬† we do “MTCocoaIOSCogJitDebugConfig¬† generateWithSources”. And that’s all. Once you finish all the build steps, you have a debug VM.

What are you waiting? Fire up your image and create a debug VM ūüėȬ† Now…how do you know if you really succeeded or not to create a debug VM?¬† Easy: it will be much slower and the VM binary will be bigger (because of the symbolic symbols addition). In a regular CogMTVM, if you evaluate “1 tinyBenchmarks” you may get something arround ¬†‘713091922 bytecodes/sec; 103597027 sends/sec’ whereas with a debug VM something arround ‘223580786 bytecodes/sec; 104885548 sends/sec’. So..as you can notice, a debug VM is at least 3 times slower.

Debugging the VM with GDB

Disclaimer: I am not a GDB expert, since most of the times I debug from the IDE (XCode). I will give you a quick introduction. So…we have compiled our with all the debug flags which means that the VM binaries now have extra information so that we can debug. GDB is a command line debugger from GNU Project and it is the “standard” when debugging C programs. In addition, it works in most OS. Notice that for Windows there are not VMMakeVMMaker debug configurations. I think this is just because nobody tried it. As far as I know gdb works with MSYS, so it may be a matter of just implementing #compilerFlagsDebug for Windows confs and try to make it work. If you give it a try and it works, please let me know! So far, I did a little test with some compiler flags like -g3 and -O0 and it compiles and runs. But (as we will see later) when I do CTRL+C instead of get the gdb prompt, I kill GDB heheheh. Googling, seems to be a known problem.

So….open a console/terminal. The following are the minimal steps to run the VM under GDB in Mac OS:

cd blessed/results/CogMTVM.app/Contents/MacOS/
#If you compiled a StackVM or CogVM instead of a CogMTVM, then the .app and the executable will have another name
gdb CogMTVM

In Linux/Windows since you don’t have the .app directory of the Mac OS, it should be something like this;

cd blessed/results
#If you compiled a StackVM or CogVM instead of a CogMTVM, then the executable will those names instead
gdb CogMTVM

With the previous step, you should have arrived to a gdb console that looks like this:

(gdb)

So now we need to start the VM. In Mac, the VM automatically raises the File Selection popup that lets you choose which image to run. So the following line is enough:

(gdb) run

Choose the image and continue. In the Linux  you have to specify to the VM the .image to run by parameter. So, you have to do something like this:

(gdb) run  /home/mariano/Pharo/images/Pharo1.3.image

The idea is that you send by parameter the .image you want to run. So…at this point you have already launched your image, its time to play a bit. Open a Transcript. Open a workspace and evaluate:

9999 timesRepeat: [Transcript show: 'The universal answer is 42.']

Now…come back to the console, and do a CTRL+C. This will get a SIGINT Interruption and in the GDB console you should get something like:

Program received signal SIGINT, Interrupt.
0x9964a046 in __semwait_signal ()
(gdb)

What happened?¬† I am not an expert, but with such interruption we can “pause” the VM execution and the gdb takes the control. If you see your image at this moment, you will se that you cannot do anything in it. It is like “frozen”. If you now do:

(gdb) continue

You can also type “c” instead of “continue”. The image should continue running, and you Transcript should continue printing the universal answer ūüėȬ†¬† So…what is cool about being able to interrupt the VM?¬† We can, for example:

  • Inspect the value of some variables and stack. Type “help data” and “help stack” in the gdb prompt.
  • Put breakpoints in the code and then after go step by step.
  • Invoke functions from our VM (this is really cool!!)

Let’s see a couple of examples….hit CTRL+C again to get the control of the gdb prompt and do:

(gdb) bt

Which should look similar this (why similar? because it depends where you stop it):

(gdb) bt
#0  0x0002dbd0 in longAtPointerput (ptr=0x22efa6e4 "\017??\024?\001", val=349999375) at sqMemoryAccess.h:82
#1  0x0002daf2 in longAtput (oop=586131172, val=349999375) at sqMemoryAccess.h:99
#2  0x0006a2ae in sweepPhase () at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:44356
#3  0x0003d8b4 in incrementalGC () at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:17846
#4  0x00069f61 in sufficientSpaceAfterGC (minFree=0) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:44214
#5  0x00032e56 in checkForEventsMayContextSwitch (mayContextSwitch=1) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:11398
#6  0x0003ccad in handleStackOverflowOrEventAllowContextSwitch (mayContextSwitch=1) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:17346
#7  0x000326c4 in ceStackOverflow (contextSwitchIfNotNil=525336580) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:11136
#8  0x1f40032e in ?? ()
#9  0x0006ab4c in threadSchedulingLoop (vmThread=0x828a00) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:44714
#10 0x0003dd96 in initialEnterSmalltalkExecutive () at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:17970
#11 0x0003ebdc in initStackPagesAndInterpret () at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:18435
#12 0x00022211 in interpret () at /Users/mariano/Pharo/vm/git/cogVM2/blessed/src/vm/gcc3x-cointerpmt.c:2037
#13 0x00072142 in -[sqSqueakMainApplication runSqueak] (self=0x211090, _cmd=0x121c80) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/platforms/iOS/vm/Common/Classes/sqSqueakMainApplication.m:172
#14 0x96339cbc in __NSFirePerformWithOrder ()
#15 0x91b9ee02 in __CFRunLoopDoObservers ()
#16 0x91b5ad8d in __CFRunLoopRun ()
#17 0x91b5a464 in CFRunLoopRunSpecific ()
#18 0x91b5a291 in CFRunLoopRunInMode ()
#19 0x920bde04 in RunCurrentEventLoopInMode ()
#20 0x920bdaf5 in ReceiveNextEventCommon ()
#21 0x920bda3e in BlockUntilNextEventMatchingListInMode ()
#22 0x9307378d in _DPSNextEvent ()
#23 0x93072fce in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#24 0x93035247 in -[NSApplication run] ()
#25 0x9302d2d9 in NSApplicationMain ()
#26 0x0006fca5 in main (argc=2, argv=0xbffff864, envp=0xbffff870) at /Users/mariano/Pharo/vm/git/cogVM2/blessed/platforms/iOS/vm/Common/main.m:52
(gdb)

Guess what it was doing ? Sure…an incremental GC ūüėČ

(gdb) call printAllStacks()

Which should look similar to the following (notice that it is not the same place as the previous example and this is because the stacktrace in the middle of the incremental Garbage Collector was too long to put in the post, so I did a “bt” in another moment):

(gdb) call printAllStacks()
Process 0x2197b5f8 priority 10
0xbff650b0 I ProcessorScheduler class>idleProcess 526237088: a(n) ProcessorScheduler class
0xbff650d0 I [] in ProcessorScheduler class>startUp 526237088: a(n) ProcessorScheduler class
0xbff650f0 I [] in BlockClosure>newProcess 563590428: a(n) BlockClosure

Process 0x219edf1c priority 50
0xbff5f0b0 I WeakArray class>finalizationProcess 526237628: a(n) WeakArray class
0xbff5f0d0 I [] in WeakArray class>restartFinalizationProcess 526237628: a(n) WeakArray class
0xbff5f0f0 I [] in BlockClosure>newProcess 564059712: a(n) BlockClosure

Process 0x2067e180 priority 80
0xbff600d0 M Delay class>handleTimerEvent 526243268: a(n) Delay class
0xbff600f0 I Delay class>runTimerEventLoop 526243268: a(n) Delay class
543678476 s [] in Delay class>startTimerEventLoop
543678752 s [] in BlockClosure>newProcess

Process 0x2197b430 priority 60
0xbff610b0 I SmalltalkImage>lowSpaceWatcher 527603592: a(n) SmalltalkImage
0xbff610d0 I [] in SmalltalkImage>installLowSpaceWatcher 527603592: a(n) SmalltalkImage
0xbff610f0 I [] in BlockClosure>newProcess 563589972: a(n) BlockClosure

Process 0x219eba78 priority 60
0xbff62030 M [] in Delay>wait 564050624: a(n) Delay
0xbff62050 M BlockClosure>ifCurtailed: 565737144: a(n) BlockClosure
0xbff6206c M Delay>wait 564050624: a(n) Delay
0xbff62084 M InputEventPollingFetcher>waitForInput 526769836: a(n) InputEventPollingFetcher
0xbff620b0 I InputEventPollingFetcher(InputEventFetcher)>eventLoop 526769836: a(n) InputEventPollingFetcher
0xbff620d0 I [] in InputEventPollingFetcher(InputEventFetcher)>installEventLoop 526769836: a(n) InputEventPollingFetcher
0xbff620f0 I [] in BlockClosure>newProcess 564050332: a(n) BlockClosure

Process 0x212b6170 priority 40
0xbff5e03c M [] in Delay>wait 565740688: a(n) Delay
0xbff5e05c M BlockClosure>ifCurtailed: 565741260: a(n) BlockClosure
0xbff5e078 M Delay>wait 565740688: a(n) Delay
0xbff5e098 M WorldState>interCyclePause: 529148776: a(n) WorldState
0xbff5e0b4 M WorldState>doOneCycleFor: 529148776: a(n) WorldState
0xbff5e0d0 M PasteUpMorph>doOneCycle 527791904: a(n) PasteUpMorph
0xbff5e0f0 I [] in Project class>spawnNewProcess 528638248: a(n) Project class
556491024 s [] in BlockClosure>newProcess
(gdb)

Notice that in this case we are calling from GDB a exported function of the VM. Where does it come from?  For the moment, take a look to #printAllStacks. Now, it is time to kill the executable and finally to quit gdb:

(gdb) kill
(gdb) Quit

Notice that after doing the “kill” you can call “run” again an start everything again.¬† Hopefully, you got an idea of how to debug the VM. However, using gdb from command line is not the only option.

Debugging the VM with XCode

In the previous post we saw that CMake provides “generators”, i.e, a way to generate specific makefiles for different IDEs. Hence, we can generate makefiles for our IDE and debug the VM from there. Remember that executing “cmake help” in a console shows the help and at the end, there is a list of the available generators. In this particular example, I will use XCode in Mac OS. To do so, we need to follow the same steps than compiling a debug VM (that is, the same as building a release VM but using a debug CMakeVMMaker conf) but using the XCode generator in particular. That means that instead of doing “cmake .” we do “cmake -GXcode”. Notice that if we previously build another VM you will need to remove the CMake cache: /blessed/build/CMakeCache.txt. So..

cd blessed/build
cmake -G Xcode

Notice that it is not needed to do a “make” since we will do that from inside XCode. “cmake -G Xcode” generates a /blessed/build/CogMTVM.xcodeproj (the name will change if you generate a CogVM or a StackVM) that we directly open with XCode. Now…normally we should be able to select the target Debug (on the top left) and then from the menu “Build -> Build and Debug – Breakpoints On”. That should compile the VM and automatically run it with GDB. However, I found two problems:

  • External plugins are not correctly placed. I tried to fix it but I couldn’t. They are placed in blessed/results/CogMTVM.app/Contents/Resources/Debug¬† but they should be together in blessed/results/Debug/CogMTVM.app/Contents/Resources. The problem is that when XCode compiles, it adds a “Debug” in the directory when compiling with the “debug target” and the same for “release”. I have spent two days completely trying to solve it from the CMakeVMMaker and I couldn’t. If you know how to do it, please let me know. Anyway, the workaround is to copy all those .dylib from the first directory to the second one.
  • No matter that you change the settings in the project (like compiler flags, gcc version, etc) it will always the values from the CMake generated makefiles. This is a pity because I would like to change settings from XCode…

Now you are able to “Build -> Build and Debug – Breakpoints On”. If everything goes well, you should be able to open an image and the VM will be running with gdb. You will notice there is a little gdb button that opens a gdb console where you can do exactly the same as¬† if it were by command line. In addition, there is another button for the debugger. I attach a screenshot.

So…there you are, you are running the VM in debug mode from gdb and inside XCode. Interesting things from the gdb console is that you have buttons for “Pause” (what we did with gdb but from command line with the CTRL+C) and for “Continue” (what we type from gdb command line). And you have even a “Restart”.

Now….the nice thing about being able to debug the VM with XCode is the ability to easily put breakpoints in the code. What where? in which file? Without giving much details (because that’s the topic of another post), we will say that the “VM Core” (basically Interpreter classes + ObjectMemory classes) is translated into a big .c file called gcc3x-cointerpmt.c (in case of cogMT). A regular CogVM will be called “gcc3x-cointerp.c” the same as for StackVM. You can check this files by yourself. You should know where they are. Remember?¬† yes, they are (if you did the default configuration) in /blessed/src/vm¬† (for Cocoa configurations it may be in /blessed/stacksrc/vm). In such folder you will see that there is another file cointerpmt.c (or cointerp.c). Which is the difference between the one that has the “gcc3x” and the one that has not? for me moment let’s just say the one with the “gcc3x” has some optimizations (Gnuifier) that were automatically done in the C code and with the intention of compiling such sources with a gcc3x compiler.

We already know which file to debug “most of the times”. Maybe you need to debug another C file like plugins or the machine code generator, but most of the times, you will debug the “VM core” which is the file gcc3x-cointerpmt.c or its variant. Now what you have to do is to open it and put a breakppoint. On the top part of XCode you have a list of the files included in the project. Search for that one and select it. Be careful that XCode is really slow with this big file. Let’s put a breakpoint in the method lookup when the #doesNotUnderstand is thrown. This is done in the function “static sqInt lookupMethodInClass(sqInt class)” and here is a screenshot:

Once you set breakpoints you can “Build and Debug”. Now, instead of testing by executing something in a workspace, we will create a simple method anywhere (why not to use the workspace is not at my knowledge right now, sorry). So…I create the method foo in TestCase doing like this:

TestCase >> foo
self methodNonExistent.

And then I executed (this can be done in a workspace):

TestCase new foo

Once this is done, the VM should have been paused and you should have available the gdb prompt. You can now open the debugger or GDB console and do whatever you want. With the Debugger you can do “Step Over”, “Step Into”, “Step Out”, etc‚ĶAnd with the GDB console you can do exactly the same like if you were executing gdb from a terminal. Here is the screenshot:

There are much things I would like to talk about regarding debugging a VM, but the post is already too long and I think this is enough to start. I recommend you to watch the screencast.  Further in this blog sequence, we will do a second part. Now, it is time to understand some internal parts of the Squeak/Pharo VM.