Building the VM – Second Part – Mariano Martinez Peck

Hi folks. I guess that some readers do not like all the building part and want to go directly to see the VM internals. But it is really important that you understand how to change the VM, compile it or even debug it. Otherwise, you’ll be very limited.

This post is mostly about a couple of things that I wanted to mention in the previous post, but I couldn’t because it was already too long. If you read such post, you may think that compiling the VM from scratch is a lot of work and steps. But the post was long because of my explanations and because of my efforts in making it reproducible. This is why I would like to do a summary of how to compile the VM.

Summary of VM build

Assuming that you have already installed Git + CMake + GCC, then the following are the needed steps to compile the Cog VM:

mkdir newCog
cd newCog
git clone --depth 1 git://gitorious.org/cogvm/blessed.git
cd blessed/image
wget --no-check-certificate http://www.pharo-project.org/pharo-download/unstable-core
"Or we manually download with an Internet Browser the latest PharoCore
image from that URL and we put it in blessed/image

Then we open the image with a Cog VM (which we can get from here or here) and we evaluate:

Deprecation raiseWarning: false.
 Gofer new
 squeaksource: 'MetacelloRepository';
 package: 'ConfigurationOfCog';
 load.
(Smalltalk at: #ConfigurationOfCog) project latestVersion load.
"Notice that even loading CMakeVMaker is not necessary anymore
since it is included just as another dependency in ConfigurationOfCog"
MTCocoaIOSCogJitConfig generateWithSources.
"Replace this CMMakeVMMaker configuration class for the one that suites your OS
like CogUnixConfig and CogMsWindowsConfig"

Now, come back to the terminal and do:

cd newCog/blessed/build
cmake .
# Or  cmake . -G"MSYS Makefiles"  if you are in Windows
make

And that’s all, in “blessed/results” (in Windows it should be under “blessed/build/results”) you should have the CogVM binary. I know that you probably are a lazy guy, but if you really want to take advantage and learn in my posts, I strongly recommend you to follow those steps. All along this sequence of posts, we will debug and modify the VM (change GC, method lookup, create our own primitives and plugins, etc). Once you have Git and CMake, I promise the process takes less than 5 minutes.

Available CogVMs

Remember that all these posts is what I called “Journey through the VM”, so we will probably go and come back between different posts 🙂 In the first post,under the title “CogVM and current status” I explained the different flavors of CogVMs and the main features of them:

Real and optimized block closure implementation. This is why from the image side blocks are now instances of BlockClosure instead of BlockContext.
Context-to-stack mapping.
JIT (just in time compiler) that translates Smalltalk compiled methods to machine code.
PIC (polymorphic inline caching).
Multi-threading.

What is the big difference between StackVM and CogVM? Well, Stack VM implements 1) and 2). And Cog VM is on top of the Stack VM and adds 3) and 4). Finally, there is CogMTVM which is on top of Cog VM and adds multi-threading support for external calls (like FFI for example).

In addition, Cog brings also some refactors. For example, in Interpreter VM, the Interpreter was a subclass of ObjectMemory. That was necessary in order to easily translate to C. In Cog, there are new classes like CoInterpreter and NewObjectMemory. But the good news is that we can have composition!! The CoInterpreter (which is a new class from Cog) has an instance variable that is the object memory (in this case an instance of NewObjectMemory). This was awesome and required changes in the SLANG to C translator.

As said, in the VMMaker part of the VM, what we called the “core”, there are mainly two important classes: Interpreter and ObjectMemory. Read the first post for details of their responsibilities. In Cog, there are a couple of differences:

As said, the Cog Interpreter class do not subclass from ObjectMemory, but instead it is an instance variable.
In Cog there isn’t only one Interpreter class like in the old VM. In fact each Cog VMs I told you (StackVM, CogVM, CogVMMT) has its own Interpreter class (StackInterpreter, CoInterpreter and CoInterpreterMT). Come on!! Don’t be lazy, take you image and browse them 🙂
In Cog, there are not only those Interpreter classes that I have already told you, but also several more that are just for a design point of view, i.e, they are not Interpreter classes that should be used for compiling the VM. They are for example, to reuse code or to better simulate them. Examples, CoInterpreterPrimitives, StackInterpreterPrimitives, InterpreterPrimitives, etc. And then, of course, we have the Interpreter simulators, but that’s another story for another post.

So…if you are paying attention to this blog you may be asking yourself which Interpreter class you should use? My advice, and this is only my advice, is that you should normally use the “last one”. In this case, the CogVMMT. The few reasons I find not to use the last one are:

If you are running on a hardware where Cog JIT is not supported. For example, for the iPhone the StackVM is usually used.
When you are doing hacky things with the VM and you want to be sure there is no problems with JIT, PIC, etc. This is my case…
Maybe for learning purposes the CogVM or CogVMMT is far much complicated than the StackVM or InterprertVM.
The “last one” may not be the most stable one. So if you are in a production application you may want to deploy with a CogVM rather than a CogVM that has been released just now.

But apart from that, you will probably use the “last one” available. Just to finish with this little section, I let you a screenshot of a part of the Cog VMs hierarchy.

CMakeVMaker available configurations

In the previous post we saw what CMMakeVMMaker configurations do: 1) generate VM sources from VMMaker and 2) generate CMake files. 1) depends on which Cog (StackVM, CogVM and CogVM) we want to build, which plugins, etc. And 2) depends not only in which CogVM but also in the OS (the CMake files are not the same for each Operating System) and other things, like whether we are compiling for debug mode or not, whether we are using Carbon on Cococa library in Mac, etc. So…imagine the combination of: which CogVM, which OS, and whether debug mode or not. It gives us a lot of possibilities 🙂

The design decision to solve this in the CMakeVMake project was to create specific “configuration” classes. To summarize, there are at least one class for VM/OS. So you have, for example, CogUnixConfig (which is a CogVM, for Unix and “release”), CogDebugUnixConfig, MTCogUnixConfig, StackInterpreterUnixConfig, StackInterpreterDebugUnixConfig. And then for the rest of the OS is the same: CogMsWindowsConfig, StackInterpreterMsWindowsConfig, MTCogMsWindowsConfig, etc….So, your homework: browse the categories ‘CMakeVMMaker-Windows’, ‘CMakeVMMaker-Unix’ and ‘CMakeVMMaker-IOS’. Look at the available classes. To learn, check implementors of #compilerFllags, #defaultInternalPlugins, #interpreterClass, etc…To test, take the debug variant, follow the same procedure as always, and you compile a debug VM with all the debugging symbols and no optimization 🙂

Which one you should use? I have already answered, but imagine you want the “last one”, then they are MTCocoaIOSCogJitConfig, MTCogUnixConfig and MTCogMsWindowsConfig.It doesn’t matter which configuration you choose, all you need to normally do is send the #generateWithSoources.

This design decision has a couple of advantages from my point of view:

It is extremelly easy to customize. And in fact, there are already examples: CogUnixNoGLConfig (which doesn’t links against OpenGL so it works perfect unless you use Balloon3D or Croquet plugins), CogFreeBSDConfig (specially for BSD since it has a couple of differences in the compiler flags), etc.
YOU can subclass and change what you want: default internal or external plugins, compiler flags, etc.
It is easy for a continuous integration server like Hudson to build different targets.

Customizing CMakeVMMaker configurations

I told you that you can subclass from a specific class and overwrite the compiler flags, the default plugins and if they should be internal or external, etc. However, CMMakeVMaker can be parametrized in several ways while using them. In the building instructions at the beginning of this blog, I told you to move your Pharo image to blessed/image. And as I explained in the previous post that was in order to let CMakeVMaker take the defaults directories and make it work out of the box. But in fact, it is not necessary at all to move your image. You can download the “platforms code” in some place and the image elsewhere. Notice that these changes (the ability to customize each direcotry) has been commited in new versions of the CMakeVMMaker package. So, if you want to really try the followin code, make sure to have CMakeVMMaker-MarianoMartinezPeck.94. You can get it using Monticello Browser or Gofer.

So, you can do something like this:

"The image where this code is being run can be in ANY place"
MTCocoaIOSCogJitConfig new
srcDir: '/Users/mariano/Pharo/generateCode/src';
platformsDir: '/Users/mariano/Pharo/vm/git/cogVM2/blessed/platforms';
buildDir: '/Users/mariano/Pharo/vms/build';
"The resources directory is only needed for Mac"
resourcesDir: '/Users/mariano/Pharo/vm/git/cogVM2/blessed/macbuild/resources';
outputDir: '/Users/mariano/binaries/results';
generateSources;
generate.

The “platformsDir” must map with “platforms” directory that we downloaded with Git, it cannot be choosed randomly. The same with the “resourcesDir” (which in fact is only for Mac). The rest of the directories (src, build and output) are not created by VMMaker nor Git. They are just directories that I have created by my own and I want to use them instead of the default.

And I’ve created this shortcut also:

"The image where this code is being run can be in ANY place"
MTCocoaIOSCogJitDebugConfig new
defaultDirectoriesFromGitDir: '/Users/mariano/Pharo/vm/git/cogVM1/blessed';
generateSources;
generate.

That way, I don’t need to move my image to blessed/image. BTW, don’t try this with Windows confs because there still a problem. Anyway, despite from that we can also customize things using #internalPlugins:, #externalPlugins, etc.

Synchronization between platform code (Git) and VMMaker

In this post, I told you the problems I have seen so far with “process” of the Interpreter VM + SVN for platform code. And I also told you how this new process (CMake + Git ) helps a bit in some of those problems. From my point of view there are a couple of things that have improved the process:

Platform code and VMMaker are be in sync: when people (Igor, Esteban, etc) commit a new version to Git, they make sure that the VMMaker part is working.
Documentation of that synchronization: in the previous post, I told you to load version ‘1.5’ of ConfigurationOfCog. Suppose I didn’t tell you that, how do you know for a certain Git version, which version of ConfigurationOfCog you should use? Check in blessed/codegen-scripts/LoadVMMaker.st and you have exactly the piece of code you should execute to get the working VMMaker with that specific version of GIT. So…this means that when someone commits the Git repository and such changes require a new VMMaker version, then such developer needs to create a new version of ConfoigurationOfCog, and modify LoadVMMaker.st. Now that you know this, the steps I told you at the beginning of this posts can be automatic, can’t they? someone say uncle Hudson? Yes, of course!!
Git is easier in the fact that people can fork, hack, experiment, tests, and then push changes into the blessed.

Hudson for building VMs

Pharo has a continuous integration server with Hudson: http://ci.pharo-project.org/. And as you can see here, there are a lot of targets for CogVMs. Basically, for every single commit in Git, Hudson builds all those images. How? Following nearly the same steps I told you at the beginning of this post. It creates StackVMs, CogVMs and CogVMs for every OS. In fact, there are no Windows builds yet because this week they are getting the Windows slave. But the confs and the procure is working…So it is just a matter of getting the Windows box.

Conclusion: you don’t need to wait one year an a half to get a VM with a bug fix, nor you don’t need to compile it by yourself. With Hudson, they are built for every commit.

Hudson traceability

We saw how we can trace from platform code to VMMaker. Now, how to know how was every Hudson VM build ? Easy:

Go to http://ci.pharo-project.org
Choose a target in the “Cog” tab. For example, I choose “Mac Cog Cocoa”
Follow the link, for example Cog Unix, and there you can see two artifacts:

a built VM
a source code tarball, which is used to build that VM (in this example, CocoaIOSCogJitConfig-sources.tar.gz)

If you Download the source code archive and unpack it into your local directory what would you expect?? Of course, a copy of the git directory plus the Pharo image generated to build such VM. Such image is in build/ subdirectory and it is called generator.image and was the used to generate source code (located in src/ subdirectory) and CMake configuration files (located in build/ subdirectory). Isn’t this cool ?

CMake generators

Did I already tell you that I am also a CMake newbie? Ok…just in case 😉 Anyway, imagine CMake like a tool where we can set things, parameters, variables, directories, etc, in some files (which in our case they are auto-generated by CMakeVMMaker) and then from those “general” files we can generate specific and different makefiles. So, from the same CMake files we can generate different kind of makefiles, i.e, we can generate makefiles the way some IDE except them to be. CMake call this ability “generators”. And the way to create makefiles with a specific generator is like this:

cmake -G "Generator Name"

Does that sound familiar?? Of course! We have already used them for MSYS in Windows. The cool thing is that there are generators for several IDEs. And this is just GREAT. For example, I can create makefiles and a project for XCode (the C IDE for MacOS). Just doing:


cmake -G Xcode

creates a XCode project for CogVM which is in /blessed/build/CogMTVM.xcodeproj. You don’t have an idea how cool is this. This mean you can open XCode and everything is set and working out of the box for the CogVM. You can put breakpoints, inspect C code, compile, debug, everything….Before, this was much more complicated because the .xcodeproj file was versioned in the SVN and this file usually keeps some file locations or things like that and in my experience, it was always a pain to make it work.

When you use a particular generator for an IDE (like Xcode, Eclipse, KDevelop, Vsual Studio, etc, you usually don’t do the “make” by hand. So, after invoking cmake, you won’t need to do a make. Instead, you have to compile from the IDE itself (which should have the correct makefiles).

How do you know which are the available generators? just type:

cmake --help

and at the end you’ll find a section that says “The following generators are available on this platform:” and each of them has a name and a description. What you need to pass to the -G parameter is the name. Notice that as the help says, it automatically shows the generators available in YOUR platform (OS). Some examples:

cmake -G KDevelop3
cmake -G "Eclipse CDT4 - Unix Makefiles"
cmake -G "Visual Studio 10"
cmake -G "Borland Makefiles"

When the name includes more than one word you must use the double quotes.

So…the 2 main advantages I see from CMake to our process is: cross compiling, and be able to automatically create makefiles for IDEs. Sorry I couldn’t try with other IDE than Xcode. If you try it and it works, let me know 🙂

In the next post we will so how to debug the VM and some related tricks. After that post, we will probably start to see the VM internals since you will have already all the needed tools.

8 thoughts on “Building the VM – Second Part”

Pingback: How to debug the VM? « Mariano Martinez Peck
Pingback: The first part of the journey is over « Mariano Martinez Peck
Igor Stasenko says:

May 6, 2011 at 3:58 am

Mariano you did really good job, that you spent time documenting a process, and also throughoutly checked every step and gave feedback & fixes to what me & Esteban did so far.
And after your refactorings i feel that CMakeVMMaker even better than before 🙂

LikeLike

1. marianopeck says:
  
  May 6, 2011 at 11:01 am
  
  Thanks Igor for your nice words. And I have to thank you not only for developing all the CMakeVMMakers stuff, gitorious port, hudson builds, etc, but also for answering all my questions in the mailing list 🙂
  Yes, now with some refactorings CMakeVMMaker looks a bit better!
  
  LikeLike
  
Pingback: Class formats and CompiledMethod uniqueness « Mariano Martinez Peck
Pingback: Named Primitives « Mariano Martinez Peck
Pingback: Building the VM from scratch using Git and CMakeVMMaker « Mariano Martinez Peck
Pingback: Smalltalk: Links, News And Resources (9) « Angel ”Java” Lopez on Blog

Mariano Martinez Peck

Sending messages through small talks