Hi folks. I guess that some readers do not like all the building part and want to go directly to see the VM internals. But it is really important that you understand how to change the VM, compile it or even debug it. Otherwise, you’ll be very limited.
This post is mostly about a couple of things that I wanted to mention in the previous post, but I couldn’t because it was already too long. If you read such post, you may think that compiling the VM from scratch is a lot of work and steps. But the post was long because of my explanations and because of my efforts in making it reproducible. This is why I would like to do a summary of how to compile the VM.
Summary of VM build
mkdir newCog cd newCog git clone --depth 1 git://gitorious.org/cogvm/blessed.git cd blessed/image wget --no-check-certificate http://www.pharo-project.org/pharo-download/unstable-core "Or we manually download with an Internet Browser the latest PharoCore image from that URL and we put it in blessed/image
Deprecation raiseWarning: false. Gofer new squeaksource: 'MetacelloRepository'; package: 'ConfigurationOfCog'; load. (Smalltalk at: #ConfigurationOfCog) project latestVersion load. "Notice that even loading CMakeVMaker is not necessary anymore since it is included just as another dependency in ConfigurationOfCog" MTCocoaIOSCogJitConfig generateWithSources. "Replace this CMMakeVMMaker configuration class for the one that suites your OS like CogUnixConfig and CogMsWindowsConfig"
Now, come back to the terminal and do:
cd newCog/blessed/build cmake . # Or cmake . -G"MSYS Makefiles" if you are in Windows make
And that’s all, in “blessed/results” (in Windows it should be under “blessed/build/results”) you should have the CogVM binary. I know that you probably are a lazy guy, but if you really want to take advantage and learn in my posts, I strongly recommend you to follow those steps. All along this sequence of posts, we will debug and modify the VM (change GC, method lookup, create our own primitives and plugins, etc). Once you have Git and CMake, I promise the process takes less than 5 minutes.
Remember that all these posts is what I called “Journey through the VM”, so we will probably go and come back between different posts 🙂 In the first post,under the title “CogVM and current status” I explained the different flavors of CogVMs and the main features of them:
- Real and optimized block closure implementation. This is why from the image side blocks are now instances of BlockClosure instead of BlockContext.
- Context-to-stack mapping.
- JIT (just in time compiler) that translates Smalltalk compiled methods to machine code.
- PIC (polymorphic inline caching).
What is the big difference between StackVM and CogVM? Well, Stack VM implements 1) and 2). And Cog VM is on top of the Stack VM and adds 3) and 4). Finally, there is CogMTVM which is on top of Cog VM and adds multi-threading support for external calls (like FFI for example).
In addition, Cog brings also some refactors. For example, in Interpreter VM, the Interpreter was a subclass of ObjectMemory. That was necessary in order to easily translate to C. In Cog, there are new classes like CoInterpreter and NewObjectMemory. But the good news is that we can have composition!! The CoInterpreter (which is a new class from Cog) has an instance variable that is the object memory (in this case an instance of NewObjectMemory). This was awesome and required changes in the SLANG to C translator.
As said, in the VMMaker part of the VM, what we called the “core”, there are mainly two important classes: Interpreter and ObjectMemory. Read the first post for details of their responsibilities. In Cog, there are a couple of differences:
- As said, the Cog Interpreter class do not subclass from ObjectMemory, but instead it is an instance variable.
- In Cog there isn’t only one Interpreter class like in the old VM. In fact each Cog VMs I told you (StackVM, CogVM, CogVMMT) has its own Interpreter class (StackInterpreter, CoInterpreter and CoInterpreterMT). Come on!! Don’t be lazy, take you image and browse them 🙂
- In Cog, there are not only those Interpreter classes that I have already told you, but also several more that are just for a design point of view, i.e, they are not Interpreter classes that should be used for compiling the VM. They are for example, to reuse code or to better simulate them. Examples, CoInterpreterPrimitives, StackInterpreterPrimitives, InterpreterPrimitives, etc. And then, of course, we have the Interpreter simulators, but that’s another story for another post.
So…if you are paying attention to this blog you may be asking yourself which Interpreter class you should use? My advice, and this is only my advice, is that you should normally use the “last one”. In this case, the CogVMMT. The few reasons I find not to use the last one are:
- If you are running on a hardware where Cog JIT is not supported. For example, for the iPhone the StackVM is usually used.
- When you are doing hacky things with the VM and you want to be sure there is no problems with JIT, PIC, etc. This is my case…
- Maybe for learning purposes the CogVM or CogVMMT is far much complicated than the StackVM or InterprertVM.
- The “last one” may not be the most stable one. So if you are in a production application you may want to deploy with a CogVM rather than a CogVM that has been released just now.
But apart from that, you will probably use the “last one” available. Just to finish with this little section, I let you a screenshot of a part of the Cog VMs hierarchy.
CMakeVMaker available configurations
In the previous post we saw what CMMakeVMMaker configurations do: 1) generate VM sources from VMMaker and 2) generate CMake files. 1) depends on which Cog (StackVM, CogVM and CogVM) we want to build, which plugins, etc. And 2) depends not only in which CogVM but also in the OS (the CMake files are not the same for each Operating System) and other things, like whether we are compiling for debug mode or not, whether we are using Carbon on Cococa library in Mac, etc. So…imagine the combination of: which CogVM, which OS, and whether debug mode or not. It gives us a lot of possibilities 🙂
The design decision to solve this in the CMakeVMake project was to create specific “configuration” classes. To summarize, there are at least one class for VM/OS. So you have, for example, CogUnixConfig (which is a CogVM, for Unix and “release”), CogDebugUnixConfig, MTCogUnixConfig, StackInterpreterUnixConfig, StackInterpreterDebugUnixConfig. And then for the rest of the OS is the same: CogMsWindowsConfig, StackInterpreterMsWindowsConfig, MTCogMsWindowsConfig, etc….So, your homework: browse the categories ‘CMakeVMMaker-Windows’, ‘CMakeVMMaker-Unix’ and ‘CMakeVMMaker-IOS’. Look at the available classes. To learn, check implementors of #compilerFllags, #defaultInternalPlugins, #interpreterClass, etc…To test, take the debug variant, follow the same procedure as always, and you compile a debug VM with all the debugging symbols and no optimization 🙂
Which one you should use? I have already answered, but imagine you want the “last one”, then they are MTCocoaIOSCogJitConfig, MTCogUnixConfig and MTCogMsWindowsConfig.It doesn’t matter which configuration you choose, all you need to normally do is send the #generateWithSoources.
This design decision has a couple of advantages from my point of view:
- It is extremelly easy to customize. And in fact, there are already examples: CogUnixNoGLConfig (which doesn’t links against OpenGL so it works perfect unless you use Balloon3D or Croquet plugins), CogFreeBSDConfig (specially for BSD since it has a couple of differences in the compiler flags), etc.
- YOU can subclass and change what you want: default internal or external plugins, compiler flags, etc.
- It is easy for a continuous integration server like Hudson to build different targets.
Customizing CMakeVMMaker configurations
I told you that you can subclass from a specific class and overwrite the compiler flags, the default plugins and if they should be internal or external, etc. However, CMMakeVMaker can be parametrized in several ways while using them. In the building instructions at the beginning of this blog, I told you to move your Pharo image to blessed/image. And as I explained in the previous post that was in order to let CMakeVMaker take the defaults directories and make it work out of the box. But in fact, it is not necessary at all to move your image. You can download the “platforms code” in some place and the image elsewhere. Notice that these changes (the ability to customize each direcotry) has been commited in new versions of the CMakeVMMaker package. So, if you want to really try the followin code, make sure to have CMakeVMMaker-MarianoMartinezPeck.94. You can get it using Monticello Browser or Gofer.
So, you can do something like this:
"The image where this code is being run can be in ANY place" MTCocoaIOSCogJitConfig new srcDir: '/Users/mariano/Pharo/generateCode/src'; platformsDir: '/Users/mariano/Pharo/vm/git/cogVM2/blessed/platforms'; buildDir: '/Users/mariano/Pharo/vms/build'; "The resources directory is only needed for Mac" resourcesDir: '/Users/mariano/Pharo/vm/git/cogVM2/blessed/macbuild/resources'; outputDir: '/Users/mariano/binaries/results'; generateSources; generate.
The “platformsDir” must map with “platforms” directory that we downloaded with Git, it cannot be choosed randomly. The same with the “resourcesDir” (which in fact is only for Mac). The rest of the directories (src, build and output) are not created by VMMaker nor Git. They are just directories that I have created by my own and I want to use them instead of the default.
And I’ve created this shortcut also:
"The image where this code is being run can be in ANY place" MTCocoaIOSCogJitDebugConfig new defaultDirectoriesFromGitDir: '/Users/mariano/Pharo/vm/git/cogVM1/blessed'; generateSources; generate.
That way, I don’t need to move my image to blessed/image. BTW, don’t try this with Windows confs because there still a problem. Anyway, despite from that we can also customize things using #internalPlugins:, #externalPlugins, etc.
Synchronization between platform code (Git) and VMMaker
In this post, I told you the problems I have seen so far with “process” of the Interpreter VM + SVN for platform code. And I also told you how this new process (CMake + Git ) helps a bit in some of those problems. From my point of view there are a couple of things that have improved the process:
- Platform code and VMMaker are be in sync: when people (Igor, Esteban, etc) commit a new version to Git, they make sure that the VMMaker part is working.
- Documentation of that synchronization: in the previous post, I told you to load version ‘1.5’ of ConfigurationOfCog. Suppose I didn’t tell you that, how do you know for a certain Git version, which version of ConfigurationOfCog you should use? Check in blessed/codegen-scripts/LoadVMMaker.st and you have exactly the piece of code you should execute to get the working VMMaker with that specific version of GIT. So…this means that when someone commits the Git repository and such changes require a new VMMaker version, then such developer needs to create a new version of ConfoigurationOfCog, and modify LoadVMMaker.st. Now that you know this, the steps I told you at the beginning of this posts can be automatic, can’t they? someone say uncle Hudson? Yes, of course!!
- Git is easier in the fact that people can fork, hack, experiment, tests, and then push changes into the blessed.
Hudson for building VMs
Pharo has a continuous integration server with Hudson: http://ci.pharo-project.org/. And as you can see here, there are a lot of targets for CogVMs. Basically, for every single commit in Git, Hudson builds all those images. How? Following nearly the same steps I told you at the beginning of this post. It creates StackVMs, CogVMs and CogVMs for every OS. In fact, there are no Windows builds yet because this week they are getting the Windows slave. But the confs and the procure is working…So it is just a matter of getting the Windows box.
Conclusion: you don’t need to wait one year an a half to get a VM with a bug fix, nor you don’t need to compile it by yourself. With Hudson, they are built for every commit.
We saw how we can trace from platform code to VMMaker. Now, how to know how was every Hudson VM build ? Easy:
- Go to http://ci.pharo-project.org
- Choose a target in the “Cog” tab. For example, I choose “Mac Cog Cocoa”
- Follow the link, for example Cog Unix, and there you can see two artifacts:
- a built VM
- a source code tarball, which is used to build that VM (in this example, CocoaIOSCogJitConfig-sources.tar.gz)
If you Download the source code archive and unpack it into your local directory what would you expect?? Of course, a copy of the git directory plus the Pharo image generated to build such VM. Such image is in build/ subdirectory and it is called generator.image and was the used to generate source code (located in src/ subdirectory) and CMake configuration files (located in build/ subdirectory). Isn’t this cool ?
Did I already tell you that I am also a CMake newbie? Ok…just in case 😉 Anyway, imagine CMake like a tool where we can set things, parameters, variables, directories, etc, in some files (which in our case they are auto-generated by CMakeVMMaker) and then from those “general” files we can generate specific and different makefiles. So, from the same CMake files we can generate different kind of makefiles, i.e, we can generate makefiles the way some IDE except them to be. CMake call this ability “generators”. And the way to create makefiles with a specific generator is like this:
cmake -G "Generator Name"
Does that sound familiar?? Of course! We have already used them for MSYS in Windows. The cool thing is that there are generators for several IDEs. And this is just GREAT. For example, I can create makefiles and a project for XCode (the C IDE for MacOS). Just doing:
cmake -G Xcode
creates a XCode project for CogVM which is in /blessed/build/CogMTVM.xcodeproj. You don’t have an idea how cool is this. This mean you can open XCode and everything is set and working out of the box for the CogVM. You can put breakpoints, inspect C code, compile, debug, everything….Before, this was much more complicated because the .xcodeproj file was versioned in the SVN and this file usually keeps some file locations or things like that and in my experience, it was always a pain to make it work.
When you use a particular generator for an IDE (like Xcode, Eclipse, KDevelop, Vsual Studio, etc, you usually don’t do the “make” by hand. So, after invoking cmake, you won’t need to do a make. Instead, you have to compile from the IDE itself (which should have the correct makefiles).
How do you know which are the available generators? just type:
and at the end you’ll find a section that says “The following generators are available on this platform:” and each of them has a name and a description. What you need to pass to the -G parameter is the name. Notice that as the help says, it automatically shows the generators available in YOUR platform (OS). Some examples:
cmake -G KDevelop3 cmake -G "Eclipse CDT4 - Unix Makefiles" cmake -G "Visual Studio 10" cmake -G "Borland Makefiles"
When the name includes more than one word you must use the double quotes.
So…the 2 main advantages I see from CMake to our process is: cross compiling, and be able to automatically create makefiles for IDEs. Sorry I couldn’t try with other IDE than Xcode. If you try it and it works, let me know 🙂
In the next post we will so how to debug the VM and some related tricks. After that post, we will probably start to see the VM internals since you will have already all the needed tools.