Monthly Archives: August 2011

Boosting SandstoneDB with Fuel

Hi guys. I continue to find nice Fuel examples to show at ESUG, and I while doing so, I post my results 😉 . It may be useful for someone else apart from myself.  SandstoneDB is a database similar to Ruby’s ActiveRecord. SandstoneDB is a simple object database that uses (by default) SmartRefStreams to serialize clusters of objects to disk. It seems that currently there is also a GOODS database backend.

For more details about SandstoneDB please read: http://onsmalltalk.com/sandstonedb-simple-activerecord-style-persistence-in-squeak. There is also an excellent screencast about it in PharoCasts.

In summary, SandstoneDB is a simple database which I think it may be useful for small/medium applications. Anyway….the goal of this short post is to show you how to speed-up SandstoneDB by using Fual serializer instead of SmartRefStream. Ramon Leon, author of SandstoneDB did the integration between the both. This was pretty straighforward and worked out of the box in the lass one hour. It was necessary to only create the class SDFuelSerializer with two methods: #serialize: anObject toFile: aFile  and #materializeFromFile: aFile.

So….how to install and use SandstoneDB with Fuel? Quite easy in fact:

1) Grab a Pharo image and install Fuel. Otherwise, you can download an image from Jenkis: https://ci.lille.inria.fr/pharo/job/Fuel/. Anyway, if you want to take a Pharo image and install Fuel in it, just execute:

Gofer new
squeaksource: 'MetacelloRepository';
package: 'ConfigurationOfFuel';
load.

((Smalltalk at: #ConfigurationOfFuel) project version: '1.6') load.

2) Install SandstoneDB and its integration with Fuel:

Gofer new
squeaksource: 'SandstoneDb';
package: 'SandstoneDb';
package: 'SandstoneDbTests';
package: 'SandstoneDbFuel';
load.

3) If you run now the tests with SandstoneDB you will still be using SmartRefStream instead of Fuel, even if we have downloaded the package “SandstoneDbFuel”. So if you want to use Fuel as the default serializer you have to evaluate:

SDFileStore serializer: SDFuelSerializer new.

4) That’s all 🙂 You can now run all SandstoneDB tests, they all should be using Fuel and they all should be green 🙂

Now, how much speed-up you can have? Well…that depends on the graph you are serializing. The bigger it is the graph to serialize, the faster Fuel will be over SmartRefStream. If the graph is large, Fuel can be like 10x faster than SmartRefStream. If it is small, much less. The following is a benchmark where we individually serialize 500 persons. Each person is small, hence there is not that big difference. Nevertheless, in this test, Fuel is 2.5 times faster in serialization 1.5 times in materialization.


| commitTime people lookupTime loadTime |
 "  SDFileStore serializer:  SDSmartRefStreamSerializer new."
 SDFileStore serializer: SDFuelSerializer new.
 SDActiveRecord
 setStore: SDFileStore new;
 warmUpAllClasses.
 "only want to warm up test models, not anything else that might be in this image"
 SDFooObject warmUp.
 SDPersonMock withAllSubclasses do: [ :each | each warmUp ].

people := (1 to: 500) collect: [ :it | SDPersonMock testPerson ].
 commitTime := [ people do: [ :each | each save ] ] timeToRun.
 lookupTime := [ people do: [ :each | SDPersonMock atId: each id ] ] timeToRun.
 loadTime := [
 SDActiveRecord resetStoreForLoad.
 SDPersonMock
 withAllSubclassesDo: [ :each | SDActiveRecord store ensureForClass: each ];
 withAllSubclassesDo: [ :each | each warmUp ].
 SDActiveRecord store ensureForClass: SDFooObject.
 SDFooObject warmUp ] timeToRun.
 Transcript
 show: 'Serialiation time: ', commitTime asString;
 cr;
 show: 'Materialization time: ', loadTime asString;
 cr;
 cr.
 SDPersonMock do: [ :each | [each delete] on: Error do: [] ].
 SDPersonMock coolDown.
 SDFooObject do: [:each | [each delete] on: Error do: [] ].
 SDPersonMock allSubclassesDo: [ :each | each coolDown ].
 Smalltalk garbageCollect
 

If we take the same example what instead of serializing each person individually we save serialize an object that contains all those persons, Fuel is 12x faster in serialization and 6x in materialization.

Anyway…all this was just a proof of concept. All I can tell you is that all tests are green and that Fuel seems to speed-up SandstoneDB. If you want to give it a try, please go ahead. And if you already have an application running with SandstoneDB and SmartRefStream I would appreciate if you can benchmark both since it is interesting to see a real case.

See you.


When Fuel meets Riak NoSQL database

Hi guys. I was fed up of writing a paper so I started to think what to show in the ESUG Awards with Fuel. As explained in my previous post, Fuel is a new general purpose and fast object graph serializer for Pharo.

The first thing it come to my mind, was the following: debug something from the workspace and in the middle of the debugging, serialize the debugger into a file. Then open another image, materialize the debugger, and continue debugging. The good news is that it worked from the first shoot! Nice! I will show it in ESUG.

Now, the second thing was something that was in my head since a couple of months: use Fuel to get a byte array representation of an object graph and store/retrieve it from a NoSQL database. Of course, there are several NoSQL databases, and even several wrappers for Pharo. For my experiments I choose Riak because I read it may have the best results. Runar Jordahl did the binding for such database and as many cases, the API is using HTTP protocol. Thanks to Sven Van Caekenberghe, we have now a wonderful HTTP library in Pharo Smalltalk: Zinc. Obviously, Riak client for Pharo uses Zinc.

Why is Riak important and how it is related to Fuel? Most web apps read intensive. Fuel is designed by scratch to be extremely fast during materialization. And Fuel fits nice with a NoSQL like Riak, since Riak reads fast as well. Minimizing latency on web pages is really important. With riak (commercial version) you can host your database on many datacenters. So you can have datacenters close to all customers. So…reading from Riak is fast, and materializing from Fuel is also fast. And now in Pharo we have a fast VM as well. 🙂

Ok….I should stop talking and show you the code. I will show you how to store and read a simple object graph into a Riak database using Fuel. Steps:

1) Install Riak database server. You can download it from here: http://downloads.basho.com/riak/CURRENT/. For more information about how to install and setup a Riak database you can read: http://wiki.basho.com/Installation-and-Setup.html

In my Mac OS, I just unzip the file and that was all. Let’s say  the directory where I have the unzip is /home/mariano/riak.

2) Start the server. In Mac I went to the directory /home/mariano/riak/bin. There you will find the executable Riak. So, just run “./riak start”.

3) Check the server is running. Just execute “./ping” and you should get a “pong”.  Another possibility is to go to your web browser and go to http://localhost:8098/riak/ or http://localhost:8098/ and you should see a couple of things.

4) Ok…..cool. Now, grab a Pharo image and install Fuel. Of course, if you are lazy like me, you can download an image from our greatest tester/slave, Jenkis: https://ci.lille.inria.fr/pharo/job/Fuel/. Anyway, if you want to take a Pharo image and install Fuel in it, just execute:

Gofer new
squeaksource: 'Fuel';
package: 'ConfigurationOfFuel';
load.
((Smalltalk at: #ConfigurationOfFuel) project latestVersion) load.

5) Install Riak client by executing:

Gofer new
squeaksource: 'EpigentRiakInterface';
package: 'ConfigurationOfEpigentRiakInterface';
load.

((Smalltalk at: #ConfigurationOfEpigentRiakInterface) project version: '0.2') load.

6) Since in step 2) we have started our server locally, our tests should use the local server. To make this work, you have to options: add a mapping between ‘riaktest’ and localhost in your “hosts” file or even simple, just modify the method #riakTestUrlString to something like:

riakTestUrlString
^ 'http://localhost:8098/riak'

7) Now you should be able to run Riak tests and they should be green.  Of course, Fuel tests should be green as well.

Ok. Once you follow all those steps, you should have a Riak database running locally and a Riak client in Pharo using Zinc and being able to talk to the database. So…how do you store stuff in a NoSQL database? basically, data is stored with key / value. The key is usually a String and the value can be a string or something binary. You may have listen that NoSQL databases can be used as a persistency strategy for enterprise applications. But how can that be possible if Riak only accepts strings or bytes? Well, here is when the serializer plays its role. One possible solution is to use a JSON or XML serializer in which case you take your domain specific object graph and you serialize it to a text file which is then stored in the NoSQL database.

But with Fuel, we can serialize an object graph in a binary way, and we able to serialize it and materialize it really fast. Here is an example of how you can use Fuel with Riak:

testFuelWithRiak
| bucket storedObject materializedObject aRectangle client |
client := EpigentRiakClient newForRestConnection: 'http://localhost:8098/riak'.
aRectangle := Rectangle origin: (4@2) corner: (7@1).
storedObject := FLSerializer serializeInMemory: aRectangle.
bucket := client  bucketNamed:  'epigenttest' .
bucket atKey: 'test' put: storedObject type: 'application/binary'.
self assert: (bucket atKey: 'test') = storedObject.
materializedObject := FLMaterializer materializeFromByteArray: storedObject.
self assert: materializedObject origin = (4@2).
self assert: materializedObject corner = (7@1).

Basically, we use Fuel to serialize an object graph into a ByteArray. Then we store such object into a Blob in the database. And of course, we can then read it and materialize it. What do we gain with this? Now you are able to take any object graph and store/retrieve it from Riak database. Just replace “aRectangle” for whatever object you want. If you are developing a real application, this object may be your “root” which is usually a dictionary.

So….since I am a simple PhD student, I’ve already done my job: experiment. Now it is your time to get something real from this 😉

See you


My talks at ESUG 2011

Hi guys. This is a short post to just let you know what I’ll be talking about at ESUG this year in Edinburgh, Scotland. I’ll give a total of 3 talks which are the following:

1) Fuel: Boosting your serialization.

Fuel is an open-source general purpose framework to serialize and materialize object graphs based on a pickling algorithm. Fuel is implemented in Pharo Smalltalk environment and we demonstrate that we can build a really fast serializer without specific VM support, with a clean object-oriented design and providing most possible required features for a serializer.

For more details please visit Fuel website: http://rmod.lille.inria.fr/web/pier/software/Fuel

I will give an introduction to Fuel explaining its pickle format showing some examples of how to use this serializer. Ahhh at the same time, I am submitting Fuel to the Innovation Technology Awards 🙂

I participate in Fuel in several ways:

– Fuel is being sponsored by ESUG SummerTalk and I am the mentor. The student and main developer is Martin Dias.

– I contribute with code and test.

– For my research/PhD prototype, I use Marea as the serializer.

2) Ghost:  A Uniform, Light-weight and Stratified Proxy Implementation

This talk is about the paper we wrote about a novel proxy implementation. I’ve developed Ghost proxies in Pharo Smalltalk and it is part of my PhD.

This toolbox provides low memory consuming proxies for regular objects as well as for classes and methods. Ghost proxies let us intercept all messages sent to a proxy, with clear separation between the layers of intercepting and handling interceptions. Ghost is stratified and does not use the common yet problematic #doesNotUnderstand: for implementing proxies.

For more info please read: http://rmod.lille.inria.fr/web/pier/software/Marea/GhostProxies

3) DBXTalk: Argentinian Connection.

DBXTalk is the complete and open-source solution to relational database access. It has been in development since 4 years already and it includes the following tools:

  • OpenDBXDriver: this is the database driver and it wraps the C library OpenDBX. This subset of DBXTalk was formerly known as SqueakDBX.
  • GlorpDBX: this is a port of VisualWorks Glorp plus some changes to make it work with different database drivers such us OpenDBXDriver.
  • DBXTalkDescriptions: it’s a tool to generate classes from tables and tables from classes, based in Glorp types. Also to generate GlorpConfiguration from Magritte descriptions. And a UI to make it easy to use.
  • DBXPlus: it’s a UI to visualize and browse databases without going out the image. You can execute sql query and review the results, browse into the defined tables and views, inspect the structure and the data stored.

Guillermo Polito, Santiago Bragagnolo and me (all part of the team) will give you an introduction to all those tools of the DBXTalk suite while showing real examples.

For DBXTalk, well…I was part of the authors. I have been actively developed SqueakDBX/DBXTalk and it is the open-source project I have spent more time in apart from Pharo. DBXTalk is used in several production applications and it has already it list of users.

For more info, check the links:

http://www.esug.org/wiki/pier/Conferences/2011/Schedule-And-Talks/DBXTalk

http://www.squeakdbx.org/

All the schedule of the talks are in here

That’s all for today.