What is a primitive?
Do you want to know the answer? Just do what we always do in Smalltalk: browse code 🙂 Open your image and browse the method #whatIsAPrimitive. You can read the following information there:
“Some messages in the system are responded to primitively. A primitive response is performed directly by the interpreter rather than by evaluating expressions in a method. The methods for these messages indicate the presence of a primitive response by including <primitive: xx> before the first expression in the method.
Primitives exist for several reasons. Certain basic or ‘primitive’ operations cannot be performed in any other way. Smalltalk without primitives can move values from one variable to another, but cannot add two SmallIntegers together. Many methods for arithmetic and comparison between numbers are primitives. Some primitives allow Smalltalk to communicate with I/O devices such as the disk, the display, and the keyboard. Some primitives exist only to make the system run faster; each does the same thing as a certain Smalltalk method, and its implementation as a primitive is optional.
When the Smalltalk interpreter begins to execute a method which specifies a primitive response, it tries to perform the primitive action and to return a result. If the routine in the interpreter for this primitive is successful, it will return a value and the expressions in the method will not be evaluated. If the primitive routine is not successful, the primitive ‘fails’, and the Smalltalk expressions in the method are executed instead. These expressions are evaluated as though the primitive routine had not been called.
The Smalltalk code that is evaluated when a primitive fails usually anticipates why that primitive might fail. If the primitive is optional, the expressions in the method do exactly what the primitive would have done (See Number @). If the primitive only works on certain classes of arguments, the Smalltalk code tries to coerce the argument or appeals to a superclass to find a more general way of doing the operation (see SmallInteger +). If the primitive is never supposed to fail, the expressions signal an error (see SmallInteger asFloat).
Each method that specifies a primitive has a comment in it. If the primitive is optional, the comment will say ‘Optional’. An optional primitive that is not implemented always fails, and the Smalltalk expressions do the work instead. If a primitive is not optional, the comment will say, ‘Essential’. Some methods will have the comment, ‘No Lookup’. See Object >> #howToModifyPrimitives for an explanation of special selectors which are not looked up.
For the primitives for +, -, *, and bitShift: in SmallInteger, and truncated in Float, the primitive constructs and returns a 16-bit LargePositiveInteger when the result warrants it. Returning 16-bit LargePositiveIntegers from these primitives instead of failing is optional in the same sense that the LargePositiveInteger arithmetic primitives are optional. The comments in the SmallInteger primitives say, ‘Fails if result is not a SmallInteger’, even though the implementor has the option to construct a LargePositiveInteger. For further information on primitives, see the ‘Primitive Methods’ part of the chapter on the formal specification of the interpreter in the Smalltalk book.”
As we will see later in another post, in the object header of every object (except compact classes) there is a pointer to its class (another object). Hence, accessing to that pointer of the object header has to be done by a primitive:
Object >> class "Primitive. Answer the object which is the receiver's class. Essential. See Object documentation whatIsAPrimitive." <primitive: 111> self primitiveFailed
We read in the comment of the method #whatIsAPrimitive that what it is after <primitive: XXX> is ONLY called when the primitive fails. In this case, when that happens, the code “self primitiveFailed” will be executed: there is nothing we can do from image side if this primitive fails. Notice that the declaration of <primitive: XXX> has to be first in the method. The only possible thing before that is comments and declare temp variables: it is not possible to write code before that. So, this is not possible:
Object >> class "Primitive. Answer the object which is the receiver's class. Essential. See Object documentation whatIsAPrimitive." Transcript show: '#class was called!!'. <primitive: 111> self primitiveFailed
Another example of a primitive:
SmallInteger >> bitOr: arg "Primitive. Answer an Integer whose bits are the logical OR of the receiver's bits and those of the argument, arg. Numbers are interpreted as having 2's-complement representation. Essential. See Object documentation whatIsAPrimitive." <primitive: 15> self >= 0 ifTrue: [^ arg bitOr: self]. ^ arg < 0 ifTrue: [(self bitInvert bitAnd: arg bitInvert) bitInvert] ifFalse: [(self bitInvert bitClear: arg) bitInvert]
In this case, if the primitive fails, this method tries to resolve its task in Smalltalk code. Sometimes this works and it means that this primitive is for improving performance, but not mandatory (as it is the case with #class). In other cases, the code after the primitive (written in Smalltalk) will fail for sure if the primitive has already failed. However, such code is put in Smalltalk with documentation purposes. You can imagine what such primitive does (and why it could fail) in the VM side (Slang/C) by looking its possible code in Smalltalk.
Two important literals
When we talked about CompiledMethod and literals I forgot to mention that there are 2 literals in every CompiledMethod that are really important. CompiledMethod can answer to the messages #methodClass (which answers the class where such CompiledMethod is installed) and #selector (which answers the method’s selector). How can both methods be implemented in CompiledMethod if they don’t hold such information? Ok, they do hold such information as literals. The LAST literal of every CompiledMethod is an Association where the key is the class name and the value the class object. The penultimate literal stores the selector. So if we explore “Date >> #month”:
So you can now understand the methods:
CompiledMethod >> methodClass "answer the class that I am installed in" ^self numLiterals > 0 ifTrue: [ (self literalAt: self numLiterals) value ] ifFalse: [ nil ]
CompiledMethod >> selector "Answer a method's selector. This is either the penultimate literal, or, if the method has any properties or pragmas, the selector of the MethodProperties stored in the penultimate literal." | penultimateLiteral | ^(penultimateLiteral := self penultimateLiteral) isMethodProperties ifTrue: [penultimateLiteral selector] ifFalse: [penultimateLiteral]
Forget for the moment the #isMethodProperties.
Pragmas and CompiledMethods
Now…when we talk about the <primitive: XXX>, what’s that?? it is not a regular message send. How can that be compiled by the Compiler? Ok, these are called “Method tags” and their goal is to store metadata of the method. If you are a java developer, method tags can be “similar” to Java annotations. In Pharo Smalltalk, one implementation of method tags is called “Pragmas”. I won’t discuss the advantages or disadvantages of Pragmas against other method tag implementations, or whether to use pragmas o regular subclassification, etc.
For more information about Pragmas, check the class comment of Pragma class and the tests like PragmaTest, MethodPragmaTest, etc. Nowadays, Pragmas are used in Pharo in several places like the new settings framework, the world menu, Metacello, HelpSystem, etc.
Ok…nice. But how are they really stored in a CompiledMethod? Let’s explore “SmallInteger >> #bitOr:”.
So….as we can see in the explorer, at compiling time the Compiler creates an instance of AdditionalMethodState and such object is placed in the penultimate literal. The class comment of AdditionalMethodState says: “I am class holding state for compiled methods. All my instance variables should be actually part of the CompiledMethod itself, but the current implementation of the VM doesn’t allow this. Currently I hold the selector and any pragmas or properties the compiled method has. Pragmas and properties are stored in indexable fields; pragmas as instances of Pragma, properties as instances of Association.”
AdditionalMethodState has two named instance variables: ‘method’ and ‘selector’. No explanation needed here. But since the class format is variable (do you remember them from my old post?) it can also store indexable fields. In this case, pragmas are stored that way. Hence, an instance of Pragma is stored in AdditionalMethodState and that’s what we can see in the explorer. A Pragma instance has 3 instance variables: ‘method keyword arguments’.
But the AdditionalMethodState instance is put in the penultimate literal and that’s where the “selector” should be found. How can “CompiledMethod >> #selector” work with them? If we now take again a look to such method (look above), you will see there is a “isMethodProperties ifTrue: [penultimateLiteral selector]”. Of course, AdditionalMethodState answers true to isMethodProperties and hence the selector is asked to itself (which in fact is an instance variable of it).
Primitives and their impact in CompiledMethod
Since primitives uses Pragma, the first effect is to have an AdditionalMethodState in the penultimate literal instead of a selector. The second effect, is that the primitive number is stored in the CompiledMethod header. You can send the message #primitive and get the value. For example, “(SmallInteger >> #bitOr:) primitive” -> 15. If the method has no primitive then zero is answered. Example, (TestCase >> #assert:) primitive -> 0.
When the VM executes a CompiledMethod it checks whether it is a primitive method or not (checking whether the value in the object header is zero or bigger). If it is, the VM searches in a table and dispatches the primitive associated to the number.
How primitives are map to the VM side?
Continuing with SmallInteger >> #bitOr:, the primitive number is 15. How can we know the code of such primitive in the VM side? Time to open an image with VMMaker (if you don’t know how to do it read the title “Prepared image for you” in this post). The VM keeps a table that maps primitive numbers with selectors implemented in the interpreter class. The most useful advice here is to check the method that initialices such table: #initializePrimitiveTable. So we can take a look:
For our example of #class the primitive number was 111. In such table 111 maps to #primitiveClass. So we can browse its code. Remember that this code is written in SLANG and it is part of the VMMaker package (check my previous posts for details).
primitiveClass | instance | instance := self stackTop. self pop: argumentCount+1 thenPush: (objectMemory fetchClassOf: instance)
(SmallInteger >> #bitOr:) has primitive number 15, which maps to #primitiveBitOr, which code is:
primitiveBitOr | integerReceiver integerArgument | integerArgument := self popPos32BitInteger. integerReceiver := self popPos32BitInteger. self successful ifTrue: [self push: (self positive32BitIntegerFor: (integerReceiver bitOr: integerArgument))] ifFalse: [self unPop: 2]
So..you have learnt how to map primitive numbers with methods in VM side 🙂 You already know how to do that for primitives and bytecodes now. Congrats!!!
Browse de method #primDeleteFileNamed: and you will see something like:
primDeleteFileNamed: aFileName "Delete the file of the given name. Return self if the primitive succeeds, nil otherwise." ^ nil
what’s that primitive? where is the number? Can I create my own primitive? Sure! We will see how to do that in a future post 🙂