Class in the Front, Proto in the Back
↩ ↪August 01, 2010
Inspired by what I learned and saw at the Emerging Languages Camp, I’ve started working on Magpie again. It’s been more or less dormant while I figured out what kind of language I wanted it to be.
One big change is that Magpie Redux is going to be object-oriented where the original Magpie was procedural. Magpie was an experiment to see if a procedural language can be as nice to use as an OOP one if you give it memory management and a decent type system. For my purposes, the answer was that it’s not too bad, but objects are really nice to have.
So I started working on a fresh Magpie interpreter in Java that’s class-based and dynamically-typed. (As a nice bonus, Magpie going forward won’t be as tied to the MS stack as the old C# one is.) This post is about one implementation detail I think is kind of interesting: even though Magpie-the-language is class-based, under the hood, the interpreter uses prototypes.
What’s in a class?
Before we get too far, let’s make sure we’re all talking about the same thing. In Magpie, all objects are instances of a class. The class determines what methods an object has. The pseudocode for invoking a method is:
invoke(obj, methodName, arg)
classObj = obj.getClass
method = classObj.findMethod(methodName)
method.invoke(obj, arg)
end
Under the hood then, a class is a bag of methods. Something like:
class ClassObj {
Map<String, Method> methods;
}
And an object is just a reference to its class (and some data, but we won’t worry about that here):
class Obj {
ClassObj theClass;
}
However, there are actually two kinds of methods. There are regular instance methods, like we’ve seen and “static” or “class” methods that you invoke directly on the class. These include things like constructors (which are like Ruby’s new methods). For example:
def foo = Foo.new // Call a class method on Foo.
foo.toString // Call an instance method on a Foo.
So our class object really needs two dictionaries, one for class and one for instance methods:
class ClassObj {
Map<String, Method> classMethods;
Map<String, Method> instanceMethods;
}
It also ends up needing to duplicate the API for using them: separate methods for defining a method, looking one up, invoking one, etc. I hate redundancy so this rubbed me the wrong way.
Split-aparts
Instead of having a single class object that does two things, I thought why not split that up into two separate objects? One represents the class itself and has the static or class methods like constructors. The other represents the set of methods you can call on an instance of the class.
My ClassObj
type disappeared, to just be:
class Obj {
Map<String, Method> methods;
}
The only missing piece was that an instance of some object needed to have a reference to the other object that contains its methods, like so:
class Obj {
Obj parent;
Map<String, Method> methods;
}
Now our pseudo-code for invoking a method is:
invoke(obj, methodName, arg)
thisObj = obj
loop
method = thisObj.findMethod(methodName)
if method != null then
return method.invoke(obj, arg)
end
// Walk up the parent chain.
thisObj = obj.parent
end
end
You’ll note that this pseudocode handles both instance and class methods. With
instance methods, obj
will be the class object itself (since classes are
first-class objects) and it will find the method and immediately invoke it.
With instance methods, the object’s parent will be the special object that holds all of the instance methods. So we’ll fail to find the method on the object itself, walk up to its parent, find it and be done.
Familiar?
This started to look pretty familiar. I had an object that represented a class, and another object that represented the methods all objects of that class have in common. There’s a name for that second object: a prototype. In the name of getting rid of duplicate code, I basically stumbled backwards onto the same type system that Javascript and Finch use.
The interesting part is that this is hidden completely from the user. Magpie code is still written using classes and instances. If you have this in Magpie:
class Foo
def sayHi () print "hi"
end
def a = Foo.new
def b = Foo.new
a.sayHi
b.sayHi
At runtime, the interpreter will build this object hierarchy:
Every box represents an object and is of the same Java class in the interpreter
(Obj
). Objects a
and b
both have Foo proto
as their parent, which is the
prototype object holding the instance methods. The class methods like new
are
held in a separate object Foo class
representing the class itself and bound to
the variable Foo
.
The class and proto objects don’t have each other as parents because they aren’t directly related—there’s no “is-a” relationship between them. Instead, the class stores a reference to the prototype in a “proto” field so that it can find it when it needs to create new instances. Conversely, the prototype stores a reference to the class so that it can find it when you ask an instance what its class is.
Doing things this way let me simplify a lot of the underlying code while still having the convenience of classes at the user level. This is the best of both worlds: prototypes are easier for the language implementer to code and classes are easier for the user to use.
Of course, I still have a ton of work to do before Magpie is anywhere near usable again and this may all be nullified by something I run into along the way, but so far it seems to be working.