Monday, March 26, 2012

Code gen redo preview

Rewriting the code generation phase of a compiler is not for the faint of heart. Nor, perhaps, for the sound of mind.

I've nearly completed a rewrite of the code gen code of the ClojureCLR compiler.  There are still a few things on my punch list (see below), but all the clojure.test-clojure.* tests run now.  I hope an intrepid few will give it a spin before I push the changes to master.  The new code can be found in the nodlr branch in the github repo.

When I wrote the ClojureCLR compiler, I was interested in seeing what kind of mileage I could get out of the Dynamic Language Runtime, specifically the DLR's expression tree mechanism.  The DLR's ETs extended the ETs used in Linq by providing enhanced control flow capabilities.  They are central to other dynamic language implementations on the CLR such as IronPython and IronRuby.

The first version of the ClojureCLR compiler mimicked the JVM compiler through its initial phases.  The Lisp reader translates source code to Lisp data structures that are parsed to generate abstract syntax trees.  The ClojureJVM compiler traverses the ASTs to produce JVM bytecodes.  The ClojureCLR compiler instead generates DLR ETs from the ASTs.   Those ETs are then asked to generate MSIL code.

I got a lot of mileage out of using the DLR for code generation.  I got to avoid some of the hairier aspects of MSIL and the CLR-- things like value types, generic types, nullable types, for example, are handled nicely by ETs.  I also found it easier  to experiment.  However, using ETs  had at least two drawbacks. One was that going from ASTs to MSIL through ETs likely nearly doubles the work of MSIL generation. Another was that ETs were restricted to producing static methods only.  Working around this restriction introduced several inefficiencies in the resulting code.

The Clojure model for functions maps each defined function to a class.  For example, compiling

(defn f
  ([x] ... )
  ([x y] ... ))

yields a class named something like user$f__1295 that implements clojure.lang.IFn, with overrides for virtual methods invoke(Object) and invoke(Object,Object).  (The actual value to which f would be bound would be an instance of this class.)

Note that the invoke overrides of necessity are instance methods.  Recall from above that DLR ETs cannot produce instance methods.  Toss in a another little problem referring to unfinished types.  Shake and stir.  You end up with the following abomination:  Where Clojure/JVM generates one class and two methods for the example above, ClojureCLR would have to generate two classes and four methods.  An invoke override method is just a passthrough to a DLR-generated static method taking the function object as a first paramenter.

For several years I hoped that the DLR team would get around to looking into class and instance method generation.  This now seems unlikely.  So I finally decided to rewrite the code generation phase to eliminate most uses of the DLR.

The new code gen code yields significant improvements in compilation time and code size.  Compiling the bootstrap clojure core environment is roughly twice as fast. The generated assemblies are about 20% smaller.  Startup time (non-NGEN'd) is 11% faster.  A few benchmarks I've run show speedups ranging from 4% to 16%.  This is in line with my best hopes.

One other benefit: with code generation more closely modeled after the JVM version, future maintainers will need less knowledge of the DLR.

There are drawbacks to this move.  The DLR guys know a lot more about about generating MSIL code than I do.  Some wonderful goodness with names like Expression.Convert and Expression.Call were my best friends    They are (mostly) gone now.   And, oh, the beauty of DebugView for ETs for debugging code gen--this will be missed.  My new best friends are peverify and .Net Reflector, the caped duo for rooting out bad MSIL.   Wonderful in their own way, but I have a sense of loss.

So, where are we?  I have a little more work to do before putting this on the master branch.    I plan to make one last traversal of the old code looking at all occurrences of my former best friends  to make sure I've been consistent in handling the complexities they hid.  I also plan to reimplement a 'light compile' variation to be used during evaluation.  The current version has it.  (What this is and why it matters I leave to another time.)  Neither task will take long.

In the meantime, the nodlr branch is ready for a workout by those who care and dare. Have at it.


P.S.  The DLR is still being used to provide CLR interop and polymorphic inline caching.  Another topic-for-another-day.