Monday 21 July 2014

Writing Assembler using Standard ML Functors

Here is an example of something important which is seldom mentioned in Functional Programming circles. One can use Standard ML functors to produce type-checked, efficient assembler code. For example, here is a functor implementing assembler primitives for CAML byte-code machines. The same functor implements code for three different ABI representations of "machine words," which are instances of the Standard ML basis signature Word. The Functor AbstractMachineWord effectively composes the assembler code from JIT instruction-generating routines which are in turn composed by other functors. In this instance, for example, the parameter WordEnc (which is another structure with a defined interface like this one) is something like  that produced by the functor VectorSliceWordEnc.sml which uses other assembler code composed by the code-generating primitives defined in PrimEnc.sml which, you might be pleased to hear, is a first-class structure, also with a defined interface. This structure is fixed in terms of the primitives implemented by the 'foreign function interface' binding to the GNU lightning JIT code-generating library. , and it implements only the  CAML ABI. But VectorSliceWordEnc.sml is just one of the possible representations of machine words in memory, and another is ArraySliceWordEnc.sml. And this uses the same primitives, but results in a different set of assembler functions.

There are also other possible representations of the ABI calling conventions, so we could use the same AbstractMachineWord functor to generate primitives for GNU guile, Python, or Ocaml, just by writing a different version of PrimEnc.sml. Provided this conforms to the interface signature defined in PrimEnc.sig, it will work.

The remarkable thing about these functors is their reliability. It is actually easier to write assembler this way than it is to write it ad-hoc. You don't have to take my word for it, you can try it yourself. The whole repository is available for download and/or forking.

No comments:

Post a Comment