This articles was published on 2012-07-02

Five ways to execute mruby

Ruby code is usually be interpreted. Even in the latest 1.9 release of CRuby, the code might be transformed into bytecode during interpretation, but the user always has to pass a clear text Ruby file to the Ruby program. (JRuby has actually an interesting feature of compiling but this isn’t a topic here.)

With mruby there are now five ways to execute Ruby code. In the following lines I want to present each of these methods.

As a trivial example I’m using the following small hello world Ruby code inside of the test_program.rb file:

$ cat test_program.rb 
puts "hello world"

Interpret (.rb)

The probably most common way to use mruby is to invoke the mruby program and pass a clear text Ruby file. This will interpret the Ruby code similar to many other Ruby implementations:

$ mruby/bin/mruby test_program.rb
hello world

Advantage

  • Provides a quick development cycle. Programming -> Test -> Programming.

Disadvantage

  • You have to provide the source code to the user. This is sometimes not acceptable. I don’t want to discuss code obfuscating here (I personally don’t believe in it) but many people think it is unacceptable to provide the source code to the user.
  • Overhead. You need the actual mruby program and you need a file system to store your Ruby code. Also your Ruby code will be parsed and transformed to bytecode before it will be executed. All this will increase the footprint and execution-time.

Interactive mruby Shell (mirb)

Not exactly directly used for program execution but still, the Embeddable Interactive Ruby Shell can evaluate Ruby code by using the mirb program:

$ ./mruby/bin/mirb 
mirb - Embeddable Interactive Ruby Shell

This is a very early version, please test and report errors.
Thanks :)

> puts "hello world"
hello world
 => nil

Advantage

  • Direct feedback without any indirections like a source code file, compiler or interpreter.

Disadvantage

  • Isn’t usable for productive execution.
  • Overhead. The shell needs to parse the Ruby input actually two times. The first time it checks if the Ruby code is complete. In the second round it transforms the Ruby code into bytecode which then can be executed.

Bytecode (.mrb)

mruby provides a Java-like execution style by compiling to an intermediate representation form which then will be executed.

The first step is to produce bytecode with the mrbc program:

$ mruby/bin/mrbc test_program.rb

This will produce a file called test_program.mrb which contains the bytecode representation of the previously given Ruby code:

$ cat test_program.mrb 
RITE0009000000090000MATZ    000900000000007700010000        D0A700000077SC0002000400046F2800000005010000060180003D02000005010000A00000004AF507000000010F000Bhello world3304000000010004puts248900000000

This bytecode can be executed by the mruby program. You need to add the -b parameter to tell the program that your file isn’t pure Ruby but pre-compiled mruby bytecode:

$ mruby/bin/mruby -b test_program.mrb
hello world

Advantage

  • You don’t need to provide the source code to the user.
  • Reduction of overhead during runtime. The Ruby code doesn’t need to be parsed.

Disadvantage

  • Overhead. You still need the actual mruby program and you also need a file system to store your mruby bytecode.
  • Your development cycle will get more complex. Now there are two steps and two different programs necessary. Programming -> Compile (mrbc) -> Test (mruby) -> Programming.

Readable C Code (.c)

This variant is interesting for everybody who wants to integrate Ruby code directly into their C code. For this the mrbc program provides a feature to compile Ruby code down to C code and wraps this C code into a C function which name can be defined by the user. You can do this by using the -C parameter. In my case I named the C function init_tester:

$ mruby/bin/mrbc -Cinit_tester test_program.rb

This produces the C file test_program.c with the following content:

#include "mruby.h"
#include "mruby/irep.h"
#include "mruby/string.h"
#include "mruby/proc.h"

static mrb_code iseq_0[] = {
  0x01000006,
  0x0180003d,
  0x02000005,
  0x010000a0,
  0x0000004a,
};

void
init_tester(mrb_state *mrb)
{
  int n = mrb->irep_len;
  int idx = n;
  mrb_irep *irep;

  mrb_add_irep(mrb, idx+1);

  irep = mrb->irep[idx] = mrb_malloc(mrb, sizeof(mrb_irep));
  irep->idx = idx++;
  irep->flags = 0 | MRB_ISEQ_NOFREE;
  irep->nlocals = 2;
  irep->nregs = 4;
  irep->ilen = 5;
  irep->iseq = iseq_0;
  irep->slen = 1;
  irep->syms = mrb_malloc(mrb, sizeof(mrb_sym)*1);
  irep->syms[0] = mrb_intern(mrb, "puts");
  irep->plen = 1;
  irep->pool = mrb_malloc(mrb, sizeof(mrb_value)*1);
  irep->pool[0] = mrb_str_new(mrb, "hello world", 11);

  mrb->irep_len = idx;

  mrb_run(mrb, mrb_proc_new(mrb, mrb->irep[n]), mrb_top_self(mrb));
}

As you can see we’ve got an Array with some bytecode and a C function with the name init_tester which contains the C code of my Ruby code. To test this I need a little bit of boilerplate code:

int
main(void)
{
  /* new interpreter instance */
  mrb_state *mrb;
  mrb = mrb_open();

  /* execute C compiled Ruby code */
  init_tester(mrb);

  mrb_close(mrb);

  return 0;
}

Now I can compile and link everything:

$ gcc -Imruby/src -Imruby/include -c test_program.c -o test_program.o
$ gcc -o test_program test_program.o mruby/lib/libmruby.a

This will create the executable test_program which I can execute fully standalone:

$ ./test_program
hello world

Advantage

  • You don’t need to provide the source code to the user.
  • Reduction of overhead during runtime. The Ruby code doesn’t need to be parsed. Also it isn’t necessary to have a file system on the target system.
  • The program is fully standalone. No other application is necessary for execution.
  • The C integration is really easy. You just need a mrb_state instance which you pass to the generated function.
  • The generated code is still readable and could be checked and optimized by hand.

Disadvantage

  • Your development cycle is more complex. Now there are four steps necessary. Programming -> Compile (mrbc) -> Integrate C code -> Compile (gcc) -> Test -> Programming.
  • Has quite some bugs.

Binary C Code (.c)

The last variant is similar to the previous one. It again compiles Ruby code to C code. But this time instead of providing a C function with API calls, it will create only an Array of bytecode which you have to execute by yourself.

The first step again is to compile your Ruby program. You can do this by using the parameter -B of mrbc. You also need to pass a name for the Array of the bytecode, which will be used for the generated code:

$ mruby/bin/mrbc -Btest_symbol test_program.rb

This will create a C file called test_program.c with an Array called test_symbol:

const char test_symbol[] = {
0x52,0x49,0x54,0x45,0x30,0x30,0x30,0x39,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x39,
0x30,0x30,0x30,0x30,0x4d,0x41,0x54,0x5a,0x20,0x20,0x20,0x20,0x30,0x30,0x30,0x39,
0x30,0x30,0x30,0x30,0x00,0x00,0x00,0x82,0x00,0x01,0x00,0x00,0x20,0x20,0x20,0x20,
0x20,0x20,0x20,0x20,0x6b,0x91,0x00,0x00,0x00,0x44,0x53,0x43,0x00,0x02,0x00,0x04,
0x00,0x02,0x6f,0x28,0x00,0x00,0x00,0x05,0x01,0x00,0x00,0x06,0x01,0x80,0x00,0x3d,
0x02,0x00,0x00,0x05,0x01,0x00,0x00,0xa0,0x00,0x00,0x00,0x4a,0xf5,0x07,0x00,0x00,
0x00,0x01,0x0f,0x00,0x0b,0x68,0x65,0x6c,0x6c,0x6f,0x20,0x77,0x6f,0x72,0x6c,0x64,
0x33,0x04,0x00,0x00,0x00,0x01,0x00,0x04,0x70,0x75,0x74,0x73,0x24,0x89,0x00,0x00,
0x00,0x00,
};

To execute this bytecode I wrote the following boilerplate which takes the Array, loads the bytecode and executes it immediately:

#include "mruby.h"
#include "mruby/irep.h"
#include "mruby/proc.h"

int
main(void)
{
  /* new interpreter instance */
  mrb_state *mrb;
  mrb = mrb_open();

  /* read and execute compiled symbols */
  int n = mrb_read_irep(mrb, test_symbol);
  mrb_run(mrb, mrb_proc_new(mrb, mrb->irep[n]), mrb_top_self(mrb));

  mrb_close(mrb);

  return 0;
}

The compile and link procedure is equivalent to the last example:

$ gcc -Imruby/src -Imruby/include -c test_program.c -o test_program.o
$ gcc -o test_program test_program.o mruby/lib/libmruby.a

The given executable test_program is again standalone and can be executed like this:

$ ./test_program
hello world

Advantage

  • You don’t need to provide the source code to the user.
  • Reduction of overhead during runtime. The Ruby code doesn’t need to be parsed. Also it isn’t necessary to have a file system on the target system.
  • The program is fully standalone. No other application is necessary for the execution.
  • This C code is more compact, due to the reason that it is just an Array.

Disadvantage

  • Your development cycle is more complex. Now there are four steps necessary. Programming -> Compile (mrbc) -> Integrate C code -> Compile (gcc) -> Test -> Programming.
  • You need more additional boilerplate to get the program up and running.
  • The generated bytecode is practical impossible to check and optimize by hand.

Conclusion

For the development phase it is clear that the interpreter mode of mruby, together with the interactive shell of mirb, is the best way to get an optimal development cycle.

For the time afterwards it depends on the target requirements. If you are using mruby for scripting, the mruby program is still fine for execution.

In case you want to provide your program to a customer and at the same time want to obfuscate your source code, the mrbc program with the bytecode compilation should be the way to go. In this scenario the customer needs the actual mruby program on his system to execute the program.

If you want to integrate your Ruby code into a C program, the mrbc feature -C and -B is the way to go. At the moment the bytecode Array generator (-B) seems to be more stable. The -C variant had several bugs in the past and during my tests I actually found another one. The C generator is by far more complex than the bytearray output, which might be the reason for this observation. On the other side the generated C code of -C is readable and gives you the possibility to check the compiled code and optimize by hand. Based on Matz, this code is also a little bit more efficient.