YARV’s Internal Stack and Your Ruby Stack
I've started working on a new edition of Ruby Under a Microscope that covers Ruby 3.x. I'm working on this in my spare time, so it will take a while. Leave a comment or drop me a line and I'll email you when it's finished.
The content of Chapter 3, about the YARV virtual machine, hasn't changed much since 2014. However, I did update all of the diagrams to account for some new values YARV now saves inside of each stack frame. And some of the common YARV instructions were renamed as well. I also moved some content that was previously part of Chapter 4 here into Chapter 3. Right now I'm rewriting Chapter 4 from scratch, describing Ruby's new JIT compilers.
Chapter 3: How Ruby Executes Your Code
| YARV’s Internal Stack and Your Ruby Stack | 4 |
| Stepping Through How Ruby Executes a Simple Script | 5 |
| Executing a Call to a Block | 8 |
| Taking a Close Look at a YARV Instruction | 10 |
| Local and Dynamic Access of Ruby Variables | 12 |
| Local Variable Access | 12 |
| Method Arguments Are Treated Like Local Variables | 15 |
| Dynamic Variable Access | 16 |
| Climbing The Environment Pointer Ladder In C | 22 |
| Experiment 3-1: Exploring Special Variables | 23 |
| Controlling the Flow of Execution | 27 |
| How Ruby Executes an if Statement | 27 |
| Jumping from One Scope to Another | 29 |
| Catch Tables | 30 |
| Other Uses for Catch Tables | 33 |
| Experiment 3-2: Testing How Ruby Implements For Loops Internally | 34 |
| Summary | 35 |
YARV’s Internal Stack and Your Ruby Stack
As we’ll see in a moment, YARV uses a stack internally to track intermediate values, arguments, and return values. YARV is a stack-oriented virtual machine.
In addition to its own internal stack, YARV keeps track of your Ruby program’s call stack, recording which methods call which other methods, functions, blocks, lambdas, and so on. In fact, YARV is not just a stack machine—it’s a double-stack machine! It has to track the arguments and return values not only for its own internal instructions but also for your Ruby program.
Figure 3-1 shows YARV’s basic registers and internal stack.
Figure 3-1: Some of YARV’s internal registers, including the program counter and stack pointer
YARV’s internal stack is on the left. The SP label is the stack pointer, or the location of the top of the stack. On the right are the instructions that YARV is executing. PC is the program counter, or the location of the current instruction.
You can see the YARV instructions that Ruby compiled from the puts 2+2 example on the right side of Figure 3-1. YARV stores both the SP and PC registers in a C structure called rb_control_frame_t, along with the current value of Ruby’s self variable and some other values not shown here.
At the same time, YARV maintains another stack of these rb_control_frame_t structures, as shown in Figure 3-2.
Figure 3-2: YARV keeps track of your Ruby call stack using a series of rb_control_frame_t structures.
This second stack of rb_control_frame_t structures represents the path that YARV has taken through your Ruby program, and YARV’s current location. In other words, this is your Ruby call stack—what you would see if you ran puts caller.
The CFP pointer indicates the current frame pointer. Each stack frame in your Ruby program stack contains, in turn, a different value for the self, PC, and SP registers, as shown in Figure 3-1. Ruby also keeps track of type of code running at each level in your Ruby call stack, indicated by the “[BLOCK]”, “[METHOD]” notation in Figure 3-2.
Stepping Through How Ruby Executes a Simple Script
In order to help you understand this a bit better, here are a couple of examples. I’ll begin with the simple 2+2 example from Chapters 1 and 2, shown again in Listing 3-1.
puts 2+2
This one-line Ruby script doesn’t have a Ruby call stack, so I’ll focus on the internal YARV stack for now. Figure 3-3 shows how YARV will execute this script, beginning with the first instruction, putself.
Figure 3-3: On the left is YARV’s internal stack, and on the right is the compiled version of my puts 2+2 program.
As you can see in Figure 3-3, YARV starts the program counter (PC) at the first instruction, and initially the stack is empty. Now YARV executes the putself instruction, and pushes the current value of self onto the stack, as shown in Figure 3-4.
Figure 3-4: putself pushes the top self value onto the stack
Because this simple script contains no Ruby objects or classes, the self pointer is set to the default top self object. This is an instance of the Object class that Ruby automatically creates when YARV starts. It serves as the receiver for method calls and the container for instance variables in the top-level scope. The top self object contains a single, predefined to_s method, which returns the string “main.” You can call this method by running the following command in the console:
$ ruby -e 'puts self' => main
YARV will use this self value on the stack when it executes the opt_send_without_block instruction: self is the receiver of the puts method because I didn’t specify a receiver for this method call.
Next, YARV executes putobject 2. It pushes the numeric value 2 onto the stack and increments the PC again, as shown in Figure 3-5.
Figure 3-5: Ruby pushes the value 2 onto the stack, the receiver of the + method.
This is the first step of the receiver (arguments) operation pattern described in “How Ruby Compiles a Simple Script” on page 34. First, Ruby pushes the receiver onto the internal YARV stack. In this example, the Fixnum object 2 is the receiver of the message/method +, which takes a single argument, also a 2. Next, Ruby pushes the argument 2, as shown in Figure 3-6.
Figure 3-6: Ruby pushes another value 2 onto the stack, the argument of the + method.
Finally, Ruby executes the + operation. In this case, opt_plus is an optimized instruction that will add two values: the receiver and the argument, as shown in Figure 3-7.
Figure 3-7: Figure 3-7: The opt_plus instruction calculates 2 + 2 = 4.
As you can see in Figure 3-7, the opt_plus instruction leaves the result, 4, at the top of the stack. Now Ruby is perfectly positioned to execute the puts function call: The receiver self is first on the stack, and the single argument, 4, is at the top of the stack. (I’ll describe how method lookup works in Chapter 6.)
Next, Figure 3-8 shows what happens when Ruby executes the puts method call. As you can see, the opt_send_without_block instruction leaves the return value, nil, at the top of the stack. Finally, Ruby executes the last instruction, leave, which finishes the execution of our simple, one-line Ruby program. Of course, when Ruby executes the puts call, the C code implementing the puts function will actually display the value 4 in the console output.
Figure 3-8: Ruby calls the puts method on the top self object.

