Compiling a Call to a Block
I've started working on a new edition of Ruby Under a Microscope that covers Ruby 3.x. I'm working on this in my spare time, so it will take a while. Leave a comment or drop me a line and I'll email you when it's finished.
This week's excerpt is from Chapter 2, about Ruby's compiler. Whenever I think about it, I'm always suprised that Ruby has a compiler like C, Java or any other programming language. The only difference is that we don't normally interact with Ruby's compiler directly.
The developers who contributed Ruby's new parser, Prism, also had to rewrite the Ruby compiler because Prism now produces a completely different, redesigned abstract syntax tree (AST). Chapter 2's outline is more or less the same as it was in 2014, but I redrew all of the diagrams and updated much of the text to match the new AST nodes and other changes for Prism.
Chapter 2: Compilation
| The Ruby Compiler | 4 | 
| Ruby 3.2 Introduces a Just-In-Time (JIT) Compiler | 5 | 
| How Ruby Compiles a Simple Script | 6 | 
| Scope AST Nodes | 7 | 
| Compiling a Simple AST | 8 | 
| Compiling a Call to a Block | 12 | 
| How Ruby Iterates Through the AST | 16 | 
| Experiment 2-1: Displaying YARV Instructions | 19 | 
| The Local Table | 21 | 
| Compiling Optional Arguments | 23 | 
| Compiling Keyword Arguments | 24 | 
| Unnamed Local Variables | 25 | 
| Experiment 2-2: Displaying the Local Table | 28 | 
| Summary | 30 | 
Compiling a Call to a Block
Next, let’s compile my 10.times do example from Listing 1-1 in Chapter 1 (see Listing 2-2).
10.times do |n| puts n end
Notice that this example contains a block parameter to the times method. This is interesting because it will give us a chance to see how the Ruby compiler handles blocks. Figure 2-13 shows the AST for the 10.times do example again.
Figure 2-13: A simplified view of the AST for the call to 10.times, passing a block
The left side of Figure 2-13 shows the AST for the 10.times function call: the call node and the receiver 10, represented by integer node. On the right, Figure 2-13 shows the beginning of the AST for the block: do |n| puts n end, represented by the block node. You can see Ruby has added a scope node on both sides, since there are two lexical scopes in Listing 2-2: the top level and the block. Let’s break down how Ruby compiles the main portion of the script shown on the left of Figure 2-13. As before, Ruby starts with the first PM_NODE_SCOPE and creates a new snippet of YARV instructions, as shown in Figure 2-14.
Figure 2-14: Each PM_SCOPE_NODE is compiled into a new snippet of YARV instructions.
Next, Ruby steps down the AST nodes to PM_CALL_NODE, as shown in Figure 2-15.
Figure 2-15: Ruby stepping through an AST
At this point, there is still no code generated, but notice in Figure 2-13 that two arrows lead from PM_CALL_NODE: one to PM_INTEGER_NODE, which represents the 10 in the 10.times call, and another to the inner block. Ruby will first continue down the AST to the integer node and compile the 10.times method call. The resulting YARV code, following the same receiver-arguments-message pattern we saw in Figures 2-7 through 2-11, is shown in Figure 2-16.
Figure 2-16: Ruby compiles the 10.times method call.
Notice that the new YARV instructions shown in Figure 2-16 push the receiver (the integer object 10) onto the stack first, after which Ruby generates an instruction to execute the times method call. But notice, too, the block in <main> argument in the send instruction. This indicates that the method call also contains a block argument: do |n| puts n end. In this example, the arrow from PM_CALL_NODE to the second PM_SCOPE_NODE has caused the Ruby compiler to include this block argument. Ruby continues by compiling the inner block, beginning with the second PM_CALL_NODE shown at right in Figure 2-13. Figure 2-17 shows what the AST for that inner block looks like.
Figure 2-17: The branch of the AST for the contents of the block
Notice Ruby inserted a scope node at the top of this branch of the AST also. Figure 2-17 shows the scope node contains two values: argc=1 and locals: [n]. These values were empty in the parent scope node, but Ruby set them here to indicate the presence of the block parameter n. From a relatively high level, Figure 2-18 shows how Ruby compiles the inner block.
Figure 2-18: How Ruby compiles a call to a block
You can see the parent PM_NODE_SCOPE at the top, along with the YARV code from Figure 2-16. And below that Figure 2-18 shows the the inner scope node for the block, along with the YARV instructions for the block’s call to puts n. Later in this chapter we’ll learn how Ruby handles parameters and local variables, like n in this example; why Ruby generates these instructions for puts n. The key point for now is that Ruby compiles each distinct scope in your Ruby program—methods, blocks, classes, or modules, for example—into a separate snippet of YARV instructions.

