Meta Circular Evaluators

This page is completely based on the 4th chapter of the famous "Abelsson and Sussman" course (book: Abelson & Sussman: Structure and Interpretation of Computer Programs - SICP).

The evaluator, which determines the meaning of expressions in a programming language, is just another program.

Meta Circularity

A meta circular evaluator for Pico is an interpreter for Pico that is written in Pico itself. How is this possible? In order to explain this, we just realize that in fact there are 2 Pico's: the first Pico is the 'real' Pico (whose interpreter was written in C, but that's not essential at all) and the 'meta circular Pico' (whose interpreter was written in 'real' Pico). The 'real' Pico is called the meta level and the 'false' Pico is called the base level Pico. Notice that 'false' is not really a well chosen word here as that evaluator is a real as the first one: it runs Pico programs. We will sometimes also talk about the implementing Pico (i.e. the 'real' Pico) and the implemented Pico (i.e. the 'other' Pico).

According to SICP, an evaluator that is written in the same language that it evaluates is said to be meta circular.

Studying meta circular evaluators for any language L is a very useful activity because:

By studying this volume of L code, you get to know L much better than by merely studying small code excerpts and standard algorithms.
By studying the code, we get a a precise definition of the semantics of L. In fact, the code for the evaluator is the only precise definition of the semantics. The only drawback of a metacircular definition is that it cannot be used in practice because the resulting evaluator is too slow.
Although the evaluator studied is written specifically for L, it contains the essential structure of an evaluator for any computer language. By studying and experimenting with one metacircular evaluator, students are acquainted to the entire discipline of interpreter building is a very accessible way.
The essence of meta circularity is that one has to understand precisely one program written in L in order to understand all programs written in L!

Structure of the Meta Circular Evaluator

The structure of a meta circular evaluator is no different from the structure of any other evaluator. The only thing special about it is that it is written in the same language it evaluates. The evaluator consists of a so called 'Read-Eval-Print'-Loop (or REPL). This is a loop that is started and terminated by the programming environment. As you might expect, the REPL consists of three parts:

Read When the user types in a program, this program is just a text, a meaningless sequence of characters. Before the core of the evaluator can do something reasonable with it, it has to check whether that sequence of characters indeed constitutes a valid program. This because many texts can be rejected as being wrong programs (e.g. because parentheses don't match). Instead of detecting this during the execution of the wrong program, it is better to detect it before beginning any execution. This checking phase is called the reading phase of the REPL. The reader takes a piece of text that is expected to meet the rules of the syntax of the language. If the text does not meet the definition of the syntax, an error is generated. Otherwise, the reader transforms the text meeting the concrete syntax into a tree data structure, the so-called parse tree. This tree represents the structure of the program and no longer contains surface syntax like parentheses. The reader for pico consists of two files representing the scanner (or lexical analyzer) and the parser (or structural analyzer). The scanner transformes the program text into a stream of 'tokens' such that all irrelevant information such as whitespaces and carriage returns have been removed. The parser consumes these tokens and tries to find the structure linking the tokens in order to build up the parse tree.
Eval After reading a string, we obtain a parse tree which is of course much easier to process than the original string (after all, that's the reason why we parse in the first place). Therefore, evaluating the program boils down to performing a tree recursive process on the parse tree. Hence, the evaluator is a tree recursive process that consumes a parse tree and returns a Pico value. Each step in this recursive process takes one node in the parse tree, dispatches over the kind of node (is it a definition, an assignment,...) and undertakes the appropriate steps to process that particular kind of node. In many cases, this will require recursive calls to the evaluator. The result of the recursive evaluation procedure is called a Pico value.
Print After calling the evaluator, the printer is called. The printer can be considered the inverse of the reader. It takes a Pico value (a tree) and prints the value on the screen in a form that makes sense to the Pico programmer.

Back to the beginning !

This page was made (with lots of hard work!) by Wolfgang De Meuter