Stack-C

Stack-C is a stack based language designed with the following properties: There is not a clear design philosophy that was followed. It was developed along side the C-compiler and was influenced by some pragmatic decisions. Initial it was thought of as a (rather Spartan) low-level programming language and it contains some constructs that are not generated by the C-compiler. So, it is rather arbitrary why it does have support for short-circuit logical 'or' and 'and', while these can be simply emulated with an if-statement on one hand, while on the other hand it has no support for for and switch-statements.

Keywords

The following C keywords are used:

Comments

Lines starting with the # character are considered comments and copied verbatim to the output.

Values

An integer value (following C syntax) will push that value on the top of the stack.

A single quoted character will push the value of the character on the stack. The following escape sequences are recognized: \0, \n, \r, and \t.

A double quoted string will push the address to string constant on the stack. Repeated string contant will point to the same address. The same escape sequences as for a character are recognized.

An identifier (not used in another context) will add the value associated with the identifier on the top of the stack. For variables the value is the address location and for functions it is the start address of the function.

Stack operators

The following operators work like the C-operators taking the top two values from the stack and push the result on the stack (thus making the stack one value less high). They presume that the values represent unsigned integers.

+ - * / % & | ^ << >> == != < <= > >=

The following operator are similar to the above, except from that they presume that the value represent signed integers.

/s <s <=s >s >=s

The following operator are similar to the C operator but operate on the top value (replacing it).

~ !

The following special operators are defined:

Statements

Constant definition

A constant definition starts with the keyword const followed by an identifier and an integer constant.

Function definition

A function definition starts with the keyword void followed by an identifier. If it is followed by either a ; character, to indicate a forward declaration of a function, or by a code block. The curly brackets { and } are used for block definitions.

Global or local variable definitions

A global or local variable definition starts with the keyword int followed by an optional positive integer and an identifier. The optional integer specifies the size in multiples of four bytes. When a variable definition occurs within a block it is considered to be local to that block.

Static variable definitions

These are like local variable definitions except that they start with the keyword static.

If statement

The if statement starts with the if keyword and uses the top value on the stack to determine of the following block will be executed, or, in case that block is followed with the else keyword the block following that will be executed.

Do statement

The do statement starts with the do keyword followed by a block. Within a do statment, the keywords break and continue may be used to indicate a jump out of the loop amd start of the loop respectively.

Logic and and or statements

The logic and and or statements start with respectively && and || followed by a code block (in order to implement the short-circuit functionality). Whether the block will be executed depens on the top value of the stack, which will be popped in case the following block is executed. For this reason && { is equivalent with $ if { ; and || { with $ ! if { ;.

Return statements

The return statements consists of the return keyword. The stack is not affected.

Goto statement

The goto statement starts with the goto keyword followed by an identifier representing a label. Labels are defined by : followed by an identifier. The label definition may occur before or after the goto statement.

Stack-C compiler

The Stack-C compiler is implemented in stack_c.c. It produces output for the M1 assembler. The contents of the file stack_c_intro.M1 is copied to the start of the output, which does introduce the labels ELF_text, _start, f_sys_int80, f_sys_malloc, and SYS_MALLOC. The compiler also does generate some new labels, such as ELF_end and as described below. For all the global variables, the compiler produces labels prefixed with l_ and for all the functions labels prefixed with f_.

For labels used inside functions the compiler produces labels of the form l_%s_%s, where the first %s is replaced with the function name and the second with the label name. (This could lead to a problem, if, for example, there is a function with the name func_a_x with a label b and a function with the name func_a with a label x_b.)

For implementing the various language constructs, it will introduce labels of the following forms, where %s is replaced by the function name and %d by a unique integer value (for the function):

For each static variable, it will introduce a label of the form static_%d_%s, where where %d is replaced by a unique number and %s by the name of the static variable. (Note that it is possible to have several static variables with the same name in a single function.)

For each constant string, it will introduce a label of the form string_%d, where the %d is replaced by a unique number.

x86 implementation

The implementation makes use of two stack. One stack contains the temporary values used for the evaluation of expressions, including arguments passed to functions and results returned by functions. For this the normal stack is used. The other stack is used for local variables and also the return address for functions being called. This is stored in the ebp register. All local variables are given a positive index with respect to the second stack pointer. The stack pointer is moved on function call and exit. Before main is called, 100.000 bytes are allocated for it.

Stack-C interpreter

The Stack-C interpreter is implemented in stack_c_interpreter.c. This interpreter was primarily developed for debugging purposes, not for being fast. It keeps track of what kind of data is stored in a memory location, whether it is a value or whether it is a pointer to memory location, to a function, or to a constant string. It also performs range checking by representing each pointer as pair of a pointer to the start of a memory range (or string constant) and an index. It also generates errors when 'illegal' operations are performed and warnings when trick operations are performed, such as, for example, comparing pointers from to different memory locations, because the order could be defined by the implementation of the memory allocation function. When an error is reported the contents of the stack and the call stack is printed. Comment lines that consist of a file name followed by a number are interpreted as meaning references to source files. Information from these is used in the call stack.

The interpreter assumes that all pointers are stored at 'word' (four byte) offsets.

The interpreter only has support for a limited number of system calls, namely:

There have been plans to incorporate a debugger to allow inspection of values on the stack and the values of variables.


Home