iA: intermediate Assembly programming language

WARNING: this library is in the design phase and is not even remotely useable.

iA (intermediate Assembly) is a programming language for writing (and dis-assembling) straightforward cross-platform assembly. It exists as a kind of "typed assembly", something more useful/powerful than traditional assembly but less powerful than a language like C. It's primary design goal is to directly and straightfowardly map to assembly while still helping the programmer avoid most footguns.

It does this with the following features:

Core Concepts

Registers and values

iA operates directly on registers, which are letters between A - T (20 registers). It can also operate on local values (V) which are stored on the return stack, and "universal" values (U) which are on the heap. W, X, Y and Z are reserved for future use (probably thread-local values and flags).

Any value can be declared "corruptable" (aka mutable) with $, which means that it can be modified within the scope declared. If it is ever declared "uncorruptable" (without $) then it cannot be modified within that scope.

You can declare a named variable and its type with:

const answer: Int = 42  \ a constant, not stored anywhere.

 A val1: Int = 41     \ register A, pre-initialized with 41, non-mutable
$B val2: Int = answer \ register B, pre-initialized with const, mutable
$V val3: Int = val1   \ mutable local value, increases return stack size
$U val4: Int = 0      \ mutable universal value, stored on heap

Functions

iA supports defining functions. Unlike many languages, function arguments must specify exactly what registers the function will be operating on (or V) and which registers should be considered "return" values of the function. Functions cannot return local values (but can modify passed-in pointers or arrays).

The simplest way to specify a function is via the #auto attribute.

\        name  inputs      outputs
#auto fn fib  (V i: Int) -> Int    do
  ... expr1 or multi-assiment statements ...
end

This function declares the following:

If the fn is #auto then the whole function must never specify specific registers (only use V inputs and locals). iA will automatically convert these to iA-reg-common standards and may replace other V used with registers for optimizations.

Some notes:

Without #auto, you may specify the specific registers to use and whether they are corruptable:

\  name  inputs        outputs   corrupts
fn fib  ($A i: Int) -> C: Int    $$BD     do
  ... expr1 or multi-assignment statements ...
end

A few notes:

Expressions

There are 3 types of what are called "single expressions" (expr1):

expr1's effectively "return" their leftmost value, so you can write something like below. Note that you can be explicit about the registers like below, or they can be inferred.

All expressions are evaluate from right to left (TODO: I think this will be changed to left->right and an optional >> left-to-right assignment operator will be added, stay tuned). Once a variable is assigned as a function input it is considered "locked" and attempts to mutate it will result in a corruption error.

(a=10) += (C c: Int = foo($A a=5, $B b = bar($A a=6)))
In the example above the A register is modified twice but is not locked until a=5, so the above code compiles. The above code is very explicit with what registers are modified and could be rewritten as:

(a=10) += foo(5, bar(a))

An example of a corruption error is below. Register B is locked with b=3 (remember, right-to-left evaluation) and then is corrupted by bar(a=1, b=2).

foo(a=bar(a=1, b=2), b=3)
From the above you can see the general rule when calling functions in iA: the functions called must be progressively simpler as they go from right to left, since more registers will be locked.

Types

iA supports the following native types:

Note that a conventional Str is just [U1].

In addition, users can define enums (named integer values) and structs composed of known-size typed fields.

Lua Metaprogramming

iA is implemented as a Lua library and it can be extended by other Lua libraries.

iA code can be defined in one of two types of files:

Symbols prefixed with # such as #auto are considered "macros". They are defined in Lua and are passed a single expression (which can be (inside, parens)). They are executed from right -> left, so #mymacro #auto fn myFn() would first execute #auto and then execute #mymacro.

Macros modify the AST directly, including: adding flags or debugigng info, reorganizing nodes, changing operations, re-organizing registers, etc. A common use for macros is to inline code. Macro expansion happens before type or corruption checking, but all macros must be valid code.

iA package API

Use local iA = require"iA" to get the iA Lua package. It has the following methods and values:

It also has the following types:

Building

When building your iA project, you typically specify the dependencies you need, which will include both Lua and iA dependencies. On the Lua side, the architecture is roughly thus:

Basically, iA is built by calling the top-most pkg:iA(), which recursively calls the rest. This is handed to the build-script, which converts the intermediate assembly to actual machine bytecode and packages it into a binary depending on the configuration.

Code

Overview of writing iA code.

Operations

iA supports the following builtin assignment operations which modify only val. All names are expr1.

See also iA-reg-common for the integer multiply and division operations.

iA supports the following compare operations, which evaluate two expressions as a boolean, which can be assigned to a register or used for control flow:

Control Flow

These are used in control flow structures. For each, the last statement is evaluated for a non-zero value to determine when to jump.


::my_location:: \ define a location

\ jump to my_location
goto my_location;

\ jump to my_location if a < b
if a < b goto my_location;

\ if-elif-else blocks
if   a == 4 do
  ...
elif B b = foo(1, 2); c <= bar(b) do
  ...
else
  ...
end

\ Similar to a C++ for loop, loops from 0 - 9
\    init,  op before end,  loop condition
loop $I i=0 then i+=1       until i<10 do
  ...
end

\ Similar to C++ while loop.
until i<10 do ... end

\ infinite loop
loop do ... end


\ similar to C switch-case.
switch i
case 0..15 do
  ...
  goto next; // explicit fallthrough
case 16    do
  ...
else
  ...
end

Register Common Practice

Below are the registers and their common usage. Note that although iA allows specifying any register as corruptable or saved; sticking to the below conventions for your public API will help most code behave faster and more cleanly.

Input/output corruptable registers A B C D. By convention these registers are used for both inputs and outputs of functions, and will therefore be corrupted.

A B D: inputs should use these registers, in this order, then non-corruptable registers or local values.

Outputs should use C A B D (in that order). These are the only registers that can be function outputs (unless the platform is constrained), additional outputs must be represented as mutable pointer inputs.

Additionally, these registers should be chosen to work with the following:

Note: On some supported architectures like the Z80 there are only 4 general purpose registers, so the rest of these will be converted to V

E F G H: corruptable registers, commonly used as additional function inputs after A B D.

I J K L M N O P: non-corruptable registers for general use. I J K are very common for loop registers.

Q R S T: corruptable registers, typically used for temporary values.

Other registers: the following are special registers and cannot be assigned to a variable name. However, they can be accessed directly.

Mod iA

intermediate Assembly

Types: mod core array Ty Var Literal Expr1 Assign Cmp CondBlock If Loc Goto Switch While

Functions

Mod iA.mod

iA submodule containing all modules

(both user-defined and native).

Modules contain their own types.

Types: array core

Mod iA.array

iA array module, containing all array types used in code.

Get or fetch an array type by calling it with the inner type.

Mod iA.core

iA default core module, containing core types and functions.

Functions

Mod iA.core

iA default core module, containing core types and functions.

Functions

Mod iA.array

iA array module, containing all array types used in code.

Get or fetch an array type by calling it with the inner type.

Record Ty

iA Type, either user-defined (i.e. struct, enum) or native.

Fields:

Record Var

Named register or memory location and its type.

Fields:

Record Literal

A literal value.

Fields:

Record Expr1

An expression which returns a single value or register. The number of items in the list will depend on the kind:

Fields:

Record Assign

Multi assignment statement. a, b, c = myfn() where:

Fields:

Record Cmp

Cmp operation between left and right

Fields:

Record CondBlock

A block of statements gated by a condition, used in If and Loop.

Note: a normal block is just a list (no type)

Fields:

Record If

if-elif-else statement (not if-goto) The list is a series of CondBlocks with an optional else block.

Fields:

Record Loc

A local #loc location statement.

Fields:

Record Goto

if cond goto to

Fields:

Record Switch

switch of do case 0 do ... default ... end Type is  map[int, list[stmt]] 0 - highest MUST be filled out.

Fields:

Record While

while cond [atend] do block end The block is stored in the While list.

Fields:

Mod iA.parse

parse intermediate assembly

Functions