Do you know “Human Resource Machine”?
If not: HRM is a puzzle game about writing assembler code. Have a look at it, it is definetely worth a try! πŸ™‚
In this post I’ll show you the compiler I made for it.

Table of contents

The Game

In HRM, you have control over a tiny office worker that is processing input values which are transported to the office by a band conveyor. However, you do not control the worker directly. Instead, you specify it’s behavior by assembling action tiles to an algorithm.
Your task as the player is to make the worker do the desired actions. Also, you can unlock achievements if your code does not consist of more instructions than desired or if the number of process cycles needed does not exceed a critical value.

That’s how the game looks like.

While the game mechanics are all drag ‘n’ drop and there is no real “writing” of code by default, it is possible to copy-and-paste your assembler code from within the game to an editor of your choice, where all the instruction tiles are converted to text.
It also works vice versa: it is possible to write your code in an editor and copy it to the game that transforms the text into instruction tiles.

The Idea

While I enjoyed optimizing the assembler code, at some point I realized that there were many operations (such as multiplying two numbers) that couldn’t get any better but were required over and over again, which made the last “puzzles” of HRM a little boring to me.
In addition to that, the in-game “editor” does not show many lines of code which makes it very hard to stay on top of things, especially in combination with the assembler code, which is confusing by design πŸ˜€

When reaching the last level – a sorting algorithm – I decided to write a compiler that would generate assembler code that I then could copy into the game.

Choice Of The Language

At first, I thought about writing a C compiler, but I quickly dropped the idea of using a language that already existed.
The main reason for that was the fact that HRM does not support the usage of literals in the code. If you really need a reference value for certain task, it will be given to you in form of a value that lies on the floor and can be used by the worker.

I could of course have generated reference values by subtracting the first input value from itself and incrementing the resulting zero a defined amount of times, but that seemed to be a little too ambitious to me.

So, after considering that, I decided that I would have to create my own programming language that would fit exactly the needs of HRM.

The Architecture

The architecture of the HRM worker and it’s office is quite simple.

The floor of the office consists of an array of tiles with unique numbers, which represent the RAM of the office.
The worker can carry one value (represented by little tiles/boxes) at a time making his hands his accumulator.

The worker’s instruction set is also pretty small, as the following table shows:

INBOX Load the next value from the inbox to the accumulator
OUTBOX Write the value of the accumulator to the outbox, clear the accumulator
COPYFROM [tile] Load the value from [tile] to the accumulator
COPYTO [tile] Write the value of the accumulator to tile [tile]
BUMPUP [tile] Increment tile [tile], load new value to accumulator
BUMPDN [tile] Decrement tile [tile], load new value to accumulator
ADD [tile] Add the value of [tile] to the accumulator
SUB [tile] Subtract the value of [tile] from the accumulator
JUMP [label] Jump to [label]
JUMPZ [label] Jump to [label] if the value of the accumulator is zero
JUMPN [label] Jump to [label] if the value of the accumulator is negative
[tile] within [ ] Refers to the tile that identifier matches the value of [tile]

Language Specification

I wanted the language to fit the needs of HRM. Because the tasks keep relatively simple, it would generate a lot of unnecessary overhead to make the language very abstract. Therefore, I wanted to keep some primitive instructions like “put a value to a specific field”.

What I really wanted to have was a way to compare the values of two tiles and use that information for branching, like if-statements or the break condition of loops.

Another thing that really grinded my gears was multiplication:
The multiplication of two values is an often needed task, but it needs 3 slots of memory and eleven lines of assembler code. So this needed to be shortened.

The full list of requirements then looked like this:

  • loops
  • if/then/else – structures
  • boolean evaluations (<, >, =)
  • arithmetic operations (multiplication, addition, subtraction)
  • some basic instructions
    • increment
    • decrement
    • put a value to a field
    • read a value from a field
    • read from input
    • write to output
  • comments

An interesting aspect of the language design was the fact that I would not enable the use of literals.
Without literals, numbers were no longer reseved for that. Instead, I could just use them to indicate the reference to a certain floor tile.
So, a statement like 16 = 7 + 1 does not contain any of the values 16, 7 or 1; what this line means is that the result of the addition of the values at the tiles 7 and 1 shall be written to tile 16.
Analogously, if 12 < 7 is not necessarily always false. πŸ˜‰

That is what the code looks like

If you are interested in the remaining details of the language, you can read the full specification at the GitHub-repository.

Results And Limitations

The compiler actually works pretty well and the benefits become quite clear if you compare the length and complexity of the resulting assembler code with the original source code.
Yet, there are a couple of limitations.

An example of bad optimization

First of all, from an optimization point of view the compiler output is quite scary.
Resulting from the fact, that if/then/else statements always have to have then and else, there are a couple of sequences where a label is surrounded by two jump instructions, making the label absolutely unnecessary.

Also, at the last level (the sorting algorithm), the compiler generated about twice as many instructions as needed to unlock the achievement and it took also about twice as long to execute the code. Of course, this could also be due to my poor algorithm (I just implemented a bubble sort algorithm), but I don’t think that’s all.

Another limitation of the compiler is the fact that it can barely be used for one of the earlier levels, since it might need instructions that are not unlocked for those levels.

A third problem are alphanumerical inputs, since not all instructions work on letters like they would work on numbers.
The SUB-instruction for example does not work on mixed values(you cannot do A - 7 or similar tasks).
In my particular case, when scanning the input stream for zeros to find the end of a string, I could not use my If 23 = 24 structure (where tile 23 contained the last input value and 24 contained a zero), because that would be compiled to a structure that subtracted 24 of 23, followed by a JUMPZ. Since this does not work on letters, I had to dig into the compiler output and remove the SUB 24-statement.
In this particular case, the code will still work, because JUMPZ also compares with zero AND can handle letters, but that hack only works if the array is terminated by zeros. So this is definitely a limitation of the compiler.

Another annoying thing is the lack of variable names. Since there are no literals, this could be done relatively easy by an search-and-replace preprocessor or a similar step, but I did not implement it yet, so I had to keep in mind what tile contains what value, which also is an obstacle when trying to focus on the code.

The Implementation

I implemented the compiler in python, the code can be found on GitHub.