Writing a Chip 8 Emulator (Part 2)

In a previous post, I wrote about the basic CPU structure needed to emulate a Chip 8. I also wrote about memory access, and how to define bytes and words. In this post, I will talk about how to interpret Chip 8 instructions, as well as how to set up a main emulator loop.

The Execution Cycle

Every computer executes a simple loop over and over again while it is powered up. This loop consists of the following steps:

  • Fetch
  • Decode
  • Execute

During the fetch portion of the loop, the CPU will grab the contents of memory where the Program Counter is currently pointing. This memory content indicates the instruction that the CPU should execute. For example, let’s say that the program counter points at memory location $0200, and that the data $1100 is stored there (technically, $11 is stored in location $0200, and $00 is stored in location $0201). The fetch portion of the cycle will load the CPU with the instruction $1100 (more on what this instruction actually does a little later).

The decode portion of the loop translates the instruction into a form that the CPU understands. Each instruction has a specific number of operands that are associated with it. These are essentially parameters that describe what registers or memory locations the instruction should actually be executed on. In the case of the Chip 8, the operands are all part of the 16-bit word that is read from the Program Counter. For example, with the instruction $1100, the $1 indicates that the instruction is a JUMP instruction, while the 100 is the operand that relates to the address that the CPU needs to jump to.

The execute portion of the loop actually executes the instruction on the operands that were previously decoded. In the case of the decoded JUMP instruction $1100, the CPU will advance the program counter to the memory location $100.

The computer will continue performing this loop over and over again until the power is turned off. Now that we understand what we need to do in the main loop, let’s look at how we decode the different instructions.

Anatomy of a Chip 8 Instruction

Chip 8 instructions are actually really nice to work with. Each instruction that the CPU can execute is exactly two bytes long, making them a 16-bit word in length. All of the data needed to execute the instruction is stored within this 16-bit word. Let’s take a look at a simple instruction and dissect how the operation works.

The Jump Instruction

The Chip 8 JUMP instruction tells the CPU to move the Program Counter to the address specified. The JUMP instruction is encoded as follows:

1nnn

The n characters are used to denote 4-bit numbers. When combined, the numbers form a 12-bit address that the CPU should jump to. For example, if the CPU encountered the following instruction:

1234

The CPU would set the program counter to $234. During the next loop of the CPU, the next instruction would be fetched from location $234.

The Load Operation

Let’s look at another example. The LOAD instruction tells the CPU to load a specific number into a register. The LOAD instruction is encoded as follows:

6snn

The s character is used to denote a 4-bit value that specifies a register to store the number in. The N characters denote 4-bit numbers that, when combined, indicate the 8-bit value that should be loaded into that register. For example, if the CPU encountered the instruction:

6248

The CPU would load the value $48 into register 2.

Chip 8 Operands

We can start to see patterns in the way a Chip 8 instruction is encoded. The first hex digit of the instruction usually provides a hint at what major operation is about to occur. The next three hex digits encode numeric information, or the registers that the operations work on. Here is a mostly complete set of Chip 8 instructions:

Opcode Operands Description
00E0 0 Clear the screen
00EE 0 Return from subroutine
1nnn 1 Jump to address nnn
2nnn 1 Call routine at address nnn
3snn 2 Skip next instruction if register s equals nn
4snn 2 Do not skip next instruction if register s equals nn
5st0 2 Skip if register s equals register t
6snn 2 Load register s with value nn
7snn 2 Add value nn to register s
8st0 2 Move value from register s to register t
8st1 2 Perform logical OR on register s and t and store in t
8st2 2 Perform logical AND on register s and t and store in t
8st3 2 Perform logical XOR on register s and t and store in t
8st4 2 Add s to t and store in s – register F set on carry
8st5 2 Subtract s from t and store in s – register F set on !borrow
8s06 1 Shift bits in register s 1 bit to the right – bit 0 shifts to register F
8s0E 1 Shift bits in register s 1 bit to the left – bit 7 shifts to register F
9st0 2 Skip next instruction if register s not equal register t
Annn 1 Load index with value nnn
Bnnn 1 Jump to address nnn + index
Ctnn 2 Generate random number between 0 and nn and store in t
Dstn 3 Draw n byte sprite at x location reg s, y location reg t
Ft07 1 Move delay timer value into register t
Ft0A 1 Wait for keypress and store in register t
Fs15 1 Load delay timer with value in register s
Fs18 1 Load sound timer with value in register s
Fs1E 1 Add value in register s to index
Fs29 1 Load index with sprite from register s
Fs33 1 Store the binary coded decimal value of register s at index
Fs55 1 Store the values of register s registers at index
Fs65 1 Read back the stored values at index into registers

Bit Shifting and Bit Masks

So, when writing an emulator, how do you separate out the various operands? The simple way is through bit masks and bit shifting. A bit mask is a binary pattern, usually used to turn certain bits on or off, or used to determine if a particular bit pattern is activated. Bit shifting is where the bits in a pattern are moved left or right by a certain number of spaces. Let’s work through a simple mask and shift example.

Let’s say that we had the hex number $23. In binary, this would be the following:

0010 0011

The 0010 corresponds to the 2, while the 0011 corresponds to the 3. Now let’s say you want to get just the hex value of the first digit. What we do is apply a bit mask with a logical AND operation, and shift the bits.

To break this down into a few steps, first what we want to do is isolate the high 4-bits in the pattern, and blank everything else out. We do that by applying a bit mask that has all ones in the high 4-bits like so:

1111 0000

That number in hex is $F0. Now that we know the mask value we need to apply, the next thing we do is apply a logical AND operation to $23 and $F0. The way an AND works is if both digits are a 1, then a 1 is the result, otherwise, the value is a 0. When we apply the AND:

0010 0011
1111 0000
--------- &
0010 0000

The result we are left with is 0010 0000, which is the hex value of $20. Now that we have blanked everything else out, we perform the second step, which is the logical shift. We want to move all of the bits of the result 4 places to the left. In a language like C, this is done with the >> operator.

0010 0000
--------- >> 4
0000 0010

This leaves us with 0000 0010 – the hex value of $2, which is exactly what we wanted!

Shifting and Masking Chip 8 Instructions

Let’s take a look at the Chip 8 instruction to move a value of one register into another. This corresponds to the instruction 8st0. This says to move the value in register s into the register t. An example of this would be 8620, which would make the Chip 8 CPU move the value in register 6 into register 2.

The first thing we need is the value for s. Using a bit mask, we want the following:

1000 0110 0010 0000
0000 1111 0000 0000
------------------- &
0000 0110 0000 0000

Once again, the magic hex value is $F for the 4-bit pattern 1111. The same operation in hex gives us:

$ 8 6 2 0
$ 0 F 0 0
  ------- &
$ 0 6 0 0

This gives us the result $0600. Now we need to shift the bits 8 places to the left this time:

0000 0110 0000 0000
------------------- >> 8
0000 0000 0000 0110

The same operation in hex gives us:

$ 0 6 0 0
  ------- >> 8
$ 0 0 0 6

So now we know that the s value should be $6! We do the same thing for the t register, except this time we only need to perform a shift of 4 places to the left. The hex operations are as follows:

$ 8 6 2 0
$ 0 0 F 0 
  ------- &
$ 0 0 2 0
  ------- >> 4
$ 0 0 0 2

This tells us that the t value should be $2!

The key points to take away here are:

  • Groups of 4-bits correspond to a single hex number. For example, 1001 corresponds to the hex digit $9.
  • Bit masks allow you to figure out specific bit patterns in a larger bit sequence.
  • When performing bit shifting hex values, always shift in multiples of 4. So if you want to get the $6 in $8620, you need to shift 2 hex digits to the left, which is 8 binary bits to the left (2 x 4).
  • If you are decoding the JUMP instruction, which is 1nnn, you don’t need to shift anything! Just apply the bit mask of $0FFF and you have the address you need to jump to.

Conclusion

In this post, we looked at the execution loop, and examined how Chip 8 instructions are encoded. We also looked at bit shifting and bit masks, and how those two simple tools can be used during the decoding phase. In a future post, I’ll look at some of the Chip 8 instructions in more detail, as well as how to code the execution loop in a language like C.

OpenStreetMap and PostGIS

Nearly all of the mobile devices that are available for purchase these days have GPS location services baked into them. In particular, running apps are interesting, since they can plot your position over time, calculate your pace, and show you where you were fast or slow along your route. I’m also particularly fond of what was once called Google Local. Being able to find restaurants near my location has been quite a handy feature when out travelling in new locations.

I have a few ideas surrounding mobile applications and GPS services, so I wanted to check out what kind of data is freely available. While the Google Location API is available for development on the Android platform, I wanted to know more about freely available alternatives, as well as get to know more about GIS in general.

Advertisements

OpenStreetMap

The OpenStreetMap project is a community driven project that attempts to provide location data all over the world. Contributors with GPS devices tag various locations, and can add them to the database. The database is released under the Open Data Commons Open Database License. Warning: I am by no means a lawyer, so you may want to read the text of the license yourself so that you understand it. The license for the OpenStreetMap database allows you to use it for commercial and personal use, but requires you to attribute the source, and release any modified version of the database back to the community. For me this is perfect, since it lets me experiment with various applications and GIS data without having to think about commercial licenses, closed APIs, or commercial data sources.

Enter PostGIS

The first question was: what database system is required to load and access all of that data? Some form of SQL database was favorable, since I know how to run SQL queries, and configuring PostgreSQL or MySQL is in my wheelhouse. As it turns out, there is an extension for PostgreSQL called PostGIS which is designed specifically for storing geographical and spatial objects within a database. Perfect.

Note: most of the installation instructions below were culled from a variety of sources such as here, here, and here. These sources detail how to set up a fully-fledged tile server, which will allow you to draw maps from the data contained within the database. In this post, I’m just interested in the PostGIS database, and loading OpenStreetMap data. I’ll talk about tile servers in a future post.

As it turns out installing PostGIS is incredibly easy under Ubuntu 14.04. Assuming that you don’t have a PostgreSQL server installed yet:

sudo apt-get install postgresql-9.3 postgresql-server-dev-9.3 postgresql-contrib-9.3 postgresql-9.3-postgis-2.1

Once installed, you will need to set up a database, and load some extensions into that database. To do that:

sudo -u postgres -i
createuser gis
createdb -E UTF8 -O gis gis
psql -f /usr/share/postgresql/9.3/contrib/postgis-2.1/postgis.sql -d gis
psql -f /usr/share/postgresql/9.3/contrib/postgis-2.1/spatial_ref_sys.sql -d gis
psql -f /usr/share/postgresql/9.3/contrib/postgis-2.1/postgis_comments.sql -d gis
echo "alter table geometry_columns owner to gis; alter table spatial_ref_sys owner to gis;" | psql -d gis
echo "create extension hstore;" | psql -d gis

The sudo command switches users to the postgres user so that we can run the database commands. The createuser command will create a gis user in the database, while the createdb command will create a database named gis and add the gis user to it. The commands that follow set up the necessary extensions to store GIS information in the database. It will also enable the hstore extension, which I’ll talk about a little later on in this post.

Importing OpenStreetMap Data

With the database ready to go, we need to load it with some actual data. You can download the entire world map at a whopping 35 GB from the Planet OSM site. However, if you are like me, you just want to experiment a little with GIS data, and probably only want a small portion of the database to start with. Geofabrik maintains a site with much smaller chunks of the world data. For example, I was able to download British Columbia map data separately, at the cost of 500 MB. You will want to download the data in the .osm.bz2 format.

While you are downloading the data for your region, you will also need to download and install the osm2pgsql importer. This open source tool is responsible for importing from the downloaded file format into your new PostGIS database. You can either build the tool from source, or if you are like me, simply install the package:

sudo apt-get install osm2pgsql

With everything downloaded, it’s time to actually run the import:

sudo -u postgres -i
osm2pgsql -c -d gis -U gis -W -C 2048 --hstore --slim ~/british-columbia-latest.osm.bz2

This runs the actual import. There are several options specified on the osm2pgsql command – let’s break down what is happening:

  • -c tells the tool to clear out the database before running the import.
  • -d lets us specify the database we want to load into, in this case the database gis.
  • -U specifies what user we should use during the import, in this case, we’re using the gis user we created before.
  • -C tells the system to use 2 GB of RAM for caching purposes. If you’re importing bigger data sets, you may need to make this larger.
  • --hstore tells the import to create an additional tags column (I’ll talk more about this below).
  • --slim tells the import to use database tables to store temporary data. This slows down the import, but is good for machines with little RAM.
  • ~/british-columbia-latest.osm.bz2 is the path to the actual data to load.

The tool will probably take some time to finish. On my Intel Dual Core i5 with 4 GB of RAM and a slow disk, this took about 64 minutes to complete. Once the process is over, I strongly recommend creating indexes for the tags on each of the tables. You can do that with the following commands:

sudo -u postgres -i psql gis
CREATE INDEX idx_planet_osm_point_tags ON planet_osm_point USING gist(tags);
CREATE INDEX idx_planet_osm_polygon_tags ON planet_osm_polygon USING gist(tags);
CREATE INDEX idx_planet_osm_line_tags ON planet_osm_line USING gist(tags);
\q

The command simply logs into the gis database, and creates indexes for the tags column on the tables planet_osm_point, planet_osm_polygon and planet_osm_line.

Querying the Database

Alright, with data in the database, now it’s time to actually run some queries on the OpenStreetMap data. The planet_osm_point table contains useful information relating to various points of interest. The schema for the planet_osm_point table contains quite a few fields such as:

  • amenity
  • barrier
  • bicycle
  • foot
  • horse
  • highway

Each of the fields can have a wide variety of different values associated with it. For example, the amenity field can take on values such as:

  • restaurant
  • school
  • library
  • ferry_terminal
  • fuel
  • dentist

Luckily, the OpenStreetMap project maintains an online wiki with all of this information described in it. For example, they have a description of the amenity field which defines what values will be found on it. We can use this information to write some simple SQL statements. For example, say you wanted to find libraries in the database:

SELECT name, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext
FROM planet_osm_point
WHERE amenity='library';

Or maybe you wanted to find restaurants:

SELECT name, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext
FROM planet_osm_point
WHERE amenity='restaurant';

This will return records that are considered to be restaurants, showing the name (if it has one), and the coordinates where it can be found. That’s good, but let’s say that you are only interested in finding pizzerias. Trudging through all of the restaurant data looking for what seems to be a pizzeria would take too long. This is where additional tags come into play.

Tags and the HSTORE Column Type

Additional information for each point is stored in the tags column. Each point may have one or more tags associated with it, describing things such as address, hours of operation, phone number, etc. The tags of each point are stored as an HSTORE column. The HSTORE column type is simply a set of keys and their associated values normally called a key, value pair. A single record may have only one entry with the specified key, but may have many key, value pairs. A typical key, value pair is specified as follows:

KEY=>VALUE

There are several helpful functions in PostgreSQL that help you deal with HSTOREs. For example, to check if the HSTORE column tags contains a phone key, regardless of its value:

tags ? 'phone'

To check if the HSTORE column tags has a key of cuisine and value of fast_food:

tags @> 'cuisine=>fast_food'

Queries with Tags

Putting together the tags column and some common PostgreSQL syntax results in some more selective queries. For example, to find all the restaurants that are pizzerias:

SELECT name, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext
FROM planet_osm_point
WHERE amenity='restaurant' AND tags @> 'cuisine=>pizza';

You have to be careful however, since the restaurant classification is only one of many food related classifications in the amenity ontology. For example, there is also:

  • pub
  • ice_cream
  • cafe
  • food_court

I was surprised that Starbucks was under a different amenity type – cafe. At first, I was restricting my search to those elements that had the cuisine key, with coffee_shop as the value:

SELECT name, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext
FROM planet_osm_point 
WHERE amenity='cafe' AND tags @> 'cuisine=>coffee_shop' AND name='Starbucks';

Note however, that not all Starbucks have a cuisine tag associated with them, so sometimes you may have to modify your query to be less specific. For example, you can see here the difference in number of records returned when we search just for the name, and when we start to add more restrictions on the query:

gis=# SELECT COUNT(*) 
FROM planet_osm_point 
WHERE name='Starbucks';
 COUNT
-------
   122
(1 ROW) 
 
gis=# SELECT COUNT(*) 
FROM planet_osm_point 
WHERE name='Starbucks' AND amenity='cafe';
 COUNT
-------
   121
(1 ROW)
 
gis=# SELECT COUNT(*) 
FROM planet_osm_point 
WHERE name='Starbucks' AND amenity='cafe' AND tags @> 'cuisine=>coffee_shop';
 COUNT
-------
    25
(1 ROW)

There is quite a bit of information available as tags. For example, perhaps you’re interested in finding restaurants that have phone numbers associated with them in the database:

SELECT name, tags, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext
FROM planet_osm_point
WHERE amenity='restaurant' AND tags ? 'phone';

Or maybe you are looking for a particular restaurant along Yates Street:

SELECT name, tags, ST_AsText(ST_Transform(way,4326)) AS pt_lonlattext
FROM planet_osm_point
WHERE amenity='restaurant' AND tags @> 'addr:street=>Yates\ Street';

As you can see, it’s really simple to perform some basic queries.

Conclusion

In this post, we looked at how to set up a PostGIS database to store OpenStreetMap data. We also looked at how to perform some basic queries using the provided fields, as well as tags. In a future post, I will discuss tile servers, and how to serve actual map images.