I need to take a quick segue here to talk about the toolchain I’ve been using to write the GPU code for Veronica. As I bring up the GPU to an ever-more-useful state, it’s starting to need large quantities of fairly complex code. Since any kind of high-level language is out of the question in this situation, I need to use the next best thing- macro-based assembly. Macro assemblers have been around almost as long as computers, because they’re a very easy layer to add on top of a normal assembler that multiplies programmer efficiency. Macros can be very powerful indeed, and at times can make it feel like you’re coding in C or another high-level language. Of course, you have to keep in mind what you’re really doing (copying and pasting code around), and proceed with care at all times.
Atmel’s official development environment for AVRs includes a macro assembler, and I’m sure it’s swell. However, it’s also Windows-only. If, like me, you’re using all open-source cross-platform tools, that means using avr-gcc. The GNU toolchain for AVR isn’t too shabby all things considered, but one thing it is missing is macros in the assembler (avr-as). The GNU build process relies on the C Preprocessor (CPP) for this sort of thing, and it’s fine to a point. It works great for defining constants, including other files, that sort of thing. However, CPP is not designed to respect whitespace, since it’s intended for a language (C) which doesn’t care about it. Assemblers generally do care about whitespace however, so we need a macro system that will preserve whitespace and do other things that are needed for the output to be assemble-able by avr-as.
So, let’s start with some sample code to illustrate the problem I’m trying to solve. One of the things I do a lot of in the GPU code is enable and disable access to the VRAM data bus. This involves flipping a few bits in registers, and needs to be done carefully. However, it’s also the same process everywhere, so it’s a perfect candidate for a macro. Here’s a GPU function to plot a red pixel:
Those three oddball lines of code (EnableVRAMWrite, etc) are macros that expand into several lines of assembly which set all the register bits and control states needed for the GPU to do the task in question. This makes the code much more readable, and greatly reduces error. Furthermore, if I change the control signals or something else that affects this process, I can just update the macro instead of having to find all the places the GPU touches VRAM. It’s like getting all the benefits of a high-level language function call, although the code can get bloated if you’re not careful.
Clearly, the code above would not compile under avr-gcc or avr-as, so how am I doing this?
The secret is to step up from CPP to GNU’s “m4 “, which is the 800 pound gorilla of code preprocessors. If m4 can’t do the preprocessing you need, then it can’t be done. It can easily be configured to preserve whitespace during macro substitution, and has many crazy features (loops, anyone?) that CPP can only dream of. Even better, if you have a un*x system of any flavor, chances are you have m4 already. All I needed to do was set up a makefile that would incorporate m4 into the build, so that it processes all the files before the GCC assembler sees them.
A quick caveat- I am far from an expert on Make, which is a very sophisticated tool in its own right. So forgive me if there are better ways to implement what I’m doing here. However, this works for me.
Here’s a sample makefile for my process:
As you can see (or not, depending on your ability to read makefiles 🙂 ), I’ve introduced a new intermediate file format called .m4S, which is a .S assembly file that has been run through m4. The m4S files are then fed to GCC for assembly. Conveniently, GCC still runs the files through CPP first, so I can still use #defines and other convenient CPP constructs.
Here’s the m4 file that is parsed to create the macros shown in the sample code above:
m4 code can be a bit cryptic, but if you’ve used any shell scripting languages, or something like YACC or Bison, it will feel somewhat familiar. I had never used it before, but got started by skimming the well-written manual.
With everything set up as you see here, I can use CPP constructs (such as #defining register r16 to “accum” in the code sample above), and mix in m4 macros transparently. I’m very happy with this setup, and it has greatly accelerated my GPU code development. I can build from the command line, using these commands:
I can also build, install, and run from within Xcode (my preferred development environment), which supports makefiles rather well.
When I got started with AVR development, I found it difficult to find information on setting up a good build pipeline using cross-platform tools. In particular, good makefiles for avr-gcc are hard to find online. I hope this information helps you out a bit!
Stay tuned for more exciting action from Veronica!