Modular Monitor: Dynamic Dispatch and Code Cleanup

Jumping Through Tables For Your Petty Amusement

One of the beautiful things about modern computers is that they just work. You can mix and match hardware or software. Any reasonably modern OS will do the hard work of ensuring everything works well together- or at least keep the system from crashing too hard. Wouldn’t it be great to have some of that functionality on a tiny, bare-bones development system?

Modular Monitor was explicitly intended to be expandable- but I also intended this to be dynamic. The idea is you could add new hardware to the system, which would register it’s own handler module automatically. Software can do similar things, so you could upload a new routine that would be instantly available.

My first crude version of Modular Monitor can’t do this- but I made sure to leave enough structure to add the functionality on some later day. While I’m making significant adjustments to Modular Monitor’s internals, I might as well do some general cleanup. Various parts of Modular Monitor were hacked together pretty quickly- and it shows.

Overall the new dispatch system makes Modular Monitor much easier to use. I already have a few extra modules ready to go- all I have to do is put a new pointer in the jump table!

Dynamic Dispatch

Part of the challenge is that 6502s have a very short, stunted set of addressing modes. As is sadly the norm for CISC CPUs, you get a bunch of addressing modes, none of which are actually useful for your program. Conspicuously absent are good relative and indirect modes. That makes run-time adjustments considerably harder.

Today we’ll be exploring how to implement indirect jumps, which allows for program flow to be rewritten at runtime without resorting to self-modifying code. This covers the majority of complex instruction addressing you might want (relative addressing still being conspicuously absent).

Do note that the indexed indirect jump instruction was added to the 65C02; it isn’t in the NMOS 6502s. It’s possible to implement a similar routine using the stack and RTS. I won’t discuss that here, since I have a follow-up project that uses that technique. If you want to know more about the 65C02’s unique quirks, look over here.

Jump Tables

Assembly programmers know well the concept of a jump table. This is exactly what it sounds like- a table of jump addresses. In 6502 assembly, it might look something like this:

;Jump target in X
LDX #JMP_SELECT
JMP TABLE, X

;Up to 127 addresses
TABLE:
.addr HERE
.addr THERE
.addr NOWHERE
.addr EVERYWHERE

You select the jump target by putting the associated index into X. Due to the way indexing on the 6502 works, the least significant bit should be ignored. Due to the way the 6502 works, you have to do that manually. Up to 127 targets are possible without resorting to tricks. I feel that is more than sufficient for most applications.

Jump tables are already super useful. They can be used to implement state machines, n-way branches, and shared libraries. C switch statements usually turn into jump tables when compiled.

Things get more interesting when the jump table itself is writable. We can add, remove, or modify the entries at runtime. It would be possible to put basic I/O drivers in ROM, but override them if a new device is plugged in. Older routines can be bypassed, patched, or simply dummied out, all without any ROM burning. Considering MM is supposed to be a development aid, this allows me to test things without touching the ROM. Very handy when you consider EEPROMs have limited write endurance.

New Command Structure

In order to use a jump table, I have to restructure Modular Monitor’s command interpreter. This is a chance to change everything about how MM’s commands are laid out.

After much hemming, hawing, hamming, and yes even some hewing, I chose to keep the one letter ID. Full strings and two-letter IDs were considered, but they oversolve the problem. Modular Monitor is more of a development tool than an operating system. Better to keep things small and simple.

Command addresses are stored in a table. Incoming ASCII characters are stripped down to a numeric index. An indexed indirect jump loads the appropriate handling routine from the table.

This manages to remain very simple but gives me all the features I wanted. In particular, command aliasing is possible- two entries can point to the same jump target. Admittedly, this isn’t as useful as it would be if I chose to handle full strings. Still, it simplifies modifying the command table. Remember, this table is supposed to be editable at runtime- I won’t have an assembler to watch over the process.

Modular Monitor Code Cleanup

I left Modular Monitor in a pretty sorry state. Quite a few things were hacked together in a rush. Lots of redundant code, a few serious bugs, that sort of thing. In other words, perfectly ordinary software.

I won’t be cleaning up everything, but there are a few bits and pieces I’d like to fix up now so they don’t cause problems later.

HEXTOBIN Cleanup

I was not happy with my implementation of HEXTOBIN. I know there’s a way to eliminate some excess code. BINTOHEX used a cute jump-forward trick to eliminate common code; why can’t HEXTOBIN?

After some research into 6502 code I have finally seen the little bit of logic I missed.

;Modified to use common subroutine
HEX2BIN: ;Low byte in A, high byte in X
	PHA ;Backup A
	TXA ;A=X
	JSR HEXTONIB  ;Send hex in A, return partial binary in A
	ASL	      ;ASL shifts left, pads with zeroes
	ASL
	ASL
	ASL
	STA TMP1      ;Store partial conversion
	PLA
	JSR HEXTONIB
	ORA TMP1      ;Combine partial results
	RTS
	
HEXTONIB:	
	CMP #'A'
	BCC @NUM      ;Less than 'A'?
	SBC #7        ;Diff between 'A' and ':'
@NUM:
    AND #$0F
	RTS

It really was as simple as that!

16 Bit Increment Bugfix

A curious bug in Modular Monitor 1.0 was that if WRITE had an entry with a trailing hex value, it would end prematurely. Obviously HEXTOBIN was incorrectly passing a flag that caused early termination.

If only it was that simple.

A few hours of poking around with MM’s own commands, the assembly equivalent of printf() debugging, and rewriting modules to use unique ZP locations, I concluded the following:

All entries were in fact being duly scanned
All entries were correctly converted
When a hex value was converted, both halves of the pointer were always incremented
When a decimal value was converted, only the lower half of the pointer was incremented
READ also had issues with values that required the upper half of the pointer to be adjusted

Turns out INC does not set the carry flag. No idea why, but it means that the BCC instructions were using whatever junk was left over from the last carry. Switching to BNE fixes the issue.

;Correct way to increment 16b value
;BCS doesn't work, because INC does not set C for some reason
;BNE triggers on rollover which works reliably
        INC PTR
	BNE @NOINC
	INC PTR+1
@NOINC:

The key debug idea was checking the ZP pointers. This required giving WRITE a unique pointer, so I could use READ to check it. I don’t know how long it would have taken me to find the bug if I didn’t try looking at the pointers.

Lesson learned, and now I have a parameterized INC16 macro to do this for me in the future.

New Features

Getting Modular Monitor to work with the new dispatch mechanism requires some significant modifications. The actual jump table turns out to be the simplest overall part!

With most of the code executing out of ROM, the jump table must be moved to RAM before it can be modified. This introduces a new requirement for copying large blocks of memory. Then the table itself is set up, for the new dispatch system to use.

COPY

COPY does exactly what you expect it to: copy one set of memory locations to another.

COPY takes three parameters: source pointer, destination pointer, and byte count. For practical purposes, COPY only works with runs of 256 bytes. Expanding it to a full 16 bits is a potential future project.

;Copy from SRC to DST using count in COUNT
COPY:
	LDY #00
@LOOP:
	LDA (SRC), Y
	STA (DST), Y
	CPY COUNT
	BEQ @EXIT
	INY
	BRA @LOOP
@EXIT:
	RTS

COPY can be used not only to do 1:1 transfers, but also “decompress” repeating patterns into RAM. Let’s say we want to fill RAM with a repeating 16 bit pattern. We could do something like this:

;Copy a pattern of bytes using the COPY routine
;Even addresses will have $12, while odd addresses will have $34
$1003  xx
$1002  xx  <- DST points here
$1001  $34
$1000  $12  <- SRC points here

It’s occasionally handy. On the 65816 there are block move instructions (MVP, MVN) which can be used in a similar way.

It occurs to me that it would be slightly more efficient to pass the count in Y. Then we could count Y down to zero, copying from high to low. That eliminates the count variable, LDY #00, and CPY COUNT. While not a particularly useful change to implement now, it’s worth considering for later.

Jump Table

The good news is that there is nothing special about the jump table itself. All you need to do is assemble using the symbolic labels for table entries, which ensures they remain consistent between revisions. I don’t think the 6502 cares about alignment here, but I’m keeping things on even addresses.

.RODATA
;Jump table of all commands
;28 possible commands, starts at ?, ends at Z
COMJMP:	.addr BADCOM ;?
	.addr BADCOM ;@
	.addr BADCOM ;A
	.addr BADCOM ;B
	.addr BADCOM ;C
	.addr BADCOM ;D
	.addr BADCOM ;E
	.addr BADCOM ;F
	.addr BADCOM ;G
	.addr BADCOM ;H
	.addr BADCOM ;I
	.addr BADCOM ;J
	.addr BADCOM ;K
	.addr BADCOM ;L
	.addr BADCOM ;M
	.addr BADCOM ;N
	.addr BADCOM ;O
	.addr BADCOM ;P
	.addr BADCOM ;Q
	.addr READ   ;R
	.addr BADCOM ;S
	.addr BADCOM ;T
	.addr CALL   ;U- User called routine
	.addr BADCOM ;V
	.addr WRITE  ;W
	.addr EXEC   ;X
	.addr BADCOM ;Y
	.addr INIT   ;Z
	.addr BADCOM ;Default for all other results
	.addr $0000  ;NULL terminator to guard table

Where the default jump table is stored is mostly irrelevant. As long as the assembler knows where it is, that’s good enough. I’m deferring to the cc65 linker to figure that stuff out. I chose to load it into ram starting at $0200, which is a fixed, easy to remember location.

Note the NULL terminator. In the event I decide to add code to insert new modules at run time, this serves as a handy “end of table” marker. $0000 will never be a valid instruction pointer.

Modular Monitor Updates

Now that we have all the pieces together, we can copy the ROM based table into RAM. This only has to be done once, at the very beginning. Because we need to know the exact location of the RAM table, I chose the fixed address of $0200. It’s sufficiently out of the way for my purposes.

        JMPTAB  = $0200 ;RAM based jump table	

;Load SRC/DST pointers, then copy jump table to RAM
	LDA #<JMPTAB
	STA DST
	LDA #>JMPTAB
	STA DST+1
	
	LDA #<COMJMP
	STA SRC
	LDA #>COMJMP
	STA SRC+1
	
	LDA #60 ;Number of commands x2
	STA COUNT
	JSR COPY

With this address known, we can rewrite the dispatch routine to use an indexed indirect jump. It’s pretty easy:

	;New table based dispatch. Use ASCII value as index
	LDA INPBUFF
	SEC         ;Set carry to ensure proper subtraction
	SBC #'?'    ;Get down to proper range
	CMP #'Z'+1  ;If it was greater than Z, it's not valid!
	BCC @VALID
	LDA #'Z'+1  ;Guarantee an invalid command entry
@VALID:
	ASL         ;Multiply by two to align with addresses
	TAX         ;Put A in X
	JMP (JMPTAB, X)

After a couple of sanity checks, we simply load the command offset into X then call the jump. Much simpler compared to the “chain of branches” method.

Because the jump table is currently sparsely populated, most of them point to BADCOM. All this routine does is print an error string.

;Print out the invalid command error, then go back to MAIN
BADCOM:
	LDA #<NOCOM
	STA PTR1
	LDA #>NOCOM
	STA PTR1+1
	JSR SENDSTR
	BRA MAIN

NOCOM: 	.byte "Invalid command", LF, BEL, NULL

With these three things changed, Modular Monitor is now able to be hot-patched using it’s own editing routines.

Testing Dynamic Dispatch

Our first test is simply to check if the jump table even exists. No table, no work. Since we know where the table is supposed to be, all we have to do is read it out from $0200. It will be very obvious because it has a known set of patterns. All those BADCOM entries stick out!

Modular Monitor output, showing the command jump table has been successfully copied to RAM.

I realize this means nothing to you, but I know my ROM starts at $E000 so it’s clear these are pointers to the monitor routines. Using my list file I can figure out what address each module is loaded at.

Of course, the more logical readers will realize the table must already exist since the READ command uses it! Spare me a bit of leniency; I checked this before implementing the new dispatch routine. Once I could confirm that COPY worked, I could rewrite the dispatch and go from there.

Next, we test if the table is properly rewritable. I don’t feel much like coming up with a totally new test program, so I’ll just redirect the existing ones. I chose to make ‘@’ point to INIT, which resets the whole system.

You can see the second address in the table is now changed. When ‘@’ is entered, the system does indeed re-initialize.

Performing this editing by hand is tiresome, but it may be the only reasonable way to test out software that was written using Modular Monitor. Automatic registration is relatively simple- just pick any BADCOM pointer and replace it. Without a definitive API though, that’ll be a pain to organize. Something I’ll have to consider later on, I suppose.

Finishing Up

Modular Monitor was designed to be modified so the user could add their own modules without recompiling the ROM image. While I can’t realistically self-host because I don’t have an assembler that runs on the 6502 (…yet), the new dispatch is a strong step towards that goal.

This project turned out to be nice and easy. Outside of fixing up some of MM’s existing bugs, everything went smooth from start to finish. In a way, I’m kind of disappointed. Failure is a good teacher if you bother to listen. I guess it just means I’m getting better at 6502 programming.

A caveat with this kind of jump table is that it only covers jumps, NOT branches or calls. Branches are so short (127 bytes either way) this usually doesn’t matter- they probably wouldn’t be able to branch over the branch table! Subroutine calls on the other hand, are not indexable. There are ways around this, which I’ll explore at a later date. An advantage of using jumps over subroutines don’t use any stack space, so it’s not necessarily a matter of convenience.

Modular Monitor was intended primarily as a sort of learning tool. A low stakes way to get to grips with the 6502’s very hostile assembly process. In that way it has been successful. Much like the minimal development board, I expect to abandon it in the not-so-distant future. Until then, I have a few other modules to implement.

Working on an increasingly complex program is showing me that my existing development process of one text file plus a few backups is woefully inadequate. I seem to have no real choice but to get to grips with source control, makefiles, and development environments. That should make me a Real Programmer^(TM) with the ability to start pointless arguments about computers that nobody actually cares about.

Summer has proven to be quite draining this year. I continue to make progress on various projects, but I just don’t have much to show right now. Switching back to 6502 projects gives me some much needed motivation. Between my personal health, family drama, and the existential stress of life. it’s easy to fall into a perpetual state of “meh”.

Scribble is my “big project” for summer 2023. I still expect to get it done by the end of August. I have some 6502 stuff that’s getting close to completion though, so I’ll split my effort between the two for a while. I’m planning out a new development board that should be ready around August, so I’ll start focusing on the 6502 again in autumn.

Some Disassembly Required