Retrochallenge 2019/10 #3

Booting FLEX


The FLEX operating system uses track 0 of each disk for its own housekeeping. Sectors 1 and 2 contain a boot program. Sector 3 holds the System Information Record (SIR), which tells FLEX the geometry of the disk, where the first empty sector is located and the number of empty sectors, among other things. The directory information is stored from sector 5 onward.

The boot process consists of three stages. First, the boot ROM loads sectors 1 and 2 of the disk into memory. These sectors contain a secondary boot loader. Second, the boot loader will load the operating system, contained in the FLEX.SYS file. Third, the boot loader jumps to FLEX entry point at address $CD00.

The boot loader in ROM can be rather simple. All it has to do is load two raw sectors (512 bytes) into memory and jump to the correct entry point. There is some freedom here; the exact load address is not critical as long as the code doesn’t use the area where FLEX will be loaded later. In effect, memory located below $C000 is safe to use.

The purpose of the secondary boot loader is to load the FLEX.SYS file. This file is formatted like any other FLEX binary; it consists of records which hold a load address and a record size followed by up to 252 payload bytes.

Unlike the ROM boot loader, the secondary loader must adhere to the FLEX disk layout as the FLEX.SYS can be fragmented on the disk. To simplify things a little, boot sectors 1 and 2 have a pointer to the starting track and sector where FLEX.SYS resides. The user can set this pointer using the LINK utility.

The track and sector of the FLEX.SYS file are written to track 0, sector 1 (the first sector on the disk), at offsets 5 and 6 respectively. This way, the secondary boot loader does not have to traverse the directory list, keeping the code small.

During the loading of FLEX.SYS the secondary boot loader keep an eye out for transfer address records. This record contains the jump address. After FLEX.SYS is loaded, it jumps to this address and the operating system takes over.

The new boot ROM and secondary boot loader were tested in the simulator:

Booting FLEX

Booting FLEX

The code for the secondary boot loader can be found here.

Retrochallenge 2019/10 #2

Bootloader ROM

I’m glad to report the development of the new bootloader is coming along nicely. At this moment, it can view memory using the ‘M’ command, it can run a program using the ‘R’ command and it will accept Motorola s-record dumps via the ‘S’ command. This is a bit more fancy than the old one, which just prompted for an s-record dump.

The real benefit of the new bootloader is that it has an address table with ROM routine addresses, located at a fixed address. The entries of the table can change between different bootloader versions without affecting the operation of the user programs.

; ==================================================    
; ==================================================    

    ORG $FFA0
    .dw CMDINT    
    .dw INCHAR
    .dw INCHARE
    .dw INCHECK
    .dw OUTCHAR
    .dw WRITEHEX

The address table is located at $FFA0 and contains addresses to the following routines:

  • CMDINT: the command interpreter of the bootloader.
  • INCHAR: read character from UART and put in A register.
  • INCHARE: read character from UART with echo and put in A register.
  • INCHECK: set zero flag to 1 if there are characters to be read from UART.
  • OUTCHAR: write character in A register to the UART.
  • PRINTSTRING: write a 0-terminated string, address in X reg, to the UART.
  • WRITEHEX: write the value of A register as two-digit HEX to UART.

In the future, other ROM routines will be added.

Expansion port

During the 2017/04 Retrochallenge, I never got around to testing the I/O expansion port of the HD6309 computer. So I wrote a simple program to read or write to the expansion port.

A quick test with the oscilloscope revealed that, as expected, the I/O port did not work. There was a mistake in the output enable expression for the 74HCT541 bus driver, making sure that the data never actually arrived at the expansion port pins.

The fix was simple and quick; I installed Atmel WinCUPL, the software that turns the PLD file into a bitstream file my TL866 GAL/EPROM programmer understands, and verified via simulation my fix was indeed correct:


And indeed, the EXP_OE_N line goes low when the I/O expansion port is addressed.

After updating the I/O decoder GAL and checking with the oscilloscope I could see data arriving at the pins of the expansion port. One less problem to worry about!

An operating system

The updates to the bootloader ROM were aimed at getting more software to run on the HD6309. Loading software over the UART in s-record format, however, is not that convenient, even considering the standards at the time. The HD6309 really needs mass storage and an operating system.

There seemed to be two major operating systems for the 6809, namely OS9 and FLEX. Of the two, OS9 is the more complex, offering multi-user and multi-tasking as standard. For this to work, the CPU needs to be interrupted several times a second via timer hardware to switch tasks and do housekeeping. The HD6309 computer does not have such a timer, so porting OS9 to it is not feasible without additional hardware. So that leaves us with FLEX!

Finding the source code for FLEX turned out to be a lot more challenging than I had expected. After couple of nights scouring the web turning up only binaries, I finally found the source code to a 6809 version of FLEX. This will be the basis for the HD6309’s operating system.

There are lots of documents on FLEX online. I found a user manual, a programmer’s manual and even a porting manual. Things are really well documented, from the memory layout to the floppy disk format. This makes life a lot easier.

Next time…

That’s it for now. Next time, more on the porting of FLEX and mass storage solutions!



Retrochallenge 2019/10 #1

I’ve entered the 2019/10 Retrochallenge. Two years ago I built an 8-bit computer based on the HD6309 processor from Hitachi.


Now it needs some software. The onboard 4k ROM contains a bootloader which, at the moment, only accepts Motorola SREC dumps so one can load software via the UART.


It could really use a classic monitor and elaborate I/O routines to support an operating system. So, for the first part of the Retrochallenge, I’m going to develop a new boot ROM.

I’m developing the boot ROM using Visual Studio Code and LWTOOLS. Rather than burning EEPROMs and testing it on the real system, I coded a simple simulator based on Ray Bellis’ 6809 simulation library. You can find it here.

For real-time updates, follow me on Twitter -> @trc_wm.

Retrochallenge 2017/10 #7: Final report

At the beginning of the Retrochallenge, I had the following goals:

  • Make a Verilog implementation of the SP0256-AL2.
  • Synthesize and test the design on a Terasic DE0 FPGA board.
  • Cut corners.
  • Learn Verilog.
  • Have fun!

I think I succeeded on all five points.

The platform agnostic Verilog code for the Speech256 is available on GITHUB. I also have a Quartus II project available to run a demonstration on the Terasic DE0 board (no not Digilent, as I mentioned many times before.. derp).

I cut quite a few corners by not implementing the compressed ROM format of the SP0256-AL2, but using my own encoding and controller.

I learned Verilog in the process, although Clifford Wolf did have a few pointers on my coding style regarding non-blocking assignments…

And finally, I synthesized and tested the design on the DE0 board:

… and I had fun doing this; I hope you liked it too!

A big shout-out to John W. Linville for running the Retrochallenge 2017/10.

Until next time — Retrocompute!

Retrochallenge 2017/10 #6


The last two days, I spent several hours debugging the filter engine. The filter didn’t want to behave, meaning the output values were going all over the place.

The debugging process involved setting up the filter to the ‘EH’ allophone and going through the changing filter states, one by one, to find the differences between a known/good model, which I had made in MATLAB, and the Verilog code. Using this method, I finally tracked this down to a signed/unsigned problem in the serial/parallel multiplier.

Will it float or will it sink?

The source-filter model, controller and top-level Speech256 block were mostly complete. To test the final design, all I needed to do was feed a string of allophones to the Speech256 top-level block and capture the simulated output to the DAC.

I set up the following allophones: 0x1B, 0x07, 0x2D, 0x35, 0x03, 0x2E, 0x1E, 0x33, 0x2D, 0x15, 0x03.

This is the result:


So YES! it floats!


Next: actually get it synthesized and running on the DE0 FPGA board…

Retrochallenge 2017/10 #5

I’ve been working on the controller part of the Speech256. The current status is that is it mostly working: it sends pitch, duration, voiced/unvoiced and filter coefficients control signals to the source-filter model.


Coefficient translation

To save command ROM space, the designers of the SP0256 command ROM chose to encode the filter coefficients in a 8-bit sign-magnitude format instead of the 10-bit format needed by the filter engine. To be clear, the Speech256 project does not use a copy of the SP0256 ROM; it uses a custom format. However, the bit-depth reduction technique is used to keep the control ROM small.

The SP0256 contains an expansion circuit that translates the 8-bit format into the 10-bit format. The SP0256 seems to use the same translation table as the SP0250, which is documented in its Applications Manual:


The table pertains to the (7-bit) magnitude part of the (8-bit) sign-magnitude only: positive and negative numbers are translated by the same table, keeping the sign untouched.

Although the manual claims it uses a lookup ROM, the content looks like a piece-wise linear curve which can therefore be implemented without a ROM. I found four lines (C1 .. C4) that represent the table quite accurately:SP0256_quant_curve
The following code snippet will translate a 7-bit magnitude into a 9-bit magnitude according to the table …

if x < 38
    return x*8;
if x < 69
    return 301 + (x-38)*4;
if x < 97
    return 425 + (x-69)*2;
if x < 128
    return 481 + (x-97);

… except for x=2, where the result should be 12 instead of 16. Note that the C1 line also has a slight offset error but I think this translation function is most likely good enough.

The convenient factors-of-two scaling found, makes me suspect the function was meant to be implemented directly in logic, not in a ROM.

Time is running out, so I’d better get on implementing the above!


Retrochallenge 2017/10 #4

Another small update of the Retrochallenge 2017/10 Speech256 project. The source part is finished! It can produce both the pulse wave and noise:



Here’s the audio for the brave:

Noise generator

In case you were wondering, the linear feedback shift register used for the noise generator is a single Verilog line:

lfsr = {lfsr[15:0], lfsr[16] ^ lfsr[2]}

The updated Verilog source code is up on the Github page!


Up next will be integrating the filter engine and the source to complete the source-filter model of the synthesizer.

Retrochallenge 2017/10 #3

During the past few days I’ve managed to squeeze some time out of my schedule to make progress on the Verilog code for the Speech256 synthesizer.

Basically, I think I have the 12-pole filter working. This is the most complicated part of the project, with three shift registers holding the filter coefficients and the filter states:


The coefficient shift register is used as a roundabout, which avoids the use of an actual addressable structure, like a RAM. It also simplifies loading a new set of coefficients into the filter engine, as these can simply be served in a serial fashion, reducing the interface complexity to the Speech256 controller.

The multiplier shown is implemented using a shift-and-add technique, which has a reduced speed but a lower resource requirement compared to a fully parallel implementation.

The 12-pole filter control logic (not shown) was a bit tricky due to the serial nature of many of the components, see the meaningless screenshot here:


The most recent code is up on my Speech256 Github page..

Retrochallenge 2017/10 #2

Here is the first real RC2017 update in the form of a video on the inner workings of the SP0256-AL2 speech synthesis chip.

It took a lot longer to produce than I originally thought, partly because I had a severe cold and sounded like a wet newspaper, and partly because open-source video recording and editing is still a P.I.T.A, apparently.

Retrochallenge 2017/10 #1


For this season’s Retro Challenge I’m going to make an implementation of the SP0256-AL2 speech synthesis chip in a Field Programmable Gate Array (FPGA).


What is an SP0256-AL2?

In the late 70ies and early 80ies, many companies such as Texas Instruments, Votrax and General Instruments, produced speech synthesis chips. These chips ended up in various toys (speak-n-spell) and expansion products for home computers. Back then it was hot tech!

The SP0256-Al2 is a speech synthesis chip that was made by General Instruments. It has a built-in ROM containing 59 allophones, which are small snippets of speech. By concatenating allophones, words could be formed. It has a relatively simple 8-bit parallel interface and a built-in digital-to-analogue converter.

The SP0256-AL2 was used in a number of products, such as the the Tandy Speech/Sound catridge,the Speech 64 module for the Commodore 64 and the Currah uSpeech interface for the ZX Spectrum:


So why make an (FPGA) implementation of the SP0256-AL2?

  • I find the obsolete speech technology interesting.
  • SP0256-AL2 NOS ICs are expensive.
  • Stocks are going to run out in the future.
  • Some being sold are fake.
  • If I succeed, FPGA-based retro computer remakes can include this speech core.
  • I like a challenge.


The goals for this Retro Challenge I set for myself are the following:

  • Produce a working equivalent of the SP0256-AL2 on an Terasic DE0 board.
  • Use Verilog as the hardware description language (I mostly know VHDL).
  • Make the Verilog and supporting code available on Github.
  • Avoid fully parallel multipliers in the implementation so it will fit into small FPGAs.
  • Cut corners wherever possible.
  • Blog about the process.