Monday 1 January 2024

2023 is a wrap!

No putting it off any longer, I need the graphics ROM to be read from SDRAM.

Initial implementation today didn't go well. Getting the data into the SDRAM was trivial - the ROM LOADER module was already designed to do that after changing a single constant - but reading out has proven less successful.

The graphics ROM is accessed at 12MHz by the YGV608 - twice the dot-clock as it interleaves Planes A & B. That's just 83.3ns to get the data back from SDRAM given access latency, arbitration with the CPU ROM access, and refresh. The SDRAM itself is running 8x faster, so 8 clocks.

The bandwidth isn't really a problem, but the latency may well be. I need to sit down and actually calculate what I can theoretically expect from the SDRAM.

I can halve the number of accesses using a relatively simple read cache as every other - interleaved - read is from the same 16 bits, so that will help a little with bandwidth. But I'm thinking I might need to cache a whole line at the end of VBLANK (or better yet, during the BORDER) to avoid latency issues.

Either way, I've never implemented my own cache, so I need to do a little research first. It will probably end up as a 2-level cache; one for the 16-bit access and one for the whole line. Will be a fun learning exercise! Well, fun until I can't get it to work properly...

Happy New Year from 2024!

Sunday 31 December 2023

I can see the attraction!

"Next (and last) unimplemented mode (as defined by the MD bits) is the 2-plane 8-bit (pattern name) mode which is used by the games themselves. That shouldn't be too difficult to implement (or more correctly, fix) but it does take a while for the simulation to get that far."

Famous last words! It has taken a few re-writes of a good chunk of the logic in the pattern generator, but I'm finally seeing at least Xevious and Mappy running in attract mode! Not much to see of Galaga except a few words on the screen, but that could be palette issues.

The aforementioned mode also requires both the page table and base address registers to be implemented and included in the pattern generator calculations. Not overly complex but a number of different combinations to account for in shifting values around relative to one another. It does pay to read - and re-read - the documentation as much of the logic that I ended up re-writing was summarised quite nicely in a few tables in the YGV608 Application Manual.

Anyway, here's some eye candy from the attract mode:

Xevious original and Arrangement versions (simulation)

Mappy original and Arrangement versions (simulation)

Not every mode is fully implemented yet either; only 8x8 & 16x16 tiles, and it assumes only Page 0 is in use on each plane. I will have to look at scrolling soon too!

One issue I have is that I currently have only a fraction of the graphics ROM loaded into the real MiSTer project, as it is residing in on-chip RAM. So I can only test that the correct tiles are being displayed under simulation. And under simulation, it takes 81 mins to get to the original Xevious attract mode, and a tad over 2hrs to get to the original Mappy attract mode!

Now that I've gotten this far, it's time to get the graphics ROM loaded into SDRAM. I have the option of using the 2nd SDRAM module on my development MiSTer, but haven't decided if that's necessary just yet...

Wednesday 27 December 2023

Bigger tiles!

Almost impossible to find the time to work on Namco Classics but managed to find a few minutes here and there. After a false start I've updated the core to handle tile sizes other than 8x8 (16x16, 32x32 & 64x64) which is used on the game select screen.

16x16-pixel tile mode (MiSTer)

The dual-plane is also working as can be seen on the per-game select screen.

Next (and last) unimplemented mode (as defined by the MD bits) is the 2-plane 8-bit (pattern name) mode which is used by the games themselves. That shouldn't be too difficult to implement (or more correctly, fix) but it does take a while for the simulation to get that far.

Tuesday 26 December 2023

Xmas progress!

Not surprisingly it's been a busy run-in to Xmas with work and home life, though I have done more on Namco Classics than my lack of blog posts would suggest. In fact, I've been quite pleased with my progress given the relative lack of time to work on it.

To continue where the last post left off, I've sorted the interrupt issue. That was due to me not responding with VPAn properly - interestingly that signal was absent on the TG68K core I used many years ago for the beginnings of my Neo Geo implementation. The software seems to be running properly now as far as I can ascertain from the far-from-complete video implementation.

Next step was to get the fx68k core executing code from SDRAM. That involved studying a bunch of different cores to find an SDRAM controller that was relatively 'new' and also suited my circumstances. Turns out the core written by Sorgelig is used in several different cores, and modified as required. After a false start using the Irem92 version of the core (see below), I finally settled on the version used in the SEGA Genesis core. FTR it has 3 channels, though I'm only using 1 atm.

Before I can use the SDRAM core however, I need to be able to download to the SDRAM. For this I found the Irem92 rom_loader module, which uses some magic in the .mra file to download data to both SDRAM and BRAM memory blocks. This was perfect for Namco Classics as I am still loading a subset of the graphics ROM into BRAM, although I soon discovered that I'd need to switch the SDRAM core to that of the Genesis core. Fortunately the rom_loader didn't require any changes.

Finally I was greeted with the SELF TEST screen that I had only seen on the simulator...

68K code running from SDRAM!

I decided to look at the dipswitches next, so I could get the game into the self-test menu. Quite straightforward with the MiSTer framework, so then I started looking at the inputs to the core. This was a little more confusing as the documentation is quite terse on this topic, and took some 'reverse-engineering' of another core to deduce how that actually works. Eventually I could navigate the menu and get into the SWITCH TEST screen.

SELF TEST (inputs) screen (MiSTer)

That just about completes the interfacing with the MiSTer framework, at least until I want to get more into the advanced graphics and sound options.

So back to completing the YGV608 implementation. It doesn't take long for the game to switch into an unimplemented graphics mode. The main title screen runs in 256-colour mode, so that was next. At the end of the day, all that was required was a few more lines of Verilog in the Pattern Generator module and it was displaying correctly!

Title screen - 256 colour mode (MiSTer)

And then the very next screen in attract mode switches to another unimplemented graphics mode... this time 16x16-pixel tiles instead of 8x8-pixel tiles. This is currently WIP, but I at least now see something resembling the proper graphics:

16x16-pixel tile mode (Verilator)

So far the new graphics modes have been trivial to implement, but I'm expecting it to get more complex before too long. Scrolling and paging in particular, and that's not even thinking about rotating and scaling (which the NAMCO intro screen uses!) But soon it may be time to implement the sprite layer!

Friday 8 December 2023

Is your RAM OK?

Another very productive day!

I started out trying to get the code a bit further to display more on the screen. I was frustrated by the inefficient process of tracing execution via the waveforms in gtkwave. I had already written a crude "breakpoint" mechanism that simply halted the execution at that point, but it required re-compiling the simulation to change the address. So next task was to implement something more convenient.

After undergoing a baptism by fire with ImGui I ended up with some GUI elements allowing me to type in a breakpoint address, display the current address on the bus (PC), and the data bus values. I also added a button to go to the next address that was doing an instruction fetch (it was too complex for now to work out how to step to the next actual instruction - I was in a hurry).

CPU 'debugger' in the Verilog simulator

Now back to tracing the program and working out where it was spinning. Again, going back to my original MAME driver to look at the H8/3002 hacks, I implemented the very crude CUSKEY hack that ultimately allowed the RAM test to PASS. Then it was more RE in IDAPro trying to get it to jump to the full SELF TEST menu screen - to no avail. 1MB of code is a *lot* to search through!!!

Eventually I found it was spinning on a RAM address, waiting for the value to change. It appears that value is being changed in one of the ISR's. I had written the code to generate the VBLANK interrupt from the YGV608 and hook it up to the 68K core, but I hadn't seen it actually working. So I hooked it up again and all hell broke loose. I'm yet to confirm but it appears that as soon as interrupts are enabled in the core, it tries to service the pending VBLANK and fetches the INITIAL STACK POINTER vector and jumps there... have to look further into it as I'm no 68K hardware expert.

Changing tack for the day I decided to clean up the YGV608 Pattern Generator (tile display engine), re-writing it to support all 4 pattern plane modes and various page sizes. That required work on the register side that writes the pattern name table RAM, and the pattern generator itself. But as I knew it would, the logic shrinks down when you optimise it properly, though I have opted at this stage to tend to keep it readable. Finally I could see 'RAM OK' instead of a bunch of zeroes.

Last task was to account for the pipeline delays in the pattern generator and color bus to fix the clipping of characters. After hand-drawing a timing diagram in my 'retro project' notebook, it was obvious where I had to pipeline some counters, and only two compiles to get it right!

The first really clean display - simulation output

So a fairly clean display. I haven't accounted for the pipeline delays between the border and the pattern planes yet, hence the extra lines either side of the border. I should also note that I've only really just scraped the surface of the YGV608 functionality thus far, even where the pattern planes are concerned. There's the virtual screen made of (repeating) pages of planes, flipping, reversing, scrolling regions, transparency, mosaic displays, display priorities, name table mapping modes.... not to mention sprites and rotation! Having said that, I didn't even implement every function in the MAME driver either!

So next... I think I'll have to get interrupts working to get much further in executing the code. No point implementing YGV608 features that can't be tested!

Wednesday 6 December 2023

"ND-1 SELF TEST / VER1.00"

Continuing on from where I left off yesterday, I've implemented the color (sic) palette interface and memory, and also got the code to the point where it displays text on the screen!

It hasn't been without some head-scratching and some 'doh' moments though.

A combination of quick hacks, some unfounded assumptions, and confusion over how the YGV608 was wired to the 68K data bus resulted in some copious head-scratching. I decided that it is wired to the upper [15:8] bits on the bus, but then I observed the 'clear screen' function writing alternate '$20' and '$00' to the pattern name table. Confused by the fact that $20 was always either on the upper or lower half of the bus, and the fact that the code was alternating between byte and word access to the YGV608, I thought I'd have to decode UDSn and LDSn to get the right half... and then I eventually realised that the YGV60 was in 16-bit pattern mode and it should be writing those two bytes!

Next was getting the execution to a point where it was actually writing something to the pattern name table (displaying characters on the screen). On a cold boot it shows the self-test screen, but before it gets there it checks a few things that aren't implemented, one of which is communications from the H8/3002 via shared RAM.

Now way back when I wrote the MAME driver, I was pretty sure there was no H8/3002 core and that I had done a very crude 'black box' emulation of the MCU functionality. I just had to find it. Fortunately mamedev keep archives of all their old releases, and using a 'binary chop' I found a release (0.53) with what is likely my original driver. And sure enough, there was the hack for the missing MCU!

I traced the execution of the simulation where it was spinning, waiting on a shared memory location, and implemented the hack for that address. Running again it got further, showing a few lines of '0000' on the otherwise clear display. My guess was that the pattern generator was picking up the wrong byte in my quickly-hacked 16-bit pattern mode - and I was right. After I fixed that, I got this:

NCV1 running under simulation - text output

Nowhere near perfect but good enough to verify the CPU is executing and displaying text via the YGV608. And FTR, the '0000000' string should actually be 'RAM NG', but it's still picking up the wrong byte on that line. It took me a minute to realise why the RAM test was failing when I'd implemented all of it - because of the shared-RAM hack I implemented!

Anyway very good progress today but it's probably a good time now to step back and actually implement the pattern generator properly, to handle all modes. It was originally a quick hack for 8-bit mode (only), then re-hacked for 16-bit mode (only). To do it properly will probably take a few days at least.

BTW the missing lines on the characters are due to pipeline delays on different signals in the core which I haven't even attempted to cater for at this point because I knew I'd have to rewrite most of it anyway.

Tuesday 5 December 2023

68k executing code and writing to YGV608

Significant progress today!

@Toya from the MiSTer dev-talk channel on Discord pointed me at jotego's fork of the fx68k core that is compatible with Verilator. After (finally) discovering that Verilator will randomly create new .cpp files when you add code, I was able to build the 68k CPU into the simulation.

That also required adding the CPU ROM files to the download, with some (more) emulated ROM in the simulation. A few debug cycles later and it was downloading the CPU ROM to the emulated memory!

Next step was to see if the CPU was executing the ROM code correctly. Have to admit this took a couple of hours to get right. At first I couldn't get the CPU to start out of reset; turned out that BGACKn was tied low and the core was spinning waiting for... something to happen.

Following the trace, it seemed to fetch the RESET vector and start executing, even the subsequent branch. Then it entered a long loop where the entire ROM contents are fed into the Namco CUSKEY chip as a protection check. This is where my brain stopped working for an hour or so...

I won't go into too many details, but rather fast-forward to where I'd added 'breakpoints' into the simulation to stop when the address bus had a certain value, so I could halt at the routine where it starts to write to the YGV608 registers. That appeared to work, but what was happening after that didn't match the code at that point?!? Fast forward an hour or two and I suddenly realised - it wasn't EXECUTING at the breakpoint address, it was feeding that address into the CUSKEY!!!

A few hacks later and I managed to halt it at the actual breakpoint. Now I could finally debug the YGV608 register interface and tweak (fudge) the pattern name table pointers for the mode I wanted. While I was at it, I also added the code to pick up the border colour from the register. And here is what is displayed in the simulation:

Clear screen and correct border colour - in simulation

It may look like I've gone backwards, but in fact code is running from the ROMs and has cleared the screen, removing all the '0' characters from the empty pattern name table memory. And note the border is black, as it should be.

I have a few options moving forward from here.

One is to add the palette register interface and allow the code to set the palette, rather than use the hard-coded (but correct, partial) palette I have now.

Another is to find out why nothing else happens on the screen - it should eventually display the self test screen, and then continue to the menu. It's possible that it requires (at least) the VBLANK interrupt to get that far. Or some other I/O location to be implemented/fudged.

UPDATE: It's spinning reading $40002 waiting for something to set bit 7...

Or I could try to get this running on MiSTer. That will require another attempt at getting things running from SDRAM - both Graphics ROM and CPU ROM. Something that could require a few days at least.

I think I'll attempt the above in the order I've listed.