WarpSE: 25 MHz 68HC000-based accelerator for Mac SE

Patrick · May 23, 2023

nah, more like the

(ok, that is the Thomson version. but better representative of what i had in mind)
i'm not really a electronics guy. so you say 68000 i think what i'm used to when i open up my older macs

not that it matters. i was just surprised there was that many packages for it.

Trash80toG4 · May 23, 2023

What's that package called, looks like the DuoDock+ ROM?

Zane Kaminski · May 23, 2023

Patrick said:
nah, more like the

View attachment 12418 (ok, that is the Thomson version. but better representative of what i had in mind)
i'm not really a electronics guy. so you say 68000 i think what i'm used to when i open up my older macs

not that it matters. i was just surprised there was that many packages for it.

Ah got it. Surface mount is so much easier for us than through hole since we can basically do it automatically. Only thing I’ve learned to not do as surface-mount is the PLCC sockets for the CPU and flash ROM. It’s better to use through hole sockets for those surface mount components. If there’s a soldering defect under the SMD socket they’re too hard to fix since the plastic of the socket blocks the pins. Some people cut out the center of the sockets for easier soldering but like… I had a friend in high school who would “fix” people’s cars. The conversation at the end of the repair would go like this:

“Hey thanks for fixing my car. Here’s the money.”
“No problem, here’s your keys back and here’s your extra parts.”
“Extra parts?”
“Yeah I managed to put the car together without requiring these parts so I just left them off. Y’know like weight savings”

So I don’t think you’re supposed to cut the middle of the PLCC socket out…

Trash80toG4 said:
What's that package called, looks like the DuoDock+ ROM?

I’d just call it a PQFP as in “plastic quad flat pack.” Sometimes PQFPs are square, sometimes they’re non-square rectangles. But the difference between them and the more modern TQFP “thin quad flat pack” is that TQFP chips are always much thinner and usually have smaller pitch. PQFP almost always have like 0.65mm pin spacing whereas TQFP usually has 0.5mm. PQFPs can be several millimeters tall so they’re dimensionally closer to CQFP “ceramic quad flat pack” chips like the surface mount 68030

Trash80toG4 · May 24, 2023

Zane Kaminski said:
So I don’t think you’re supposed to cut the middle of the PLCC socket out…

I don't think so either, form factors are finite and removing gobs of plastic that's meant to be there would seem to b counterproductive in terms of reliability.

Melkhior · Jun 6, 2023

Zane Kaminski said:
Actually I've redesigned this RAM controller

Is a controller necessary? I went down a rabbit hole looking at SRAM to maybe do a cache for a 68030 (I hate that modern SRAMs don't have a reset function, how do I implement the valid and/or tag bits?), and I realized - how much memory do you actually need for a Mac running an older OS? Chips like the IS61C5128AL-10TLI are 3.77€ each (qty 10), run at 5V, are very fast (10ns!), and don't need refresh. I don't know for the '000, but for the '030 and '040 the UMs document how to do a SRAM-based memory subsystem. You'd need 8 chips for a full complement of 4 MiB in a SE, that's still only 30€ worth of SRAM (larger SRAMs are usually 3V3 now). Wouldn't that make the '000 even faster? And those chips are current production, so no need to source vintage DRAMs.

I wonder what the performance of a '040v running out of a bunch of IS61WV20488FBLL-10TLI (2 MiB each, so just use 8 to do the banking thing like in the UM and get 16 MiB for less than 100€) would be like...

Zane Kaminski · Jun 6, 2023

Just do a small tagged SRAM cache in the FPGA and then use (S)DRAM for the main memory instead of SRAM. 30 EUR is a lot for RAM. It’d be better to spend that on a larger or faster FPGA. For reference, I just bought a few hundred RAM chips for the production batch of the WarpSE and paid less than $2 each. Next version will have a single 32 MB SDRAM chip costing $1. So it’s better to use 3.3V SDRAM with level shifting buffers and make up the latency difference with more advanced computer architecturey features. The refresh overhead is about 1% at most, so that’s not a major contributor to performance loss. In the next revision with SDRAM, I will be doing an SDRAM controller with 0 wait states at 25 MHz and completely hidden refresh but again, totally hiding the refresh gets you 1% speedup at best. Faster than 25 or 30 MHz requires wait states or a cache/prefetch buffer because SDRAM has a lot of latency. The good part is that SDRAM is burst-oriented so on an ‘030 or ‘040, you don’t have to do the banking thing and you can just use one bank of chips. Most of computer architecture is figuring out how to get performance as good as SRAM with DRAM main memory lol, so I would be abdicating some of my responsibility if I “cheated” and used SRAM.

Zane Kaminski · Jun 7, 2023

Oh also @Melkhior, regarding flushing the cache, I suggest doing it differently. When a cache flush is initiated, the cache will be disabled for some time. During this time, the FPGA iterates through the cache memory and invalidates each line. Maybe you have to acquire the bus to do this, in which case you must spread out the flush so that the CPU is not starved of memory access for too long. Only after flushing is done can the cache be reenabled. Enabling the cache before the flush is finished sets the enable bits in whatever cache control register but doesn’t actually enable it until the flush is done. You can do this on boot too. It’s also possible to map the tag RAM into the CPU address space and do it with the CPU but that will be slower and the CPU will have to spend a lot of time in the cache disable/enable routine.

But anyway, I think it’s better to couple the L2 with the RAM controller so no cache flushing is necessary except on boot.

Melkhior · Jun 7, 2023

Zane Kaminski said:
Just do a small tagged SRAM cache in the FPGA

Current working theory for a cache would be to have the data in dedicated SRAM, and the control logic & tag/valid bits in the FPGA. And then looking for SRAM revealed the rabbit hole

Zane Kaminski said:
and then use (S)DRAM for the main memory instead of SRAM. 30 EUR is a lot for RAM. It’d be better to spend that on a larger or faster FPGA. For reference, I just bought a few hundred RAM chips for the production batch of the WarpSE and paid less than $2 each.

I thought that vintage DRAM would be more expensive than that. SRAM is indeed at least an order of magnitude more expensive.

Zane Kaminski said:
Next version will have a single 32 MB SDRAM chip costing $1. So it’s better to use 3.3V SDRAM (...) The good part is that SDRAM is burst-oriented so on an ‘030 or ‘040, (...)

I see SDRAM chip on Mouser like the IS42S32200L (2M x 32). They seem a lot more complicated to use than pure SRAM - and I don't understand how the burst thing really works. In particular if it does the 'wrap-around' required by '030 and '040...

Zane Kaminski said:
(...) But anyway, I think it’s better to couple the L2 with the RAM controller so no cache flushing is necessary except on boot.

You still need some way to reset the tag/valid at boot, and iterating can take a lot of time if you have a fairly large cache.

Also my current idea is to cache existing memory in existing system with some level of support (e.g. IIsi), in addition to any extra memory that could be added. Of course on a IIsi one might just add enough memory on the PDS to make all on-board memory useless other than the first MiB for the vampire video...

Anyway sorry for somewhat hijacking the discussion. Looking forward to all version for 'inspiration'

Melkhior · Jun 7, 2023

Melkhior said:
I see SDRAM chip on Mouser like the IS42S32200L (2M x 32). They seem a lot more complicated to use than pure SRAM - and I don't understand how the burst thing really works. In particular if it does the 'wrap-around' required by '030 and '040...

It seems it's not in the datasheet because it's part of the JEDEC standard - and it does appear in some other datasheets, such as the cheap (and available from JLCPCB) W9825G6KH-6, 32 MiB in a 16-bits wide package for $1.2292 (qty 1). So SDRAM is definitely a better choice, at least for '030 and '040 using burst.

Edit: ... the downside is unlike SRAM, you can't connect the address lines directly from the CPU as you need the row/column stuff on the multiplexed address lines on the SDRAM. Which means a lot more pins from the FPGA :-( Maybe it's doable with external buffers? The trollbook is using dedicated pins from the FPGA.

Zane Kaminski · Jun 7, 2023

Melkhior said:
unlike SRAM, you can't connect the address lines directly from the CPU as you need the row/column stuff on the multiplexed address lines on the SDRAM. Which means a lot more pins from the FPGA :-( Maybe it's doable with external buffers? The trollbook is using dedicated pins from the FPGA.

SDRAM is not that bad. 15 address pins plus CS, RAS, CAS, WE, CKE, and then the clock which might come from an oscillator or your FPGA. So 20-21 extra pins on the FPGA are required if it’s already connected to the full 68k address bus.

Melkhior · Jun 7, 2023

Zane Kaminski said:
SDRAM is not that bad. 15 address pins plus CS, RAS, CAS, WE, CKE, and then the clock which might come from an oscillator or your FPGA. So 20-21 extra pins on the FPGA are required if it’s already connected to the full 68k address bus.

That's a lot of pins

When all is said and done, I already need 92 pins to get most of the PDS pins connected to the FPGA (ignoring things like IPL*, RMW, TM*A, PFW, BUSLOCK, NUBUS and the non-cpu clocks, this is 68030 so already 64 for A+D). HDMI needs 8 (that's the primary purpose after all) and my board only has 100 I/O

SRAM is doable by dropping some features like bus-mastering and extra IRQs to reclaim 8 signals, but otherwise, I'm gonna need a bigger board ;-) Unfortunately, boards with more I/Os are usually a lot more expensive...

Melkhior · Jul 31, 2023

Zane Kaminski said:
Then I’ll focus on another accelerator architecture applicable to the (...) LC/LCII/CC (...)

Having discussed the PDS slot in the LC for other reasons recently, dumb question: how do you plan to work around the missing address lines (A28, A29, A30) in the LC PDS ?

My potential use case was to bring the IIsIFPGA's ability to expand the memory in the IIsi to other systems. Easy for the SE/30, LCIII or IIci ("just" some ROM work). But without the ability to decode full addresses, it doesn't feel doable in the LC/LCII. It also feels like accelerators are screwed if they can't distinguish between the memory in $0xxx_xxxx and the I/O stuff in $5xxx_xxxx.

One could stick to 24-bits mode I suppose, but that's very restrictive (8 MiB memory max, up to 1 MiB only per expansion 'slot').

Ubik · Oct 1, 2023

This thread has been quiet. Is WarpSE still alive?

JDW · Oct 1, 2023

Ubik said:
This thread has been quiet. Is WarpSE still alive?

Sadly, it seems the answer is no. I sent numerous private emails to Zane and Garrett during the 7 weeks I spent making my ROM SIMM video, and I did not receive a reply.

Zane did send a single email to our ROM SIMM group at the very beginning, but nothing thereafter. Very disappointing because their 2x8MB ROM SIMM is the best ROM SIMM tech I’ve seen. It was announced several years ago and Zane very graciously sent me a sample, but it was never sold to the general public.

This is why I fear the otherwise amazing accelerator discussed in this thread has me the same fate.

jasa1063 · Oct 1, 2023

JDW said:
Sadly, it seems the answer is no. I sent numerous private emails to Zane and Garrett during the 7 weeks I spent making my ROM SIMM video, and I did not receive a reply.

Zane did send a single email to our ROM SIMM group at the very beginning, but nothing thereafter. Very disappointing because their 2x8MB ROM SIMM is the best ROM SIMM tech I’ve seen. It was announced several years ago and Zane very graciously sent me a sample, but it was never sold to the general public.

This is why I fear the otherwise amazing accelerator discussed in this thread has me the same fate.

I certainly hope that is not the case.

Ubik · Oct 1, 2023

JDW said:
Sadly, it seems the answer is no. I sent numerous private emails to Zane and Garrett during the 7 weeks I spent making my ROM SIMM video, and I did not receive a reply.

Zane did send a single email to our ROM SIMM group at the very beginning, but nothing thereafter. Very disappointing because their 2x8MB ROM SIMM is the best ROM SIMM tech I’ve seen. It was announced several years ago and Zane very graciously sent me a sample, but it was never sold to the general public.

This is why I fear the otherwise amazing accelerator discussed in this thread has me the same fate.

Thanks JDW. Seems like the best direction for these accelerator projects is more open-sourced like BlueSCSI. Then multiple sources could build, improve, and promote. There are so many super-smart people (Zane included) in the community that could contribute and benefit.

JDW · Oct 1, 2023

@Ubik
One of the last emails I sent to them was to suggest that they do something similar to what you suggest, or at least assign rights to someone that they trust, be that Steve Chamberlin of BMOW, Kay Koba, CayMac Vintage, whoever! I suggested maybe they could even ask for a royalty if they wanted some compensations for their efforts, which they deserve for come out with some of the best engineering designs I've seen to date!

But again, I got zero replies. I don't think it's a situation where I am disliked and therefore the replies are not forthcoming, but if you folks want to send a PM to Zane, by all means!

I love their tech, but I am heartbroken it is left to languish. And while it could be that this accelerator project is not 100% complete and therefore could not be in a sellable form, the fact remains that their amazing 16MB ROM SIMM (two selectable 8MB SIMMs in one!) is complete and verified by myself to work great. Indeed, I worked with Doug Brown for a long time so his new SIMM Programmer app would work with the Garrett's Workshop ROM SIMM, eliminating the custom firmware their SIMM required in the past. Now you can flash it just like any other ROM SIMM.

Another possibility is that Steve C., Kay Koba, CayMac or other interested parties could approach Zane and/or Garrett and make an offer.

I really don't know what's going on, but the "life gets in the way excuse" is always just an excuse. Even if the unthinkable happened, you could at least shift to a different business model to ensure the tech gets out in the wild. You will lose some control over it, sure, but it least it will see the light of day. And that brings into mind this maxim...

Great Artists Ship

While my words may seem like a chastisement, sometimes we need a small kick in the booty. If Steve never did that, Apple wouldn't exist today. I really do want to see the accelerator and the ROM SIMM get into the hands of the Mac community as soon as possible, and I hope Zane and Garrett can benefit from that too. It's been YEARS since the ROM SIMM was announced, and "life hasn't got in the way" all those years. As with anything in life, you at some point just have to be decisive.

Zane, if you ever read this, I love you, buddy! We hope to see you and Garrett back in action very soon!

YMK · Oct 2, 2023

Not to pile on, though I'm reminded of this:

Also amusing is that his quip about biotech is no longer true.

Stephen · Oct 2, 2023

Many of these projects are done by enthusiasts in their spare time. Let’s be considerate and understanding that there’s zero obligation.

I’m really proud to have folks like Zane contribute and participate in our community. He has personally supported me on several occasions, including “hand holding” when I needed a little more help understanding something.

Drake · Oct 2, 2023

WarpSE: 25 MHz 68HC000-based accelerator for Mac SE

Tinkerer

Active Tinkerer

Administrator

Active Tinkerer

Tinkerer

Administrator

Administrator

Tinkerer

Tinkerer

Administrator

Tinkerer

Tinkerer

Tinkerer

Administrator

Tinkerer

Tinkerer

Administrator

Active Tinkerer

BetterBit

TinkerDifferent Board Vice-President 2023