Finally got all of the Garrett’s Workshop to-do list done up to the WarpSE! We had to change the lab/factory layout, fix the pick and place machine, release several new/updated products, and then build a thousand or so gizmos which were out of stock. Also whenever the landlord comes over to where we have the lab, we have to hide stuff like the massive conveyor reflow oven… And unfortunately the landlord is overly fastidious and is always trying to do things like A/C inspections, duct cleanings, upgrading the garbage disposal, changing the electrical outlet covers, etc. “I used to be a doctor and I worked on my patients. Now I am a landlord and I work on my houses,” he says. Anyway we have persevered through this (hahahahahah) and now we have a pretty big stock of all our Apple II and Mac products, more money in the company bank account, and nothing left to make but WarpSEs. So we’re just about ready for WarpSE beta testing. Once we finish making WarpSEs we will be outsourcing production of all of our stuff since making it ourselves is just too time-consuming, even though we have a great setup with a pick and place machine with vision, conveyor reflow oven, etc. We think outsourcing production will let us design new products much faster. Feels like only 10% of my time working on GW stuff is designing boards or programming and the other 90% of the time is spent moving chips into the desiccating oven, screening solder paste on boards, restocking the pick and place while it runs, fixing the pick and place when it inevitably gets hung up, putting boards through the oven, fixing soldering defects, cleaning boards, drying boards, etc.
Sorry about disappearing, especially if it affected anyone’s accelerator purchase decisions. Did I really say when the WarpSE was gonna be released? Hmm I guess schedule optimism runs very deep in me. I do agree,
@jasa1063 , I am representing a real company and I should have at least reiterated that the project was way down on our long to-do list.
Anyway I do believe the release is somewhat close. Should be in the next 3-5 months maybe? Monday night I made a batch of WarpSE prototypes with a new board revision:
Most of these are made with parts salvaged from the old prototypes and other boards we had lying around so some are not super pretty (lol look at that RAM chip that has “EDO” written on it) but they will suffice for internal use and beta testing. I still need to fix some soldering defects on the boards (mostly due to the use of used chips), solder the PDS connectors on, fix the Windows 7 PC (it runs the Xilinx software), recompile the CPLD firmware, “bundle” the firmware for the USB update system, flash the CPLDs, test the boards, and fix any issues in the firmware or soldering issues on the boards. Then I can send some WarpSEs to testers. If they work well we can go into mass production. I plan to make something like 230 WarpSE boards for starters and the price will be $150 including shipping within the United States. Shipping to places outside of the US will be fairly reasonable, like $17-20. 230 boards should last a while and we can maybe make another few hundred before ending production in our little factory for the foreseeable future.
I fixed a number of small issues in this latest version…
The old version required that modwire for the D0 bit on the accelerated CPU which was super annoying. Fixed that.
The old version had a 25 MHz crystal oscillator since the CPU runs at 25 MHz. Unfortunately most oscillators have a very loose duty cycle spec. The clock could be high/low for as much or as little as 40%/60% of the time. MC68HC000 is only specced for 20 MHz and the specs demand perfect 50% duty cycle at that speed. Since WarpSE will be at 25 MHz, we had better fix the duty cycle. Therefore this new version has a 50 MHz oscillator which is then divided by two to make the 25 MHz clock. This results in a near-perfect 50% duty cycle. Added bonus is that we can stop the 68k clock or only pulse it every once in a while. This will help if we need to slow down to cycle-accurate speeds.
I also added an “overclocking header.” Near the clock oscillator, there’s a new "CLKIN" header where you can supply a faster clock signal. This way you can connect a little board with, say, a 66 MHz oscillator and then run at 33 MHz. The CPU and CPLD can probably do it but the RAM may need an additional wait state. This can be added via firmware update. There’s also a debug header with +5V power and 6 I/O pins connected to the CPLD. Maybe I’ll use it for debugging and maybe it can do something in the future, but the main use might just be to mount and power the oscillator board for overclocking.
Another change is that the CPLD produces the A23 and A22 address bits going to the PDS instead of these directly coming from the MC68HC000. This way we can map motherboard RAM in the accelerated CPU’s address space. It won’t be fast and it’s not contiguous with main RAM but maybe someone can figure out how to patch the memory manager to use it. If not then it would be great as a RAM disk too!
I figured that I should buffer the 8 MHz, 16 MHz, and E clocks coming from the Mac’s motherboard near the PDS connector. The clock traces were like 4+ inches long and Designing Cards and Drivers says not to make them longer than something like an inch or two. Probably doesn’t matter but no sense risking some kind of marginal instability.
Regarding the USB update capability, we have shipped two products with our unique and very slow but extremely inexpensive USB firmware update system. Another product with USB update is in final qualification testing. Previously we had not shipped anything with our unique implementation of USB firmware update and we were not sure if it was gonna work well on all Windows-based systems. We did discover a problem when running an update using certain hypervisors such as Parallels on Apple Silicon. Parallels has a curious bug where the update started running *faster* than expected when you plugged in our card through a USB 2.0 hub. This caused the data received from the card to sometimes get corrupted on its way to the PC, since it was basically reading data back before it had all been sent. One of our testers told us about a similar issue he helped the Raspberry Pi team with many years ago. Interesting! Basically, when communicating with USB 1.x (1.5/12 Mbps) devices (such as the chip used in our update system) over a USB 2.0 (480 Mbps) connection to a hub, a transaction with a device is split up into two portions. During the first part of the transfer, data is sent at 480 Mbps speed to a buffer in the hub. Then the hub slowly sends that data to the USB 1.x device at 1.5 or 12 Mbps speed. Once this is done, a completion message is sent across the USB bus. In between, data can be transferred to other devices. This way using a USB 1.x device doesn’t degrade the hub’s overall bandwidth. Apparently Parallels has a bug in their virtual USB subsystem and says that the transaction is over after the first part of the split transaction but before the completion part. Therefore even when "overlapped" (asynchronous) I/O is disabled, writing to the device returns once the data is transferred to the hub but before the data transfer to the USB 1.x device completes. We applied a workaround for this since right now the program is only available for Windows (so users of Macs will have to use Parallels or VMWare) and now updating over USB works super reliably on every system we have tried. Also we changed the microUSB connector on the WarpSE to a nicer Amphenol brand connector since the previous prototype’s cheaper connector felt awful to plug in.
The DIP switches were removed too since like, why have options? There should just be one config that works for 99% of people and if you really need, you can apply a different one with the update system. Also we will be removing the CPU and ROM sockets from the production version. We feel these introduce more unreliability than there is risk of a soldered CPU breaking. Big PLCC sockets in particular tend to eject the chips over time. As for removing the sockets for the onboard ROMs, we can eventually have an app for flashing the ROMs. Good starting point would be BMoW’s ROMinator 1 app if he’s willing to release it with an open-source license. In case your ROM gets corrupted due to a power loss while flashing, we can make a special firmware that uses motherboard ROM instead of onboard fast ROM. That way you can boot up and use the utility to reflash the fast ROM.
Performance will be the same as the previous prototypes. In Speedometer 3 the scores were something like 4.1x speedup in the CPU test and 3.4x speedup in the graphics test. I think this is just shy of an SE/30 although FPU performance is of course much lower.
As I said before, the sound issue has been fully addressed albeit a bit sloppily. Basically when the CPU accesses sound RAM, the CPU slows down for a little while. The mechanism of the slowdown is that many wait states are inserted for every memory access. Eventually sound slowdown expires if sound RAM has not been accessed in a while and the accelerator goes back to full speed. This is sloppy in the sense that I just added wait states until the sound issue didn’t happen, then added another one or two for good measure. This is fine but it would be better if it was cycle-accurate. The new clock circuitry enables this and instead of just inserting an arbitrary number of wait states to slow the CPU, we can pulse the CPU clock one time per motherboard 7.8336 MHz clock. This is “cycle-accurate sound slowdown.” This is possible with the new hardware but not implemented in the logic yet. Right now many of the subsystems in the accelerator controller CPLD assume that the CPU will receive a particular signal and then everything is supposed to move on to whatever the next state is supposed to be. Obviously with the CPU’s clock stopped, the CPU is not gonna do anything. So all of the subsystems in the controller need updated to receive the “CPU clock stopped” signal and give it appropriate consideration. Should be easy enough to implement but the current solution works well, I think, so we can put this off and proceed with beta testing.
As for the Xilinx XC9500XL discontinuation, we already own quite a few XC95144XL chips. We may buy some more from distributors soon though. Eventually we would like to reimplement the whole thing (68k and all) in an FPGA, but first we would like to ship a few hundred WarpSEs and maybe make something for the SE/30 or LC PDS. Then we will circle back to the WarpSE and try to eliminate all legacy parts from the design. I am eyeing the Suska 68K10 68010-compatible core which the creator has been developing in one form or another for something like 20 years. If this works well then we can eliminate the 68k and combine all the RAM and flash into an SDRAM chip plus a little 8-pin flash ROM. Alternatively, maybe we could do a design with a wider memory system and sockets for an ‘030 and FPU. The card would come pre-flashed with the 68K10 core but you could install your own 68030 and 68882 and flash an alternate firmware that uses the ‘030. That would be good.
I’m quite busy and right now all my activities (work and otherwise) are sort of revolving around supervising some construction at my mom's house but I should be able to find some time to fix the WarpSEs over the next few weekends and get them out to beta testers. This is all assuming I didn’t screw something up in the new board layout lol but I bet it will work. In a month or two, I should have much more free time to produce the WarpSE boards. Once we get a bunch made and if there are no issues found during testing, we will release the card. We have to do the release slowly at first. If there's a widespread issue (which we have never had at GW but you never know) and everyone wants a refund (which of course they would be entitled to) it could like, bankrupt the company. So we have to release slowly over a month or maybe even more.
I'll definitely post an update once the prototypes are finished and working. Should be within the next month unless there's some issue with the board.