Reimplementing the BBU has been a much-discussed topic on here. Whenever it comes up, I always suggest that we should be focusing not on a form and function replacement for the BBU but a redesigned Mac SE motherboard using modern RAM and FPGA to implement the BBU and RAM. Thanks to @Kai Robinson 's Mac SE Reloaded, we have the mechanical design for an SE board; now all we need to do is swap out the RAM, BBU, and other supporting components for a new implementation using modern FPGA and SDRAM.
Now I have to say right off the bat that I am not able to pursue this project to completion, but I have had the design for the RAM controller and video output subsystem of the BBU in my head for a while. So here's my attempt to write down my thoughts on how to implement this portion of the BBU. I want to emphasize that my aim here in reimplementing the BBU is to eliminate legacy/hard-to-find components from the design and instead replace them with new-manufacture (harder to solder) FPGAs and 74xx chips. The total number of chips and soldered pins on this implementation of the BBU will be higher than in the Mac SE, and it will use fine-pitch surface-mount components. That's the trade-off associated with using new chips! So we have to accept the downsides of more chips, surface-mount, and more pins if we wanna replace the BBU in the way I am describing. And I am just trying to replace the BBU and associated circuitry, not the SWIM (floppy chip), ADB microcontroller, RTC+PRAM chip, etc. We can however replace the Sony sound chip and GLU because they're really easy to reimplement, either in 74xx or in the FPGA (if we have enough pins). If someone helps complete this project then we can move on to redoing the SWIM and other I/O chips but I think first things first, we need to do the BBU.
Right away, one of the supporting components we wanna eliminate is the 15.6672 MHz oscillator since it's somewhat rare. To generate this clock without an oscillator of the exact same frequency, we need an FPGA with a PLL. I like Lattice MachXO2. The LCMXO2-1200HC-4TG144C is in TQFP-144 package and has plenty of logic and 104 I/O pins (excluding JTAG). We supply the FPGA with a 25 MHz clock (very common frequency) and we can generate 15.6672 MHz using the internal PLL. Great!
Next on the list to eliminate is legacy DRAM. We can use a single SDRAM chip. 8 MB is a common size of SDRAM these days and prices before the chip shortage were under $0.50 each in 100ish unit quantity. Since the Mac SE's RAM size is only 4 MB, we can also eliminate costly parallel ROM chips, instead opting to copy the contents of a cheap serial flash ROM into RAM at startup and serving the ROM reads out of RAM.
We need to connect to every signal the BBU connects to and do functionally identical stuff on all of the output lines. Nominally I would draw a block diagram but I am just trying to get this basic thought out there so that someone else can maybe do a layout. So let me just list the pins the FPGA needs to connect to:
If it's not obvious from the pinout, we are going to connect to the entire 68k address bus, unlike the BBU which has a multiplexed connection to the address bus. Regarding the data bus, we are sharing it between the SDRAM, FPGA, and 68k, just like the BBU does. All signals coming from the FPGA and connecting to the 68k bus need level-shifting buffers since the FPGA is not 5V-tolerant. 74AHC245 is a good choice for these.
Now let's start implementing. We are going to implement clock generation, the SDRAM command sequence, and then video, in that order.
Internal to the FPGA, we are going to generate a 31.3344 MHz clock from the 25 MHz clock. This "C32M" clock will be used to run the state machine and SDRAM. But then we need to output some clocks. Referring to the list of pins, we need to output 15.6672 MHz (C16M), 7.8336 MHz (C8M) and 3.9168 MHz (C4M) clocks. I see this "C3.7M" stuff in the Bomarc schematics. Huh? I thought it was is actually a 3.9168 MHz division of the main clock.
Alright, my strategy for the clock is as follows. First we are going to make a 4-bit binary counter, called S[3:0]. It will count from 0-15. We are going to use this counter to create the clock signals. Now in the Mac SE, the C16M clock is the sort of "master clock" in the system and all of the other clocks are registered outputs clocked by C16M. That means all of the other clocks will be slightly delayed compared to C16M since their toggling is a consequence of the C16M rising edge. According to "Designing Cards and Drivers," this delay can be up to 30 nanoseconds. So no matter what, we need to make sure the C8M and C4M clocks' rising edges are delayed a a good bit compared to C16M, but not more than 30ns. Okay, check this timing diagram:
I've just expanded the the S[3:0] count sequence. From this it's really easy to get the three clock signals. C16M is S[0] inverted, C8M is S[1] but delayed one half clock cycle, and C4M is S[2] again delayed half a clock. So:
Great, we have our clock signals. We just send these out of the FPGA as-is. They all have the right frequency and C4M/C8M are delayed half a clock so they will come out of the FPGA approximately 16 nanoseconds (half a 31.3344 MHz clock period) after C16M. This delay is accomplished by sending the S[2] and S[1] bits through a falling edge-triggered register.
The verilog for this is quite simple:
Moving on, we need to design the RAM controller. We have to sort of copy this diagram of the BBU functionality from "Designing Cards and Drivers":
So basically, the BBU runs at 15.6672 MHz and repeats two different 8-state patterns. First the BBU allows the video circuitry to read RAM, then the BBU allows the MC68k to read RAM three times. If MC68k misses the RAM access window, it has to wait until the next opportunity to access RAM. After one video access and then three CPU access windows, the cycle repeats. Also, during vertical and horizontal blanking, the video access window is given to the CPU.
Let me elaborate a little bit on the "RAM access window." The Apple diagram is a little bit misleading because it shows /AS falling specifically during S2. If each RAM access took four clocks, it would line up this way, but sometimes, MC68k takes a non-multiple-of-4 clock cycles between accesses. In actuality, MC68k can assert /AS after any C8M rising edge, i.e. in any even-numbered state. What the diagram is supposed to mean is that if /AS falls in a state other than S2, the BBU doesn't respond to it until the next S2, so it would be as if /AS only falls during S2. Subtly different though and we should not have an assumption in our controller logic on /AS only falling in our state corresponding to S2.
Okay, so the Apple diagram has 8 states per RAM access, but we have a 31.3344MHz clock (twice as fast) so we need 16 states. We already have this with the S[3:0] clock counter! So just like in the above image of the BBU's functionality, we need to define the operation sequence for video reads, CPU reads, and CPU writes. This will get encapsulated into the RAM controller block of our system. One thing the RAM controller block is not responsible for is deciding whether the current RAM access cycle should be for CPU or video access. That will be left to the client of the RAM controller. Okay I tried writing the operating sequence for video access first:
To give some context, VID/CPU is a signal supplied by the client of the RAM controller indicating whether the current sequence should be for video or CPU access. VID/CPU should be valid from S5-S14 but can change between S15-S4. The client of the RAM controller also must provide the address for the video access but I have not included addressing stuff because it's pretty much just a straightforward consequence of the command issuance. So referring to the above diagram, RCMD and RCKE form the SDRAM command. CKE is "clock enable." Basically, if CKE is 0, the next clock pulse is ignored by the SDRAM. So CKE is 0, before issuing a command, we need to issue a NOP CKE to wake up the SDRAM chip. So we start in our S4 by issuing a NOP CKE command to the SDRAM to wake it up after the clock being disabled previously. Then in S5 we check VID/CPU. For this timing diagram, we are assuming it's 1, i.e. video access, so we issue an ACTivate command to open the correct row for the video access. We are assuming a CAS latency of 2 for the SDRAM so read data comes out of the RAM one clock after the read command is executed. We need to read two 16-bit data buses worth of data for the video access so we issue two RD commands at the end of S6 and S7 but we manipulate the CKE pin and issue some NOPs to slow the data rate down to make a large capture window so the FPGA can clock in the data. So at the end of S9 and S11, the data has been stable for a clock cycle or so, so, we shift in the video data coming out of the RAM. Then by S12, we have VD[31:0] corresponding to the 32 bits of video data just read from RAM. Not shown explicitly is precharging. The second read in the sequence oughta have the auto-precharge bit set so that a precharge is executed in S10 ahead of the REFresh command issued in S11 (and executing in S12). The rest of the time is just NOP CKD to disable the SDRAM clock and save power..
Moving on, we have similar diagrams for read and write CPU accesses. Here's reading:
And writing:
There isn't much to add here other than that precharge isn't explicitly shown. The RD and WR commands oughta be of the auto-precharge type. also notice the RDDIR and /RDOE signals. These control the '245 buffers between the 68k data bus and the FPGA+SDRAM data bus. We have to enable the output at the right times for reading and writing and also set the direction correctly. For the video operations previously discussed /RDOE is always 1 to isolate the video read data from the 68k bus.
So the idea here is that we are checking /AS and VID/CPU at the end of S5. Since the above diagrams are for the CPU access case, we check /AS, and if it's 0, then we go into the RAM read/write sequence. Of course if /AS is 1 then we need an idle sequence. Something like this:
Just do nothing...
Notice in all four cases (read/write/idle/video) the slight difference in the assumption on /AS and VID/CPU. In all cases, VID/CPU should be stable between S5 and S14. But the assumption on /AS differs. For the idle case, /AS should be checked and saved during S5, since it could go low in the future and we would not want to execute the second half of the RAM access sequence but not the first. But for the read and write cases, if /AS is low in S5, it is assumed not to change until the end of the RAM access cycle.
Notice also how although the RAM controller makes demands of its client (not allowing VID/CPU or /AS to change at the wrong times) but also provides certain guarantees. If VID/CPU and /AS are set for a CPU access at the right time, the operation will complete in a specified number of cycles. Therefore the RAM controller doesn't have to be bothered generating some kind of proto-DTACK signal and the external circuitry can just generate it based on the guaranteed completion of the RAM operation.
Oh also the /AS sent to the RAM controller should be further gated with the RAM/ROM select signal such that /AS as seen by the RAM controller doesn't fall when I/O devices are being accessed.
I won't write all the code for the RAM controller since it's quite simple, basically just a big if statement based on S[3:0], VID/CPU, /AS, and R/W. Here's a rough sketch:
Obviously I have omitted the specifics but it's pretty easy to write the code looking at the timing diagrams. It's just a big if statement which issues the commands depending on the state number, /AS, VID/CPU, and R/W.
Okay so we have basically implemented clocking and SDRAM control, now let's focus back on video.
The Mac SE's video pixel clock is the same C16M clock as we have generated. The machine has a 512x342 active screen resolution but the total including the retrace periods is 704x370. Notice how in the BBU timing diagram, there were some eight-state cycles repeated four times for a total of 32 states. And conveniently, 704 is divisible by 32. So basically the BBU's RAM access state counter runs off the low bits of the video horizontal counter. Now in our system, we are running from C32M so we need 1408 horizontal states (704x2) and the unit of repetition is 64 states long but it's basically the same thing.
So we extend the S[3:0] counter into HS[10:0] which will count from 0-1407. Here's the verilog:
Now we need the vertical counter which goes from 0-369:
With these, we can generate the HSYNC and VSYNC signals. Just need to know the H and V coordinates where the sync pulses begin and end:
Now we are going to define the timing relationship between the horizontal/vertical counters and the RAM controller by deciding when to change the VID/CPU signal. Let's define VS states 0-341 as the active lines. During each line, let's fetch video data during HS0-15, HS64-79, HS128-143, ..., HS960-975. So during those states, VID/CPU will be 1, telling the RAM controller to do a video access. Otherwise, VID/CPU will be 0 and the MC68k can access the RAM. Now here's the tricky part. Video data becomes available in S12 of a video access cycle, so we will be putting out the active pixels of a line during the even states from HS12-1034. You'd think it would be HS0-1024 or something but it's actually a bit later since the video data is only available 3/4 of the way (S12) through the video read cycle. Here's the code:
Note here that we have chosen the numbering of the active pixel states arbitrarily. This is okay but we need to make sure to set the HSYNC/VSYNC start/end coordinates accordingly since they will be relative to the active pixel coordinates HS12-1034 / VS0-341. Another issue in the code is the use of the "<" operator. Less-than is quite a heavy operation in a small FPGA! We should switch to something where range flags are set/cleared at certain HS/VS values. Equivalence checks require much less logic to implement than less-than and greater-than checks.
Now that we have VID/CPU implemented, let's implement the 1-bit video pixel value output:
We could save some register bits in the FPGA by combining the VD_shift register with the VD register from the RAM controller. And also this code has the < operator problem I mentioned previously but we can correct that later.
Alright I'm tired so that's all for now. Hopefully this all makes enough sense. There are some more parts to the BBU such as /DTACK and /VPA generation, chip select generation, SCSI pseudo-DMA, audio DMA, etc. which I have not made a stab at implementing. But the RAM controller and the relationship with the video circuitry is the most complex and is sort of the central part of everything else in the BBU.
Next maybe I will write about implementing the chip select signals, /DTACK and /VPA generation, etc.
Anyone wanna do the layout? I will perfect the FPGA code I've written about here if someone designs a board.
Now I have to say right off the bat that I am not able to pursue this project to completion, but I have had the design for the RAM controller and video output subsystem of the BBU in my head for a while. So here's my attempt to write down my thoughts on how to implement this portion of the BBU. I want to emphasize that my aim here in reimplementing the BBU is to eliminate legacy/hard-to-find components from the design and instead replace them with new-manufacture (harder to solder) FPGAs and 74xx chips. The total number of chips and soldered pins on this implementation of the BBU will be higher than in the Mac SE, and it will use fine-pitch surface-mount components. That's the trade-off associated with using new chips! So we have to accept the downsides of more chips, surface-mount, and more pins if we wanna replace the BBU in the way I am describing. And I am just trying to replace the BBU and associated circuitry, not the SWIM (floppy chip), ADB microcontroller, RTC+PRAM chip, etc. We can however replace the Sony sound chip and GLU because they're really easy to reimplement, either in 74xx or in the FPGA (if we have enough pins). If someone helps complete this project then we can move on to redoing the SWIM and other I/O chips but I think first things first, we need to do the BBU.
Right away, one of the supporting components we wanna eliminate is the 15.6672 MHz oscillator since it's somewhat rare. To generate this clock without an oscillator of the exact same frequency, we need an FPGA with a PLL. I like Lattice MachXO2. The LCMXO2-1200HC-4TG144C is in TQFP-144 package and has plenty of logic and 104 I/O pins (excluding JTAG). We supply the FPGA with a 25 MHz clock (very common frequency) and we can generate 15.6672 MHz using the internal PLL. Great!
Next on the list to eliminate is legacy DRAM. We can use a single SDRAM chip. 8 MB is a common size of SDRAM these days and prices before the chip shortage were under $0.50 each in 100ish unit quantity. Since the Mac SE's RAM size is only 4 MB, we can also eliminate costly parallel ROM chips, instead opting to copy the contents of a cheap serial flash ROM into RAM at startup and serving the ROM reads out of RAM.
We need to connect to every signal the BBU connects to and do functionally identical stuff on all of the output lines. Nominally I would draw a block diagram but I am just trying to get this basic thought out there so that someone else can maybe do a layout. So let me just list the pins the FPGA needs to connect to:
Code:
Clock (4)
(1) CLK_IN (25 MHz)
(3) C16M, C8M, C4M
68k (52)
(23) A[23:1] (68k address bus)
(16) D[15:0] (SDRAM data bus 68k data bus)
(2) /DOE, DDIR (data bus buffer control)
(4) /AS, /LDS, /UDS, R/W (68k outputs)
(2) /DTACK, /VPA, /BERR (68k cycle termination signals)
(3) /RESETi, /RESETo (reset in and out)
(2) /IPL[1:0] (68k interrupt signals)
PDS (2)
(2) /PMCYC, /EXT.DTK
SDRAM (22)
(14) BA[1:0], RA[11:0] (SDRAM address bus)
(2) DQMH, DQML (SDRAM low/high byte mask)
(5) /CS, CKE, /RAS, /CAS, /WE (SDRAM control signals)
(1) CLK (SDRAM clock)
Video (3)
(3) HSYNC, VSYNC, VID
Sound + Brightness/Disk Speed (7)
(2) SND, PBL
I/O Select (7)
(5) SELSCC, SELSWIM, SELSCSI, SELVIA, EXTPDS
(2) /RD (SCC), /WR (SCC)
SCSI Pseudo-DMA (3)
(3) /EOP, DACK, DRQ
VIA Interface (3)
(2) PB7, PA6
(1) IRQ
Flash (4)
(4) /CS, SCK, MOSI, MISO
JTAG (3)
(4) TCK, TDI, TMS, TDO
Now let's start implementing. We are going to implement clock generation, the SDRAM command sequence, and then video, in that order.
Internal to the FPGA, we are going to generate a 31.3344 MHz clock from the 25 MHz clock. This "C32M" clock will be used to run the state machine and SDRAM. But then we need to output some clocks. Referring to the list of pins, we need to output 15.6672 MHz (C16M), 7.8336 MHz (C8M) and 3.9168 MHz (C4M) clocks. I see this "C3.7M" stuff in the Bomarc schematics. Huh? I thought it was is actually a 3.9168 MHz division of the main clock.
Alright, my strategy for the clock is as follows. First we are going to make a 4-bit binary counter, called S[3:0]. It will count from 0-15. We are going to use this counter to create the clock signals. Now in the Mac SE, the C16M clock is the sort of "master clock" in the system and all of the other clocks are registered outputs clocked by C16M. That means all of the other clocks will be slightly delayed compared to C16M since their toggling is a consequence of the C16M rising edge. According to "Designing Cards and Drivers," this delay can be up to 30 nanoseconds. So no matter what, we need to make sure the C8M and C4M clocks' rising edges are delayed a a good bit compared to C16M, but not more than 30ns. Okay, check this timing diagram:
I've just expanded the the S[3:0] count sequence. From this it's really easy to get the three clock signals. C16M is S[0] inverted, C8M is S[1] but delayed one half clock cycle, and C4M is S[2] again delayed half a clock. So:
Great, we have our clock signals. We just send these out of the FPGA as-is. They all have the right frequency and C4M/C8M are delayed half a clock so they will come out of the FPGA approximately 16 nanoseconds (half a 31.3344 MHz clock period) after C16M. This delay is accomplished by sending the S[2] and S[1] bits through a falling edge-triggered register.
The verilog for this is quite simple:
Code:
reg [3:0] S;
output reg C8M;
output reg C4M;
always @(posedge C32M) begin
S <= S+1;
C16M <= S[0];
end
always @(negedge C32M) begin
C8M <= S[1];
C4M <= S[2];
end
Moving on, we need to design the RAM controller. We have to sort of copy this diagram of the BBU functionality from "Designing Cards and Drivers":
So basically, the BBU runs at 15.6672 MHz and repeats two different 8-state patterns. First the BBU allows the video circuitry to read RAM, then the BBU allows the MC68k to read RAM three times. If MC68k misses the RAM access window, it has to wait until the next opportunity to access RAM. After one video access and then three CPU access windows, the cycle repeats. Also, during vertical and horizontal blanking, the video access window is given to the CPU.
Let me elaborate a little bit on the "RAM access window." The Apple diagram is a little bit misleading because it shows /AS falling specifically during S2. If each RAM access took four clocks, it would line up this way, but sometimes, MC68k takes a non-multiple-of-4 clock cycles between accesses. In actuality, MC68k can assert /AS after any C8M rising edge, i.e. in any even-numbered state. What the diagram is supposed to mean is that if /AS falls in a state other than S2, the BBU doesn't respond to it until the next S2, so it would be as if /AS only falls during S2. Subtly different though and we should not have an assumption in our controller logic on /AS only falling in our state corresponding to S2.
Okay, so the Apple diagram has 8 states per RAM access, but we have a 31.3344MHz clock (twice as fast) so we need 16 states. We already have this with the S[3:0] clock counter! So just like in the above image of the BBU's functionality, we need to define the operation sequence for video reads, CPU reads, and CPU writes. This will get encapsulated into the RAM controller block of our system. One thing the RAM controller block is not responsible for is deciding whether the current RAM access cycle should be for CPU or video access. That will be left to the client of the RAM controller. Okay I tried writing the operating sequence for video access first:
To give some context, VID/CPU is a signal supplied by the client of the RAM controller indicating whether the current sequence should be for video or CPU access. VID/CPU should be valid from S5-S14 but can change between S15-S4. The client of the RAM controller also must provide the address for the video access but I have not included addressing stuff because it's pretty much just a straightforward consequence of the command issuance. So referring to the above diagram, RCMD and RCKE form the SDRAM command. CKE is "clock enable." Basically, if CKE is 0, the next clock pulse is ignored by the SDRAM. So CKE is 0, before issuing a command, we need to issue a NOP CKE to wake up the SDRAM chip. So we start in our S4 by issuing a NOP CKE command to the SDRAM to wake it up after the clock being disabled previously. Then in S5 we check VID/CPU. For this timing diagram, we are assuming it's 1, i.e. video access, so we issue an ACTivate command to open the correct row for the video access. We are assuming a CAS latency of 2 for the SDRAM so read data comes out of the RAM one clock after the read command is executed. We need to read two 16-bit data buses worth of data for the video access so we issue two RD commands at the end of S6 and S7 but we manipulate the CKE pin and issue some NOPs to slow the data rate down to make a large capture window so the FPGA can clock in the data. So at the end of S9 and S11, the data has been stable for a clock cycle or so, so, we shift in the video data coming out of the RAM. Then by S12, we have VD[31:0] corresponding to the 32 bits of video data just read from RAM. Not shown explicitly is precharging. The second read in the sequence oughta have the auto-precharge bit set so that a precharge is executed in S10 ahead of the REFresh command issued in S11 (and executing in S12). The rest of the time is just NOP CKD to disable the SDRAM clock and save power..
Moving on, we have similar diagrams for read and write CPU accesses. Here's reading:
And writing:
There isn't much to add here other than that precharge isn't explicitly shown. The RD and WR commands oughta be of the auto-precharge type. also notice the RDDIR and /RDOE signals. These control the '245 buffers between the 68k data bus and the FPGA+SDRAM data bus. We have to enable the output at the right times for reading and writing and also set the direction correctly. For the video operations previously discussed /RDOE is always 1 to isolate the video read data from the 68k bus.
So the idea here is that we are checking /AS and VID/CPU at the end of S5. Since the above diagrams are for the CPU access case, we check /AS, and if it's 0, then we go into the RAM read/write sequence. Of course if /AS is 1 then we need an idle sequence. Something like this:
Just do nothing...
Notice in all four cases (read/write/idle/video) the slight difference in the assumption on /AS and VID/CPU. In all cases, VID/CPU should be stable between S5 and S14. But the assumption on /AS differs. For the idle case, /AS should be checked and saved during S5, since it could go low in the future and we would not want to execute the second half of the RAM access sequence but not the first. But for the read and write cases, if /AS is low in S5, it is assumed not to change until the end of the RAM access cycle.
Notice also how although the RAM controller makes demands of its client (not allowing VID/CPU or /AS to change at the wrong times) but also provides certain guarantees. If VID/CPU and /AS are set for a CPU access at the right time, the operation will complete in a specified number of cycles. Therefore the RAM controller doesn't have to be bothered generating some kind of proto-DTACK signal and the external circuitry can just generate it based on the guaranteed completion of the RAM operation.
Oh also the /AS sent to the RAM controller should be further gated with the RAM/ROM select signal such that /AS as seen by the RAM controller doesn't fall when I/O devices are being accessed.
I won't write all the code for the RAM controller since it's quite simple, basically just a big if statement based on S[3:0], VID/CPU, /AS, and R/W. Here's a rough sketch:
Code:
always @(posedge C32M) begin
case (S[3:0]):
0: // issue NOP CKD
1: // ...
2: // ...
3: // ...
4: // issue NOP CKE
5: // Check /AS, VID/CPU and issue ACT or NOP accordingly
6: // issue RD or NOP accordingly
7: // ...
...
endcase
end
Okay so we have basically implemented clocking and SDRAM control, now let's focus back on video.
The Mac SE's video pixel clock is the same C16M clock as we have generated. The machine has a 512x342 active screen resolution but the total including the retrace periods is 704x370. Notice how in the BBU timing diagram, there were some eight-state cycles repeated four times for a total of 32 states. And conveniently, 704 is divisible by 32. So basically the BBU's RAM access state counter runs off the low bits of the video horizontal counter. Now in our system, we are running from C32M so we need 1408 horizontal states (704x2) and the unit of repetition is 64 states long but it's basically the same thing.
So we extend the S[3:0] counter into HS[10:0] which will count from 0-1407. Here's the verilog:
Code:
reg [10:0] HS;
wire [3:0] S = HS[3:0];
always @(posedge C32M) begin
if (HS==1407) HS <= 0;
else HS <= HS+1;
end
Now we need the vertical counter which goes from 0-369:
Code:
reg [8:0] VS;
always @(posedge C32M) begin
if (HS==1407) begin
if (VS==369) VS <= 0;
else VS <= VS+1;
end
end
With these, we can generate the HSYNC and VSYNC signals. Just need to know the H and V coordinates where the sync pulses begin and end:
Code:
reg nHSYNC, nVSYNC;
always @(posedge C32M) begin
if (HS==HSYNC_HSTART && VS==HSYNC_VSTART) nHSYNC <= 0;
else if (HS==HSYNC_HEND && VS==HSYNC_VEND) nHSYNC <= 1;
if (HS==VSYNC_HSTART && VS==VSYNC_VSTART) nVSYNC <= 0;
else if (HS==VSYNC_HEND && VS==VSYNC_VEND) nVSYNC <= 1;
end
Now we are going to define the timing relationship between the horizontal/vertical counters and the RAM controller by deciding when to change the VID/CPU signal. Let's define VS states 0-341 as the active lines. During each line, let's fetch video data during HS0-15, HS64-79, HS128-143, ..., HS960-975. So during those states, VID/CPU will be 1, telling the RAM controller to do a video access. Otherwise, VID/CPU will be 0 and the MC68k can access the RAM. Now here's the tricky part. Video data becomes available in S12 of a video access cycle, so we will be putting out the active pixels of a line during the even states from HS12-1034. You'd think it would be HS0-1024 or something but it's actually a bit later since the video data is only available 3/4 of the way (S12) through the video read cycle. Here's the code:
Code:
reg VIDnCPU;
always @(posedge C32M) begin
if (S[3:0]==0) begin
//FIXME: < operator is bad!!
if (VS < 342 && HS[5:4]==0 && HS < 1024) VIDnCPU <= 1;
else VIDnCPU <= 0;
end
end
Now that we have VID/CPU implemented, let's implement the 1-bit video pixel value output:
Code:
reg [31:0] VD_shift;
assign VIDOUT = VD_shift[0];
always @(posedge C32M) begin
if (HS >= 12 && HS <= 1036) begin
if (HS[3:0]==12) VD_shift[31:0] <= VD[31:0];
else VD_shift[31:0] <= {1'b0, VD_shift[31:1]};
end else VD_shift[31:0] <= 32'hFFFFFFFF;
end
Alright I'm tired so that's all for now. Hopefully this all makes enough sense. There are some more parts to the BBU such as /DTACK and /VPA generation, chip select generation, SCSI pseudo-DMA, audio DMA, etc. which I have not made a stab at implementing. But the RAM controller and the relationship with the video circuitry is the most complex and is sort of the central part of everything else in the BBU.
Next maybe I will write about implementing the chip select signals, /DTACK and /VPA generation, etc.
Anyone wanna do the layout? I will perfect the FPGA code I've written about here if someone designs a board.
Last edited: