ok, I'm getting some progress; instead of doing this:
addi #0x0020,source //bump source and dest by $20, which 512 in decimal, next line
addi #0x0020,dest
I do this instead:
add.l #0x40,source //bump source and dest by 0x40, which 512 in decimal, next line
add.l #0x40,dest
gives this...