The FPU runs asynchronously though? Plus the many clock cycles it takes to do floating point, even with an FPU, means that memory shouldn't be a limit within reason. Also, this behavior doesn't match my experience with basically the same card, so it is either a clone issue, or a setup issue I...