InvestorsHub Logo
icon url

mmoy

11/24/05 9:37 AM

#66404 RE: dacaw #66369

The Intel MMX routine can handle four integers per instruction
while the MMX/3dNow instructions only handle 2 per instruction
to my recollection. There is some cost in doing the scaled
integer math so maybe that makes up for the difference.

As far as quality goes, I can't tell the difference between
ISLOW and IFAST on web pages so I generally favor IFAST and
haven't received any complaints.

The regular scalar IJG routine is really slow compared to the
SIMD integer routines that I've used. The SSE2 integer operations
do 8 operations in parallel and allow you to hold most of the data in the SIMD registers (or all in the case of x64).

I'd be interested in getting the code, though, and running my usual benchmarks on it. I have my doubts that it can beat an SSE2 IFAST implementation (which is what I'm currently using) but I'm prepared to be found wrong. If you want to send it to me, you can email it to movdqa@gmail.com or mmoy@yahoo.com

Microsoft isn't supporting 3dNow on x64 but I still do specialized builds and get requests for older AMD processors from time to time.