[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Multiprecision integer mult using FPU

To: [email protected]
Subject: Multiprecision integer mult using FPU
From: Eric Blossom <[email protected]>
Date: Fri, 12 Aug 1994 16:59:02 -0700
Cc: [email protected], [email protected], [email protected], [email protected]
In-Reply-To: Norman Hardy's message of Thu, 11 Aug 1994 22:54:02 -0700 <[email protected]>
Sender: [email protected]

Norm Hardy writes:
> The PowerPC floating point is even more impressive. The fmadd instruction
> can do "a <- b*c+d" every other clock or 30 per microsecond on the low end
> Power Mac. If we store 24 bits of a multiple precision number in successive
> elements of an arrary then the inner loop of a multiply is a routine such
> as:
>
> void m8(float * a, float * b, double * p)
> {p[0] = a[0]*b[0];
> p[1] = a[0]*b[1] + a[1]*b[0];
> p[2] = a[0]*b[2] + a[1]*b[1] + a[2]*b[0];
> p[3] = a[0]*b[3] + a[1]*b[2] + a[2]*b[1] + a[3]*b[0];
> p[4] = a[0]*b[4] + a[1]*b[3] + a[2]*b[2] + a[3]*b[1] + a[4]*b[0];
> p[5] = a[0]*b[5] + a[1]*b[4] + a[2]*b[3] + a[3]*b[2] + a[4]*b[1] + a[5]*b[0];
> ....
> p[13] = a[6]*b[7] + a[7]*b[6];
> p[14] = a[7]*b[7];}

Nice hack Norm.

This would appear to apply to any processor where the floating point
performance is substantially greater than the integer.  This is true
of the Pentium too.

Floating point:
		latency/throughput
	FADD	3/1
	FMUL	3/1

	FLD	1/1
	FST	2/2	1/1 if storing to FPU stack

Integer:
	ADD	1
	MUL	10

References:
- Re: IDEA vs DES
  - From: [email protected] (Norman Hardy)

Prev by Date: Re: Bug in PgP2.6???
Next by Date: No Subject
Prev by thread: Re: IDEA vs DES
Next by thread: Re: IDEA vs DES
Index(es):
- Date
- Thread