The residue number system (RNS) is a way to represent integers with where , such that some operations are much faster in the RNS than the corresponding operations in the radix representation.
The RNS allows for computations without any need to consider carries when performing additions, subtractions, or multiplications .
Moreover big numbers can be represented by using a set of much smaller numbers on which all calculations can be done at the same time. With parallelism, space is always a good trade off against time.
The size of the maximum integer that can be represented by the RNS grows exponentially with the number of factors of . For example, using three moduli of size will be able to represent numbers as big as .
So it is not even a trade off of space against time. The only loss of space is due to the growing number of moduli, all of which have to be written on bigger machine numbers than their actual value.
Computing everything using the RNS looks, at first, like a good idea, and we will see it is not really the case.
Unfortunately, some operations are much more complicated in the RNS than in the radix representation, changing from the RNS to the radix representation is often needed for comparisons or divisions, and it is heavily dependent on the coprime integer used.
Yet it is fascinating to look at it, and since all the information needed is already in the residues and the Modulus, I speculate there might still be some unknown links between the two systems yet to be found that allows for faster computations, even when considering those harder operations.
The RNS is a number representation system based on the Chinese Remainder Theorem (CRT) .
The CRT states that if you know each remainder from the Euclidian division of a number by a set of pairwise coprimes values ,
the remainder
can be uniquely determined.
We define for the rest of the text.
Here we use the term remainder as but we could take any number in a range of the same size.
The RNS defines a number only by the set of "residues" . Each value between and is uniquely defined by those residues. The reconstruction of the number in a radix representation is then seen only as a conversion operation. It is easier to convert from a radix representation to the RNS than the other way around.
Here is a quick example to show how the RNS works. We choose three moduli as a set of coprime integers.
Their product defines .
The RNS can now uniquely define a number in
Let’s pick up a random number . I threw in the air a balanced dice and it landed on the face , so .
The number is uniquely determined knowing only those values and that it is smaller than .
On the other side, from those residues, the number can be reconstructed. How?
Let’s take as an unkown which verifies all the congruence relations that were just described. We know that , so it might be possible to write as
where .
The same is true for other moduli, we might also be abe to write and as well.
The congruence does not mean has to be equal to one, it could be equal to a multiple of and . Because and are coprime with their inverse exists .
If we take
with
then
.
The same is true for and . We can write as
then
.
and
then
.
The advantage of having
is that, then
Such that defining , we now have
And so,
Since and their inverse can be computed easily: and .
They are their own inverse, and it’s quite normal because over all values not congruent to are either or , so that .
So we can already write
such that
Now with the same type of reasonning we can find that if
So
verifies (). is not exactly equal to but it is congruent to modulo .
If we had taken instead of , the reconstruction would likely have been completely different and using the CRT the set of residues would have been
Applying the same procedure than the previous one for leads to
The number given by the reconstruction is actually smaller for than for and at first it looks impossible from the residues alone to be able to tell which one is bigger.
It is also possible to reconstruct the number by parsing a lookup table containing all residues. This is basically the method we will discuss later that allows to compare two numbers represented by the RNS.
In conclusion, given a number whose RNS decomposition is , the general formula to find a number in a radix representation is
but other methods are possible.
Addition, multiplication and subtraction can be performed very easily using the RNS. The modular arithmetic tells us that you only need to perform the same operations on the residues.
For example, performing an addition between two numbers in the RNS system can be achieved by adding all the residues. Abusing notations, this translates to:
the same is true for the multiplication and the subtraction
Computer scientists dream would be to have a because machine numbers are powers of two. If was a power of two, the conversion between RNS and radix representation woud be super easy. But being the product of coprime numbers, it is obviously impossible.
Let’s write the first algoritms for those operations.
First, we need to set the residue number system. We want to emulate a 64 bits unsigned integer using a set of 32 bits unsigned integers, and we want to be able to check whether an addition or a subtraction overflowed . We do not consider yet checking an overflow for multiplications here because it would require either to have bigger than a 128 bits integer or some additionnal work.
We chose a set of 5 prime integers all around represented by 32 its unsigned integers, so that any product of two residue does not overflow the 32 bits. If they are prime, then they are necessarily coprime integers (as well as their product) and because they are all odd, it allows us to perform what seems complicated at first, comparison and checking the overflow during an addition or a subtraction.
We define the set of coprime integers:
const P1:u32 = 7121; //first modulus
const P2:u32 = 7127; //second modulus
const P3:u32 = 7129; //third modulus
const P4:u32 = 7151; //fourth modulus
const P5:u32 = 7159; //fifth modulus
const N:u128 = (P1 as u128)*(P2 as u128)*(P3 as u128)*(P4 as u128)*(P5 as u128);
is slightly bigger than . For now, operations are not computed modulo but are computed modulo the native modulus .
We define a struct that represents the number in the RNS:
#[derive(Copy, Clone)]
pub struct DoubleInteger {
: u32,
r_1: u32,
r_2: u32,
r_3: u32,
r_4: u32,
r_5
}
//constructors
impl DoubleInteger{
pub fn new(r_1: u32,r_2: u32,r_3: u32,r_4: u32,r_5: u32) -> Self {
Self { r_1,r_2,r_3,r_4,r_5 }
}
}
impl DoubleInteger{
pub fn new_from_double_integer(x:DoubleInteger) -> Self {
let r_1=x.r_1;
let r_2=x.r_2;
let r_3=x.r_3;
let r_4=x.r_4;
let r_5=x.r_5;
Self { r_1,r_2,r_3,r_4,r_5 }
}
}
impl DoubleInteger{
pub fn new_from_u64(x:u64) -> Self {
let u =decompose(x);
let r_1=u.r_1;
let r_2=u.r_2;
let r_3=u.r_3;
let r_4=u.r_4;
let r_5=u.r_5;
Self { r_1,r_2,r_3,r_4,r_5 }
}
}
impl DoubleInteger{
pub fn assign(r_1: u32,r_2: u32,r_3: u32,r_4: u32,r_5: u32) -> Self {
let r_1=r_1;
let r_2=r_2;
let r_3=r_3;
let r_4=r_4;
let r_5=r_5;
Self { r_1,r_2,r_3,r_4,r_5 }
}
}
The addition function can easily be written:
pub fn add(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
let r_1=(x.r_1+y.r_1)%P1;
let r_2=(x.r_2+y.r_2)%P2;
let r_3=(x.r_3+y.r_3)%P3;
let r_4=(x.r_4+y.r_4)%P4;
let r_5=(x.r_5+y.r_5)%P5;
let mut temp_sum = DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
temp_sum}
The multiplication can be written
pub fn mul(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
let r_1=(x.r_1*y.r_1)%P1;
let r_2=(x.r_2*y.r_2)%P2;
let r_3=(x.r_3*y.r_3)%P3;
let r_4=(x.r_4*y.r_4)%P4;
let r_5=(x.r_5*y.r_5)%P5;
let mut temp_prod = DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
temp_prod}
and the subtraction
pub fn sub(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
let r_1=((x.r_1+P1)-y.r_1)%P1;
let r_2=((x.r_2+P2)-y.r_2)%P2;
let r_3=((x.r_3+P3)-y.r_3)%P3;
let r_4=((x.r_4+P4)-y.r_4)%P4;
let r_5=((x.r_5+P5)-y.r_5)%P5;
let mut temp_sub = DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
temp_sub}
Here, we added to each subtraction so that it is never negative. We used unsigned 32 bits integers that would not allow for negative values.
Now we need to reconstruct the number. In order to do that we can use immediately the formula from the CRT or we can use a mixed radix representation. We will discuss the second method later.
const P1_128:u128 = 7121;
const P2_128:u128 = 7127;
const P3_128:u128 = 7129;
const P4_128:u128 = 7151;
const P5_128:u128 = 7159;
const PROD3451_128:u128 = P1_128*P3_128*P4_128*P5_128;
const PROD3452_128:u128 = P2_128*P3_128*P4_128*P5_128;
const PROD1254_128:u128 = P1_128*P2_128*P4_128*P5_128;
const PROD1253_128:u128 = P1_128*P2_128*P3_128*P5_128;
const PROD1234_128:u128 = P1_128*P2_128*P3_128*P4_128;
pub fn reconstruct(x:&DoubleInteger)->u128{
let fact_1 =((x.r_1 as u128)*INV1 as u128)%P1_128;
let fact_2 =((x.r_2 as u128)*INV2 as u128)%P2_128;
let fact_3 =((x.r_3 as u128)*INV3 as u128)%P3_128;
let fact_4 =((x.r_4 as u128)*INV4 as u128)%P4_128;
let fact_5 =((x.r_5 as u128)*INV5 as u128)%P5_128;
let mut r= PROD3452_128*fact_1+PROD3451_128*fact_2 + PROD1254_128*fact_3 + PROD1253_128*fact_4 + PROD1234_128*fact_5;
=r%N;
r
return r as u128;
}
Defining , we have
Can we make some more sophisticated operations? Can we make a right shift and a left shift?
Depending on the machine, the system or the programming language, right and left shifts have no unique definition. We will call right shift an integer division by and left shift a *wrapping* multiplication by . This is not a "conventionnal left shift where should be equal to zero. For now, shifts are also with the restriction that left shifts bigger than 63 are not allowed, as it would be for 64 bits unsigned integers, but since it is possible to perform several smaller leftshifts, it is possible to overflow and wrapp around 2^64. We will see a "true" left shift later.
Multiplying by two in the RNS is just like multiplying by two in the real world but because we use only 32 bits unsigned integers to emulate 64 bits or so integers, we need to separate our calculation into two cases. The case where the desired left shift is smaller than 32 bits, and the case where the left shift is bigger than 32 bits.
pub fn leftbitshift(x:&DoubleInteger, t:u32 )->DoubleInteger{
if t<32
{
let y = DoubleInteger::new((1<<t)%P1,(1<<t)%P2,(1<<t)%P3,(1<<t)%P4,(1<<t)%P5);
let r=mul(x,&y);
return r;
}else if t < 64
{
let t_1= t/2;
let t_2 = t -t_1;
let y = DoubleInteger::new((1<<t_1)%P1,(1<<t_1)%P2,(1<<t_1)%P3,(1<<t_1)%P4,(1<<t_1)%P5);
let z = DoubleInteger::new((1<<t_2)%P1,(1<<t_2)%P2,(1<<t_2)%P3,(1<<t_2)%P4,(1<<t_2)%P5);
let mut r=mul(x,&y);
=mul(&r,&z);
rreturn r;
}else
{
panic!("shift is too big!");
}
}
The right shift is a bigger piece. We need to know the parity of the number. Getting the parity of a number is simple if the RNS uses as a coprime integer. But taking as a coprime integer does not allow for a right shift because then is no longer invertible.
The basic algorithm to make a 1 bit right shift would be:
find the parity of the number
if the parity is , multiply by the inverse of ,
if the parity is , subtract and then multiply by the inverse of
Multiplying by the inverse of is exactly like dividing by if the number is a multiple of two. The same observation will allow for integer divisions for other numbers as well, but it is a much longer story.
It is quite easy to find the inverse of . We can proove that . Which means we can multiply by the inverse of 2 very quickly. Since is odd,
thus
As we stated before multiplying by the inverse of an even number is exactly like dividing y two. For example
Multiplying by the inverse of odd numbers is not really more complicated
This will be used in the rightshift algorithm when each residue has to be multiplied by the inverse of . If the residue of a number is even we will apply a simple rightshift to it, otherwise, we apply a rightshift and add :
= if r_i&1 == 0 { r_i>>1 } else { (r_i>>1) + INV_2_Pi }; r_i
The RNS does not allow immediatly to tell whether a number is odd or even if we did not include a power of two in our coprime system. On the other side we could easily find the parity of a number knowing its radix representation.
As mentionned earlier there are several ways to change from the RNS to the radix representation. One way to do so, tedious at first sight, is by parsing a lookup table that stores all the RNS possible values in the range . Let’s go back to our simple example with . We present the beginning of that lookup table where is the radix representation of the number, and is the parity of :
n | p | |||
---|---|---|---|---|
0 | 0 | 0 | 0 | 0 |
1 | 1 | 1 | 1 | 1 |
2 | 2 | 2 | 2 | 0 |
0 | 3 | 3 | 3 | 1 |
1 | 4 | 4 | 4 | 0 |
2 | 0 | 5 | 5 | 1 |
0 | 1 | 6 | 6 | 0 |
1 | 2 | 0 | 7 | 1 |
2 | 3 | 1 | 8 | 0 |
0 | 4 | 2 | 9 | 1 |
Up until the parity of is the parity of . For , the parity of is the opposite of the parity of . We also know that the residue of will always be any value between and included.
Thus to reconstruct a number and find its parity, one ossibility is to:
Define a number that starts at in the lookup table. If is added to that value, will always verify .
repeatedly add to , (which means jumping by rows in the lookup table), until . The number of times is defined as . Each time is added the parity of changes, so the parity of is now the parity of .
Once and , add repeatedly to until . We call the number of times is added, .
Because is coprime with , .
We have two results here.
First the parity of is the parity of :
Secondly,
The second result is a mixed radix representation of and does not necessitate the reduction .
The method can be generalised to more than moduli.
We can actually deduce more than that.
Because
we get that
for any number, thus, let’s say is a 32 bits integer, we can find all residues by computing:
We already know that
In addition
and
can be precomputed. Since we chose , the calculation
as well as
can not overflow the unsigned integer over 32 bits.
Therefore, to get the integer division of a number by the procedure will be similar to dividing by two.
find the RNS representation of the residue
subtract to in the RNS
multiply by the inverse of if that inverse exists
The inverse of is uniquely defined only for integers not being multiples of any , so the fewer of them there are, the better.
The question is now to find those without having access to an actual lookup table storing all the values up to ? It’s easy.
We starts with , and at that point . But because , .
We write as We want to find such that
A similar procedure is applied for . Starting again from
Ok so now, the code to find the parity of a number in the RNS
const INVP5MODP4:u32=894;
const INVP5P4MODP3:u32=1415;
const INVP5P4P3MODP2:u32=116;
const INVP5P4P3P2MODP1:u32=510;
const P5_1:u32=P5%P1;
const P5_2:u32=P5%P2;
const P5_3:u32=P5%P3;
const P54_1:u32= (P5_1*P4_1)%P1;
const P54_2:u32= (P5_2*P4_2)%P2;
const P543_1:u32= (P54_1*P3_1)%P1;
const PRODUCT3:u32 = (INVP5P4MODP3*(P3-P5_3))%P3;
const PRODUCT2_1:u32 = (INVP5P4P3MODP2*(P2-P5_2))%P2;
const PRODUCT2_2:u32 = (INVP5P4P3MODP2*(P2-P54_2))%P2;
const PRODUCT1_1:u32 = (INVP5P4P3P2MODP1*(P1-P5_1))%P1;
const PRODUCT1_2:u32 = (INVP5P4P3P2MODP1*(P1-P54_1))%P1;
const PRODUCT1_3:u32 = (INVP5P4P3P2MODP1*(P1-P543_1))%P1;
const P4_D:u32 = P4<<1;
const P3_D:u32 = P3<<1;
const P2_D:u32 = P2<<1;
const P1_D:u32 = P1<<1;
pub fn parity_find( x: &DoubleInteger )->(u32,u32,u32,u32,u32)
{
let k_1 = (INVP5MODP4*(P4_D + x.r_4 - x.r_5))%P4;
let k_2 = (INVP5P4MODP3*(P3_D+ x.r_3 - x.r_5 )+k_1*PRODUCT3)%P3;
let k_3 = (INVP5P4P3MODP2*( P2_D + x.r_2 - x.r_5 )+ k_1*PRODUCT2_1+k_2*PRODUCT2_2 )%P2;
let k_4 = (INVP5P4P3P2MODP1*(P1_D + x.r_1 -x.r_5 ) + k_1*PRODUCT1_1+k_2*PRODUCT1_2 + k_3*PRODUCT1_3 )%P1;
let parity= (x.r_5+k_1+k_2+k_3+k_4)&1;
,k_1,k_2,k_3,k_4)
(parity}
And we can now create a new reconstruction function:
pub fn reconstruct_2(x:&DoubleInteger)->u128{
let parity = parity_find(x);
let k_1 = parity.1 as u128;
let k_2 = parity.2 as u128;
let k_3 = parity.3 as u128;
let k_4 = parity.4 as u128;
let r = x.r_5 as u128 + ( P5_128*(k_1 + P4_128*(k_2 + P3_128*(k_3 + k_4*P2_128) ) ) ) ;
return r as u128;
}
From the benchmark I run, this method is surprisingly not really faster than the first reconstruction method.
reconstruct 118446744073709551615:18446744073709551615
Elapsed time for reconstruction 1: 1.05µs
reconstruct 118446744073709551615:18446744073709551615
Elapsed time for reconstruction 2: 3.637µs
reconstruct 118446744073709551613:18446744073709551613
Elapsed time for reconstruction 1: 775ns
reconstruct 218446744073709551613:18446744073709551613
Elapsed time for reconstruction 2: 875ns
Up until now, all operations were performed over the modulus , but we want to simulate 64 bits integers.
The addition algorithm takes two numbers and add them modulo . Finding the value while the native modulus is is a bit tedious. If we were to look at that sum in , the addition would result in:
but
With the parity function we can actually determine whether
How can we do that?
If and have the same parity (either both are odd or both are even), then is even. Because is an odd number, if is even in then should be even if and it should be odd if .
The same is true if is odd. If is odd, then is odd if and is even if .
Now, something similar happens when subtracting. If , keeps the same parity than but if , changes parity. This allows to make comparisons between numbers . We can thus apply that principle to .
pub fn is_above_m_mod_n(x:& DoubleInteger, m: &DoubleInteger)-> bool
{
let temp_x = DoubleInteger::new(
.r_1 + P1) - m.r_1)%P1,
(( x.r_2 + P2) - m.r_2)%P2,
(( x.r_3 + P3) - m.r_3)%P3,
(( x.r_4 + P4) - m.r_4)%P4,
(( x.r_5 + P5) - m.r_5)%P5
(( x;
)let parity_x = parity_find(x);
let parity_temp_x = parity_find(&temp_x);
if parity_x.0 == parity_temp_x.0
{
return true;
}else
{
return false;
}
}
pub fn add_mod_m(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
let max = DoubleInteger::new(MAX1,MAX2,MAX3,MAX4,MAX5);
let mut xt = DoubleInteger::new(x.r_1,x.r_2,x.r_3,x.r_4,x.r_5);
let mut yt = DoubleInteger::new(y.r_1,y.r_2,y.r_3,y.r_4,y.r_5);
let parity_x = parity_find(x);
let parity_y = parity_find(y);
let mut temp_sum = DoubleInteger::new((xt.r_1+yt.r_1)%P1,(xt.r_2+yt.r_2)%P2,(xt.r_3+yt.r_3)%P3,(xt.r_4+yt.r_4)%P4,(xt.r_5+yt.r_5)%P5);
let parity_sum = parity_find(&temp_sum);
if is_above_m_mod_n(&temp_sum,&max)
{
//println!("M < sum < N");
.r_1 = (temp_sum.r_1 + NM1)%P1;
temp_sum.r_2 = (temp_sum.r_2 + NM2)%P2;
temp_sum.r_3 = (temp_sum.r_3 + NM3)%P3;
temp_sum.r_4 = (temp_sum.r_4 + NM4)%P4;
temp_sum.r_5 = (temp_sum.r_5 + NM5)%P5;
temp_sum} else if parity_x.0 == parity_y.0
{
//println!(" 1 1 1");
if parity_sum.0 == 1
{
.r_1 = (temp_sum.r_1 + NM1)%P1;
temp_sum.r_2 = (temp_sum.r_2 + NM2)%P2;
temp_sum.r_3 = (temp_sum.r_3 + NM3)%P3;
temp_sum.r_4 = (temp_sum.r_4 + NM4)%P4;
temp_sum.r_5 = (temp_sum.r_5 + NM5)%P5;
temp_sum}
}else if parity_x.0 != parity_y.0
{
if parity_sum.0 == 0
{
//println!(" 0 1 0 || 1 0 0");
.r_1 = (temp_sum.r_1 + NM1)%P1;
temp_sum.r_2 = (temp_sum.r_2 + NM2)%P2;
temp_sum.r_3 = (temp_sum.r_3 + NM3)%P3;
temp_sum.r_4 = (temp_sum.r_4 + NM4)%P4;
temp_sum.r_5 = (temp_sum.r_5 + NM5)%P5;
temp_sum}
}
return temp_sum;
}
Similar problems happen for the subtraction, and the multiplication, but that post has already been too long, so here is just the code.
pub fn sub_mod_m(x:&DoubleInteger,y:&DoubleInteger)->DoubleInteger
{
let r_1 = (x.r_1+P1)-y.r_1 ;
let r_2 = (x.r_2+P2)-y.r_2 ;
let r_3 = (x.r_3+P3)-y.r_3 ;
let r_4 = (x.r_4+P4)-y.r_4 ;
let r_5 = (x.r_5+P5)-y.r_5 ;
let max = DoubleInteger::new(MAX1,MAX2,MAX3,MAX4,MAX5);
let mut res =DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
let parity_x = parity_find(&x).0;
let parity_y = parity_find(&y).0;
let parity = parity_find(&res).0;
if parity_x==parity_y
{
if parity == 1
{
.r_1 = (r_1+P1-NM1)%P1;
res.r_2 = (r_2+P2-NM2)%P2;
res.r_3 = (r_3+P3-NM3)%P3;
res.r_4 = (r_4+P4-NM4)%P4;
res.r_5 = (r_5+P5-NM5)%P5;
res
}
}else
{
if parity == 0
{
.r_1 = (r_1+P1-NM1)%P1;
res.r_2 = (r_2+P2-NM2)%P2;
res.r_3 = (r_3+P3-NM3)%P3;
res.r_4 = (r_4+P4-NM4)%P4;
res.r_5 = (r_5+P5-NM5)%P5;
res}
}
if res.r_1 == MAX1 && res.r_2 == MAX2 && res.r_3 == MAX3 && res.r_4 == MAX4 && res.r_5 == MAX5
{
.r_1 = 0;
res.r_2 = 0;
res.r_3 = 0;
res.r_4 = 0;
res.r_5 = 0;
res}
if is_above_m_mod_n(&res,&max)
{
.r_1 = (res.r_1 + NM1)%P1;
res.r_2 = (res.r_2 + NM2)%P2;
res.r_3 = (res.r_3 + NM3)%P3;
res.r_4 = (res.r_4 + NM4)%P4;
res.r_5 = (res.r_5 + NM5)%P5;
res}
return res;
}
pub fn mul_mod_m(x:&DoubleInteger,y:&DoubleInteger)->DoubleInteger{
let max = DoubleInteger::new(MAX1,MAX2,MAX3,MAX4,MAX5);
let mut xt = DoubleInteger::new(x.r_1,x.r_2,x.r_3,x.r_4,x.r_5);
let mut yt = DoubleInteger::new(y.r_1,y.r_2,y.r_3,y.r_4,y.r_5);
let xh = rightbitshift(&xt,32);
let xl_mask = leftbitshift_mod_m(&xh,32);
let xl = sub_mod_m(&xt,&xl_mask);
let yh = rightbitshift(&yt,32);
let yl_mask = leftbitshift_mod_m(&yh,32);
let yl = sub_mod_m(&yt,&yl_mask);
//*
let mut xl_yh =mul(&xl,&yh);
let mut xh_yl =mul(&xh,&yl);
let xl_yl =mul(&xl,&yl);
// */
let mut sum = add_mod_m(&xl_yh,&xh_yl);
= leftbitshift_mod_m(&sum,32);
sum = add_mod_m(&sum,&xl_yl);
sum
sum
// separate x and y. x = xh*2^32 + xl, y = yh*2^32 + yl
// x*y = xh*yl*2^32 + xl*yh*2^32 + xl*yl
}
It’s a bit long and it provides a multiplication 22 times longer than with 64 bits integers ( with 5 numbers just "twice" as small). In the end, adding a new modulus inside the natural modulus does not sounds like a good idea for efficiency. Some areas might benefit from it though.
As a bonus, a division by 3 algorithm. There should be a check for the cases where , otherwise it should work all the time.
const INV3_P1:u32 = 2374;
const INV3_P2:u32 = 2376;
const INV3_P3:u32 = 4753;
const INV3_P4:u32 = 2384;
const INV3_P5:u32 = 4773;
pub fn find_k_i(x: &DoubleInteger)->(u32,u32,u32,u32)
{
let k_1 = (INVP5MODP4*(P4_D + x.r_4 - x.r_5))%P4;
let k_2 = (INVP5P4MODP3*(P3_D+ x.r_3 - x.r_5 )+k_1*PRODUCT3)%P3;
let k_3 = (INVP5P4P3MODP2*( P2_D + x.r_2 - x.r_5 )+ k_1*PRODUCT2_1+k_2*PRODUCT2_2 )%P2;
let k_4 = (INVP5P4P3P2MODP1*(P1_D + x.r_1 -x.r_5 ) + k_1*PRODUCT1_1+k_2*PRODUCT1_2 + k_3*PRODUCT1_3 )%P1;
,k_2,k_3,k_4)
(k_1}
pub fn rest_mod_3(x:&DoubleInteger)->DoubleInteger{
let k = find_k_i(x);
//let r = (x.r_5%3 + P5_mod_3*(k.0%3) + P54_mod_3*(k.1%3) + P543_mod_3*(k.2%3) + P5432_mod_3*(k.3%3))%3;
let r = (x.r_5 + P5_mod_3*k.0 + P54_mod_3*k.1 + P543_mod_3*k.2 + P5432_mod_3*k.3)%3;
let rest = DoubleInteger::new(r,r,r,r,r);
rest }
pub fn divide_by_3(x:&DoubleInteger)->DoubleInteger{
let rest = rest_mod_3(x);
let int = sub_mod_m(x,&rest);
let div = DoubleInteger::new((int.r_1*INV3_P1)%P1,(int.r_2*INV3_P2)%P2,(int.r_3*INV3_P3)%P3,(int.r_4*INV3_P4)%P4,(int.r_5*INV3_P5)%P5);
div }
Next post will be about a baby step giant step reversed algorithm.