implementing 64 bits integers in RNS (residue number system)

The residue number system (RNS) is a way to represent integers with where , such that some operations are much faster in the RNS than the corresponding operations in the radix representation.

The RNS allows for computations without any need to consider carries when performing additions, subtractions, or multiplications .

Moreover big numbers can be represented by using a set of much smaller numbers on which all calculations can be done at the same time. With parallelism, space is always a good trade off against time.

The size of the maximum integer that can be represented by the RNS grows exponentially with the number of factors of . For example, using three moduli of size will be able to represent numbers as big as .

So it is not even a trade off of space against time. The only loss of space is due to the growing number of moduli, all of which have to be written on bigger machine numbers than their actual value.

Computing everything using the RNS looks, at first, like a good idea, and we will see it is not really the case.

Unfortunately, some operations are much more complicated in the RNS than in the radix representation, changing from the RNS to the radix representation is often needed for comparisons or divisions, and it is heavily dependent on the coprime integer used.

Yet it is fascinating to look at it, and since all the information needed is already in the residues and the Modulus, I speculate there might still be some unknown links between the two systems yet to be found that allows for faster computations, even when considering those harder operations.

The RNS is a number representation system based on the Chinese Remainder Theorem (CRT) .

The CRT states that if you know each remainder from the Euclidian division of a number by a set of pairwise coprimes values ,

the remainder

can be uniquely determined.

We define for the rest of the text.

Here we use the term remainder as but we could take any number in a range of the same size.

The RNS defines a number only by the set of "residues" . Each value between and is uniquely defined by those residues. The reconstruction of the number in a radix representation is then seen only as a conversion operation. It is easier to convert from a radix representation to the RNS than the other way around.

Here is a quick example to show how the RNS works. We choose three moduli as a set of coprime integers.

Their product defines .

The RNS can now uniquely define a number in

Let’s pick up a random number . I threw in the air a balanced dice and it landed on the face , so .

The number is uniquely determined knowing only those values and that it is smaller than .

On the other side, from those residues, the number can be reconstructed. How?

Let’s take as an unkown which verifies all the congruence relations that were just described. We know that , so it might be possible to write as

where .

The same is true for other moduli, we might also be abe to write and as well.

The congruence does not mean has to be equal to one, it could be equal to a multiple of and . Because and are coprime with their inverse exists .

If we take

with

then

.

The same is true for and . We can write as

then

.

and

then

.

The advantage of having

is that, then

Such that defining , we now have

And so,

Since and their inverse can be computed easily: and .

They are their own inverse, and it’s quite normal because over all values not congruent to are either or , so that .

So we can already write

such that

Now with the same type of reasonning we can find that if

So

verifies ( ). is not exactly equal to but it is congruent to modulo .

If we had taken instead of , the reconstruction would likely have been completely different and using the CRT the set of residues would have been

Applying the same procedure than the previous one for leads to

The number given by the reconstruction is actually smaller for than for and at first it looks impossible from the residues alone to be able to tell which one is bigger.

It is also possible to reconstruct the number by parsing a lookup table containing all residues. This is basically the method we will discuss later that allows to compare two numbers represented by the RNS.

In conclusion, given a number whose RNS decomposition is , the general formula to find a number in a radix representation is

but other methods are possible.

Addition, multiplication and subtraction can be performed very easily using the RNS. The modular arithmetic tells us that you only need to perform the same operations on the residues.

For example, performing an addition between two numbers in the RNS system can be achieved by adding all the residues. Abusing notations, this translates to:

the same is true for the multiplication and the subtraction

Computer scientists dream would be to have a because machine numbers are powers of two. If was a power of two, the conversion between RNS and radix representation woud be super easy. But being the product of coprime numbers, it is obviously impossible.

Let’s write the first algoritms for those operations.

First, we need to set the residue number system. We want to emulate a 64 bits unsigned integer using a set of 32 bits unsigned integers, and we want to be able to check whether an addition or a subtraction overflowed . We do not consider yet checking an overflow for multiplications here because it would require either to have bigger than a 128 bits integer or some additionnal work.

We chose a set of 5 prime integers all around represented by 32 its unsigned integers, so that any product of two residue does not overflow the 32 bits. If they are prime, then they are necessarily coprime integers (as well as their product) and because they are all odd, it allows us to perform what seems complicated at first, comparison and checking the overflow during an addition or a subtraction.

We define the set of coprime integers:

const P1:u32 = 7121; //first modulus
const P2:u32 = 7127; //second modulus
const P3:u32 = 7129; //third modulus
const P4:u32 = 7151; //fourth modulus
const P5:u32 = 7159; //fifth modulus
const N:u128 = (P1 as u128)*(P2 as u128)*(P3 as u128)*(P4 as u128)*(P5 as u128);

is slightly bigger than . For now, operations are not computed modulo but are computed modulo the native modulus .

We define a struct that represents the number in the RNS:

#[derive(Copy, Clone)]
pub struct DoubleInteger { 
    r_1: u32,
    r_2: u32,
    r_3: u32,
    r_4: u32,
    r_5: u32,
     
}

//constructors

impl DoubleInteger{
    pub fn new(r_1: u32,r_2: u32,r_3: u32,r_4: u32,r_5: u32) -> Self {
        Self { r_1,r_2,r_3,r_4,r_5 }
    }
}

impl DoubleInteger{
    pub fn new_from_double_integer(x:DoubleInteger) -> Self {
        let r_1=x.r_1;
        let r_2=x.r_2;
        let r_3=x.r_3;
        let r_4=x.r_4;
        let r_5=x.r_5;
        Self { r_1,r_2,r_3,r_4,r_5 }
    }
}

impl DoubleInteger{
    pub fn new_from_u64(x:u64) -> Self {
        let u =decompose(x);
        let r_1=u.r_1;
        let r_2=u.r_2;
        let r_3=u.r_3;
        let r_4=u.r_4;
        let r_5=u.r_5;
        Self { r_1,r_2,r_3,r_4,r_5 }
    }
}

impl DoubleInteger{
    pub fn assign(r_1: u32,r_2: u32,r_3: u32,r_4: u32,r_5: u32) -> Self {
        let r_1=r_1;
        let r_2=r_2;
        let r_3=r_3;
        let r_4=r_4;
        let r_5=r_5;
        
        Self { r_1,r_2,r_3,r_4,r_5 }
    }
}

The addition function can easily be written:


pub fn add(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
    let r_1=(x.r_1+y.r_1)%P1;
    let r_2=(x.r_2+y.r_2)%P2;
    let r_3=(x.r_3+y.r_3)%P3;
    let r_4=(x.r_4+y.r_4)%P4;
    let r_5=(x.r_5+y.r_5)%P5;
        

    let mut temp_sum = DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
    
    temp_sum
}

The multiplication can be written


pub fn mul(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
    let r_1=(x.r_1*y.r_1)%P1;
    let r_2=(x.r_2*y.r_2)%P2;
    let r_3=(x.r_3*y.r_3)%P3;
    let r_4=(x.r_4*y.r_4)%P4;
    let r_5=(x.r_5*y.r_5)%P5;
        
    let mut temp_prod = DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
    
    temp_prod
}

and the subtraction


pub fn sub(x:& DoubleInteger,y:& DoubleInteger)->DoubleInteger
{
    let r_1=((x.r_1+P1)-y.r_1)%P1;
    let r_2=((x.r_2+P2)-y.r_2)%P2;
    let r_3=((x.r_3+P3)-y.r_3)%P3;
    let r_4=((x.r_4+P4)-y.r_4)%P4;
    let r_5=((x.r_5+P5)-y.r_5)%P5;
        

    let mut temp_sub = DoubleInteger::new(r_1,r_2,r_3,r_4,r_5);
    temp_sub
}

Here, we added to each subtraction so that it is never negative. We used unsigned 32 bits integers that would not allow for negative values.

Now we need to reconstruct the number. In order to do that we can use immediately the formula from the CRT or we can use a mixed radix representation. We will discuss the second method later.


const P1_128:u128 = 7121; 
const P2_128:u128 = 7127; 
const P3_128:u128 = 7129; 
const P4_128:u128 = 7151; 
const P5_128:u128 = 7159; 

const PROD3451_128:u128 = P1_128*P3_128*P4_128*P5_128;
const PROD3452_128:u128 = P2_128*P3_128*P4_128*P5_128;
const PROD1254_128:u128 = P1_128*P2_128*P4_128*P5_128;
const PROD1253_128:u128 = P1_128*P2_128*P3_128*P5_128;
const PROD1234_128:u128 = P1_128*P2_128*P3_128*P4_128;

pub fn reconstruct(x:&DoubleInteger)->u128{
    
    let fact_1 =((x.r_1 as u128)*INV1 as u128)%P1_128;
    let fact_2 =((x.r_2 as u128)*INV2 as u128)%P2_128;
    let fact_3 =((x.r_3 as u128)*INV3 as u128)%P3_128;
    let fact_4 =((x.r_4 as u128)*INV4 as u128)%P4_128;
    let fact_5 =((x.r_5 as u128)*INV5 as u128)%P5_128;
    
    let mut r= PROD3452_128*fact_1+PROD3451_128*fact_2 + PROD1254_128*fact_3 + PROD1253_128*fact_4 + PROD1234_128*fact_5;
    r=r%N;
    
    
    return r as u128;    
}

Defining , we have

Can we make some more sophisticated operations? Can we make a right shift and a left shift?

Depending on the machine, the system or the programming language, right and left shifts have no unique definition. We will call right shift an integer division by and left shift a *wrapping* multiplication by . This is not a "conventionnal left shift where should be equal to zero. For now, shifts are also with the restriction that left shifts bigger than 63 are not allowed, as it would be for 64 bits unsigned integers, but since it is possible to perform several smaller leftshifts, it is possible to overflow and wrapp around 2^64. We will see a "true" left shift later.

Multiplying by two in the RNS is just like multiplying by two in the real world but because we use only 32 bits unsigned integers to emulate 64 bits or so integers, we need to separate our calculation into two cases. The case where the desired left shift is smaller than 32 bits, and the case where the left shift is bigger than 32 bits.

pub fn leftbitshift(x:&DoubleInteger, t:u32 )->DoubleInteger{
    if t<32
    {
        let y = DoubleInteger::new((1<<t)%P1,(1<<t)%P2,(1<<t)%P3,(1<<t)%P4,(1<<t)%P5);
        let r=mul(x,&y);
        return r;
    }else if t < 64
    {
        let t_1= t/2;
        let t_2 = t -t_1;

        let y = DoubleInteger::new((1<<t_1)%P1,(1<<t_1)%P2,(1<<t_1)%P3,(1<<t_1)%P4,(1<<t_1)%P5);
        let z = DoubleInteger::new((1<<t_2)%P1,(1<<t_2)%P2,(1<<t_2)%P3,(1<<t_2)%P4,(1<<t_2)%P5);
        
        let mut r=mul(x,&y);
        r=mul(&r,&z);
        return r;
    }else 
    {
        panic!("shift is too big!");
    }
    
}

The right shift is a bigger piece. We need to know the parity of the number. Getting the parity of a number is simple if the RNS uses as a coprime integer. But taking as a coprime integer does not allow for a right shift because then is no longer invertible.

The basic algorithm to make a 1 bit right shift would be:

  • find the parity of the number

  • if the parity is , multiply by the inverse of ,

  • if the parity is , subtract and then multiply by the inverse of

Multiplying by the inverse of is exactly like dividing by if the number is a multiple of two. The same observation will allow for integer divisions for other numbers as well, but it is a much longer story.

It is quite easy to find the inverse of . We can proove that . Which means we can multiply by the inverse of 2 very quickly. Since is odd,

thus

As we stated before multiplying by the inverse of an even number is exactly like dividing y two. For example

Multiplying by the inverse of odd numbers is not really more complicated

This will be used in the rightshift algorithm when each residue has to be multiplied by the inverse of . If the residue of a number is even we will apply a simple rightshift to it, otherwise, we apply a rightshift and add :


r_i = if r_i&1 == 0 { r_i>>1 } else { (r_i>>1) + INV_2_Pi };

The RNS does not allow immediatly to tell whether a number is odd or even if we did not include a power of two in our coprime system. On the other side we could easily find the parity of a number knowing its radix representation.

As mentionned earlier there are several ways to change from the RNS to the radix representation. One way to do so, tedious at first sight, is by parsing a lookup table that stores all the RNS possible values in the range . Let’s go back to our simple example with . We present the beginning of that lookup table where is the radix representation of the number, and is the parity of :

n p
0 0 0 0 0
1 1 1 1 1
2 2 2 2 0
0 3 3 3 1
1 4 4 4 0
2 0 5 5 1
0 1 6 6 0
1 2 0 7 1
2 3 1 8 0
0 4 2 9 1

Up until the parity of is the parity of . For , the parity of is the opposite of the parity of . We also know that the residue of will always be any value between and included.

Thus to reconstruct a number and find its parity, one ossibility is to:

  • Define a number that starts at in the lookup table. If is added to that value, will always verify .

  • repeatedly add to