Longest path search

Linear algebra over field extensions

A common occurrence in succinct proofs is the need to work in two different finite fields. In general, we work mostly over a base field 𝐅 (I will mostly call this the "little field") and over some extension field 𝐅 (ditto "big field") which is a field that is also a superset of the base field 𝐅𝐅. (Technically, the extension 𝐅 need not be an actual superset of 𝐅; rather, it needs to have a subset that is equivalent/isomorphic to 𝐅, we will make use of this later.)

Most of the operations in a given succinct proof that uses codes (e.g., in Ligero/ZODA/Ligerito) will be done over the base field and a few (mostly involving random linear combinations) will be over the "big field" . As we will see, most of these operations will take the form of something like the following: let G𝐅m×n be a matrix in the "little field" and x𝐅n a vector over the "big field", then we want to compute Gx𝐅m.

The tl;dr will be that it's all linear algebra in the end and we can simply perform Gyi for some y1,,yk𝐅 in the little field and pack the elements back together to get the result Gx. More specifically, since 𝐅 is a field extension, then 𝐅𝐅k. Set x(y1,,yk) where y1,,yk𝐅, then Gx(Gy1,,Gyk).

First attempt

One way of dealing with this is to first define the field extension 𝐅 and then construct a subset 𝐅𝐅 of this field 𝐅 that (a) contains {0,1} and (b) is closed under multiplication and addition—in other words, 𝐅 itself is a field. Since the little field is literally a subset of the big field 𝐅𝐅 in the proper sense of the word "subset", the operations can easily be performed since any subfield operation in 𝐅 is just the same as operating over 𝐅.

This is slightly annoying for two reasons. Number one is that 𝐅 is often "large" in the sense that representing an element of the big field α𝐅 often takes far more bits than representing an element of the little field β𝐅. A concrete example from ZODA, e.g., is that 𝐅 has 216 elements and so an element can be represented by a 16-bit integer while 𝐅 has 2128 elements and so requires a 128-bit integer to represent. While asymptotically this doesn't matter, in practice it's very important: in a standard ARM register, which is 64 bits, we can fit 4 "little field" elements and only half of a "big field" element, which kills performance if we represent every "little field" element using the "big field" representation.

Slight improvement

One obvious version of this is to "compress" the representation of 𝐅: simply represent elements in the little field 𝐅 using 16-bit integers and then "upcast" them on the fly to their 128-bit representation in 𝐅 during computation.

This is fine and, indeed, is what we do (for now) implicitly in CryptoUtilities.jl but it's a little annoying: the conversion actually takes a nontrivial amount of operations since the mapping from the "compressed" representation of 𝐅 to the "big field" representation in requires 16 "big field" additions.

(There's also some secondary annoyance since we would like the representations to be "consistent" in that multiplication in the "compressed" representation should correspond to multiplication in the "big field" representation, but that's a different story for a later day. See, e.g., here for how we deal with this.)

General construction

Another way to construct this pair of fields 𝐅 and 𝐅 is in "reverse". In particular, we can define 𝐅 as a degree k extension of 𝐅 by setting

𝐅=𝐅[X]/(f),

where f is an irreducible polynomial over 𝐅 of degree k. (Funnily enough, f itself won't be used anywhere else in this construction, it just matters that one exists, which we know a-priori.)

This means that an element of the big field 𝐅 and a k-vector over the little field 𝐅 are equivalent in that one can be mapped to the other and vice versa. That is,

𝐅𝐅k

in the obvious way: x𝐅 is a polynomial of degree <k, so interpret the coefficients as a k-vector, or, conversely, the k-vector can be interpreted as a polynomial of degree less than k, which gives the bijective mapping. Note that 0𝐅 is the zero polynomial in 𝐅, or, equivalently, the zero vector in 𝐅k. (Exercise: what should the 1 element be in 𝐅 and therefore in 𝐅k using this map?) Most importantly, note that this map is linear in the little field 𝐅 and preserves that structure.

In general, given an element β𝐅 we will write its "tilde" version β~𝐅k to be its version as a k-vector.

Breaking it down

The first question to ask then is, given two elements, α𝐅 in the little field and β𝐅 in the big field, what is αβ? Since β𝐅 then set β~𝐅k to be the corresponding k-vector. It is not hard to show that αβ~=αβ~; in other words, multiplication of a big field element by a little field element is just scalar multiplication of the underlying vector representation of the big field element.

Now, we can ask the next question: given a vector in the little field x𝐅n and a scalar in the big field β𝐅, what is the (big field) 𝐅n vector βx? Well, from before, note that β~𝐅k, and we know the ith entry of βx should be equal to βxi, which we know from before. This, in turn, is equivalent in the vector representation of the big field to xiβ~, which is a k-vector over 𝐅. Finally, define the matrix representation of a vector y𝐅n in the big field to be the matrix y~𝐅n×k over the small field where the ith row of y~ is the k-vector representation of the ith element (yi)~𝐅k. A little work shows that βx is then, in this matrix representation:

βx~=xβ~T.

Finally, we can put all of this together for the general case. Given a matrix G𝐅m×n in the little field and a vector x𝐅n in the big field, we'd like to compute Gx𝐅m (which is a vector in the big field, representable as an m×k matrix over the little field). Of course, we know that, by definition

Gx=i=1nxigi,

where gi𝐅m is the ith column of G (in the little field) and xi is a scalar (in the big field). We can use our previous scalar-vector product representation to note that

xigi~=gi(xi)~T

and add these together to get

Gx~=i=1ngi(xi)~T=Gx~.

In other words, applying any linear operation whose elements are always in the little field to a vector in the big field is equivalent to applying the linear operation to each column of the matrix representation of the big-field vector.

This means that, if we have an efficient procedure to compute Gz for any vector z in the little field, we can simply use the efficient procedure as a black box to compute Gx when x is in the big field by applying it to each column of the matrix representation of the big-field vector x.