Linear algebra over field extensions
A common occurrence in succinct proofs is the need to work in two different finite fields. In general, we work mostly over a base field (I will mostly call this the "little field") and over some extension field (ditto "big field") which is a field that is also a superset of the base field . (Technically, the extension need not be an actual superset of ; rather, it needs to have a subset that is equivalent/isomorphic to , we will make use of this later.)
Most of the operations in a given succinct proof that uses codes (e.g., in Ligero/ZODA/Ligerito) will be done over the base field and a few (mostly involving random linear combinations) will be over the "big field" . As we will see, most of these operations will take the form of something like the following: let be a matrix in the "little field" and a vector over the "big field", then we want to compute .
The tl;dr will be that it's all linear algebra in the end and we can simply perform for some in the little field and pack the elements back together to get the result . More specifically, since is a field extension, then . Set where , then .
First attempt
One way of dealing with this is to first define the field extension and then construct a subset of this field that (a) contains and (b) is closed under multiplication and addition—in other words, itself is a field. Since the little field is literally a subset of the big field in the proper sense of the word "subset", the operations can easily be performed since any subfield operation in is just the same as operating over .
This is slightly annoying for two reasons. Number one is that is often "large" in the sense that representing an element of the big field often takes far more bits than representing an element of the little field . A concrete example from ZODA, e.g., is that has elements and so an element can be represented by a 16-bit integer while has elements and so requires a 128-bit integer to represent. While asymptotically this doesn't matter, in practice it's very important: in a standard ARM register, which is 64 bits, we can fit 4 "little field" elements and only half of a "big field" element, which kills performance if we represent every "little field" element using the "big field" representation.
Slight improvement
One obvious version of this is to "compress" the representation of : simply represent elements in the little field using 16-bit integers and then "upcast" them on the fly to their 128-bit representation in during computation.
This is fine and, indeed, is what we do (for now) implicitly in CryptoUtilities.jl
but it's a little annoying: the conversion actually takes a nontrivial amount of operations since the mapping from the "compressed" representation of to the "big field" representation in requires 16 "big field" additions.
(There's also some secondary annoyance since we would like the representations to be "consistent" in that multiplication in the "compressed" representation should correspond to multiplication in the "big field" representation, but that's a different story for a later day. See, e.g., here for how we deal with this.)
General construction
Another way to construct this pair of fields and is in "reverse". In particular, we can define as a degree extension of by setting
where is an irreducible polynomial over of degree . (Funnily enough, itself won't be used anywhere else in this construction, it just matters that one exists, which we know a-priori.)
This means that an element of the big field and a -vector over the little field are equivalent in that one can be mapped to the other and vice versa. That is,
in the obvious way: is a polynomial of degree , so interpret the coefficients as a -vector, or, conversely, the -vector can be interpreted as a polynomial of degree less than , which gives the bijective mapping. Note that is the zero polynomial in , or, equivalently, the zero vector in . (Exercise: what should the 1 element be in and therefore in using this map?) Most importantly, note that this map is linear in the little field and preserves that structure.
In general, given an element we will write its "tilde" version to be its version as a -vector.
Breaking it down
The first question to ask then is, given two elements, in the little field and in the big field, what is ? Since then set to be the corresponding -vector. It is not hard to show that ; in other words, multiplication of a big field element by a little field element is just scalar multiplication of the underlying vector representation of the big field element.
Now, we can ask the next question: given a vector in the little field and a scalar in the big field , what is the (big field) vector ? Well, from before, note that , and we know the th entry of should be equal to , which we know from before. This, in turn, is equivalent in the vector representation of the big field to , which is a -vector over . Finally, define the matrix representation of a vector in the big field to be the matrix over the small field where the th row of is the -vector representation of the th element . A little work shows that is then, in this matrix representation:
Finally, we can put all of this together for the general case. Given a matrix in the little field and a vector in the big field, we'd like to compute (which is a vector in the big field, representable as an matrix over the little field). Of course, we know that, by definition
where is the th column of (in the little field) and is a scalar (in the big field). We can use our previous scalar-vector product representation to note that
and add these together to get
In other words, applying any linear operation whose elements are always in the little field to a vector in the big field is equivalent to applying the linear operation to each column of the matrix representation of the big-field vector.
This means that, if we have an efficient procedure to compute for any vector in the little field, we can simply use the efficient procedure as a black box to compute when is in the big field by applying it to each column of the matrix representation of the big-field vector .