Skip to content

Prover Implementation

Implementing the shift reduction in a sparsity-aware way

We now explain the implementation of our shift reduction's prover. We covered a lot of ground in the previous page, where described the protocol's overall two-phase sumcheck structure.

Here, we describe more explicitly how the prover can precompute the various tables of values that it needs in order to carry out those sumchecks. The point is to exploit the sparsity of the constraint system, and to get something whose performance is close to the witness size.

There are actually a few ways to do this, which are essentially similar in spirit and have slightly different tradeoffs at the implementation level. We are going to describe one which is conceptually simple and gets close to what we do.

The First Phase

We now describe the prover's implementation of the first phase of the shift reduction. The prover's inputs are the challenges ri^r_{\widehat{i}}, rxr'_x sampled during the univariate skip, the witness ww, and the constraint array zz.

Computation of h polynomials

We recall the definitions of the polynomials hop(J,S)h_\text{op}(J, S). For each jB6j \in \mathcal{B}_6 and sB6s \in \mathcal{B}_6:

hop(j,s)=iB6δD(ri^,i^)shift-ind~op(i,j,s).\begin{equation*}h_\text{op}(j, s) = \sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, j, s).\end{equation*}

First, the prover should precompute (δD(ri^,i^))iB6\left( \delta_D(r_{\widehat{i}}, \widehat{i}) \right)_{i \in \mathcal{B}_6}. As we've already seen, the prover can do this efficiently using a technique related to Lagrange interpolation.

We recall now what the shift indicators mean. We note that for each op{sll,srl}\text{op} \in \{\mathsf{sll}, \mathsf{srl}\}, for each jB6j \in \mathcal{B}_6 and each sB6s \in \mathcal{B}_6, there is at most one iB6i \in \mathcal{B}_6 for which shift-ind~op(i,j,s)=1\widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) = 1. We can thus set hop(j,s)h_\text{op}(j, s) equal to δD(ri^,i^)\delta_D(r_{\widehat{i}}, \widehat{i}) for precisely the appropriate value of ii.

The case op=sra\text{op} = \mathsf{sra} is slightly more tricky. Here, the case j=63j = 63 is exceptional. In that case, for each sB6s \in \mathcal{B}_6, shift-ind~op(i,j,s)=1\widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) = 1 for each iB6i \in \mathcal{B}_6 for which {i}64{s}\{i\} \geq 64 - \{s\}.

Putting everything together, we get the following algorithm. We write delta\mathsf{delta} for the length-64 flattening of (δD(ri^,i^))iB6\left( \delta_D(r_{\widehat{i}}, \widehat{i}) \right)_{i \in \mathcal{B}_6}.

  • allocate empty, 64×6464 \times 64, all-zero arrays hsll\mathsf{h}_{\mathsf{sll}}, hsra\mathsf{h}_{\mathsf{sra}} and hsra\mathsf{h}_{\mathsf{sra}}, containing F2128\mathbb{F}_{2^{128}}-elements.
  • for j{0,,63}j \in \{0, \ldots , 63\} do:
    • for s{0,,63j}s \in \{0, \ldots , 63 - j\} do: set hsll[j][s]delta[j+s]\mathsf{h}_{\mathsf{sll}}[j][s] \coloneqq \mathsf{delta}[j + s]
    • for s{0,,j}s \in \{0, \ldots , j\} do: set hsrl[j][s]delta[js]\mathsf{h}_{\mathsf{srl}}[j][s] \coloneqq \mathsf{delta}[j - s].
    • for s{0,,j}s \in \{0, \ldots , j\} do: set hsra[j][s]delta[js]\mathsf{h}_{\mathsf{sra}}[j][s] \coloneqq \mathsf{delta}[j - s].
  • for s{1,,63}s \in \{1, \ldots , 63\} do: update hsra[63][s]+=hsra[63][s1]\mathsf{h}_{\mathsf{sra}}[63][s] \mathrel{+}= \mathsf{h}_{\mathsf{sra}}[63][s - 1].
  • return hsll\mathsf{h}_{\mathsf{sll}}, hsra\mathsf{h}_{\mathsf{sra}} and hsra\mathsf{h}_{\mathsf{sra}}.

In the last line of the algorithm above, we do a "running prefix sum" trick. This completes our description of the prover's computation of the polynomials hop(J,S)h_\text{op}(J, S).

Computation of g polynomials

We now turn to the polynomials gop(J,S)g_\text{op}(J, S), for op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}. We reproduce their definitions here. For each op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}, jB6j \in \mathcal{B}_6 and sB6s \in \mathcal{B}_6:

gop(j,s)yBwordsZ~cons,op(rx,y,s)w~(j,y).\begin{equation*}g_\text{op}(j, s) \coloneqq \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) \cdot \widetilde{w}(j, y).\end{equation*}

Our prover's first goal will be to compute the all of the values Z~cons,op(rx,y,s)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) used above. Now across all possible yBwordsy \in \mathcal{B}_{\ell_\text{words}}, all three op\text{op}, and all 64 sB6s \in \mathcal{B}_6, these three quantities represent nwords364n_\text{words} \cdot 3 \cdot 64 values, which is 192 times as large as the witness, and so too much to compute and store naïvely. When we are assessing claims on multiple constraint arrays zz at once—as we are in practice—this problem becomes worse (linearly in the number of constraint arrays).

On the other hand, very few of these combinations will actually have nonzero values, because of sparsity. Our goal is to compute and store only the ones that we need. We will compute and store Z~cons,op(rx,y,s)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) only for those triples (y,op,s)(y, \text{op}, s) for which that value is nonzero.

To do this, we will create a kind of dictionary datastructure. We will keep a list, indexed by y{0,,nwords1}y \in \{0, \ldots , n_\text{words} - 1\}, called index\textsf{index}. For each y{0,,nwords1}y \in \{0, \ldots , n_\text{words} - 1\}, index[y]\textsf{index}[y] will itself be a dictionary, whose keys will be triples (op,s)(\text{op}, s), and whose values will be F2128\mathbb{F}_{2^{128}}-elements. For each yy, the keys of index[y]\textsf{index}[y] will be precisely those triples (op,s)(\text{op}, s) for which the column Z~cons,op(X,y,s)\widetilde{Z}_{\text{cons}, \text{op}}(X, y, s) is not identically zero; for each such key, the value index[y][(op,s)]\textsf{index}[y][(\text{op}, s)] will be Z~cons,op(rx,y,s)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s). (In the case of many constraint arrays, the keys of index[y]\textsf{index}[y] will be not pairs but triples, containing also an index into the list of constraint arrays.)

The following algorithm populates this datastructure:

  • precompute tensor(eq~(rx,x))xBand\mathsf{tensor} \coloneqq \left( \widetilde{\texttt{eq}}(r'_x, x) \right)_{x \in \mathcal{B}_{\ell_\text{and}}}, the tensor expansion of rxr'_x.
  • let index\textsf{index} be a length-nwordsn_\text{words} array, whose elements are dictionaries, initialized to empty.
  • for each x{0,,ncons1}x \in \{0, \ldots , n_\text{cons} - 1\} do:
    • load the xxth list LxzL^z_x from the constraint system.
    • for each (y,op,s)Lxz(y, \text{op}, s) \in L^z_x do:
      • update index[y][(op,s)]+=tensor[x]\mathsf{index}[y][(\text{op}, s)] \mathrel{+}= \mathsf{tensor}[x].
  • return index\mathsf{index}.

For each yy, we understand the dictionary index[y]\textsf{index}[y] as a defaultdict, in the sense that if the key (op,s)(\text{op}, s) is not yet present in index[y]\textsf{index}[y], the operation index[y][(op,s)]+=tensor[x]\mathsf{index}[y][(\text{op}, s)] \mathrel{+}= \mathsf{tensor}[x] will implicitly create the key—with value 0—before executing.

With these in hand, we're ready to compute the polynomials gop(J,S)g_\text{op}(J, S). To compute these, we run the following algorithm:

  • for each op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}, initialize gop\mathsf{g}_\text{op} to be an empty, 26×262^6 \times 2^6 array of field elements.
  • for each y{0,,nwords1}y \in \{0, \ldots , n_\text{words} - 1\} do:
    • for each key–value pair ((op,s),value)\left( (\text{op}, s), \textsf{value} \right) in index[y]\textsf{index}[y] do:
      • for each j{0,,63}j \in \{0, \ldots , 63\} do:
        • if w[y]j=?1w[y]_j \stackrel{?}= 1 do:
          • gop[j][s]+=value\mathsf{g}_\text{op}[j][s] \mathrel{+}= \textsf{value}.
  • return gop\mathsf{g}_\text{op} for all three op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}.

We claim that for each op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}, the resulting array gop\mathsf{g}_\text{op} contains precisely the table of values of gop(J,S)g_\text{op}(J, S) on B6×B6\mathcal{B}_6 \times \mathcal{B}_6. This completes our description of the first phase.

The Second Phase

As we've seen, to prepare for the second phase of the shift reduction, the prover needs to prepare the tables of values of two different words\ell_\text{words}-variate multilinears:

w~(rj,Y)\begin{equation*}\widetilde{w}(r_j, Y)\end{equation*}

and

ophop(rj,rs)Z~cons,op(rx,Y,rs).\begin{equation*}\sum_\text{op} h_\text{op}(r_j, r_s) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, Y, r_s).\end{equation*}

The first is just a standard partial specialization. The prover just needs to tensor-expand the 6-dimensional challenge rjr_j, and then, for each y{0,,nwords1}y \in \{0, \ldots , n_\text{words} - 1\}, take the subset sum j=063w[y]jeq~(rj,j)\sum_{j = 0}^{63} w[y]_j \cdot \widetilde{\texttt{eq}}(r_j, j).

For the second, we can again use our index\mathsf{index}. We claim that the following algorithm computes the required table of values:

  • initialize Z\mathsf{Z}, an all-zero, length-nwordsn_\text{words} array of F2128\mathbb{F}_{2^{128}}-elements.
  • compute in advance the tensor-expansion (eq~(rs,s))sB6\left( \widetilde{\texttt{eq}}(r_s, s) \right)_{s \in \mathcal{B}_6}.
  • for y{0,,nwords1}y \in \{0, \ldots , n_\text{words} - 1\} do:
    • for each key–value pair ((op,s),value)\left( (\text{op}, s), \textsf{value} \right) in index[y]\textsf{index}[y] do:
      • update Z[y]+=hop(rj,rs)valueeq~(rs,s)\mathsf{Z}[y] \mathrel{+}= h_\text{op}(r_j, r_s) \cdot \mathsf{value} \cdot \widetilde{\texttt{eq}}(r_s, s).
  • return Z\mathsf{Z}.

This completes the precomputation; Z\mathsf{Z} yields the table of values of the second multilinear in YY above. The prover can at this point proceed with the second phase of the sumcheck.

Constraint Matrices

We touch on one final topic, which is how the verifier can evaluate Z~cons,op(rx,ry,rs)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, r_y, r_s), for all three op\text{op}, himself. As we saw earlier, this is the last unexplained thing that the verifier needs to do at the very end.

The correctness of the following algorithm essentially follows from the standard properties of multilinear evaluation. The point is to do everything in a way that exploits sparsity.

  • for each op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}, initialize a single scalar mop\mathsf{m}_\text{op}, starting at 0.
  • precompute the tensor expansions (eq~(rx,x))xBand\left( \widetilde{\texttt{eq}}(r'_x, x) \right)_{x \in \mathcal{B}_{\ell_\text{and}}}, (eq~(ry,y))yBwords\left( \widetilde{\texttt{eq}}(r_y, y) \right)_{y \in \mathcal{B}_{\ell_\text{words}}} and (eq~(rs,s))sB6\left( \widetilde{\texttt{eq}}(r_s, s) \right)_{s \in \mathcal{B}_6}.
  • for each x{0,,ncons1}x \in \{0, \ldots , n_\text{cons} - 1\} do:
    • load the xxth list LxzL^z_x from the constraint system.
    • for each (y,op,s)Lxz(y, \text{op}, s) \in L^z_x do:
      • update mop+=eq~(rx,x)eq~(ry,y)eq~(rs,s)\mathsf{m}_\text{op} \mathrel{+}= \widetilde{\texttt{eq}}(r'_x, x) \cdot \widetilde{\texttt{eq}}(r_y, y) \cdot \widetilde{\texttt{eq}}(r_s, s).
  • return mop\mathsf{m}_\text{op} for all op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}.

This completes our treatment of the shift reduction.