Skip to content

The Sumchecks

The two phases of the shift reduction

As we've explained, our goal now is to assess the validity of evaluation claims of the form

αz=?z^(ri^,rx),\begin{equation*}\alpha_z \stackrel{?}= \widehat{z}(r_{\widehat{i}}, r'_x),\end{equation*}

where z^(I^,X)\widehat{z}(\widehat{I}, X) is an oblong-multilinearized constraint array. The values αz\alpha_z, ri^r_{\widehat{i}} and rxr'_x are known to both the prover and the verifier. In each case, the evaluation points ri^r_{\widehat{i}} and rxr'_x will be identical across all claims, whereas the claimed evaluations αz\alpha_z of course won't be.

In this page, we explain how to reduce a plurality of evaluation claims of this form to a single evaluation claim on the multilinearization w~(J,Y)\widetilde{w}(J, Y) of the witness. For expository reasons, we explain the case in which there is just one claim. The case in which there are more than one is similar, and we explain inline where it needs to be accommodated. This choice lets us abstract away how many claims there are (there will be 3 at the end of the AND reduction and 4 at the end of the MUL reduction).

Unwinding the description of z^(I^,X)\widehat{z}(\widehat{I}, X) that we constructed in the last page, and inserting ri^r_{\widehat{i}} and rxr'_x, we obtain the claim:

αz=?jB6sB6opyBwordsiB6δD(ri^,i^)Z~cons,op(rx,y,s)shift-ind~op(i,j,s)w~(j,y).\begin{equation*}\alpha_z \stackrel{?}= \sum_{j \in \mathcal{B}_6} \sum_{s \in \mathcal{B}_6} \sum_\text{op} \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) \cdot \widetilde{w}(j, y).\end{equation*}

As usual, Z~cons,op\widetilde{Z}_{\text{cons}, \text{op}} is the multilinearized constraint matrix associated to the constraint array zz.

A Factorization

At this point, it would be correct to just tell the prover to "run the sumcheck algorithm". The problem is that this sum is enormous: it is over words+18\ell_\text{words} + 18 variables. Its raw table of values would thus be something like 2182^{18} times as large as the witness, which is absurd and unacceptable.

The essential point for us is that not all variables appear in all functions. If they did, we'd have no choice but to run the standard algorithm. As it happens, though, each function has only some of the variables II, JJ, SS, and YY. Moreover, the way that these variables are "distributed" allows some very interesting things to happen.

First, note that in each of the three sum expressions above, for each jj, ss and op\text{op}, among the four resulting inner polynomials, the δD(ri^,i^)\delta_D(r_{\widehat{i}}, \widehat{i}) and shift-ind~op(i,j,s)\widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) contain ii but not yy, while Z~cons,op(rx,y,s)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) and w~(j,y)\widetilde{w}(j, y) contain yy but not ii. Because of this, we can factor the innermost nested sums over ii and yy into a product of two sums, for each jj, ss and op\text{op}. In this way, we factorize the above claim as follows:

az=?jB6sB6op[iB6δD(ri^,i^)shift-ind~op(i,j,s)][yBwordsZ~cons,op(rx,y,s)w~(j,y)].\begin{equation*} a_z \stackrel{?}= \sum_{j \in \mathcal{B}_6} \sum_{s \in \mathcal{B}_6} \sum_\text{op} \Bigg[\sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) \Bigg] \cdot \Bigg[ \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) \cdot \widetilde{w}(j, y) \Bigg].\end{equation*}

This wouldn't do us much good, if it weren't for the following further observation. In each of the three expressions above, both bracketed quantities are multilinear in both JJ and SS.

Indeed, in the left-hand bracketed sum, each summand is multilinear in jj and ss, since those variables appear only in shift-ind~op(i,j,s)\widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) and not in δD(ri^,i^)\delta_D(r_{\widehat{i}}, \widehat{i}); thus, the entire sum is also multilinear in jj and ss. Similarly, in the right-hand bracketed sum, each summand is again multilinear in jj and ss, since Z~cons,op(rx,y,s)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) depends on ss but not on jj, while w~(j,y)\widetilde{w}(j, y) depends on jj but not on ss. Thus each summand, and hence the entire sum, is again multilinear in jj and ss. We conclude that both bracketed sums are themselves multilinear in jj and ss, and we can treat the entire expression above as a sum, over the cube B6×B6\mathcal{B}_6 \times \mathcal{B}_6, of a multivariate polynomial in JJ and SS expressible as the sum of just three products of multilinears (i.e., over varying op\text{op}).

If this weren't true, then the above multivariate in JJ and SS would not be expressible as a composition of multilinears in JJ and SS with reasonable composition functions. Rather, the composition function would be enormous—with 3×643 \times 64, 3×nwords3 \times n_\text{words}, or even 3×64×nwords3 \times 64 \times n_\text{words} summands—as opposed to just 33.

The First Phase

The above remarks suggest the following first step. Purely for notation, we introduce some abbreviations. For each op{sll,srl,sra}\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}, we define two 12-variate multilinears in JJ and SS:

hop(J,S)iB6δD(ri^,i^)shift-ind~op(i,J,S);\begin{equation*}h_\text{op}(J, S) \coloneqq \sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, J, S);\end{equation*}

and

gop(J,S)yBwordsZ~cons,op(rx,y,S)w~(J,y),\begin{align*} g_\text{op}(J, S) &\coloneqq \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, S) \cdot \widetilde{w}(J, y), \\ \end{align*}

Indeed, both multivariates above in JJ and SS are above multilinear (we just argued this). By definition, our claim now amounts to:

αz=?jB6sB6ophop(j,s)gop(j,s).\begin{equation*}\alpha_z \stackrel{?}= \sum_{j \in \mathcal{B}_6} \sum_{s \in \mathcal{B}_6} \sum_\text{op} h_\text{op}(j, s) \cdot g_\text{op}(j, s).\end{equation*}

Thus, the prover can begin by tabulating the 26×262^6 \times 2^6 tables of values of all six functions hop(J,S)h_\text{op}(J, S) and gop(J,S)g_\text{op}(J, S), i.e. for all three op\text{op}. The total resulting data here is of size 262662^6 \cdot 2^6 \cdot 6 field elements, or 384 KiB total. In the next page, we talk about how the prover can prepare these tables of values efficiently, exploiting the sparsity of the constraint system.

Pending that problem, we can tell the parties to run the sumcheck above. In the case in which there are multiple constraint arrays at play, there will be one gop(J,S)g_\text{op}(J, S) for each array; at this point, the verifier will sample batching scalars for these.

The sumcheck is 12-dimensional, over a sum of three products of multilinears (i.e., taken over op\text{op}). After 12 rounds, the parties will have reduced the prover's claim to a further claim, say:

β=?ophop(rj,rs)gop(rj,rs),\begin{equation*}\beta \stackrel{?}= \sum_\text{op} h_\text{op}(r_j, r_s) \cdot g_\text{op}(r_j, r_s),\end{equation*}

for 6-dimensional random scalars rjr_j and rsr_s sampled during the sumcheck.

The Second Phase

Rearranging the claim we left off with above, by unwinding what the evaluations gop(rj,rs)g_\text{op}(r_j, r_s) mean, we get the identical claim:

β=?ophop(rj,rs)gop(rj,rs)=ophop(rj,rs)[yBwordsZ~cons,op(rx,y,rs)w~(rj,y)]=yBwords[ophop(rj,rs)Z~cons,op(rx,y,rs)]w~(rj,y).\begin{align*}\beta &\stackrel{?}= \sum_\text{op} h_\text{op}(r_j, r_s) \cdot g_\text{op}(r_j, r_s) \\ &= \sum_\text{op} h_\text{op}(r_j, r_s) \cdot \Bigg[ \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, r_s) \cdot \widetilde{w}(r_j, y) \Bigg] \\ &= \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \left[ \sum_\text{op} h_\text{op}(r_j, r_s) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, r_s) \right] \cdot \widetilde{w}(r_j, y). \end{align*}

We can now view the above expression as a further sumcheckable claim, now over the single words\ell_\text{words}-dimensional variable yy. The entire bracketed expression can be viewed as a single multilinear in YY; the partial specialization w~(rj,Y)\widetilde{w}(r_j, Y) is of course also multilinear in YY. Again in the next page, we go into how the prover can tabulate the necessary values efficiently.

For now, we instruct the parties to run an words\ell_\text{words}-round sumcheck on the above. After doing that, they wind up with a final claim of the form:

γ=?[ophop(rj,rs)Z~cons,op(rx,ry,rs)]w~(rj,ry),\begin{equation*}\gamma \stackrel{?}= \left[ \sum_\text{op} h_\text{op}(r_j, r_s) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, r_y, r_s) \right] \cdot \widetilde{w}(r_j, r_y),\end{equation*}

where ryr_y is a further value sampled during the sumcheck.

The End

We work under the assumption that the verifier can evaluate the entire bracketed quantity himself. We have already talked at great length about how the verifier can efficiently, locally, compute the evaluations hop(rj,rs)h_\text{op}(r_j, r_s) himself. Our case is a bit different than what we previously covered, since the verifier will need to evaluate shift-indop(i,rj,rs)\textsf{shift-ind}_\text{op}(i, r_j, r_s) for each iB6i \in \mathcal{B}_6, as opposed to just at a single point. But the idea is similar. The verifier can carry out this computation in a way whose cost grows linearly in the size of B6\mathcal{B}_6, and with small constants. The concrete cost is very low.

It remains for the verifier to evaluate Z~cons,op(rx,ry,rs)\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, r_y, r_s) for each op\text{op}. At this point, we punt, and use a non-succinct approach. Indeed, there is a straightforward way for the verifier to do this that takes O(nwords)+O(nand)+O(circuit size)O(n_\text{words}) + O(n_\text{and}) + O(\text{circuit size}) work (we cover this technique in the next page). We plan to support fully succinct verification soon, possibly using the Spark compiler [Set20].

It remains for the verifier to output the single evaluation claim on w~(rj,ry)\widetilde{w}(r_j, r_y). This is exactly the kind of thing that our PCS is designed to treat, so this completes our treatment of the shift reduction, at least from the verifier's point of view.