The Sumchecks

The two phases of the shift reduction

As we've explained, our goal now is to assess the validity of evaluation claims of the form

\begin{equation*}\alpha_z \stackrel{?}= \widehat{z}(r_{\widehat{i}}, r'_x),\end{equation*}

where $\widehat{z}(\widehat{I}, X)$ is an oblong-multilinearized constraint array. The values $\alpha_z$ , $r_{\widehat{i}}$ and $r'_x$ are known to both the prover and the verifier. In each case, the evaluation points $r_{\widehat{i}}$ and $r'_x$ will be identical across all claims, whereas the claimed evaluations $\alpha_z$ of course won't be.

In this page, we explain how to reduce a plurality of evaluation claims of this form to a single evaluation claim on the multilinearization $\widetilde{w}(J, Y)$ of the witness. For expository reasons, we explain the case in which there is just one claim. The case in which there are more than one is similar, and we explain inline where it needs to be accommodated. This choice lets us abstract away how many claims there are (there will be 3 at the end of the AND reduction and 4 at the end of the MUL reduction).

Unwinding the description of $\widehat{z}(\widehat{I}, X)$ that we constructed in the last page, and inserting $r_{\widehat{i}}$ and $r'_x$ , we obtain the claim:

\begin{equation*}\alpha_z \stackrel{?}= \sum_{j \in \mathcal{B}_6} \sum_{s \in \mathcal{B}_6} \sum_\text{op} \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) \cdot \widetilde{w}(j, y).\end{equation*}

As usual, $\widetilde{Z}_{\text{cons}, \text{op}}$ is the multilinearized constraint matrix associated to the constraint array $z$ .

A Factorization

At this point, it would be correct to just tell the prover to "run the sumcheck algorithm". The problem is that this sum is enormous: it is over $\ell_\text{words} + 18$ variables. Its raw table of values would thus be something like $2^{18}$ times as large as the witness, which is absurd and unacceptable.

The essential point for us is that not all variables appear in all functions. If they did, we'd have no choice but to run the standard algorithm. As it happens, though, each function has only some of the variables $I$ , $J$ , $S$ , and $Y$ . Moreover, the way that these variables are "distributed" allows some very interesting things to happen.

First, note that in each of the three sum expressions above, for each $j$ , $s$ and $\text{op}$ , among the four resulting inner polynomials, $\delta_D(r_{\widehat{i}}, \widehat{i})$ and $\widetilde{\textsf{shift-ind}}_\text{op}(i, j, s)$ contain $i$ but not $y$ , while $\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s)$ and $\widetilde{w}(j, y)$ contain $y$ but not $i$ . Because of this, we can factor the innermost nested sums over $i$ and $y$ into a product of two sums, for each $j$ , $s$ and $\text{op}$ . In this way, we factorize the above claim as follows:

\begin{equation*} a_z \stackrel{?}= \sum_{j \in \mathcal{B}_6} \sum_{s \in \mathcal{B}_6} \sum_\text{op} \Bigg[\sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, j, s) \Bigg] \cdot \Bigg[ \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s) \cdot \widetilde{w}(j, y) \Bigg].\end{equation*}

This wouldn't do us much good, if it weren't for the following further observation. In each of the three expressions above, both bracketed quantities are multilinear in both $J$ and $S$ .

Indeed, in the left-hand bracketed sum, each summand is multilinear in $j$ and $s$ , since those variables appear only in $\widetilde{\textsf{shift-ind}}_\text{op}(i, j, s)$ and not in $\delta_D(r_{\widehat{i}}, \widehat{i})$ ; thus, the entire sum is also multilinear in $j$ and $s$ . Similarly, in the right-hand bracketed sum, each summand is again multilinear in $j$ and $s$ , since $\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, s)$ depends on $s$ but not on $j$ , while $\widetilde{w}(j, y)$ depends on $j$ but not on $s$ . Thus each summand, and hence the entire sum, is again multilinear in $j$ and $s$ . We conclude that both bracketed sums are themselves multilinear in $j$ and $s$ , and we can treat the entire expression above as a sum, over the cube $\mathcal{B}_6 \times \mathcal{B}_6$ , of a multivariate polynomial in $J$ and $S$ expressible as the sum of just three products of multilinears (i.e., over varying $\text{op}$ ).

If this weren't true, then the above multivariate in $J$ and $S$ would not be expressible as a composition of multilinears in $J$ and $S$ with reasonable composition functions. Rather, the composition function would be enormous—with $3 \times 64$ , $3 \times n_\text{words}$ , or even $3 \times 64 \times n_\text{words}$ summands—as opposed to just $3$ .

The First Phase

The above remarks suggest the following first step. Purely for notation, we introduce some abbreviations. For each $\text{op} \in \{\mathsf{sll}, \mathsf{srl}, \mathsf{sra}\}$ , we define two 12-variate multilinears in $J$ and $S$ :

\begin{equation*}h_\text{op}(J, S) \coloneqq \sum_{i \in \mathcal{B}_6} \delta_D(r_{\widehat{i}}, \widehat{i}) \cdot \widetilde{\textsf{shift-ind}}_\text{op}(i, J, S);\end{equation*}

and

\begin{align*} g_\text{op}(J, S) &\coloneqq \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, S) \cdot \widetilde{w}(J, y), \\ \end{align*}

Indeed, both multivariates above in $J$ and $S$ are above multilinear (we just argued this). By definition, our claim now amounts to:

\begin{equation*}\alpha_z \stackrel{?}= \sum_{j \in \mathcal{B}_6} \sum_{s \in \mathcal{B}_6} \sum_\text{op} h_\text{op}(j, s) \cdot g_\text{op}(j, s).\end{equation*}

Thus, the prover can begin by tabulating the $2^6 \times 2^6$ tables of values of all six functions $h_\text{op}(J, S)$ and $g_\text{op}(J, S)$ , i.e. for all three $\text{op}$ . The total resulting data here is of size $2^6 \cdot 2^6 \cdot 6$ field elements, or 384 KiB total. In the next page, we talk about how the prover can prepare these tables of values efficiently, exploiting the sparsity of the constraint system.

Pending that problem, we can tell the parties to run the sumcheck above. In the case in which there are multiple constraint arrays at play, there will be one $g_\text{op}(J, S)$ for each array; at this point, the verifier will sample batching scalars for these.

The sumcheck is 12-dimensional, over a sum of three products of multilinears (i.e., taken over $\text{op}$ ). After 12 rounds, the parties will have reduced the prover's claim to a further claim, say:

\begin{equation*}\beta \stackrel{?}= \sum_\text{op} h_\text{op}(r_j, r_s) \cdot g_\text{op}(r_j, r_s),\end{equation*}

for 6-dimensional random scalars $r_j$ and $r_s$ sampled during the sumcheck.

The Second Phase

Rearranging the claim we left off with above, by unwinding what the evaluations $g_\text{op}(r_j, r_s)$ mean, we get the identical claim:

\begin{align*}\beta &\stackrel{?}= \sum_\text{op} h_\text{op}(r_j, r_s) \cdot g_\text{op}(r_j, r_s) \\ &= \sum_\text{op} h_\text{op}(r_j, r_s) \cdot \Bigg[ \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, r_s) \cdot \widetilde{w}(r_j, y) \Bigg] \\ &= \sum_{y \in \mathcal{B}_{\ell_\text{words}}} \left[ \sum_\text{op} h_\text{op}(r_j, r_s) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, y, r_s) \right] \cdot \widetilde{w}(r_j, y). \end{align*}

We can now view the above expression as a further sumcheckable claim, now over the single $\ell_\text{words}$ -dimensional variable $y$ . The entire bracketed expression can be viewed as a single multilinear in $Y$ ; the partial specialization $\widetilde{w}(r_j, Y)$ is of course also multilinear in $Y$ . Again in the next page, we go into how the prover can tabulate the necessary values efficiently.

For now, we instruct the parties to run an $\ell_\text{words}$ -round sumcheck on the above. After doing that, they wind up with a final claim of the form:

\begin{equation*}\gamma \stackrel{?}= \left[ \sum_\text{op} h_\text{op}(r_j, r_s) \cdot \widetilde{Z}_{\text{cons}, \text{op}}(r'_x, r_y, r_s) \right] \cdot \widetilde{w}(r_j, r_y),\end{equation*}

where $r_y$ is a further value sampled during the sumcheck.

The End

We work under the assumption that the verifier can evaluate the entire bracketed quantity himself. We have already talked at great length about how the verifier can efficiently, locally, compute the evaluations $h_\text{op}(r_j, r_s)$ himself. Our case is a bit different than what we previously covered, since the verifier will need to evaluate $\textsf{shift-ind}_\text{op}(i, r_j, r_s)$ for each $i \in \mathcal{B}_6$ , as opposed to just at a single point. But the idea is similar. The verifier can carry out this computation in a way whose cost grows linearly in the size of $\mathcal{B}_6$ , and with small constants. The concrete cost is very low.

It remains for the verifier to evaluate $\widetilde{Z}_{\text{cons}, \text{op}}(r'_x, r_y, r_s)$ for each $\text{op}$ . At this point, we punt, and use a non-succinct approach. Indeed, there is a straightforward way for the verifier to do this that takes $O(n_\text{words}) + O(n_\text{and}) + O(\text{circuit size})$ work (we cover this technique in the next page). We plan to support fully succinct verification soon, possibly using the Spark compiler [Set20].

It remains for the verifier to output the single evaluation claim on $\widetilde{w}(r_j, r_y)$ . This is exactly the kind of thing that our PCS is designed to treat, so this completes our treatment of the shift reduction, at least from the verifier's point of view.