Proof Complexity Meets Algebra

We analyse how the standard reductions between constraint satisfaction problems affect their proof complexity. We show that, for the most studied propositional, algebraic, and semi-algebraic proof systems, the classical constructions of pp-interpretability, homomorphic equivalence and addition of constants to a core preserve the proof complexity of the CSP. As a result, for those proof systems, the classes of constraint languages for which small unsatisfiability certificates exist can be characterised algebraically. We illustrate our results by a gap theorem saying that a constraint language either has resolution refutations of constant width, or does not have bounded-depth Frege refutations of subexponential size. The former holds exactly for the widely studied class of constraint languages of bounded width. This class is also known to coincide with the class of languages with refutations of sublinear degree in Sums-of-Squares and Polynomial Calculus over the real-field, for which we provide alternative proofs. We then ask for the existence of a natural proof system with good behaviour with respect to reductions and simultaneously small size refutations beyond bounded width. We give an example of such a proof system by showing that bounded-degree Lov\'asz-Schrijver satisfies both requirements. Finally, building on the known lower bounds, we demonstrate the applicability of the method of reducibilities and construct new explicit hard instances of the graph 3-coloring problem for all studied proof systems.


Introduction
The notion of an efficient reduction lies at the heart of computational complexity. However, in some of its subareas such as proof complexity, even though the concept exists, it is much less developed. The study of the lengths of proofs has developed mostly by studying combinatorial statements, each somewhat in isolation. There is little theory, for instance, explaining why the best studied families of propositional tautologies are encodings of the pigeonhole principle or those derived from systems of linear equations over the 2-element field. Whether there is any connection between the two is an even less explored mystery.
Luckily this fact is subject to revision, especially if proof complexity exports its methods to the study of problems beyond universal combinatorial statements. Consider the NP-hard optimization problem called MAX-CUT. The objective is to find a partition of the vertices of a given graph which maximizes the number of edges that cross the partition. The best efficient approximation algorithm known for this problem relies on certifying a bound on the optimum of its semidefinite programming relaxation. Once the certificate for the relaxation is in place, a rounding procedure gives an approximate integral solution: at worst 87% of the optimum in this case [27].
In the example of the previous paragraph, the problem that is subject to proof complexity analysis is that of certifying a bound on the optimum of an arbitrary MAX-CUT instance. The celebrated Unique Games Conjecture (UGC) can be understood as a successful approach to explaining why current algorithms and proof complexity analyses stop being successful where they do, and reductions play an important role there [49]. One of the interesting open problems in this area is whether the analysis of the Sums-of-Squares semidefinite programming hierarchy of proof systems (SOS) could be used to improve over the 87% approximation ratio for MAX-CUT. Any improvement on this would improve the approximation status of all problems that reduce to it, and refute the UGC [34]. For the constraint satisfaction problem, in which all constraints must be satisfied, as well as for its optimisation version, the analogue question was resolved recently also by exploiting the theory of reducibility: in that arena, low-degree SOS unsatisfiability proofs exist only for problems of bounded width [47,25].
The goal of this paper is to develop the standard theory of reductions between constraint satisfaction problems in a way that it applies to many of the proof systems from the literature, including but not limited to Sums-of-Squares. Doing this requires a good amount of tedious work, but at the same time has some surprises to offer that we discuss next.
Consider a constraint language B given by a finite domain of values, and relations over that domain. The instances of the constraint satisfaction problem (CSP) over B are given by a set of variables and a set of constraints, each of which binds some tuple of the variables to take values in one of the relations of B. The literature on CSPs has focussed on three different types of conditions that, if met by two constraint languages, give a reduction from the CSP of one language to the CSP of the other. These conditions are a) pp-interpretability, b) homomorphic equivalence, and c) addition of constants to the core (see [21,14]). What makes these three types of reductions important is that they correspond to classical algebraic constructions at the level of the algebras of polymorphisms of the constraint languages. Indeed, pp-interpretations correspond to taking homomorphic images, subalgebras and powers. The other two types of reductions put together ensure that the algebra of the constraint language is idempotent. Thus, for any fixed algorithm, heuristic, or method M for deciding the satisfiability of CSPs, if the class of constraint languages that are solvable by M is closed under these notions of reducibility, then this class admits a purely algebraic characterization in terms of identities.
Our first result is that, for most proof systems P in the literature, each of these methods of reduction preserves the proof complexity of the problem with respect to proofs in P . Tech-nically, what this means is that if B ′ is obtained from B by a finite number of constructions a), b) and c), then, for any appropriate encoding scheme of the statement that an instance is unsatisfiable, efficient proofs of unsatisfiability in P for instances of B translate into efficient proofs of unsatisfiability in P for instances of B ′ . Our results hold for a very general definition of an appropriate encoding scheme that we call local. The propositional proof systems for which we prove these results include DNF-resolution with terms of bounded size, Bounded-Depth Frege, and (unrestricted) Frege. The algebraic and semi-algebraic proof systems for which we prove it include Polynomial Calculus (PC) over any field, Sherali-Adams (SA), Lasserre/SOS, and Lovász-Schrijver (LS) of bounded and unbounded degree. This is the object of Section 4.
Our second main result is an application: we obtain unconditional gap theorems for the proof complexity of CSPs. Building on the bounded-width theorem for CSPs [12,19], the known correspondence between local consistency algorithms, existential pebble games and bounded width resolution [35,7], the lower bounds for propositional, algebraic and semialgebraic proof systems [1,37,16,17,28,22,23], and a modest amount of additional work to fill in the gaps, we prove the following strong gap theorem: Theorem 1. Let B be a finite constraint language. Then, exactly one of the following holds: 1. B has resolution refutations of constant width, 2. B has neither bounded-depth Frege refutations of subexponential size, nor PC over the reals, nor SOS refutations of sublinear degree.
In Theorem 1 and below, the statement that the constraint language B has efficient proofs in proof system P means that, for some and hence every local encoding scheme, all unsatisfiable instances of B have efficient refutations in P . Also, here and below, sublinear means o(n), sublinear-exponential means 2 o(n) , and subexponential means 2 n o (1) , where n is the number of variables of the instance. The proof of Theorem 1 actually shows that case 1 happens precisely if B has bounded width. As noted earlier, the collapse of Lasserre/SOS to bounded width was already known; here we give a different proof. By a very recent result on the simulation of Polynomial Calculus over the real-field by Lasserre/SOS [18], the collapse of Lasserre/SOS implies the collapse of Polynomial Calculus. The proof we present does not depend on that. Instead we exploit directly the theory of reducibility.
As an immediate corollary we get that resolution is also captured by algebra, despite the fact that our methods fall short to prove that it is closed under reductions. Corollary 1. Let B be a finite constraint language. The following are equivalent: 1. B has bounded width, 2. B has resolution refutations of constant width, 3. B has resolution refutation of sublinear width, 4. B has resolution refutations of polynomial size, 5. B has resolution refutations of sublinear-exponential size, 6. B has Frege refutations of bounded depth and polynomial size, 7. B has Frege refutations of bounded depth and subexponential size, 8. B has SA, SOS, and PC refutations over the reals of constant degree, 9. B has SA, SOS, and PC refutations over the reals of sublinear degree.
The proof of this is the object of Sections 5 and 6.
Section 7 is about proof systems that operate with polynomial inequalities and that are stronger than Lasserre/SOS. Theorem 1 raises the question of identifying a proof system that is closed under reducibilities and that can surpass bounded width. In other words: is there a natural proof system for which the class of languages that have efficient unsatisfiability proofs is closed under the standard reducibility methods for CSPs, and that at the same time has efficient unsatisfiability proofs beyond bounded width? By the bounded-width theorem for CSPs, one way, and indeed the only way, of surpassing bounded width is by having efficient proofs of unsatisfiability for systems of linear equations over some finite Abelian group. A straightforward answer to our question is thus the following: Polynomial Calculus over a field of non-zero characteristic p has efficient unsatisfiability proofs for systems of linear equations over Z p . On the other hand, in view of the limitations of Polynomial Calculus over the real-field, and of certain semi-algebraic proof systems that are imposed by Theorem 1, it is perhaps a surprise that, as we show, bounded degree Lovász-Schrijver also satisfies both requirements. Proving this amounts to showing that Gaussian elimination over Z 2 can be simulated by reasoning with low-degree polynomial inequalities over R. The proof of this counter-intuitive fact relies on earlier work in proof complexity for reasoning about gaps of the type (−∞, c] ∪ [c + 1, +∞), for c ∈ Z, through quadratic polynomial inequalities [30].
It should be pointed out that another proof system that can efficiently solve CSPs of bounded width, and that at the same time goes beyond bounded width, is the proof system that operates with ordered binary decision diagrams from [8]. Although it looks unlikely that our methods could be used for this proof system, whether it is closed under the standard CSP reductions is something that was not checked, neither in [8], nor here.
In Section 8 we demonstrate the applicability of our results. Consider the graph 3coloring problem seen as the CSP of a finite constraint language on a 3-element domain in the standard way. Since it is known that 3-coloring has unbounded width, Corollary 1 applies to it, and we get 3-coloring instances that are hard for all indicated proof systems. We open the box of the method, and elaborate on that, in order to get explicit 3-coloring instances that are hard for Polynomial Calculus over all fields simultaneously. This gives a new proof of the main result in [40]. Indeed, the same analysis applies to all CSPs that are NP-complete and all proof systems that are closed under reducibilities. This way we solve Open Problem 5.3 in [40] that asks for explicit 3-coloring instances that are hard for Lasserre/SOS. This article is an extended version of [10]. Except for providing full proof details, we generalise the main gap theorem to cover Polynomial Calculus over the reals and apply our results to the 3-coloring problem, as explained in the paragraph above.

Propositional logic and propositional proofs
Formulas. Fix a set of propositional variables taking values true or false. A literal is a variable X or the negation of a variable X. We write propositional formulas out of literals using conjunctions ∧, disjunctions ∨, and parentheses, with the usual conventions on parentheses. Also we implicitly think of ∧ and ∨ as commutative, associative and idempotent. Thus the formula A∧A is viewed literally the same as A, the formula A∧B is viewed literally the same as B ∧ A, and the formula (A ∧ B) ∧ C is viewed literally the same as A ∧ (B ∧ C). The same applies to disjunctions. Negation is allowed only at the level of literals, so our formulas are written in negation normal form. If A is a formula, we define its complement A inductively: if A is a variable X, then A = X; if A is a negated variable X, then A = X; if A is a conjunction C ∧ D, then A = C ∨ D; if A is a disjunction C ∨ D, then A = C ∧ D. The empty formula is denoted 0 and is always false by convention. Its complement 0 is denoted 1, and is always true by convention. We think of 0 and 1 as the neutral elements of ∨ and ∧, respectively, and the absorbing elements of ∧ and ∨, respectively. Thus we view the formulas 0 ∨ A and 1 ∧ A as literally the same as A, and 0 ∧ A and 1 ∨ A as literally the same as 0 and 1, respectively. The size s(A) of a formula A is defined inductively: if A is 0 or 1, then s(A) = 0; if A is a literal, then s(A) = 1; if A is a conjunction C ∧ D or a disjunction C ∨ D with non-absorbing and non-neutral C and D, then s(A) = s(C) + s(D) + 1.
Propositional proof systems. We work with a Tait-style proof system for propositional logic that we call Frege. The system manipulates formulas in negation normal form and has the following four rules of inference called axiom, cut, introduction of conjunction, and weakening: In these rules, C and D could be the empty formula 0 or its complement 1. In particular 1 is an instance of an axiom rule. A Frege proof is called cut-free if it does not use the cut rule. A Frege proof from a set of formulas F is a proof in which the formulas in F are allowed as additional axioms. In case such a proof ends with the empty formula we call it a Frege refutation of F . As a proof system, Frege is sound and implicationally complete, which means that if A is a logical consequence of A 1 , . . . , A m , then there is a Frege proof of A from A 1 , . . . , A m . We will give a proof of this in Section 2.2 that will apply also to certain subsystems of Frege. If C is a class of formulas, a C-Frege proof is one that has all its formulas in the class C. The size of a proof is the sum of the sizes of the formulas in it. The length of a proof is the number of formulas in it.
Resolution, k-DNF Frege and Bounded Depth Frege. A term is a conjunction of literals and a clause is a disjunction of literals. A k-term or a k-clause is one with at most k literals. A k-DNF is a disjunction of k-terms and a k-CNF is a conjunction of k-clauses.
We define the classes of Σ t,k -and Π t,k -formulas inductively. For t = 1, these are just the classes of k-DNF and k-CNF formulas, respectively. For t ≥ 2, a formula is Σ t,k if it is a disjunction of Π t−1,k -formulas, and it is Π t,k if it is a conjunction of Σ t−1,k -formulas.
In this paper, we use the expression Frege proof of depth t and bottom fan-in k to mean a Σ t,k -Frege proof. Bounded-depth Frege means Σ t,k -Frege for some fixed t and k. This coincides with other definitions in the literature. Frege of depth t and bottom fan-in k, as a proof system, is sound and implicationally complete for proving Σ t,k -formulas from Σ t,k -formulas. A proof of this will follow from the general completeness theorem below. Σ 1,1 is the class of clauses. It is well-known that Σ 1,1 -Frege and resolution proofs are basically the same thing (the difference is that in Σ 1,1 -Frege proofs we allow clause axioms and weakening, but these can always be removed at no cost). A resolution proof which uses only l-clauses is called a proof of width l. Σ 1,k -Frege, for k ≥ 2, is the system R(k) introduced by Krajicek [36], also known as Res(k), k-DNF resolution, and k-DNF Frege. This family of proof systems is important for us because, by letting k range over all constants (i.e., by considering R(const)), it is the weakest for which we can prove closure under reductions.

Completeness of Frege and its subsystems
The proof that Frege is implicationally complete is rather standard. We give a detailed proof nonetheless because we want to have concrete bounds.
Theorem 3 (Quantitative Completeness). Let C be a class of formulas that is closed under subformulas and complementation, and let C ′ be the closure of C under disjunctions. Let A 1 , . . . , A m and A be formulas in C ′ . If A 1 , . . . , A m logically imply A, then there is a C ′ -Frege proof of A from A 1 , . . . , A m . Moreover, if the formulas A 1 , . . . , A m and A have n variables and size at most s, then the size of the proof is at most polynomial in n, s, m, 2 n and s m .
. First we show that, for each formula B on the variables X 1 , . . . , X n and each truth assignment b = (b 1 , . . . , b n ) ∈ {0, 1} n , if b satisfies B, then there is a cut-free proof of S b ∨ B from no assumptions. This is proved by induction on the size of B. If B is a literal, say B = X i or B = X i , then S b ∨B is obtained as the weakening of the axiom X . If B is a conjunction, say B = C ∧ D, then b satisfies both C and D, and by induction hypothesis there are cut-free proofs of S b ∨ C and S b ∨ D. A cut-free proof of S b ∨ B then follows from applying introduction of conjunction. If B is a disjunction, say B = C ∨D, then b satisfies either C or D, and by induction hypothesis there is a cut-free proof of either S b ∨ C or S b ∨ D. A cut-free proof of S b ∨ B then follows from applying weakening. Note that the length of the proof constructed this way is bounded by s(B), and since all the formulas in the proof have sizes bounded by n + s(B), the size of the proof is bounded by (n + s(B))s(B). Note also for later use that, as a consequence of the assumption that C is closed under subformulas, the following holds: if B is a disjunction of formulas in C, say B = i B i , then each formula in this proof is a disjunction of formulas in C, and if B is a conjunction of formulas in C, say B = i B i , then the construction gives a cut-free proof of S b ∨ B i for each B i , and each formula in the proof of S b ∨ B i is again a disjunction of formulas in C.
Now we assume that A is a logical consequence of A 1 , . . . , A m and we build a proof of A from A 1 , . . . , A m . This proof will not yet be guaranteed to have all its formulas in C ′ . We will deal with this issue later. For each truth assignment b ∈ {0, 1} n , the following hold: 1) if b satisfies A, then the previous paragraph gives a proof of S b ∨ A, and 2) if b falsifies A, then it also falsifies some A j for some j ∈ [m], and the previous paragraph gives a proof of S b ∨ A j . From these 2 n proofs, a sequence of 2 n − 1 cuts followed by one weakening gives a proof of A ∨ A 1 ∨ · · · ∨ A m . From there a sequence of m cuts with the m hypotheses A 1 , . . . , A m gives a proof of A. Finally we argue how to turn this proof into one that uses only formulas in C ′ . For the proofs of the type S b ∨ A there is no issue because A is a disjunction of formulas in C and the previous paragraph argues that such proofs have all its formulas in C ′ . The problem comes from the proofs of the type S b ∨ A j . However, since each A j is a disjunction of formulas in C, say A j = k∈I j A jk , its negation A j is a conjunction of formulas in C, because C is closed under complementation. This means that instead of using the proof of S b ∨ A j , we could have used the proof of S b ∨ A jk for each k ∈ I j . We do this for each choice of (k 1 , . . . , k m ) ∈ I 1 × · · · × I m , and what we get are proofs of A ∨ A 1k 1 ∨ · · · ∨ A mkm . These proofs now have all their formulas in C ′ . Combining these at most s m many proofs with the hypotheses A 1 , . . . , A m in a sequence of at most s m cuts, we get a proof of A from A 1 , . . . , A m , and all the formulas in this proof are in C ′ . The size is polynomial in n, s, m, 2 n and s m , and the proof is complete.
The quantitative completeness theorem applies to Σ 1,k -Frege (k-DNF Frege and resolution) because if C is the class of k-terms and k-clauses, then C is closed under subformulas and complementation, and the closure of C under disjunctions is the class of k-DNFs. It also applies to Σ t,k -Frege, for t ≥ 2, because the class Σ t−1,k ∪ Π t−1,k is closed under subformulas and complementation, and its closure under disjunctions is precisely Σ t,k .

Polynomials and algebraic proofs
Polynomials. We define everything for the real field R for simplicity. For algebraic proofs the field would not matter, but for semi-algebraic proofs we need an ordered field such as R. Let X 1 , . . . , X n be n algebraic commuting variables ranging over R. We want to define proof systems that manipulate equations of the form P = 0 and inequalities of the form P ≥ 0, where P is a polynomial in R[X 1 , . . . , X n ], the ring of polynomials with commuting variables X 1 , . . . , X n and coefficients in R. For our purposes it will suffice to assume that the variables range over {0, 1}. Accordingly, it will also be convenient to introduce twin variablesX 1 , . . . ,X n with the intended meaning thatX i = 1 − X i for i = 1, . . . , n. In all proof systems of this section, the following axioms will be imposed on the variables: Observe that X iXi = 0 follows from these axioms: multiply X i +X i − 1 = 0 by X i and subtract X 2 i − X i = 0. This sort of reasoning is captured by the proof systems we are about to define.
Algebraic and semi-algebraic proof systems. Let P and Q denote polynomials. In addition to the axioms in (2), consider the following inference rules called addition and multiplication: Clearly, these rules are sound: any assignment f : {X 1 , . . . , X n ,X 1 , . . . ,X n } → R that satisfies the equations in the premises, also satisfies the equation in the conclusions. For semi-algebraic proofs we add the following axioms: and the following inference rules for polynomial inequalities: These rules are called addition, multiplication and positivity of squares and are also sound for assignments f : {X 1 , . . . , X n ,X 1 , . . . ,X n } → R. One could also consider additional rules that link equalities with inequalities, such as deriving P ≥ 0 from P = 0, or deriving P = 0 from P ≥ 0 and −P ≥ 0, but if we think of an equality as two inequalities, then they are not strictly necessary. On the other hand, some of the axioms are redundant, such as 1 ≥ 0 which can be obtained from adding X i ≥ 0 and 1 − X i ≥ 0, but for the sake of clarity in writing proofs we prefer to keep them. If H denotes a system of polynomial equations P 1 = 0, . . . , P r = 0 and P = 0 is a further equation, an algebraic proof of P = 0 from H is a sequence of polynomial equations ending with P = 0 where each equation in the proof is either a hypothesis equation from H, or an axiom equation as in (2), or follows from previous equations in the sequence by one of the inference rules in (3). If H in addition includes a system of polynomial inequalities Q 1 ≥ 0, . . . , Q s ≥ 0, then a semi-algebraic proof of Q ≥ 0 from H is defined analogously except that we think of each equation as two inequalities, we use additionally the axioms in (4), and we use additionally the rules in (5). Note that by writing Q = Q + − Q − , where Q + and Q − have only positive coefficients, the rules in (3) are actually easily simulated by the rules in (5) (for the multiplication rule, this uses also the axioms in (4)). If an algebraic proof ends with the equation 1 = 0, or similarly if a semi-algebraic proof ends with the inequality −1 ≥ 0, we call it a refutation of H.
As proof systems for deriving new polynomial equations or inequalities that follow from old ones on all evaluations of their variables in {0, 1}, both systems are sound and implicationally complete (we note, however, that without some restrictions on the domain of evaluation, such as {0, 1} in our case, the completeness claim is not true). In Section 2.4 below we will prove implicational completeness for two subsystems of algebraic and semi-algebraic proofs, and hence for algebraic and semi-algebraic proofs themselves.
The main complexity measures for algebraic and semi-algebraic proofs are size and degree. Size is measured by the number of symbols it takes to write the representations of the polynomials in the proofs, and degree is the maximum of the total degrees of the polynomials in the proofs. Polynomials are typically represented as explicit sums of monomials, or as algebraic formulas or circuits. Using formulas or circuits as representations requires some additional technicalities in the definitions of the rules, that we want to avoid (see [42,29]). For all our examples below, we use the representation of an explicit sum of monomials.
Some proof systems from the literature. The proofs in the Polynomial Calculus (PC) are algebraic proofs restricted in such a way that the polynomial Q in the multiplication rule in (3) is either a scalar or a variable [24]. In the literature, this has been called PCR for PC with resolution (see [2]), due to the presence of twin variables, but in recent works the shorter original name PC is used. As pointed out earlier, algebraic proofs can be defined over arbitrary scalar-fields F beyond the real-field R. A claim about algebraic proofs in which the field is omitted is meant to hold for all fields simultaneously. Whenever we need to specify the field F , we speak of algebraic and PC proofs over F .
The proofs in the Lovász-Schrijver (LS) proof system are semi-algebraic proofs for which the following restrictions apply: 1) the polynomial Q in the multiplication rule in (5) is either a positive scalar or a variable, and 2) the positivity-of-squares rule in (5) is not allowed. When the positivity-of-squares is also allowed, the system is called Positive Semidefinite Lovász-Schrijver and is denoted LS + . Originally the Lovász-Schrijver proof system was defined to manipulate quadratic polynomials only (see [41,43]). We follow [30] and consider the extension to arbitrary degree. For the original Lovász-Schrijver proof systems we use LS 2 and LS + 2 . Degree-d Lovász-Schrijver and degree-d Positive Semidefinite Lovász-Schrijver are denoted LS d and LS + d , respectively. For LS and LS + proofs, an important complexity measure originally studied by Lovász and Schrijver is their rank, which is the maximum nesting depth of multiplication by a variable in the proof. Note that, due to possible cancellations, the degree of an LS proof could in principle be much smaller than its rank.
We define four additional proof systems called Nullstellensatz (NS), Sherali-Adams (SA), Positive Semidefinite Sherali-Adams (SA + ), and Lasserre/Sums-of-Squares (SOS). For NS, SA and SA + , we define them as the subsystems of PC, LS and LS + , respectively, in which all applications of the multiplication rule must precede all applications of the addition rule. Due to the structural restriction in which multiplications precede additions, we can think of a proof from a set H of hypotheses as a static polynomial identity of the form where P 1 , . . . , P r are polynomials that either come from the set H of hypotheses, or they are axiom polynomials from the lists in (2) and (4) as appropriate (i.e., from (2) for NS, and from both (2) and (4) for SA and SA + ), or are squares of polynomials when they are allowed (i.e., for SA + ), and c 1 , . . . , c r are scalars of the appropriate type (i.e., arbitrary when the P i they multiply comes from an equation, or positive when the P i they multiply comes from an inequality). Finally we define Lasserre/Sums-of-Squares proof system as the subsystem of semi-algebraic proofs to which the following restrictions apply: 1) the polynomial Q is arbitrary in the multiplication rule in (3) and it is a square polynomial in the multiplication rule in (5), and 2) all multiplications precede all additions. Thus, in terms of static identities, these are proofs of the form where P 1 , . . . , P r are polynomials that either come from the set H of hypotheses, or they are axiom polynomials from the lists (2) and (4), or they are squares, and S 1 , . . . , S r are arbitrary polynomials or square polynomials as appropriate (i.e., arbitrary if the P i they multiply comes from an equation, and squares if the P i they multiply comes from an inequality). Note that the size of an NS, SA, SA + or SOS proof is polynomially related to the sum of the sizes of the non-zero c i 's and S i 's in the corresponding static identities (6) and (7). Non-static proofs are sometimes called dynamic [30]. We will avoid using this term here. We close this section by noting the relationships between these proof systems. Clearly, every NS proof of degree d is also a PC proof of degree d. The converse is certainly not true, but what is true is that every PC proof of degree d and rank k can be converted into an NS proof of degree d + k, where the rank of a PC proof is the analogue of the rank measure for LS proofs that we defined earlier. The same relationships hold between SA and LS, and SA + and LS + . In all three cases, the conversions go by swapping the order in which the addition and the multiplication rules are applied, when they appear in the wrong order. Also, every NS proof over the reals is an SA proof, which is an SA + proof. Finally, thanks to the axioms (2), each SA + proof can be easily converted to an SOS proof of twice the degree: replace each multiplication by a variable X by a multiplication by X 2 , and subtract the appropriate multiple of the axiom X 2 − X = 0 to effectively simulate the multiplication by X. See [39] for a related discussion.
Discussion on variants of NS, SA, SA + and SOS. The polynomial identity interpretations of NS, SA, SA + and SOS, c.f., (6) and (7), are closely related to the original definitions by Beame et al. [15] for NS, and the settings of Sherali and Adams [46] and Lasserre [38] for SA and SOS, respectively. In most incarnations of these proof systems the twin variables are not present; in some others they are (e.g., [9]). If we care only about degree, the presence of twin variables makes no difference at all for Nullstellensatz since we can always simulate a multiplication byX i by subtracting a multiplication by X i . Note, however, that this blows up the size exponentially in the degree. In order to make sense of Sherali-Adams without twin variables, we need to extend the definition to allow Q in the multiplication rule to be, besides a positive scalar or a variable X i , a linear polynomial of the form 1 − X i . The static form of such a proof is an identity such as where P ′ 1 , . . . , P ′ r and P ′ are polynomials as in (6), but without twin variables. If P ′ 1 , . . . , P ′ r and P ′ denote the polynomials over X 1 , . . . , X n that result from the polynomials P 1 , . . . , P r and P over X 1 , . . . , X n ,X 1 , . . . ,X n when each twin variableX i is replaced by 1 − X i , then any valid proof with twin variables as in (6) transforms into a valid proof without twin variables as in (8). Thus, if we care only about degree, the versions of Sherali-Adams and Positive Semidefinite Sherali-Adams without twin variables simulate the versions with twin variables, for polynomials without twin variables. As for Nullstellensatz the size could blow up exponentially in the degree. The same facts are true for Sums-of-Squares. Two further comments are in order. For Nullstellensatz, one could consider an alternative definition in which proofs are polynomial identities of the form i P i · R i = P , where the P i are hypotheses or axiom polynomials, and the R i are arbitrary polynomials. However this difference is minor since we can always write each R i as a combination of monomials j c ij M ij and split P i ·R i into j P i ·c ij M ij . Second, one could consider the version of Sumsof-Squares in which in addition to squares S i as in (7), one is also allowed multiplication by variables. As noted earlier, such multiplications by a variable X can be simulated by multiplications by their squares X 2 , thanks to the axioms X 2 − X = 0 from (2), at the cost of at most doubling the degree, and blowing up the size at most polynomially.

Completeness of Nullstellensatz and Sherali-Adams
In this section we prove the implicational completeness of Nullstellensatz and Sherali-Adams with quantitative bounds. We start with two technical lemmas that will be used to justify the elimination of twin variables. The second technical lemma that we need formalizes the elimination of twin variables.
Lemma 2. For every polynomial P of degree d, every scalar c and every two subsets J and K of [n], with |J| + |K| = ℓ, there are NS and SA proofs of the equation of degree d + ℓ and size polynomial in 2 ℓ and the size of c and P .
Proof. Assume without loss of generality that for each j ∈ [t]. Lemma 1 gives proofs of (1 − X j −X j )T j = 0 for every j ∈ [t]. Adding them all together gives R t − R 0 = 0 by (10) and we are done.
We will need the following definitions. For every assignment a : Define its indicator polynomial : For every polynomial P , let P (a) denote the evaluation of P when X i is assigned a(X i ). For a polynomial P on the variables X 1 , . . . , X n , its multilinearization is the unique multilinear polynomial that agrees with P on all assignments of values in {0, 1} to its variables. The uniqueness of the multilinearization follows from the fact that the collection of multilinear polynomials in R[X 1 , . . . , X n ] forms a vector space of dimension 2 n for which the monomials make a basis. Note that this holds for any field; not just R.
Lemma 3. For every polynomial P on the variables X 1 , . . . , X n , there are polynomials Q 1 , . . . , Q n such that the following identity holds: where P * denotes the multilinearization of P . Moreover, each Q i has size polynomial in the size of P .
Proof. Observe that it is enough to prove the lemma for the special case of monomials. Indeed, if P is an arbitrary polynomial, we get the identity (12) by splitting P into a sum of monomials, applying the lemma to each monomial, and adding up the obtained identities. Let P be a monomial. We proceed by induction on the sum of the individual degrees of the variables. If all variables have individual degree one, there is nothing to prove. Otherwise, some variable must have individual degree at least two. Say this variable is X j and let P ′ and P ′′ be such that P = X j P ′ and P ′ = X j P ′′ . Note that the multilinearizations of P and P ′ are the same, and in both P ′ and P ′′ the sum of the individual degrees is strictly smaller. The induction hypothesis applied to P ′ gives polynomials Q ′ 1 , . . . , Q ′ n such that Now the identity we want is obtained by defining Q i = Q ′ i for i = j, and Q j = Q ′ j − P ′′ . Indeed: and we already proved in (13) that this last thing is P * . Proof. Both proofs are essentially the same; first we give the proof for Sherali-Adams and then indicate how to adapt it to Nullstellensatz. We prove the theorem when P is multilinear and then we adapt it to the general case. Assume Observe that in all cases c a,i is non-negative. In the first case because P (a) was non-negative, and in the second case because both P i * (a) and P (a) were negative, so their ratio is positive. The choice of these reals guarantees that We need the following claim.
Claim 1. For every assignment a and every i ∈ [m], the polynomial P i (a) · I a is the multilinearization of P i · I a . In addition, a c a,0 · I a + m i=1 c a,i · P i (a) · I a = P . Proof. Since the multilinearization is unique and the polynomial P i (a) · I a is multilinear, it suffices to show that P i (a) · I a and P i · I a agree on all assignments of values in {0, 1} to their variables. But this is easy: they both evaluate to P i (a), or both evaluate to 0, depending on whether the assignment is a, or different from a, respectively. For the second claim we use the same argument, and add the additional fact that P is itself multilinear: the big sum over a is a multilinear polynomial and, by (17), it agrees with P on all assignments of values in {0, 1} to its variables. Hence, by the uniqueness of the multilinearization, and since P is multilinear, it is P itself.
Back to the proof, by the first part of Claim 1, for every assignment a and every i ∈ [m], there exist polynomials Q 1 a,i , . . . , Q n a,i according to Lemma 3 that make the following identities hold: We are ready to build up the proof of P ≥ 0 from P 1 ≥ 0, . . . , P m ≥ 0. We claim that the following identity holds: First we claim that the left-hand side can be converted into a valid SA proof (with multiplications by X j 's and 1 − X j 's, which can be simulated in our definition of Sherali-Adams as discussed in Lemma 2). To see this, just reorder the terms and apply Lemma 1 to replace Q j a,i · (X 2 j − X j ) by proper SA proofs. It remains to see that the identity (19) holds; this will show that it is an SA proof of P ≥ 0 from P 1 ≥ 0, . . . , P m ≥ 0.
In order to see that (19) holds, first use equation (18) to rewrite its left-hand side: And now use the second part of Claim 1 to complete the proof when P is multilinear. When P is not multilinear, it suffices to apply the above argument to get its multilinearization P * , and then apply the reverse identity in Lemma 3. Indeed, To turn this into a proper SA proof we need to use Lemma 1 again.
For Nullstellensatz, the argument is the same except that, in order to handle arbitrary fields besides the real field R, the coefficients c a,i need to be redefined. Let H = {P 1 = 0, . . . , P m = 0}. If P i (a) = 0, define c i,a = 0 for all i ∈ [m]. If P i (a) = 0, let i * be the smallest element in [m] such that P i * (a) = 0, which must exist by hypothesis, and define c a,i = P (a)/P i (a) for i = i * and c a,i = 0 for i ∈ ([m] ∪ {0}) \ {i * }. This choice is well-defined over any field and guarantees (17). The rest of the proof is the same.

Constraint satisfaction problem
There are many equivalent definitions of the constraint satisfaction problem. Here we use the definition in terms of homomorphisms. Below we introduce the necessary terminology. A concrete example will be developed in Section 8 where we apply the method of reducibilities to the graph k-coloring problem for k ≥ 3.

CSPs and homomorphisms.
A relational vocabulary L is a set of symbols; each symbol has an associated natural number called its arity. A relational structure B over L (or an L-structure) is a set B, called a domain together with a set of relations over B. For each natural number r and each relation symbol R ∈ L of arity r, there is a relation in B of arity r denoted R(B), i.e., R(B) ⊆ B r . Sometimes we call it an interpretation of R in B. We say that a relational structure is finite if its domain is finite and it has finitely many non-empty relations.
Let B and B ′ be L-structures, for some relational vocabulary L. A homomorphism from B to B ′ is a function h : B → B ′ , which preserves all the relations, that is, for every natural number r and each relation symbol R ∈ L of arity r, if (b 1 , . . . , b r ) ∈ R(B), then (h(b 1 ), . . . , h(b r )) ∈ R(B ′ ).
For a fixed L-structure B, the constraint satisfaction problem of B, denoted CSP(B), is the following computational problem: given a finite L-structure A, decide whether there exists a homomorphism from A to B. If the anwser is positive we call the instance A satisfiable; otherwise we call it unsatisfiable. The size of an instance A is the number of elements in its domain plus the number of tuples in all its relations. Note that if the vocabulary L is fixed and finite, then the size of A is polynomial in the number of elements of its domain which we denote by |A|. In the context of CSP the structure B is often called a constraint language or a template. We usually assume that the constraint language B is finite.
Bounded-width. The existential k-pebble game is played on two relational structures A and B over the same vocabulary by two players called Spoiler and Duplicator. The players are given two corresponding sets of pebbles {a 1 , . . . , a k } and {b 1 , . . . , b k }. In each round Spoiler picks one of the k pebbles a 1 , . . . , a k , say a i , and puts it on an element of the structure A. Duplicator responds by picking the corresponding pebble b i and placing it on some element of the structure B. For simplicity, in any given configuration of the game let us identify a pebble with the element of the structure that it is placed on. Spoiler wins if at any point during the game the partial function f : A → B defined by f (a i ) = b i , for each pebbled element a i of A, is either not well defined (because there exist indices i, j ∈ [k] of two pebbled elements such that a i = a j but b i = b j ), or is not a partial homomorphism. Otherwise, the Duplicator wins.
We say that a finite relational structure B has width k if, for every finite structure A of the same vocabulary as B, if there is no homomorphism from A to B, then Spoiler wins the existential k-pebble game on A and B. The structure B has bounded width if it has width k for some k. Structures of bounded width are exactly those structures for which CSP(B) can be solved by a local consistency algorithm [35].

Propositional and polynomial encodings
To reason about proof systems for CSPs we encode the fact that a finite structure A maps homomorphically to a finite structure B, over the same vocabulary, as a CNF or a system of polynomial inequalities or/and equations. In the proofs we will use concrete fixed encodings but our results hold for a whole class of encodings which we call local.
Local encodings. First let us fix some notation. In the context of propositional proof systems, for any sets A and B by V (A, B) we denote a set of propositional variables: for every a ∈ A and every b ∈ B there is a variable X(a, b) in the set V (A, B). Truth valuations of the variables in V (A, B) and relations on A×B have a natural one-to-one correspondence: a variable X(a, b) is assigned the truth value 1 if and only if the pair (a, b) belongs to the relation. Recall that a function f from A to B is a relation {(a, f (a)) : a ∈ A} on A × B. Hence, a homomorphism from an L-structure A to an L-structure B is a relation on A × B.
Fix a finite relational vocabulary L and a finite structure B over L.
A propositional encoding scheme E for CSP(B) is a mapping which assigns to every Lstructure A a set of clauses E(A) over the variables in V (A, B) in such a way that there is a one-to-one correspondence between the truth valuations of the variables in V (A, B) satisfying E(A) and the homomorphisms from A to B.
In the context of algebraic and semi-algebraic proof systems we additionally assume the presence of twin variables. For every a ∈ A and every b ∈ B there is both the algebraic variable X(a, b) and the algebraic variableX An algebraic encoding scheme E over a field F for CSP(B) is a mapping which assigns to every L-structure A a set of polynomial equations E(A) over the variables in V (A, B) in such a way that there is a one-to-one correspondence between the evaluations of the variables form V (A, B) in {0, 1} satisfying E(A) and the axioms from (2) over F , and the homomorphisms from A to B. Finally, a semi-algebraic encoding scheme E for CSP(B) is a mapping which assigns to every L-structure A a set of polynomial inequalities E(A) over the variables in V (A, B) in such a way that there is a one-to-one correspondence between the evaluations of the variables form V (A, B) in {0, 1} satisfying E(A) and the axioms from (2) and (4), and the homomorphisms from A to B. Observe that every algebraic encoding scheme over the real-field is also a semi-algebraic encoding scheme.
An encoding scheme E is invariant under isomorphisms if, whenever f : A → A ′ is an isomorphism from an L-structure A to an L-structure A ′ , it holds that E( Next we define the key notion of local encoding scheme. We need two pieces of notation. If the structure A has a single element and each of its relations is empty, we denote the encoding E(A) by E(a). If the structure A has a single non-empty relation R(A) with a single tuple (a 1 , . . . , a r ) in it, and its domain is {a 1 , . . . , a r }, then we denote E(A) by E(R(a 1 , . . . , a r )). Since the vocabulary L is finite, up to isomorphism there are only finitely many structures of one of the above-mentioned two kinds. Therefore, for any relational structure B over a finite vocabulary L and any encoding scheme E that is invariant under isomorphisms, the size of encodings of the form E(a) or E(R(a 1 , . . . , a r )) is bounded by a constant. We call it the local bound of the encoding scheme.
An encoding scheme E in local if it is invariant under isomorphisms and, for every Lstructure A, the encoding E(A) is a sum of E(a) over all a ∈ A and E(R(a 1 , . . . , a r )) over all R ∈ L and (a 1 , . . . , a r ) ∈ R(A). For our purposes all local encodings of the same kind (i.e., propositional, algebraic or semi-algebraic) are essentially equivalent, as formalized by the following result. Proof. For 1, let s and s ′ be the local bounds of E and E ′ , respectively. Take a clause C from E ′ (A). The clause C belongs to a subset of E ′ (A) of the form E ′ (a) or E ′ (R(a 1 , . . . , a r )), so the size of C is bounded by s ′ . Without loss of generality suppose that C belongs to a set E ′ (R(a 1 , . . . , a r )). The corresponding subset E(R(a 1 , . . . , a r )) of E(A) has size at most s. The satisfying truth valuations for E(R(a 1 , . . . , a r )) and E ′ (R(a 1 , . . . , a r )) are the same. Therefore, since C is an element of E ′ (R(a 1 , . . . , a r )), we have that E(R(a 1 , . . . , a r )) logically implies C. It follows from the quantitative completeness theorem for resolution (cf. Theorem 3) that the clause C has a resolution derivation from E(R(a 1 , . . . , a r )) of size bounded by a function of s and s ′ .
The proofs of 2 and 3 are analogous. The completeness theorem for Nullstellensatz and Sherali-Adams (cf. Theorem 4) needs to be used instead of Theorem 3.
Three specific examples. The results of this paper hold for arbitrary local encoding schemes. However, in the proofs it is often convenient to be specific. We now introduce three concrete encoding schemes that, in addition, are defined uniformly with respect to the template B.
For every structures A and B over the same vocabulary, let CNF(A, B) be a set of clauses with: Note that the mapping that to an L-structure A assigns CNF(A, B) is a local encoding scheme for CSP(B). Since this definition is uniform with respect to B we call it simply the CNF encoding scheme. We use it to reason about propositional proof systems for CSP(B). There are two standard ways of encoding a clause into a system of inequalities: multiplicatively and additively. These give rise to two local encoding schemes which we use to reason about algebraic and semi-algebraic proof systems in the context of CSP. Specifically, the multiplicative and additive encodings of a clause C = X 1 ∨ · · · ∨ X ℓ ∨ X ℓ+1 ∨ · · · ∨ X k are the following equation and inequality, respectively: Let EQ(A, B) be the system of polynomial equations that are multiplicative encodings of the clauses in CNF(A, B), that is: The mapping that to an L-structure A assigns EQ(A, B) is a local encoding scheme for CSP(B). Note that this scheme makes sense over any field. We call it the EQ encoding scheme. It is used in Section 4 to reason both about algebraic and semi-algebraic proof systems, and in Section 6 while discussing lower bounds for SOS.
Similarly, let INEQ(A, B) be a system of of linear inequalities that are additive encodings of the clauses in CNF(A, B), that is: The mapping that to an L-structure A assigns INEQ(A, B) is a local encoding scheme for CSP(B). We call it the INEQ encoding scheme. It is used in Section 7 to reason about semi-algebraic proof systems.
In Section 8 we will discuss one more local semi-algebraic encoding scheme that was used in [40] to prove PC lower bounds for graph coloring.

General proof complexity facts
Substitutions will play a central role in showing that certain propositional and semi-algebraic proof systems behave well with respect to the classical CSP reductions. In the case of propositional proof systems we will consider substitutions of variables by bounded-DNF formulas with a bounded number of terms, and in the case of algebraic and semi-algebraic proof systems we will use substitutions by polynomials with bounded degree and a bounded number of monomials. We now prove some key technical lemmas regarding such substitutions.

Substitutions in Frege
In the case of propositional proof systems, a substitution is a mapping from variables to formulas. Applying a substitution to a formula means replacing all variables by the corresponding formulas, simultaneously all at once. Since our formulas are in negation normal form, it is implicit that the result of applying the substitution X → F to a negative literal X is the formula dual to F , i.e., F . Lemma 5. Let k, d and m be positive integers, let A be a k-term and let A + be the result of replacing each variable in A by a (possibly different) d-DNF with at most m many terms. Then A + is logically equivalent to a k(d + m)-DNF with at most m k d km many terms.
Proof. Let p and n be the numbers of positive and negative literals in A, respectively. After applying the substitution, the k-term becomes a conjunction of p many d-DNFs and n many negations of d-DNFs, where each d-DNF has at most m many terms. Applying the De Morgan rules to the negated d-DNFs, what we get is a formula of the following schematic form: In the left subformula in (22), distributing the outer conjunction over the disjunction gives a disjunction of at most m p many pd-terms. In the right subformula in (22), distributing the two outer conjunctions over the disjunction gives a disjunction of at most d nm many nm-terms. Schematically: Finally, in formula (23), distributing the outer conjunction over the disjunctions gives a disjunction of m p d nm many (pd + mn)-terms: Using p + n ≤ k we get the result.
Lemma 6. Fix any positive integers q, d, m and p. Let F and G be sets of clauses with at most q variables each, and let σ be a substitution of the variables of F into d-DNFs with at most m many terms on the variables of G. For any positive integers k, s and any t ≥ 2, if F has a Frege refutation of depth t, bottom fan-in k, and size s, and for each formula in F its substitution is a logical consequence of at most p many clauses from G, then G has a Frege refutation of depth t, bottom fan-in k(d + m), and size polynomial in 2 k and s.
Proof. Fix some positive integers q, d, m and p. Assume that F has a Σ t,k -Frege refutation, for some k and t ≥ 2. Let ℓ = k(d + m).
We now define an operator which maps formulas in Σ t,k to formulas in Σ t,ℓ . If a formula D is a variable or the negation of a variable, then we define D + simply as the d-DNF or d-CNF obtained by applying the substitution σ to D. For a k-DNF D, we put D + to be the ℓ-DNF that one gets from applying Lemma 5 to each k-term in D with the substitution σ and then taking the disjunction of the resulting DNFs. For a k-CNF D, we define D + as the complement of (D) + . In this case D + is an ℓ-CNF. Clauses and terms are treated as 1-DNFs and 1-CNFs, respectively. Finally, if D is a formula from Σ t,k of depth at least 3, then we define D + to be the formula constructed by replacing each maximal subformula E of D of depth at most 2 by E + . By Lemma 5, the size of D + is at most polynomial in 2 k and the size of D.
If D and E are both formulas in Σ t,k , then (D ∨ E) + = D + ∨ E + . Moreover, for any D it holds that (D) + = (D + ). Hence also (D ∧ E) + = D + ∧ E + . This means that the result of applying our operator to the premises and conclusion of any of the rules of Frege is an instance of the same rule.
Let D 1 , D 2 , . . . , D t be a Σ t,k -Frege refutation of F of size s. In order to transform the sequence of formulas D + 1 , D + 2 , . . . , D + t into a valid Σ t,ℓ -Frege refutation of G we need to prove that for each non-logical axiom D i , the formula D + i has constant size Σ t,ℓ -Frege proof from G.
Each non-logical axiom D i is a q-clause C from F . By assumption, the substitution σ(C) and hence also C + is a logical consequence of at most p many q-clauses of G. Moreover, the size of C + is bounded by a function of d, m and q, and the total size of the p many q-clauses of G that imply C + is bounded by a function of p and q. The quantitative completeness theorem for Σ t,ℓ -Frege does the rest: D + i has a Σ t,ℓ -Frege derivation from G of size bounded by a function of d, m, p and q.

Substitutions in algebraic and semi-algebraic proof systems
In the case of algebraic and semi-algebraic proof systems, a substitution is a mapping from variables to polynomials. Applying a substitution to an equation or inequality means replacing all variables by the corresponding polynomials, simultaneously all at once.
For every set of polynomial equations F , by Eq(F ) we denote the union of F and all the axiom polynomial equations from (2) for the variables in F , i.e., for each variable X orX appearing in one of the equations from F , we add to F the polynomial equations X 2 −X = 0, X 2 −X = 0 and X +X − 1 = 0.
Lemma 7. Fix any positive integers d, m, p, and q. Let F and G be sets of polynomial equations of the form P = 0, where P is a monomial of degree at most q with coefficient 1, and let σ be a substitution of the variables of F into polynomials on the variables of G of degree at most d, with at most m many monomials and every coefficient equal 1. For P being the Nullstellensatz or Polynomial Calculus proof system over any field, and for any positive integers k and s, if F has a P refutation of degree k, size s, and for each equation in Eq(F ) its substitution follows from at most p many equations from G on all evaluations of its variables in {0, 1} over the underlying field, then G has a P refutation of degree linear in k and size polynomial in 2 k and s.
Proof. Let us fix some positive integers d, m, p, and q. Let F and G be sets of polynomial equations of the form P = 0, where P is a monomial of degree at most q with coefficient 1, and let σ be a substitution of the variables of F into polynomials on the variables of G of degree at most d, with at most m many monomials and every coefficient equal 1. If for each equation in Eq(F ) its substitution follows from at most p many equations from G on all evaluations of its variables in {0, 1}, then by Theorem 4 for every equation in Eq(F ) its substitution has an NS derivation from G. The size and degree of this derivation are bounded by some constants which depend on d, m, p, and q.
Suppose that P is the Nullstellensatz proof system and assume that for some positive integers k and s, the set of equations F has an NS refutation of degree k, size s. The refutation of F is of the form where P 1 , . . . , P r are polynomials such that the equation P i = 0 is in the set Eq(F ), and c 1 , . . . , c r are scalars. We substitute the variables in the above equality according to σ and substitute the polynomials from the set Eq(F ) by their NS derivations. This way we obtain an NS refutation of G of degree linear in k and size polynomial in 2 k and s. Suppose that P is the Polynomial Calculus proof system and assume that for some positive integers k and s, the set of equations F has a PC refutation of degree k, size s. The PC refutation of G goes as follows: first for each equation in Eq(F ) we derive its substitution in the Nullstellensatz proof system, and then we simulate the subsequent steps of the refutation of F . Applications of addition and multiplication by scalars remain as they were, and applications of multiplication by variables are simulated in several steps. Since after applying the substitution to the variables they become polynomials of degree at most d, with at most m many monomials and every coefficient equal 1, we can simulate multiplication by a variable by at most md multiplication steps and at most m − 1 additions. The substitution of variables causes a blow-up in size which is polynomial in 2 k , and the simulation additionally increases the size by a constant factor. Altogether, the degree of the PC refutation of G described above is linear in k and its size is polynomial in 2 k and s.
For every set of polynomial inequalities F , by Ineq(F ) we denote the union of F and all the axiom polynomial inequalities and equations from (4) and (2) for the variables in F , i.e., for each variable X orX appearing in one of the equations from F , we add to F the polynomial equations X 2 − X = 0,X 2 −X = 0, X +X − 1 = 0, and inequalities X ≥ 0, Lemma 8. Fix any positive integers d, m, p, and q. Let F and G be sets of polynomial equations of the form P = 0, where P is a monomial of degree at most q with coefficient 1, and let σ be a substitution of the variables of F into polynomials on the variables of G of degree at most d, with at most m many monomials and every coefficient equal 1. For P being the Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver proof system, and for any positive integers k and s, if F has a P refutation of degree k, size s, and for each inequality and equation in Ineq(F ) its substitution follows from at most p many equations from G on all evaluations of its variables in {0, 1}, then G has a P refutation of degree linear in k and size polynomial in 2 k and s.
Proof. Let us fix some positive integers d, m, p, and q. Let F and G be sets of polynomial equations of the form P = 0, where P is a monomial of degree at most q with coefficient 1, and let σ be a substitution of the variables of F into polynomials on the variables of G of degree at most d, with at most m many monomials and every coefficient equal 1. If for an inequality or equation in Ineq(F ) its substitution follows from at most p many equations from G on all evaluations of its variables in {0, 1}, then by Theorem 4 such substitution has an SA derivation from G. Moreover, the size and degree of those derivations are bounded by some constants which depend on d, m, p, and q.
Suppose that P is the Sherali-Adams or Positive Semidefinite Sherali-Adams proof system and assume that for some positive integers k and s, the set of equations F has an SA (or SA + ) refutation of degree k, size s. The refutation of F is of the form where c 1 , . . . , c r are reals and P 1 , . . . , P r are polynomials such that the equation P i = 0 or the inequality P i ≥ 0 is in the set Ineq(F ), or they are squares of polynomials when they are allowed (i.e., for SA + ). We substitute the variables in the above equality according to σ and substitute the polynomials from the set Ineq(F ) by their SA derivations. This way we obtain an SA (or SA + ) refutation of G of degree linear in k and size polynomial in 2 k and s.
Suppose that P is the Sum-of-Squares proof system and assume that for some positive integers k and s, the set of equations F has an SOS refutation of degree k, size s. The refutation of F is of the form where P 1 , . . . , P r are polynomials such that either the equation P i = 0 or the inequality P i ≥ 0 is in the set Ineq(F ), or they are squares, and S 1 , . . . , S r are arbitrary polynomials. We substitute the variables in the above equality according to σ and substitute the polynomials from the set Ineq(F ) by their SA derivations. This way we obtain an SOS refutation of degree linear in k and size polynomial in 2 k and s. Suppose that P is the Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver proof system and assume that for some positive integers k and s, the set of equations F has an LS (or LS + ) refutation of degree k, size s. The refutation of G goes as follows: first for each equation and inequality in Ineq(F ) we derive its substitution in the Sherali-Adams proof system, and then we simulate the subsequent steps of the refutation of F . Applications of addition and multiplication by positive reals as well as positivity-of-squares when it is allowed (i.e. for LS + ) remain as they were, and applications of multiplication by variables are simulated in several steps. Since after applying the substitution to the variables they become polynomials of degree at most d, with at most m many monomials and every coefficient equal 1, we can simulate multiplication by a variable by at most md multiplication steps and at most m − 1 additions. The substitution of variables causes a blow-up in size which is polynomial in 2 k , and the simulation additionally increases the size by a constant factor. Altogether, the degree of the LS (or LS + ) refutation of G described above is linear in k and its size is polynomial in 2 k and s.

Simulations
At a later section we will need to use the known fact that both Polynomial Calculus and Sherali-Adams efficiently simulate resolution. If C is the clause i∈I X i ∨ j∈J X j , let Note that, under the axioms (2), the clause C is encoded by the equation M(C) = 0 or, in the context of semi-algebraic proofs, by the pair of inequalities M(C) ≥ 0 and −M(C) ≥ 0. In Section 2.6 we called this the multiplicative encoding of C. Proof. Assume that C has a resolution derivation of width k and size s. Before we describe the conversions we need to apply a light pre-processing to the resolution derivation. Convert each resolution step deriving D ∨ E from D ∨ X and E ∨ X into a symmetric resolution step in which first D ∨ E ∨ X and D ∨ E ∨ X are derived by weakenings from D ∨ X and E ∨ X, respectively, and then D ∨ E is derived from these by resolving on X. Let D 1 , D 2 , . . . , D t be the resulting resolution derivation. The proofs for Polynomial Calculus and for Sherali-Adams are quite different because the latter one is a static proof system while the former one is not. For Polynomial Calculus, we derive the equation M(D i ) = 0 for i = 1, . . . , t, by induction on i. When D i is a clause from the set {C 1 , . . . , C m }, there is nothing to do. Assume now that D i is derived by a symmetric resolution step from D j = D i ∨X and D k = D i ∨X, where j, k < i. By induction hypothesis the equations M(D i )X = 0 and M(D i )X = 0 have already been derived. Add these equations to the lift of the axiom 1 − X −X = 0 by M(D i ) to get the equation M(D i ) = 0. Next assume that D i is derived by a weakening step from D j , say By induction hypothesis the equation M(D j ) = 0 has already been derived. Lift this equation by X orX as appropriate to get M(D i ) = 0. Clearly, the degree of this proof is linear in k and the size is polynomial in s and k.
For Sherali-Adams the proof is quite different. For each D i in the resolution derivation we produce an inequality Q i ≥ 0 as follows. If D i is an initial clause, let Next consider the DAG of the resolution derivation oriented from the initial clauses towards the conclusion D t . We assign a weight c i to each D i in this DAG inductively: the conclusion D t gets weight 1, and if all immediate successors of D i have already been assigned weights, then D i gets as weight the sum of the weights of its immediate successors. Next multiply each inequality Q i ≥ 0 by its weight c i and add them together. This could cause the coefficients in the SA proof to go exponentially big, but their bitsize is still polynomial. The result is an SA proof of −M(C) ≥ 0 since the only monomial that survives is the conclusion. The reverse inequality M(C) ≥ 0 follows from lifting the axiom 1 ≥ 0. This gives an SA proof of M(C) = 0 as required. The degree of this proof is linear in k and its size is polynomial in s and k.

Closure under reductions
Three types of reductions are often considered in the context of constraint satisfaction problems: a) pp-interpretability, b) homomorphic equivalence, c) addition of constants to a core. In this section we give their precise definitions and show that many proof systems behave well with respect to those types of reductions.

Reductions.
Let B and B ′ be finite relational structures over finite vocabularies L and L ′ , respectively. We say that the structure B ′ is pp-definable in the structure B if it has the same domain and for every relation symbol T ∈ L ′ the relation T (B ′ ) is definable in B by a pp-formula. Recall that a primitive positive formula over L, or pp-formula, is a first-order formula which uses only symbols from L, equality, conjunction, and first-order existential quantification.
Pp-interpretability is a generalization of pp-definability which allows for changing the domain of a CSP language. Given two relational structures B and B ′ in finite vocabularies L and L ′ , respectively, we say that B ′ is pp-interpretable in B if there exist a positive integer n and a surjective partial function f : B n → B ′ such that the preimages of all relations in B ′ (including the equality relation) and the domain of f are pp-definable in B. Showing that a CSP over a language B ′ pp-interpretable in the language B is not harder than the CSP of the language B itself [21] is one of the fundamental results of the so-called algebraic approach to constraint satisfaction problem, which led to many break-through results in the area.
Probably the simplest of all the constructions is the homomorphic equivalence. Structures B and B ′ over a vocabulary L are homomorphically equivalent if there exists a homomorphism from B to B ′ and a homomorphism from B ′ to B. Obviously, if L-structures B and B ′ are homomorphically equivalent, then any L-structure A maps homomorphically to B if and only if it maps homomorphically to B ′ . So the CSP problems over both languages are the same.
Homomorphic equivalence allows us to focus on studying constraint satisfaction problems of well-behaved structures which in this context turn out to be those exhibiting little symmetry. A finite relational structure is called a core if all its endomorphisms are surjective. It is known that every relational structure has a homomorphically equivalent substructure that is a core. Core structures can be extended by one-element unary relations which we refer to as constants, without increasing the complexity of the language [21].
The importance of the constructions a), b) and c) follows from the fact that classes of constraint languages closed under those constructions can be studied via the corresponding algebras of polymorphisms, that is algebras of operations which preserve all the relations in the language (for details see e.g. the survey [13]). Here we show that bounded-DNF Frege, bounded-depth Frege, Frege, Polynomial Calculus, Sherali-Adams, Sums-of-Squares and Lovász-Schrijver of bounded and unbounded degree behave well with respect to those three types of reductions. This allows us to apply (in Section 6) strong results based on the algebraic approach to CSP.

Results
Let us fix relational structures B and B ′ over finite vocabularies L and L ′ , respectively, such that B ′ is obtained from B by a finite sequence of constructions a), b) and c). In the following we recall the known polynomial-time computable transformation that maps instances A of CSP(B ′ ) to instances A ′ of CSP(B) such that A ′ is satisfiable if and only if A is satisfiable, and the size of A ′ is linear in the size of A. The notation is supposed to remind the reader that once a template B ′ is constructed from a template B, the transformation of instances goes in the other direction: from an instance A of CSP(B ′ ) we build an instance A ′ of CSP(B) satisfying the above mentioned conditions.
We prove that if E and E ′ are any local propositional encoding schemes for CSP(B) and CSP(B ′ ), respectively, then this transformation satisfies the following: As a special case of the above theorem, by taking t = 1 we obtain that bounded-DNF Frege behaves well with respect to the classical CSP reductions: Notice also that a Frege refutation of depth t and bottom fan-in k can be seen as a Frege refutation of depth t + 1 and bottom fan-in 1. Therefore, Theorem 5 implies the following statement, which will be crucial for obtaining lower bounds in Section 6: One more consequence of Theorem 5 concerns proofs in Frege proof system without any bounds on the depth. Corollary 3 above immediately implies that Frege is well-behaved with respect to the classical CSP reductions, that is: In the case of algebraic proof systems, if E and E ′ are any local algebraic encoding schemes over a field F for CSP(B) and CSP(B ′ ), respectively, we show that:  We point out that Theorem 7 in the case of the Sherali-Adams and Sums-of-Squares proof systems and the EQ encoding scheme can be extracted from [48] and [47].
The main idea in proving the above theorems for all the proof systems under consideration is the same. The refutation for an instance A ′ of CSP(B) is transformed into a refutation for an instance A of CSP(B ′ ) by substituting the variables of E(A ′ ) by DNFs with a bounded number of terms and a bounded number of literals in each term, or by polynomials with bounded degree, a bounded number of monomials and all coefficients equal 1. The additional condition we need to ensure is that each element of E(A ′ ) after applying the substitution is a logical consequence of a subset of E ′ (A) of a bounded size. This way we can use Lemmas 6, 7 and 8 from Section 3 to control the growth of the size and depth/degree of the refutations. This argument, however, fails if one of the steps in constructing B ′ from B is adding the equality relation (which is a special case of a pp-definition). We deal with this by showing that equality propagation can be done in bounded-width resolution.
We prove Theorems 5, 6 and 7 for CNF and EQ encoding schemes in a series of lemmas below. It follows from Lemma 4 that this suffices to obtain the theorems in full generality. Let us see how to argue this for propositional proof systems. The reasoning in the case of algebraic and semi-algebraic proof systems is analogous.
Proof of Theorem 5. Assume that the statement of the theorem holds for E and E ′ being the CNF encoding scheme. Let now E and E ′ be arbitrary local propositional encoding schemes for CSP(B) and CSP(B ′ ), respectively.
By Lemma 4 there exist positive integers p and p ′ such that for each L-structure A ′ , every clause in E(A ′ ) has a resolution proof from CNF(A ′ , B) of size bounded by p, and for each L ′ -structure A, every clause in CNF(A, B ′ ) has a resolution proof from E(A) of size bounded by p ′ .
Take an L ′ -structure A, and assume that there is a Frege refutation of E(A ′ ) of depth t, bottom fan-in k, and size s. Since every clause in E(A ′ ) has a resolution proof from CNF(A ′ , B) of size bounded by p, it follows that CNF(A ′ , B) has a Frege refutation of depth t, bottom fan-in k, and size linear in s. The statement of the theorem holds for the CNF encoding schemes, so CNF(A, B ′ ) has a Frege refutation of depth t, bottom fan-in polynomial in k, and size polynomial in 2 k , s and the size of A. Since every clause in CNF(A, B ′ ) has a resolution proof from E(A) of size bounded by p ′ , it follows that E(A) has a Frege refutation of depth t, bottom fan-in polynomial in k, and size polynomial in 2 k , s and the size of A.
In the subsequent sections we consider one by one the cases when B ′ is constructed from B using a), b) and c). We begin with pp-definability, with which we deal in three steps: by considering the equality relation, pp-formulas using conjunction only and existential quantification only.

Equality
Suppose that none of the relation symbols in L interprets in B as the equality relation. For a binary relation symbol E not in L, let L ′ = L ∪ {E}. Assume that B ′ is the L ′ -structure with domain B, all relation symbols from L interpreted as in B, i.e., R(B ′ ) = R(B) for every R ∈ L, and the relation symbol E interpreted as the equality relation over B, i.e., For every instance of the CSP of the language B ′ , that is for every finite L ′ -structure A, there is a natural corresponding instance A ′ of the CSP over the language B. If ≡ is the smallest equivalence relation on A which contains E(A), then define A ′ to be the L-structure whose domain A ′ is the set of the equivalence classes of the relation ≡ and every relation symbol R ∈ L is interpreted as for every [a] ≡ ∈ A ′ and b ∈ B. We show that, for a constant c to be determined later, for every clause C from F , the substituted formula σ(C) has a resolution proof from G of size at most c|A|. It follows that G has a Frege refutation of depth t, bottom fan-in k and size at most (c|A| + 1)s. Let C be any of the clauses in F . Note that if C is of type 1 or 2 then by applying the substitution we obtain a clause in G, so there is nothing to be proved. Now, let us assume that C is a clause of type 3, i.e., The following claim will finish the proof. We state the bound on width because it will be useful later. By q we denote the number of elements in B.

Claim 2.
There are constants c and d such that the clause σ(C) has a resolution derivation from G of width at most d and size at most c|A|.
To prove this claim the following observation will be helpful.

Claim 3.
There is a constant e such that for every a, a ′ ∈ A such that a ≡ a ′ , and every b ′ ∈ B, there is a resolution proof of X(a, b ′ ) from X(a ′ , b ′ ) and clauses in G of width at most e, length at most (2q + 1)|A| and size at most (2q + 1) 2 |A|.
We use Claim 3 to prove Claim 2. Note that for every i ∈ [r] there exists a ′ i ∈ [a i ] ≡ such that (a ′ 1 , . . . , a ′ r ) ∈ R(A). Therefore, the clause X(a ′ 1 , b 1 ) ∨ · · · ∨ X(a ′ r , b r ) belongs to G. Now, since a * 1 ≡ a ′ 1 , it follows from Claim 3 that there is a resolution derivation of width at most e, length at most (2q + 1)|A| and size bounded by (2q + 1) 2 |A| of X(a * 1 , b 1 ) from X(a ′ 1 , b 1 ) and clauses in G. If we reproduce exactly the same derivation starting with the clause X(a ′ 1 , b 1 ) ∨ · · · ∨ X(a ′ r , b r ) instead of X(a ′  1 , b 1 ), what we get is a valid resolution derivation of X(a * 1 , b 1 ) ∨ X(a ′ 2 , b 2 ) ∨ · · · ∨ X(a ′ r , b r ) of width at most e + r and size at most (2q + 1) 2 |A| + (2q + 1)(2r − 2)|A|. We repeat the same construction r − 1 more times starting with the last clause derived and get a resolution derivation of σ(C) whose width is bounded by re and whose size is bounded by ((2q + 1) 2 r + (2q + 1)(2r − 2)r)|A|. It remains to prove Claim 3.
Proof of Claim 3. First let us show that for every a, a ′ ∈ A such that (a, a ′ ) ∈ E(A) or (a ′ , a) ∈ E(A), and every b ′ ∈ B there is a resolution proof of width at most q, length at most 2q + 1 and size bounded by (2q + 1) 2 of X(a, b ′ ) from X(a ′ , b ′ ) and the clauses in G. Indeed, the cut rule applied to X(a ′ , b ′ ) and the formula b∈B X Then by a sequence of q − 1 cuts with formulas X(a, b ′ ) ∨ X(a ′ , b), for b ∈ B ′ , we derive X(a, b ′ ). The total number of formulas in this sequence is 2q + 1, and each has width at most q and size at most 2q + 1. Now, let a = a 1 , . . . , a m = a ′ be a sequence of elements of A such that (a i , a i+1 ) ∈ E(A) or (a i+1 , a i ) ∈ E(A), and let us assume that this is one of the shortest sequences with this property. The statement of the claim then follows from the fact that m ≤ |A|. Proof. Let F denote EQ(A ′ , B) and let G denote EQ(A, B ′ ). Analogously as in the proof of Lemma 10 above, for each [a] ≡ ∈ A ′ , we choose one element of [a] ≡ denoted by a * , and consider the substitution σ of the variables in F defined by: We show that every equation from Eq(F ) after applying the substitution σ has a PC derivation from Eq(G) of constant degree and size polynomial in |A|. Once we have this, the proof for P being the Polynomial Calculus proof system follows the same lines as the proof of Lemma 7. Similarly, for (Positive Semidefinite) Sherali-Adams, Sums-of-Squares or (Positive Semidefinite) Lovász-Schrijver proof systems, the proof follows the same lines as the proof of Lemma 8 once we show that every inequality from Ineq(F ) after applying the substitution σ has an SA derivation from Ineq(G) of constant degree and size polynomial in |A|.
Note that by applying the substitution to equations of type 1 and 2, and to the axiom equations and inequalities we obtain equations and inequalities from Eq(G) and Ineq(G) so there is nothing to be proved. Now, consider an equation of type 3 from F , i.e., X([

Conjunction
We now consider the case when the structure B ′ is pp-definable from B by adding a single relation pp-definable using conjunction only. Let S and P be relation symbols in L, let T be a relation symbol not in L, let L ′ = L ∪ {T }, and assume that B ′ is the expansion of B with the relation T (B ′ ) defined using a pp-formula φ(x 1 , . . . , x r ), where r is the arity of T , that is made of a conjunction of one atom on S and one atom on P . That is, R(B ′ ) = R(B) for every R ∈ L, and T (B ′ ) = {(b 1 , . . . , b r ) ∈ B r : B |= φ(x 1 /b 1 , . . . , x r /b r )}. To focus our attention let us assume that S and P are binary, T is ternary, and that the pp-formula that defines T is φ(x 1 , x 2 , x 3 ) = S(x 1 , x 2 ) ∧ P (x 2 , x 3 ). The proof of the general case will be the same.
For a finite L ′ -structure A the corresponding L-structure A ′ has the same domain and all the relation symbols except for S and P interpreted the same as in A. Proof. Let F denote CNF(A ′ , B) and let G denote CNF(A, B ′ ). Observe that the variables of F and G as well as the clauses of type 1 and 2 are the same. Below we show that every clause C of type 3 in F is a logical consequence of a bounded number of clauses of G. It follows that every clause C of type 3 in F has a resolution derivation from G of size bounded by some constant c, hence G has a Frege refutation of depth t, bottom fan-in k and size at most cs + s.
Let C be a clause X(a 1 , b 1 ) ∨ · · · ∨ X(a r , b r ) for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A ′ ), and (b 1 , . . . , b r ) ∈ B r \ R(B). If R ∈ {S, P } then C is also a clause of G and there is nothing to be proved. Without loss of generality let us assume that R = S and hence C is of the form X(a 1 , b 1 ) ∨ X(a 2 , b 2 ) where (a 1 , a 2 ) ∈ S(A ′ ), and Now if (a 1 , a 2 ) ∈ S(A) then C is a clause of G and we are done. Otherwise, there exists a 3 ∈ A ′ such that (a 1 , a 2 , a 3 ) ∈ T (A) and for every b 3 ∈ B there is a clause X(a 1 , b 1 The number of such clauses is bounded by q ℓ , where q is the number of elements in B and ℓ is the arity of T . Those clauses together with the clause of type 1 for a 3 logically imply C.

Existential quantification
We now consider the case when the structure B ′ is pp-definable from B by adding a single relation definable using existential quantification only. Let S be a relation symbol in L, let T be a relation symbol not in L, let L ′ = L ∪ {T }, and assume that B ′ is the expansion of B with the relation T (B ′ ) defined using a pp-formula φ(x 1 , . . . , x r ), where r is the arity of T , that is made of the existential quantification of one variable over an atom on S. That is, To focus our attention let us assume that S is ternary, T is binary, and that the pp-formula that defines T is φ(x 1 , x 2 ) = ∃yS(x 1 , x 2 , y).
For a finite L ′ -structure A the corresponding L-structure A ′ has domain A extended by a set of witnesses for S. For each (a 1 , a 2 ) ∈ T (A), we add to A a new point y(a 1 , a 2 ) so the domain A ′ is equal to A ∪ {y(a 1 , a 2 ) : (a 1 , a 2 ) ∈ T (A)}. All the relation symbols from L except for S are interpreted in A ′ the same as in A, and S(A ′ ) = S(A) ∪ {(a 1 , a 2 , y(a 1 , a 2 )) : (a 1 , a 2 ) ∈ T (A)}. It is not difficult to see that A maps homomorphically to B ′ if and only if A ′ maps homomorphically to B.
Note that F (1), . . . , F (q) cover T (B ′ ) and are pairwise disjoint. In other words, they partition T (B ′ ); note however that some F (b)'s may be empty.
Consider the substitution σ defined by the identity on all variables of G and defined as follows for every variable in F that is not in G: for every (a 1 , a 2 ) ∈ T (A) and b ∈ [q]. Note that this is an ℓ-DNF with at most q ℓ many terms, where ℓ is the arity of T . By Lemma 6 it suffices to check that, for each clause C of F , the substituted formula σ(C) is a logical consequence of a bounded number of clauses of G.
To argue this, let C be any of the clauses in F , say b∈[q] X(a, b) for a in the domain of A ′ . If a is not of the form y(a 1 , a 2 ), then the clause is left untouched by the substitution. Since the same clause is also in G, there is nothing to prove. Suppose now that a is y(a 1 , a 2 ). The substituted formula is then the following: Since the sets F (1), . . . , F (q) cover T (B ′ ), this is indeed equivalent to Note now that this formula is a logical consequence of the following clauses of G: those of type 1 for a 1 and a 2 , and all those of type 3 for (a 1 , a 2 ) and the relation symbol T . These are at most q ℓ + 2 many clauses, where ℓ is the arity of T , and we are done for this case.
Suppose now that C is the clause X(a, b) ∨ X(a, b ′ ) for a in the domain of A ′ and (b, b ′ ) ∈ B 2 with b = b ′ . As in the previous case, if a is not of the form y(a 1 , a 2 ), then the clause is left untouched by the substitution and there is nothing to prove. Suppose now that a is y(a 1 , a 2 ). By applying the substitution σ and converting the resulting formula into negation normal form we obtain the following: This formula says that the tuple (a 1 , a 2 ) is either not mapped to any tuple in F (b) or not mapped to any tuple in F (b ′ ). Since the sets F (b) and F (b ′ ) are disjoint, this is a logical consequence of at most 2q 2 many clauses of G: those of type 2 for a 1 and a 2 . Indeed, those clauses imply that the tuple (a 1 , a 2 ) can be mapped to at most one tuple from B 2 .
Now, let C be the clause X(a 1 , b 1 ) ∨ · · · ∨ X(a r , b r ) for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A ′ ), and (b 1 , . . . , b r ) ∈ B r \ R(B). If (a 1 , . . . , a r ) ∈ A r then the same argument as above shows that there is nothing to be proved. Observe that the only other case is when R = S and C is of the form X( The substituted formula (after converting to negation normal form) is then the following: There are two possibilities. If (b 1 , b 2 ) ∈ B 2 \ T (B ′ ), then the formula above is the logical consequence of the clause X( . Observe that the substituted formula says that the tuple (a 1 , a 2 ) is not mapped to the tuple (b 1 , b 2 ) or it is not mapped to any tuple from F (b 3 ). Similarly to the previous case, this is a logical consequence of at most 2q 2 many clauses of G: those of type 2 for a 1 and a 2 . This is because those clauses imply that the tuple (a 1 , a 2 ) can be mapped to at most one tuple from B 2 .
for every (a 1 , a 2 ) ∈ T (A) and b ∈ [q]. Note that those are polynomials of degree m with at most q m many monomials and all coefficients equal 1, where m is the arity of T . We will show that for each equation in F and for each axiom inequality and equation, its substitution follows on all evaluations of its variables in {0, 1} from a bounded number of equations in Eq(G). By Lemmas 7 and 8 this implies the statement of the lemma. The way to show this is analogous as in Lemma 14 above.
Let P = 0 be any of the equations in F , say b∈BX (a, b) = 0 for a in the domain of A ′ . If a is not of the form y(a 1 , a 2 ), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a is y (a 1 , a 2 ). The substituted equation is then the following: This equation follows on all evaluations of its variables in {0, 1} from the set Eq(G ′ ), where G ′ contains the following equations of G: those of type 1 and 2 for a 1 and a 2 , and all those of type 3 for (a 1 , a 2 ) and the relation symbol T . Indeed, take any evaluation satisfying Eq(G ′ ). It corresponds to a mapping from {a 1 , a 2 } to B, where (a 1 , a 2 ) is mapped to a pair (b There are at most q ℓ + 2q 2 + 2 many equations in G ′ , where ℓ is the arity of T , so we are done for this case. Suppose now that P = 0 is the equation X(a, b)X(a, b ′ ) = 0 for a in the domain of A ′ and (b, b ′ ) ∈ B 2 with b = b ′ . As in the previous case, if a is not of the form y(a 1 , a 2 ), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a is y(a 1 , a 2 ). By applying the substitution σ we obtain the following: Since the sets F (b) and F (b ′ ) are disjoint, this equation follows on all evaluations of its variables in {0, 1} from the set of equations of type 2 for a 1 and a 2 . Indeed, those equations imply that at most one of the pairs (b 1 , b 2 ) ∈ B 2 the product X(a 1 , b 1 )X(a 2 , b 2 ) is 1. Now, let P = 0 be the equation X(a 1 , b 1 ) · . . . · X(a r , b r ) = 0 for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A ′ ), and (b 1 , . . . , b r ) ∈ B r \ R(B). If (a 1 , . . . , a r ) ∈ A r then the same argument as above shows that there is nothing to be proved. Observe that the only other case is when R = S and the equation is of the form The substituted equation is then the following: There are two possibilities. If (b 1 , b 2 ) ∈ B 2 \ T (B ′ ), then the equation above follows on all evaluations of its variables in {0, 1} from the equation X(a 1 , b 1 )X(a 2 , b 2 ) = 0 from G.
. In this case, the substituted equation follows on all evaluations of its variables in {0, 1} from the set of all equations of type 2 for a 1 and a 2 , which imply that for at most one of the pairs (b 1 , b 2 ) ∈ B 2 the product X(a 1 , b 1 )X(a 2 , b 2 ) is 1. Let us consider the axiom equation X(a, b) 2 − X(a, b) = 0 for a in the domain of A ′ and b ∈ B. If a is not of the form y(a 1 , a 2 ), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a is y (a 1 , a 2 ). By applying the substitution σ we obtain the following: This equation follows on all evaluations of its variables in {0, 1} from Eq(G ′ ) where G ′ is the set of equations of type 1 and 2 for a 1 and a 2 . Let us consider the axiom equation X(a, b) +X(a, b) − 1 = 0 for a in the domain of A ′ and b ∈ B. If a is not of the form y(a 1 , a 2 ), then the equation is left untouched by the substitution and there is nothing to prove. Suppose now that a is y(a 1 , a 2 ). By applying the substitution σ we obtain the following: This equation follows on all evaluations of its variables in {0, 1} from Eq(G ′ ) where G ′ is the set of equations of type 1 and 2 for a 1 and a 2 .
Let us consider the axiom inequality 1 −X(a, b) ≥ 0, for a in the domain of A ′ and b ∈ B. If a is not of the form y(a 1 , a 2 ), then the inequality is left untouched by the substitution and there is nothing to prove. Suppose now that a is y(a 1 , a 2 ). By applying the substitution σ we obtain the following: This inequality follows on all evaluations of its variables in {0, 1} from Eq(G ′ ) where G ′ is the set of equations of type 1 and 2 for a 1 and a 2 . They imply that at most one of the products X(a 1 , b 1 )X(a 2 , b 2 ) for (b 1 , b 2 ) ∈ F (b) is equal 1. The same way we deal with the case, when the inequality in question is the axiom inequality 1 −X(a, b) ≥ 0, for a in the domain of A ′ and b ∈ B.
Finally, the axiom inequalities X(a, b) ≥ 0 andX(a, b) ≥ 0, for a in the domain of A ′ and b ∈ B, after applying the substitution σ are always satisfied on evaluations of their variables in {0, 1}.

All together: pp-interpretations
Let B ′ be a finite L ′ -structure pp-interpretable in B, and let f : B n → B ′ be a surjective partial function such that the domain of f is defined by a pp-formula δ(x 1 , . . . , x n ) in the language L, i.e,. f −1 (B ′ ) = {(b 1 , . . . , b n ) ∈ B n : B |= δ(x 1 /b 1 , . . . , x n /b n )}, the preimage of the equality relation on B ′ is defined by a pp-formula ǫ(x 1 , . . . , x 2n ) in the language L, i.e., . . , x 2n /b 2n )}, and for every relation symbol R ∈ L ′ of arity r, the preimage of the relation R(B ′ ) is defined by a pp-formula ϕ R (x 1 , . . . , x rn ) in the vocabulary L, i.e., For every equivalence class [(b 1 , . . . , b n )] we choose a representative (b 1 , . . . , b n ) * . The L ′ -structure whose domain is the set of all representatives and for each R ∈ L ′ of arity r the relation R interpreted as {((b 1 , . . . , b n ) From now on whenever we talk about the structure B ′ we mean the structure that we have just defined.
We now define a structure B ′′ pp-definable in B and show intuitively that small refutations for B ′′ imply small refutations for B ′ . By the results of previous sections it follows that small refutations for B imply small refutations for B ′ . To this end, for every relation symbol R ∈ L ′ of arity r, letR be a relation symbol of arity nr, and let L ′′ = {R : R ∈ L ′ }. We define B ′′ to be the finite L ′′ -structure with domain B and relations defined byR For every instance A of the CSP of the language B ′ , that is, for every finite L ′ -structure A, the corresponding instance of the CSP of the language B ′′ is the L ′′ -structure A ′′ whose domain A ′′ is A × [n] and whose relations are defined bŷ Observe that for a fixed i the sets F (b, i) are disjoint subsets of B ′ and they cover the whole B ′ . In other words, they partition B ′ ; note however that some F (b, i)'s may be empty.
Consider the following substitution σ of the variables of F : Note that this is a clause with at most q n−1 many literals, and hence a 1-DNF with at most q n−1 many terms. By Lemma 6 it suffices to check that, for each clause C of F , the substituted formula σ(C) is a logical consequence of a bounded number of clauses of G.
To argue this, let C be any of the clauses in F , say b∈B X((a, i), b) for (a, i) ∈ A × [n]. The substituted formula is then the following: , (b 1 , . . . , b n )).
Since for each i ∈ [n] the sets F (b, i) partition B ′ , this is equivalent to , (b 1 , . . . , b n )), which is the clause of type 1 for a in G. Hence, we are done for this case.
Suppose now that C is the clause X ((a, i), b ′ ) respectively) is substituted by the empty formula and σ(C) is true so there is nothing to be proved. Otherwise, the substituted formula (after converting to negation normal form) is the following: and it says that either a is not mapped to any of the elements in F (b, i), or it is not mapped to any of the elements in F (b ′ , i). Since the sets F (b, i) and F (b ′ , i) are disjoint, this formula is a logical consequence of q n (q n − 1)/2 clauses of G: those of type 2 for a. Indeed, those clauses imply that the element a can be mapped to at most one tuple from B ′ . Now, let C be the clause X((a 1 , 1), b 1 ) ∨ · · · ∨ X((a r , n), b nr ) for someR ∈ L ′′ of arity nr, (a 1 , . . . , a r ) ∈ R(A), and (b 1 , . . . , b nr ) ∈ B nr \R(B ′′ ). If for some j ∈ [r] and some i ∈ [n] the set F (b nj−n+i , i) is empty, then the variable X((a j , i), b nj−n+i ) is substituted by the empty formula, in which case σ(C) is true, and there is nothing to be proved. Otherwise, the substituted formula is a q n−1 -DNF: for each j ∈ [r] and each i ∈ [n] there is a term which says that a j is not mapped to any tuple from F (b nj−n+i , i), that is, a j is not mapped to any tuple in B ′ which has b nj−n+i on the i-th coordinate. There are two cases: either the tuple ((b 1 , . . . , b n ), . . . , (b nr−n+1 , . . . , b nr )) belongs to (B ′ ) r or not. In the second case without loss of generality let us assume that (b 1 , . . . , b n ) ∈ B ′ . In particular this means that i∈[n] F (b i , i) = ∅. Then we argue that the formula and hence also the substituted formula σ(C) is a logical consequence of q n (q n − 1)/2 + 1 clauses of G: those of type 1 and 2 for a 1 . Indeed, those q n (q n − 1)/2 + 1 clauses imply that a 1 is mapped to exactly one element from B ′ . Since i∈[n] F (b i , i) = ∅, this in turn implies that there exist i such that a 1 is not mapped to any tuple from F (b i , i) and we are done. Otherwise, if the tuple ((b 1 , . . . , b n ), . . . , (b nr−n+1 , . . . , b nr )) belongs to (B ′ ) r , then the substituted formula is a logical consequence of at most rq n (q n − 1)/2 + 1 clauses of G: the clauses of type 2 for a 1 , . . . , a r and the clause of type 3 for R ∈ L ′ , (a 1 , . . . , a r ) ∈ R(A), and  ((b 1 , . . . , b n ), . . . , (b nr−n+1 , . . . , b nr )) ∈ (B ′ ) r \ R(B ′ ). This is not very difficult to see since those rq n (q n − 1)/2 + 1 clauses imply that the tuple (a 1 , . . . , a r ) is mapped to at most one tuple from (B ′ ) r and is not mapped to ((b 1 , . . . , b n ), . . . , (b nr−n+1 , . . . , b nr )). This in turn implies that for some j ∈ [r] and some i ∈ [n], a j is not mapped to any tuple in B ′ which has b nj−n+i on the i-th coordinate, and we are done. More formally, the rq n (q n − 1)/2 + 1 clauses in question imply that for every j ∈ [r] at most one of the variables in  Now, let P = 0 be the equation X((a 1 , 1), b 1 ) · . . . · X ((a r , n), b nr ) = 0 for someR ∈ L ′′ of arity nr, (a 1 , . . . , a r ) ∈ R(A), and (b 1 , . . . , b nr ) ∈ B nr \R(B ′′ ). If for some j ∈ [r] and some i ∈ [n] the set F (b nj−n+i , i) is empty, then the variable X((a j , i), b nj−n+i ) is substituted by 0 and the substituted equation is always satisfied. Otherwise, there are two cases: either the tuple ((b 1 , . . . , b n ), . . . , (b nr−n+1 , . . . , b nr )) belongs to (B ′ ) r or not. In the second case without loss of generality let us assume that (b 1 , . . . , b n ) ∈ B ′ . In particular this means that Hence, the set of equations of type 2 for a 1 in G imply

Homomorphic equivalence
Now let B ′ be a finite L-structure homomorphically equivalent to B. Any L-structure A maps homomorphically to B ′ if and only if it maps homomorphically to B. We fix some homomorphism from B ′ to B and denote it by h. Proof. Let F denote CNF (A, B) and let G denote CNF(A, B ′ ). Consider the substitution σ defined as follows for every variable in F : for every a ∈ A and b ∈ B. By Lemma 6 it suffices to check that, for each clause C of F , the substituted formula σ(C) is a logical consequence of a bounded number of clauses of G.
To argue this, let C be any of the clauses in F , say b∈B X(a, b) for a in the domain of A. Observe that σ(C) is b∈B ′ X(a, b), which is a clause that belongs to G.
The substituted clause says that a is either not mapped to any of the elements in h −1 (b) or it is not mapped to any of the elements in h −1 (b ′ ). If any of those sets is empty, then σ(C) is true. Otherwise, since the sets h −1 (b) and h −1 (b ′ ) are disjoint, σ(C) is a consequence of the clauses of type 2 for a, which imply that a can be mapped to at most one element in B ′ . Now, let C be the clause X(a 1 , b 1 ) ∨ · · · ∨ X(a r , b r ) for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A), and (b 1 , . . . , b r ) . Therefore, σ(C) is a logical consequence of the clauses of type 3 in G for the relation symbol R, (a 1 , . . . , a r ) ∈ R(A) and all tuples Proof. Let F denote EQ(A, B) and let G denote EQ(A, B ′ ). Consider the substitution σ defined as follows for every variable in F : Now, let P = 0 be the equation X(a 1 , b 1 ) · . . . · X(a r , b r ) = 0 for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A) , and (b 1 , . . . , b r ) . Therefore, the substituted equation follows on all valuations of its variables in {0, 1} from the set of equations of type 2 for a 1 , . . . , a r and of type 3 for the relation symbol R, (a 1 , . . . , a r ) ∈ R(A) and all tuples . The argument for the axiom equation and inequalities is the same as in the proof of Lemma 15.

Adding constants
Finally we consider the extension by unary one-element relations under the assumption of B being a core. For each b ∈ B, let R b be a unary relation symbol, not in L, and let L ′ = L ∪ {R b : b ∈ B}. We assume that B is a core and B ′ is the L ′ -structure with domain B, each relation symbol from L interpreted as in B, and R b (B ′ ) = {b}, for every b ∈ B.
For every finite L ′ -structure A the corresponding L-structure A ′ has domain A ′ = A ∪ B (we assume that the sets A and B are disjoint), and every relation symbol Proof. Let F denote CNF(A ′ , B) and let G denote CNF(A, B ′ ). Consider the substitution σ defined by the identity on all variables of G and defined as follows for every variable in F that is not in G: By Lemma 6 it suffices to check that, for each clause C of F , the substituted formula σ(C) is a logical consequence of a bounded number of clauses of G.
To argue this, let C be any of the clauses in F , say b∈B X(a, b) for a in the domain of A ′ . If a ∈ A, then the clause is left untouched by the substitution. Since the same clause is also in G, there is nothing to prove. Suppose now that a = b ′ ∈ B. One of the variables in C is then X(b ′ , b ′ ). This variable is substituted by the true formula so σ(C) is true, which finishes the proof in this case.
Suppose now that C is the clause X(a, b) ∨ X(a, b ′ ) for a in the domain of A ′ and (b, b ′ ) ∈ B 2 with b = b ′ . As in the previous case, if a ∈ A, then the clause is left untouched by the substitution and there is nothing to prove. Suppose now that a = b ′′ ∈ B. Then either b = b ′′ or b ′ = b ′′ . Therefore, either the variable X(b ′′ , b) or the variable X(b ′′ , b ′ ) gets substituted by the empty formula and σ(C) is true.
Now, let C be the clause X(a 1 , b 1 ) ∨ · · · ∨ X(a r , b r ) for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A ′ ), and (b 1 , . . . , b r ) ∈ B r \ R(B). If (a 1 , . . . , a r ) ∈ A r then the same argument as above shows that there is nothing to be proved. If (a 1 , . . . , a r ) is substituted by the empty formula, so once again σ(C) is true. The only remaining case is when (a 1 , . . . , a r ) ∈ R(B) b:=a , where a ∈ R b (B ′ ) and a j = a for some (possibly more than one) j ∈ [r]. If there is substituted by the empty formula, so once again σ(C) is true. Otherwise, there exists j ∈ [r] such that a j = a and b j = b. Then the substituted formula is (possibly a weakening of) the formula X(a, b j ) where b j = b. This formula belongs to G: it is the clause of type 3 for a ∈ R b (A ′ ) and b j ∈ B \ R b (B ′ ).
Lemma 21. Let P be Polynomial Calculus, Sherali-Adams, Positive Semidefinite Sherali-Adams, Sums-of-Squares, Lovász-Schrijver or Positive Semidefinite Lovász-Schrijver. For every two positive integers k and s, and every finite L ′ -structure A, if there is a P refutation of EQ(A ′ , B) of degree k and size s, then there is a P refutation of EQ(A, B ′ ) of degree k and size at most s.
Proof. Let F denote EQ(A ′ , B) and let G denote EQ (A, B ′ ). Consider the substitution σ defined by the identity on all variables of G and defined as follows for every variable in F that is not in G: Now, let P = 0 be the equation X(a 1 , b 1 ) · . . . · X(a r , b r ) = 0 for some natural number r, R ∈ L of arity r, (a 1 , . . . , a r ) ∈ R(A ′ ), and (b 1 , . . . , b r ) ∈ B r \ R(B). If (a 1 , . . . , a r ) ∈ A r then the same argument as above shows that there is nothing to be proved. Otherwise, if (a 1 , . . . , a r ) ∈ B r then P = 0 is of the form All the axiom equations and inequalities from Ineq(F ) after applying the substitution σ either become true or are axiom equations and inequalities for the variables of G.

Upper bound
In this section we show that templates of bounded width (cf. Section 2.5) admit efficient refutations in resolution. It immediately follows that the bounded width property ensures efficient refutations in bounded depth Frege, as well as in Polynomial Calculus over the reals, Sherali-Adams and Sums-of-Squares proof systems (cf. Lemma 9). Together with matching lower bounds obtained in the next section, this will complete the proof of Theorem 1.
Let k(n) be a function. Let B be a finite relational structure over a finite vocabulary and let E be a propositional encoding scheme for CSP(B). We say that a finite relational structure B has resolution refutations of width k(n) with respect to the encoding scheme E if, for every finite structure A over the same vocabulary as B with n elements, if there is no homomorphism from A to B, then E(A) has a resolution refutation of width k(n). We say that B has resolution refutations of constant width if there exist a local encoding E and a function k(n) = O(1) such that B has resolution refutations of width k(n) with respect to E. Lemma 4 implies that a structure B has resolution refutations of constant width if and only if it has resolution refutations of constant width with respect to any local encoding scheme. In this section we use the CNF encoding scheme. The goal is to prove the following: Theorem 8. Let B be a finite relational structure. The following are equivalent: 1. B has bounded width,

B has resolution refutations of constant width.
In preparation for the proof we revisit the characterization of resolution width in terms of existential pebble games from [7].
Let L = {R 0 , . . . , R q } be a finite relational vocabulary consisting of q + 1 symbols of arity q. Let S q be an L-structure with two-element domain {0, 1}, where each relation R i (S q ) encodes the set of valuations that satisfy a q-clause with i negated variables. More precisely, for 0 ≤ i ≤ q, let R i (S q ) = {0, 1} q \ {(x 1 , . . . , x q )} where (x 1 , . . . , x q ) ∈ {0, 1} q is the vector defined by x j = 0 for j > i and x j = 1, otherwise. Now for every q-CNF F , we define an L-structure A F . Its domain is the set of variables in F , and the relation R i (A F ) is the set of all tuples (X 1 , . . . , X q ) such that the clause X 1 ∨ . . . ∨ X i ∨ X i+1 ∨ . . . ∨ X q belongs to F . We allow the variables in the clauses to repeat, so the definition covers clauses with less than q literals. Observe that partial homomorphisms from A F to S q correspond to partial truth assignments to the variables of F that do not falsify any clause from F . Hence, for every q-CNF F , it holds that F is satisfiable if and only if there is a homomorphism from A F to S q . Theorem 9 ([7]). Let k and q be positive integers such that k ≥ q and let F be q-CNF. Then F has a resolution refutation of width k if and only if Spoiler wins the existential (k + 1)-pebble game on A F and S q .
In this section we use the above theorem to establish a similar correspondence between existential pebble games on arbitrary structures A and B and bounded width resolution refutations of CNF (A, B). In the following, let the notation A ≤ k B mean that Duplicator wins the existential k-pebble game on A and B.

Lemma 22.
Let A and B be relational structures over the same vocabulary of maximum arity r, and let k be an integer such that k ≥ |B| and k ≥ r. Then: then CNF(A, B) has a resolution refutation of width k + |B|, then CNF(A, B) does not have a resolution refutation of width k + 1.
Proof. Let F denote CNF(A, B). Let q be the maximum of the number of elements in B and the arity of relation symbols in the vocabulary of A and B. Observe that F is a q-CNF. Lemma 22 follows from Theorem 9 together with the following facts.
Indeed, if Spoiler wins the existential (k + 2)-pebble game on A and B then, by Claim 4, Spoiler wins the existential (k + |B| + 1)-pebble game on A F and S q and, by Theorem 9, F has a resolution refutation of width k + |B|. On the other hand, if Duplicator wins the existential (k + 2)-pebble game on A and B then, by Claim 5, Duplicator wins the existential (k+2)-pebble game on A F and S q and, by Theorem 9, F does not have a resolution refutation of width k + 1. It remains to prove Claims 4 and 5.
Proof of Claim 4. We prove the contrapositive. Suppose that Spoiler wins the existential (k + 1)-pebble game on A and B. We give a winning strategy for Spoiler in the existential (k + |B|)-pebble game on A F and S q . We simulate each move of Spoiler in the game on A and B by |B| moves in the game on A F and S q . Suppose that Spoiler puts the i-th pebble on an element a of A. We simulate this by pebbling elements X(a, b) of A F , for each b ∈ B. There are two possibilities. If the answer of Duplicator falsifies any of the clauses of types 1 or 2 in F then Spoiler wins immediately. Otherwise, the answer of Duplicator is 1 for exactly one element X(a, b ′ ). We simulate this by putting a pebble on the element b ′ of B in the game on A and B. Now, in the game on A F and S q the pebble which lies on the element X(a, b ′ ) stays there until Spoiler picks up the i-th pebble form the element a in A. The other |B| − 1 pebbles which lie on elements X(a, b) for b = b ′ can be used to simulate subsequent moves. Therefore, to simulate the existential (k + 1)-pebble game on A and B we need only |B| − 1 extra pebbles.
If during the course of the game Spoiler does not win by falsifying any of the clauses of types 1 or 2 then the simulation of the game on A and B continues. Since in the simulated game Spoiler has a winning strategy, after a finite number of rounds the partial assignment f : A → B defined by f (a i ) = b i is not a partial homomorphism. This means that there exist a natural number r, a relation symbol R ∈ L of arity r, a tuple (a ′ 1 , . . . , a ′ r ) ∈ R(A) and a tuple (b ′ 1 , . . . , b ′ r ) ∈ B r \ R(B), such that for every i ∈ [r] the pairs of elements (a ′ i , b ′ i ) are pebbled by pairs of corresponding pebbles. It follows from the construction that in the simulation game on A F and S q the pairs of elements (X(a ′ i , b ′ i ), 1) are pebbled by pairs of corresponding pebbles. This means that the partial assignment defined by the current configuration of the game falsifies one of the clauses of type 3 in F and Spoiler wins.
Proof of Claim 5. We prove the contrapositive. Suppose that Spoiler wins the existential k-pebble game on A F and S q . We give a winning strategy for Spoiler in the existential kpebble game on A and B. We simulate each move of Spoiler in the game on A F and S q by a single move in the game on A and B. Suppose that Spoiler puts a pebble on an element X(a, b) of A F . We simulate this by pebbling the element a of A. If Duplicator responds by putting the corresponding pebble on the element b of B, then we simulate this by pebbling 1 in S q , otherwise we pebble 0 in S q .
It is not difficult to see that this is indeed a winning strategy for Spoiler in the existential k-pebble game on A and B. Since in the simulated game Spoiler has a winning strategy, after a finite number of rounds the partial assignment f : A F → {0, 1} corresponding to the current configuration of the game on A F and S q is not a partial homomorphism. Observe that it is not possible to falsify any of the clauses of type 1 in F . If for some a ∈ A and some (b, b ′ ) ∈ B 2 such that b = b ′ , the partial assignment f falsifies X(a, b) ∨ X(a, b ′ ), it means that pairs of corresponding pebbles lie on pairs of elements (X(a, b), 1) and (X(a, b ′ ), 1). Hence, in the simulation game on A and B pairs of corresponding pebbles lie on pairs of elements (a, b) and (a, b ′ ) and the partial assignment is not well defined. Finally, if for some natural number r, a relation symbol R ∈ L of arity r, a tuple (a ′ 1 , . . . , a ′ r ) ∈ R(A) and a tuple (b ′ 1 , . . . , b ′ r ) ∈ B r \R(B), the partial assignment f falsifies the clause X(a ′ 1 , b ′ 1 )∨· · ·∨X(a ′ r , b ′ r ), it means that in the game on A and B for every i ∈ [r], the pairs of elements (a ′ i , b ′ i ) are pebbled by pairs of corresponding pebbles, and the partial assignment given by the current configuration of the game is also not a partial homomorphism, which ends the proof.
We are ready to wrap-up: Proof of Theorem 8. For the implication 1 to 2, assume that B has bounded width, say width l, and let k = max{|B|, r, l}, where r is the maximum arity of the vocabulary of B. Let A be a structure over the same vocabulary as B and assume that there is no homomorphism from A to B. Then Spoiler wins the existential l-pebble game on A and B, and hence also the existential (k + 2)-pebble game on A and B, since k + 2 ≥ l. The hypotheses of Lemma 22 hold, so by part 1. in that lemma, CNF(A, B) has a resolution refutation of width k + |B|. This shows that B has resolution refutations of width k+|B|, and hence resolution refutations of constant width.
For the implication 2 to 1, assume that B has resolution refutations of width l. Again let k = max{|B|, r, l} where r is the maximum arity of the relations in the vocabulary of B. Let A be a structure over the same vocabulary as B and assume that there is no homomorphism from A to B. Then CNF(A, B) has a resolution refutation of width l, and hence of width k + 1 since k + 1 ≥ l. The hypotheses of Lemma 22 hold, so by part 2. in that lemma, Spoiler wins the existential (k + 2)-pebble game on A and B. This shows that B has width k + 2, and hence bounded width.

Lower bounds
Let d(n), k(n) and s(n) be functions. Let B be a finite relational structure over a finite vocabulary and let E be a propositional encoding scheme for CSP(B). We say that the structure B has Frege refutations of depth d(n), bottom fan-in k(n), and size s(n) with respect to the encoding scheme E if, for every finite structure A over the same vocabulary as B with n elements, if there is no homomorphism from A to B, then E(A) has a Frege refutation of depth d(n), bottom fan-in k(n), and size s(n). We say that B has boundeddepth Frege refutations of subexponential size if there exist a local encoding scheme E and functions d(n) = O(1), k(n) = O(1) and s(n) = 2 n o(1) such that the structure B has Frege refutations of depth d(n), bottom fan-in k(n), and size s(n) with respect to E. Due to Lemma 4 the structure B has bounded-depth Frege refutations of subexponential size if and only if it has bounded-depth Frege refutations of subexponential size with respect to any local propositional encoding scheme.
Similarly, for any field F , if E is an algebraic encoding scheme over F , we say that the structure B has PC refutations over F of degree d(n) with respect to the encoding scheme E if, for every finite structure A over the same vocabulary as B with n elements, if there is no homomorphism from A to B, then E(A) has a PC refutation over F of degree d(n). We say that B has PC refutations over F of sublinear degree if there exist a local encoding scheme E over F and a function d(n) = o(n) such that the structure B has PC refutations over F of degree d(n) with respect to E. Due to Lemma 4 the structure B has PC refutations over F of sublinear degree if and only if it has PC refutations over F of sublinear degree with respect to any local algebraic encoding scheme.
Finally, if E is a semi-algebraic encoding scheme, we say that the structure B has SOS refutations of degree d(n) with respect to the encoding scheme E if, for every finite structure A over the same vocabulary as B with n elements, if there is no homomorphism from A to B, then E(A) has a SOS refutation of degree d(n). We say that B has SOS refutations of sublinear degree if there exist a local encoding scheme E and a function d(n) = o(n) such that the structure B has SOS refutations of degree d(n) with respect to E. Due to Lemma 4 the structure B has SOS refutations of sublinear degree if and only if it has SOS refutations of sublinear degree with respect to any local semi-algebraic encoding scheme.
The goal of this section is to prove the following: The equivalence of 1 and 4 is known [47]. Here we provide an alternative proof. The implication 1 to 2 follows from Theorem 8: every resolution refutation is a Frege refutation of depth one, and if the refutation has bounded width, then it has polynomial size and hence subexponential size. The implications 1 to 3 and 1 to 4 follow from Theorem 8 via the fact that bounded-degree Polynomial Calculus and bounded-degree Sherali-Adams simulate bounded-width resolution (cf. Lemma 9); note that the simulation by boundeddegree Sherali-Adams implies also the simulation by bounded-degree Sums-of-Squares and, for both Polynomial Calculus and Sums-of-Squares, bounded-degree implies constant, and hence sublinear, degree. For implications 2 to 1, 3 to 1 and 4 to 1 we use an algebraic characterization of unbounded width. We begin with some definitions.

Algebraic characterization of unbounded width
Let G = (G, +, 0) be a finite Abelian group. For every positive integer n, each g ∈ G and every (z 1 , . . . , z n ) ∈ Z n , we define a relation R (g,z 1 ,...,zn) = {(g 1 , . . . , g n ) ∈ G n : z 1 g 1 + . . . + z n g n = g}, where z i g i is a shortcut for the sum of |z i | copies of sign(z i )g i . Let ∼ be the equivalence relation on the set n>0 G × Z n that identifies tuples defining the same relation, i.e., (g, z 1 , . . . , z n ) ∼ (g ′ , z ′ 1 , . . . , z ′ n ′ ) if and only if n = n ′ and R (g,z 1 ,...,zn) = R (g ′ ,z ′ 1 ,...,z ′ n ′ ) . Let L(G) be the infinite relational vocabulary that for every equivalence class [(g, z 1 , . . . , z n )] has one n-ary relation symbol E [(g,z 1 ,...,zn)] , and let B(G) be the L(G)-structure that has domain G and where each relation symbol E [(g,z 1 ,...,zn)] is interpreted as R (g,z 1 ,...,zn) . The CSP of B(G) is called LIN(G). One should think about instances of LIN(G) as systems of linear equations over the group G. For simplicity, for any instance A of LIN(G) we denote the fact that a tuple (a 1 , . . . , a n ) ∈ A n belongs to the relation E [(g,z 1 ,...,zn)] (A) by z 1 a 1 + . . . + z n a n = g.
Observe that, since there are only finitely many relations of a fixed arity k on the finite set G, the equivalence relation ∼ restricted to G × Z k has finitely many equivalence classes. Thus, in view of Theorems 5 and 7, in order to prove that 2. implies 1., 3. implies 1., and 4. implies 1. in Theorem 10, it suffices to prove lower bounds for 3LIN(G), for every non-trivial finite Abelian group G.

Lower bound for bounded-depth Frege
In [17], an exponential lower bound on the size of bounded-depth Frege proofs of the socalled Tseitin formulas was obtained by reduction from the pigeonhole principle formulas; the latter are known to be hard for bounded-depth Frege by the so-called Jewel Theorem of Proof Complexity [1,16,37]. The Tseitin formulas encode certain systems of linear equations over Z 2 that are derived from expander graphs. Here we adapt the formulas to encode systems of linear equations over arbitrary finite Abelian groups, and then show that the reduction in [17] can be generalised to our formulas. We use the CNF encoding scheme.
Theorem 12. For every integer d and every non-trivial finite Abelian group G there exists a positive constant δ and a family of unsatisfiable instances (A n ) n≥1 of 3LIN(G), where A n has Θ(n) variables and Θ(n) equations, such that for every sufficiently large integer n every Frege refutation of CNF (A n , B(G, 3)) of depth d has size at least 2 n δ .
The rest of this section is devoted to the proof of Theorem 12. We provide a proof for the special case when G is the cyclic group Z q of integers under addition modulo q, for some q ≥ 2. Lemma 25 at the end of this section shows that, thanks to the Fundamental Theorem of Finite Abelian Groups, the special case of G = Z q implies Theorem 12 in full generality. The proof of the general case would actually be the same, however we believe that by focusing on simpler groups we make the arguments easier to follow.
Linear equations over Abelian groups. For the rest of this section, let us fix G to be the cyclic group Z q of integers under addition modulo q, for some q ≥ 2. Whenever we talk about an element z of the group G, where z is some integer, we mean the unique element corresponding to z modulo q. The instances of 3LIN(G) that we show to be hard for bounded-depth Frege are special cases of so-called Tseitin graph tautologies for Z q as defined in [22]. Before defining them we need to introduce some terminology. • the set of variables is the set E(H) of the edges of the graph; • for every vertex v ∈ V (H) there is an equation The system A(H, σ) can be seen as an instance of LIN(G). The formula CNF (A(H, σ), B(G)) is called a Tseitin formula. If the graph H is obtained from directing the edges of a k-regular undirected graph, i.e., a graph in which each vertex has degree k, then A(H, σ) is an instance of kLIN(G).
It is easy to see that if v∈V (H) σ(v) = 0, then the instance A(H, σ) is unsatisfiable. Indeed, since every variable e appears positively on the left-hand side of exactly one equation and negatively on the left-hand side of exactly one equation, by summing up all the equations we get 0 on the left-hand side and v∈V (H) σ(v) on the right-hand side. If v∈V (H) σ(v) = 0 we obtain a contradiction. It is not difficult to show that for a connected graph H, the converse statement holds as well. Proof. The left-to-right direction is clear. For the opposite direction we define a solution to A(H, σ) by assigning values to edges of the graph H one by one while keeping two invariants: none of the equations gets falsified and the graph induced by unassigned edges is connected. Below we formalize this intuition.
Since H is connected, we can enumerate its edges e 1 , . . . , e m in such a way that for every i ∈ [m] the graph H i+1 obtained from H by removing the edges e 1 , . . . , e i and then deleting all isolated vertices, is connected. Let us denote the system A(H, σ) by A 1 . We assign values to the edges of H in the order specified above. Additionally, for each i ∈ [m − 1], after assigning a value to e i we substitute the variable e i in the system A i with this value, and next move all constants to the right-hand side of the equations. We denote the obtained system by A i+1 . The variables of A i+1 are e j , for j > i. Observe, that for every i ∈ [k], the sum of group elements that appear on the right-hand side of all the equations in A i is 0.
Assume that we have already assigned values to the edges e j for j < i without falsifying any of the equations in A(H, σ) (this is true for i = 1). The variable e i appears in exactly two equations in A i . There are two possibilities: • if i = m then at least one of the two equations has at least one more variable whose value has not yet been assigned. This is because the graph H i is connected. Then we can assign a value to e i in such a way that none of the equations in A i gets falsified: the value is either forced by the other equation, or can be assigned arbitrarily.
• if i = m then the two equations which mention the variable e k are of the form e m = g and −e m = h, for some elements g and h of the group. All the other equations in the system A m are of the form 0 = 0. Since the sum of the group elements on the right-hand side of the equations in A m is 0, we have that g = −h and we can assign the value g to e m , satisfying the last two equations.
This finishes the construction of a solution to A(H, σ). Proof. Since A(∂(W ), σ) is the sum of the equations in A(W, σ), the left-to-right direction is obvious. For the opposite direction, let f : ∂(W ) → G be an assignment satisfying A(∂(W ), σ). Let us denote by H ′ the graph induced by W . Observe that by assigning values to the variables in ∂(W ) according to f and moving all the constants in the system A(W, σ) to the right we obtain a system A(H ′ , σ ′ ) for some labelling σ ′ : W → G of the vertices in H ′ which satisfies v∈W σ ′ (v) = 0. By Lemma 23 there exists a solution g to the system A(H ′ , σ ′ ). By extending f with g we obtain a solution to A(W, σ).
Hard Tseitin graph tautologies are usually based on graphs that are expanders. For an undirected graph H = (V (H), E(H)) the expansion constant is: We call a family H of undirected graphs a family of expander graphs if it is infinite and there exists a positive constant e such that e(H) ≥ e for every graph H in H. For more information on expanders see e.g. the survey [31]. Here we only need the well known fact that expander families exist. Our proof strategy is now the following. We take a family (H n ) n≥1 of connected 3regular undirected expander graphs and show that for every sufficiently large integer n, one can specify edge directions inH n obtaining a directed graph H n , and choose an appropriate labelling σ n of the vertices of H n with elements of G, such that every Frege refutation of CNF(A(H n , σ n ), B(G, 3)) of depth d has size at least 2 n δ . To this end, following the lines of [17], we reduce the onto-pigeonhole principle, which states that there is no bijection between sets of size m and m + 1, to a Tseitin formula over a complete bipartite graph and further reduce the latter to the Tseitin formula over H n . Let us begin with the second reduction.
Reducing Tseitin formulas over a complete bipartite graph. We now define the graphs H n together with labellings σ n and show a reduction of a Tseitin formula over a complete bipartite graph to the Tseitin formula over H n .
The following is a special case of Theorem 4.2 in [17]. • For every i ∈ [h] the subgraph induced by V i is connected.
• For any 1 ≤ i < j ≤ h, there is at least one edge incident to some vertex in V i and to some vertex in V j .
Let us fix a family (H n ) n≥1 of connected 3-regular undirected expander graphs, where the graphH n has Θ(n) vertices. Let c be the constant whose existence follows from the theorem above. Consider the graphH n , for some n ≥ (5/c) 3 , and take a partition of the set of vertices V (H n ) into at least h ≥ cn 1/3 ≥ 5 subsets satisfying the conditions given in Theorem 13. Let us call them bubbles. Without loss of generality we can assume that the number of bubbles is odd, i.e., h = 2m + 1, otherwise we remove the bubbles V h−1 and V h and substitute them by a single bubble V h−1 ∪ V h . The set of bubbles is denoted W.
Based on the undirected expander graphH n , together with the partition of the set of its vertices into 2m + 1 bubbles we define a directed graph H n and a labelling σ n . The idea is to simulate the complete bipartite graph K m,m+1 . To this end, let us fix a partition of the set of bubbles into two disjoint sets: and each j ∈ [m + 1] let us fix an edge e i,j incident to some vertex in V i and to some vertex in W j . For future reference, let us say that we paint those edges red. We fix the direction of each of those edges from W j to V i . The directions of the rest of the edges in the graph H n are set arbitrarily. Now, for each i ∈ [m] we fix one vertex v i in the bubble V i , paint it blue and label it with −1; similarly for each j ∈ [m + 1] we fix one vertex v j in the bubble W i , paint it green and label it with 1. The rest of the vertices of the graph H n are labelled with 0. This finishes the definition of the directed graph H n and the labelling σ n .
Observe that the Tseitin tautology A(H n , σ n ) is unsatisfiable. Indeed, the sum of all labels of the vertices of H n is (m + 1) · 1 − m · 1 = 1 = 0.
We now show that by assigning truth values to some variables in CNF(A(H n , σ n ), B(G, 3)) we obtain an encoding of CNF(A(K m,m+1 , σ), B(G)), where K m,m+1 is a complete bipartite graph, and σ is some labelling of its vertices with elements of the group G. Let us first make some general comments about partial assignments for instances of LIN(G) and variable substitutions for corresponding CNFs.
Let A be any system of linear equations over the group G. For a partial assignment ρ which maps some of the variables of A to elements of the group, there is a natural corresponding substitution of the variables in CNF(A, B(G)): if the partial assignment ρ maps a variable a to g, then the variable X(a, g) is substituted by 1 and the variables X(a, g ′ ), for g ′ = g, are substituted by 0; if the partial assignment ρ leaves the value of some variable a unassigned, then for all the variables X(a, g), the substitution is defined by the identity. It is not difficult to see that the result of applying this substitution to CNF(A, B(G)) is CNF(A| ρ , B(G)), where A| ρ is the system of linear equations obtained by applying ρ to the variables of A. For simplicity, we denote the above defined substitution by ρ. Hence, CNF(A, B(G))| ρ = CNF(A| ρ , B(G)).
Coming back to the graph H n , let us consider a partial assignment ρ which, for each of the bubbles V ∈ W, maps the non-red edges in ∂(V ) to the group element 0, and leaves the value of the rest of the edges in H n unassigned. Observe that ρ does not falsify any of the equations in A(H n , σ n ). Indeed, since every subgraph induced by a single bubble is connected, for every vertex v, the value of at least one variable that appears in the equation is left unassigned. Moreover, for every bubble V ∈ W the equation A(∂(V ), σ n )| ρ says that the sum of the red edges in ∂(V ) is 1. This is clear for the bubbles in {W 1 , . . . , W m+1 }, and for the bubbles in {V 1 , . . . , V m } one only needs to multiply the corresponding equations by −1.
Consider a complete bipartite graph K m,m+1 with m blue vertices, m + 1 green vertices and a directed red edge from every green vertex to every blue vertex. Let the labelling σ assign −1 to the blue vertices and 1 to the green ones. The Tseitin tautology A(K m,m+1 , σ) is up to renaming of variables the same as the set of equations in: Therefore, from now on we denote the above system of linear equations by A(K m,m+1 , σ). Note that, for each of the vertices v of the graph K m,m+1 , the corresponding equation says that the sum of the variables in ∂(v) is 1.
Let r = max (3, |G|). For m ≤ l, an r-CNF F over variables X 1 , . . . , X l is called an implicit encoding [17] of a propositional formula ψ over variables X 1 , . . . , X m if the following holds: a truth assignment to the variables of ψ satisfies ψ if and only if it can be extended to a truth assignment to the variables of F which satisfies F . The variables X m+1 , . . . , X l are called auxiliary variables.
It follows from Lemma 24 that, for each bubble V ∈ W, the formula CNF(A(V, σ n ), B(G)) is an implicit encoding of the formula CNF(A(∂(V ), σ n ), B(G)), with the set of auxiliary variables being the set of edges of the subgraph induced by V . Since on this set of edges the substitution ρ is defined as the identity, it is not difficult to see that, for each bubble V ∈ W, the substituted formula CNF(A(V, σ n ), B(G))| ρ is an implicit encoding of the substituted formula CNF(A(∂(V ), σ n ), B(G))| ρ . Moreover, the sets of auxiliary variables in those implicit encodings are pairwise disjoint, hence the formula This way we have reduced an implicit encoding of a Tseitin formula over a complete bipartite graph K m,m+1 to the Tseitin formula over the expander graph H n , where m > Cn 1/3 , and C is a constant which does not depend on n.
Reducing the pigeonhole principle. We now use the technique for removing auxiliary variables without significantly increasing the proof size introduced in [17] to reduce the onto-pigeonhole principle formula OPHP(m, m + 1), as defined below, to the formula CNF (A(H n , σ n ), B(G, 3))| ρ .
For a positive integer l and a set of variables X 1 , . . . , X l , we denote by U(X 1 , . . . , X l ) the CNF which has a clause i∈[l] X i , and for every 1 ≤ i < i ′ ≤ l, a clause X i ∨ X i ′ . For a complete bipartite graph K l,l+1 , the onto-pigeonhole principle OPHP(l, l + 1) is the CNF which is the union of U(∂(v)) over the set of all vertices v of the graph.
Let us consider the following substitution of the variables in CNF (A(K m,m+1 , σ), B(G)) and its implicit encoding CNF (A(H n , σ n ), B(G, 3))| ρ : for every red edge e the variable X(e, 1) is substituted by e, the variable X(e, 0) is substituted by e, and for every g ∈ G such that g ∈ {0, 1}, the variable X(e, g) is substituted by 0. On the auxiliary variables of the implicit encoding the substitution is defined by the identity. For simplicity, let us consider this substitution as an extension of the substitution ρ, and let us denote it by ρ ′ . Intuitively, the substituted formula CNF(A(K m,m+1 , σ), B(G))| ρ ′ encodes those assignments to the variables of A(K m,m+1 , σ) that map each variable either to the group element 0 or to the group element 1. Setting the truth value of the variable e to 1 corresponds to mapping e to 1, and setting it to 0 corresponds to mapping e to 0.
Observe that for every vertex v ∈ K m,m+1 , we have U(∂(v)) |= CNF (A(v, σ), B(G))| ρ ′ . Indeed, U(∂(v)) is satisfied if and only if exactly one of the variables in ∂(v) is assigned a truth value 1. This truth assignment corresponds to mapping exactly one of the red edges incident to v to the group element 1 and mapping the rest of the red edges incident to v to the identity element 0. It is not difficult to see that such an assignment satisfies the equation A(v, σ).
For a CNF F with variables X 1 , . . . , X l , by DNF(F ) we denote the l-DNF formula which, for every truth assignment satisfying F , has an l-term representing this assignment, i.e., the unique l-term which is satisfied by this assignment and no other.
We now have all the ingredients necessary to remove the auxiliary variables using the technique from [17]. We remark that the Frege system studied therein differs from the system considered in the present paper. The formulas are formed from variables using negation and disjunction only, and there is no introduction of conjunction rule. However, it follows from the theorem of [44] that those two Frege systems polynomially simulate each other up to a constant factor loss in depth. Therefore, since the lower bound we aim at is exponential and for all constant depths, we can apply the technique from [17].
Hence, by Theorem 5.5 of [17] if CNF(A(H n , σ n ), B(G))| ρ ′ has a Frege refutation of depth d and size s, then there exists a Frege refutation of OPHP(m, m + 1) = v∈V (K m,m+1 ) U(∂(v)) of depth d + 10 and size at most polynomial in m 4 s, that is of size at most polynomial in n 4/3 s. To complete the proof in the case of G = Z q it suffices to refer to the following theorem proved independently in [16] and [37] as an exponential improvement over [1].
Theorem 14 (The Jewel Theorem of Proof Complexity [16,37,1]). For every integer d there exists a constant δ such that for every sufficiently large integer m every Frege refutation of OPHP(m, m + 1) of depth d has size at least 2 m δ .
It remains to show that thanks to the Fundamental Theorem of Finite Abelian Groups, the special case of G = Z q implies Theorem 12 in full generality.
Lemma 25. Let G = q∈Q Z q be a finite Abelian group, and let n, d and s be positive integers. If for some q ∈ Q there is an unsatisfiable instance A of 3LIN(Z q ) with n variables such that every Frege refutation of CNF(A, B(Z q , 3)) of depth d has size at least s, then there is an unsatisfiable instance A ′ of 3LIN(G) with n variables such that every Frege refutation of CNF(A ′ , B(G, 3)) of depth d has size at least s.
Proof. Let A be an unsatisfiable instance of 3LIN(Z q ) with n variables, and assume that every Frege refutation of CNF(A, B(Z q , 3)) of depth d has size at least s. The instance A is a system of linear equations over the group Z q . Since Z q is a subgroup of G, we can think of the same system of linear equations as a system of linear equations over the group G. Let A ′ be the corresponding instance of 3LIN(G). It is not difficult to see that it is unsatisfiable. Moreover, every Frege refutation of CNF(A ′ , B(G, 3)) of depth d has size at least s. Indeed, by applying a substitution ρ which for every a ∈ A ′ and every g ∈ G \ Z q substitutes the variable X(a, g) with 0 and on all other variables is defined by identity, we transform a Frege refutation of CNF(A ′ , B(G, 3)) of depth d to a Frege refutation of CNF(A, B(Z q , 3)) of depth d.

Lower bound for Polynomial Calculus
The original motivation in [22] for defining the Tseitin graph tautologies for Abelian groups beyond Z 2 was to compare the strength of Polynomial Calculus over different fields. Here we use their results with the different purpose of getting lower bounds for Polynomial Calculus over the real-field for all CSPs of unbounded width. Along the lines of the previous section for bounded-depth Frege, this will be a consequence of Theorem 6, Theorem 11, and the following lower bound (for which we use the EQ encoding scheme).
Theorem 15. For every non-trivial finite Abelian group G there exists a positive constant δ and a family of unsatisfiable instances (A n ) n≥1 of 3LIN(G), where A n has Θ(n) variables and Θ(n) equations, such that for every sufficiently large n every PC refutation over the reals of EQ (A n , B(G, 3)) has degree at least δn.
By the same argument as in the previous section, Theorem 15 will follow from the special case for Abelian groups of the form Z m proved in [22]. Let us note that the statement in [22] is made only for fields of prime characteristic and for prime m, but the same proof goes through for arbitrary fields whose characteristic does not divide m.
Strictly speaking, the form of the Tseitin system of equations that we defined in the previous section is slightly more general than the original one from [22]. In [22], the definition starts with an undirected graph H and, given a labelling σ : V (H) → Z m , the system of equationsÂ(H, σ) over Z m is defined as follows: • there is a pair of variables (u, v) and (v, u) for each edge {u, v} in E(H),  H, σ). This justifies the claim that the definition of the Tseitin system from the previous section is a generalization of the definition in [22]. Another sense in which the definition of the Tseitin system from the previous section is more general is that the original definition requires m in Z m to be a prime number; however, going through their proof it is readily seen that this is not essential. Finally, the original definition also requires the condition that the sum of the labels σ(u) is 1 (mod m), but again this is not essential in their proof as long as the sum is non-zero.
Let B ′ (Z m , 3) be the template B(Z m , 3) extended with the binary relation {(g, g ′ ) ∈ Z 2 m : g + g ′ = 0}, and let EQ ′ denote the modification of the encoding scheme EQ in which each twin variableX(a, b) is replaced by 1 − X(a, b). It turns out that the system of polynomial equations EQ ′ (Â(H n , σ n ), B ′ (Z m , 3)) for a fixed family of 3-regular expander graphs (H n ) n≥1 and a labelling σ n : V (H n ) → Z m of total sum 1 mod m is literally the same as the system of polynomial equations that [22] calls BTS n,m . Note that BTS n,m has Θ(n) variables. We have the following: Theorem 16 (see Corollary 21 in [22]). For every integer m ≥ 2 and every field F of a characteristic that does not divide m there exists a positive δ such that for every sufficiently large n every PC refutation over F of BTS n,m has degree at least δn.
This gives us a family of instances of B ′ (Z m , 3) that are hard for Polynomial Calculus over the real-field. Since the template B ′ (Z m , 3) is pp-definable in B(Z m , 3), Theorem 6 implies an existence of such a family for 3LIN(Z m ).
In order to complete the proof of Theorem 15 from Theorem 16 it suffices to invoke a version of Lemma 25 for Polynomial Calculus, whose statement and proof are virtually identical to those of Lemma 25, and are thus omitted.

Lower bound for Sums-of-Squares
In the case of Sums-of-Squares, similarly as for Polynomial Calculus, we do not need to adapt an existing lower bound proof from the literature for Z 2 to all finite Abelian groups because this was already done. The lower bound that we need to complete the proof of Theorem 10 is the following: 23]). For every non-trivial finite Abelian group G there exists a positive δ and a family of unsatisfiable instances (A n ) n≥1 of 3LIN(G), where A n has Θ(n) variables and Θ(n) equations, such that for every sufficiently large n every SOS refutation of EQ (A n , B(G, 3)) has degree at least δn.
The exact statement that we are referring to is Theorem G.8 from Appendix G in [23]. In order to be able to state the theorem and compare it to the statement of Theorem 17 we need to introduce some definitions.
Let G be a finite Abelian group and let C be a subgroup of G k , where k ≥ 3. The problem Additive-CSP(C), as defined in [23], is the constraint satisfaction problem that has constraint relations of the form {(c 1 , . . . , c k ) : (c 1 − b 1 , . . . , c k − b k ) ∈ C}, for all (b 1 , . . . , b k ) ∈ G k . Note that if the set of variables is V , then the set of all possible constraints can be identified with the set V k × G k . The instances are presented as distributions π over V k × G k . This amounts to assigning weights to the constraints. The value of an instance is the maximum over all assignments of values to variables of the probability that a random constraint chosen from π is satisfied by the assignment. We say that C ⊆ G k is balanced pairwise independent if for every pair i, j ∈ [k] with i = j, and every two elements a, b ∈ G, the number of k-tuples (c 1 , . . . , c k ) from C such that c i = a and c j = b is |C|/|G| 2 . For example, any C of the form {(c 1 , . . . , c k ) : c 1 + · · · + c k = 0} is balanced pairwise independent, and it is a subgroup of G k . Chan's Theorem G.8 in [23] states that if C is any balanced pairwise independent subgroup of G k and ǫ is an arbitrary positive constant, then for every sufficiently large n, there is an instance M of Additive-CSP(C) with n variables, whose value is bounded by |C|/|G| k + ǫ, and that has a Lasserre solution of value 1 for cn rounds, where c = c G,k,ǫ is a constant that depends only on the group G, the arity k, and the tolerance parameter ǫ. Moreover, it follows from the proof in [23] (see Theorem G.7) that the instance M can be chosen to have e G,k,ǫ n constraints, where e G,k,ǫ is a constant that depends only on the group G, the arity k, and the parameter ǫ. We discuss what a Lasserre solution is and how it relates to SOS proofs.
Before we do that we fix some of the parameters. We want to build an unsatisfiable instance A, and we do so by choosing the parameters to make the value of M in Chan's Theorem strictly smaller than 1. Fix k = 3 and C = {(c 1 , c 2 , c 3 ) : c 1 + c 2 + c 3 = 0}, and take ǫ = 1/4. Then the value of the instance M is bounded by |C|/|G| 3 + ǫ = 1/|G| + 1/4 ≤ 1/2 + 1/4 < 1. This means that the collection of constraints of M that have non-zero probability in π is unsatisfiable; i.e., not all constraints can be satisfied at the same time by a single assignment. Thus, our unsatisfiable instance A will just be the set of all constraints with non-zero probability in π. Now we are ready to define what a Lasserre solution of value 1 is.
According to Definition G.3 from Appendix G in [23], a Lasserre solution of value 1 for t rounds is a collection u = {u f : f ∈ G S , S ⊆ V, |S| ≤ t} of vectors in Euclidean space R d , of some finite dimension d, such that for every S ⊆ V with |S| ≤ 2t there exists a probability distribution µ S over G S with the following properties: for every R, S, T ⊆ V with |S|, |T | ≤ t and R = S ∪ T , and every f ∈ G S and g ∈ G T , it holds that and for every constraint with variables S in the support of π and every f ∈ G S that does not satisfy this constraint we have At this point we have all the necessary material to argue that A, or more precisely, EQ(A, B(G, 3)), does not have SOS refutations of degree δn, where δ = 2c G,k,ǫ . Let EQ ′ be the result of replacing each twin variableX(a, b) in EQ(A, B(G, 3)) by 1 − X(a, b). By the remarks at the end of Section 2.3, it suffices to show that EQ ′ does not have SOS refutations of degree δn for the definition of Sums-of-Squares without twin variables. Assume, for the sake of contradiction, that EQ ′ does have such an SOS refutation of degree at most 2t, where t := c G,k,ǫ n is the number of rounds for which there exists a Lasserre solution of value 1 for the instance M. The refutation has the form where P 1 , . . . , P r are polynomials that either come from EQ ′ , or they are axiom polynomials from the lists (2) and (4) without twin variables, or they are squares, S 1 , . . . , S r are arbitrary or square polynomials without twin variables as appropriate (i.e., arbitrary if the P i they multiply come from an equation, and squares if the P i they multiply come from an inequality), and the total degree of each product P i · S i is at most 2t. Multiplications by X and 1 − X can be simulated by multiplications by their squares, thanks to the axioms X 2 − X = 0 from (2), so we can assume that the refutation has the form where P 1 , . . . , P m are polynomials that either come from EQ ′ , or they are one of the axiom polynomials of the form X 2 − X from (2), and S 1 , . . . , S m , Q 1 , . . . , Q ℓ are arbitrary polynomials.
Recall that the variables of EQ ′ have the form X(a, b) where (a, b) ∈ V × G. We say that the element a is mentioned in X(a, b), and that it is mentioned in any monomial that contains this variable. Now we define a linear functional E : P 2t → R, where P 2t denotes the vector space of polynomials of degree at most 2t on the X(a, b)-variables, as follows.
For each monomial M of degree at most 2t on the X(a, b)-variables, with all mentioned elements in S ⊆ V , define where the notation h(M) stands for the evaluation of the monomial M by the partial assignment given by h; i.e., all variables X(a, h(a)) with a ∈ S are set to 1, all variables X(a, b) with a ∈ S and b = h(a) are set to 0, and all other variables are left unset. Note that (29) ensures that (33) is a well-defined quantity that does not depend on the choice of S, as long as S contains all the elements that are mentioned in M. Once E is defined for all monomials of degree at most 2t, we extend it to P 2t by linearity. The final step in the argument is to show that E evaluates the left-hand side in (32) to some non-negative quantity; this will imply that the identity in (32) does not hold, and finish the proof. In order to prove this, the following matrix (A M,N ) M,N will be instrumental. The indices are monomials M of degree at most t on the X(a, b)-variables. The entry A M,N of A is defined to be E(MN). For later use, observe that if S denotes the set of elements that are mentioned in M and there exists f ∈ G S such that f (M) = 1, then this partial assignment f with domain S is uniquely determined by M. We let f M ∈ G S ∪ {⊥} denote this unique partial assignment f that makes f (M) = 1, when it exists, or the default value ⊥ when it does not exist. We argue that Equation (29) ensures that A is a positive semidefinite matrix. First, extend the collection of vectors u to a new collection of vectors u * = {u * f : f ∈ G S ∪ {⊥}, S ⊆ V, |S| ≤ t} by defining u * f = u f for f ∈ G S , and u * f = 0 for f = ⊥. Fix indices M and N, let S and T be sets of elements mentioned in M and N, respectively. Let R = S ∪ T . Then E(MN), according to its definition (33), is the probability of the event that h ∈ µ R makes h(MN) = 1, or equivalently, that h ∈ µ R makes h| S (M) = 1 and h| T (N) = 1, or equivalently, that h ∈ µ R makes h| S = f M and h| T = g N . Thus, equation (29) and the definition of the extended collection of vectors u * ensures that A M,N = u * f M , u * g N and hence A is a Gram matrix. Thus A is positive semi-definite. Now we use the positive semi-definiteness of A to show that, for squares which is non-negative because A is a positive semi-definite matrix. For terms in (32) that are liftings of equations from EQ ′ , the evaluation through E is 0. This is clear for equations of type 2, since every monomial which contains a pair of variables X(a, b) and X(a, b ′ ), for b = b ′ evaluates to 0 by (33). For the same reason if we take any equation of type 1 in EQ ′ , i.e, b∈G (1 − X(a, b)) = 0, for some a ∈ V , and an arbitrary monomial M on the X(a, b)-variables such that P · M has a total degree at most 2t, it holds that E By (33) again, we have E(M) = b∈G E(X(a, b) · M) and the right-hand side vanishes too. Finally, liftings of equations of type 3 from EQ ′ evaluate to 0 thanks to equation (30).

Lemma 26 ([30]
). For every integer c and for every linear form L(X 1 , . . . , X n ) = n i=1 a i X i with integer coefficients a 1 , . . . , a n , there is an LS proof of the inequality D c (L) ≥ 0 (from nothing) of degree at most 3 and size polynomial in max{|a i | : i = 1, . . . , n}, |c| and n.
In the following, for I ⊆ [n] and T ⊆ I, let M I T (X 1 , . . . , X n ) := i∈T X i i∈I\TX . As usual, M I ∅ (X 1 , . . . , X n ) = 1. Such polynomials are called extended monomials. Proof. For simplicity, let q = |I| and assume I = {1, . . . , q}. We build the proof inductively on q. For q = 0, what we need is trivial since the left-hand side is 0. Assume now q ∈ {1, . . . , n} and that we have T ⊆[q−1] M I T − 1 = 0. Multiply this once by X q and once byX q . Adding the results we get T ⊆[q] M I T − X q −X q = 0, from which T ⊆[q] M I T − 1 = 0 follows from adding the axiom X q +X q − 1 = 0 to it. The size is exponential in |I| because the inductive step is used twice.
The next lemma is as technical as useful. Proof. Write M for M I T . For every i ∈ I \ T , using X iXi = 0 we get X i M = 0. For every i ∈ T , using X 2 i − X i = 0 we get X i M = M. Adding up we get i∈I X i M = |T |M.
Simulating Gaussian elimination. We use these lemmas to prove the main result of this section.
Theorem 18. Let E be an instance of 3LIN(Z 2 ) with n variables and m equations. If E is unsatisfiable, then S(E) has an LS refutation of degree 6 and size polynomial in n and m.
Proof. Write E in matrix form Ax = b, where x is a column vector of n variables, A is a matrix in Z m×n 2 , and b is a vector in Z m 2 . Let a j,1 , . . . , a j,n be the j-th row of A, so the j-th equation of E is E j : a j,1 X 1 + · · · + a j,n X n = b j . Assume E is unsatisfiable over Z 2 . Then b cannot be expressed as a Z 2 -linear combination of the columns of A, so the Z 2 -rank of the matrix [ A | b ] exceeds the Z 2 -rank of A. Since the rank of A is at most n, this means that there exists a subset of at most n rows J such that, with arithmetic in Z 2 , we have j∈J a j,i = 0 for every i ∈ [n], and at the same time j∈J b j = 1. In order to simplify the notation, we assume without loss of generality that J = {1, . . . , |J|}.
The base case k = |J| is a special case of Lemma 26. To see why note that the condition j∈J a j,i = 0 over Z 2 means that, if arithmetic were done in Q, then j∈J a j,i is an even natural number. But then all the coefficients of L |J| (X 1 , . . . , X n ) = 1 2 |J| j=1 n i=1 a j,i X i = n i=1   1 2

Applications to k-coloring
We illustrate the power of the general method of reductions for CSPs by applying it to the graph k-coloring problem for k ≥ 3. This will allow us to rederive one of the results in [40], as well as answer one of their open problems.

Blackbox application to k-coloring
An undirected graph G is k-colorable if and only if it has a homomorphism into the k-clique . Thus the k-coloring problem is a special case of the CSP of the template K k , which we abbreviate by k-COLOR. We say that it is a special case and not exactly the same problem because the inputs to k-COLOR need not be undirected graphs; in full generality they are directed graphs that allow loops. Note, however, that a directed graph that has loops would never have a homomorphism into the template, and that a loopless directed graph has a homomorphism into the template if and only if the underlying undirected graph that ignores the directions of the edges is k-colorable. Thus, for all practical purposes, the two problems are the same, and proof complexity lower bounds for one version of the problem will give proof complexity lower bounds for the other. We discuss this in due time; for now we focus on the proof complexity of the CSP with template K k . It is well-known that K k is a template of unbounded width for each k ≥ 3 (see e.g. [26]). As a consequence of our main result we get the following: Corollary 5. For every integer k ≥ 3, there exist families (G n ) n≥1 of unsatisfiable instances of k-COLOR, where G n has Θ(n) vertices and Θ(n) edges, such that for every positive integer d there exists a positive ǫ such that, for any local encoding scheme E of the appropriate type, and any sufficiently large n, the following hold: 3. X(u, b)X(v, b) = 0 for each u, v ∈ V with (u, v) ∈ E and b ∈ [k].
It is easy to see that this is a local encoding scheme for k-COLOR in the sense of Section 2.6. Thus, Corollary 5 applies to it and we get a family of instances (G n ) n≥1 that are hard for Polynomial Calculus in the indicated encoding scheme. Note that, since the instances are hard, they must be loopless graphs. Indeed, if (u, u) is a loop in G n , then X(u, b)X(u, b) = 0 is an equation in the encoding of the instance G n for all b ∈ [k]. These equations, together with the axioms X(u, b) 2 − X(u, b) = 0 and the equation b∈[k] X(u, b) − 1 = 0, would give a PC derivation of 1 = 0 in degree 2 and constant size. Thus the instances in the family are loopless graphs. We may also assume that they are undirected graphs for the simple reason that the equations X(u, b)X(v, b) = 0 and X(v, b)X(u, b) = 0 are identical (recall that all our variables commute by assumption). It follows that Corollary 5 has the real-field case of Theorem 1.1 from [40] as a special case, except for the fact that, unlike Theorem 1.1 from [40], Corollary 5 does not state that the family of graphs is explicit. In the next section we show that we can also get an explicit family of graphs with the same properties.

Opening the box
In the rest of this section we open the box of the method that underlies Corollary 5. This will allow us to re-derive Theorem 1.1 from [40] for all fields, and not just for the real-field as is stated in Corollary 5. Moreover, it will suggest a way to apply the method to any other problem that is NP-complete via gadget reductions.
Since K k for k ≥ 3 is a template of unbounded width, Theorem 11 applies to it. It is not difficult to see that K k is a core, hence by Theorem 11 there exists a non-trivial finite Abelian group G such that B(G, 3) is pp-interpretable in K + k , where K + k is the expansion of K k with all constants; i.e., the expansion with the relations R 1 = {1}, . . . , R k = {k}. Indeed, this is the case even for the group G = Z 2 . Concrete such pp-interpretations are well-known and also easy to construct. For the sake of completeness and by way of example, we propose one such pp-interpretation in two steps. First we pp-interpret B(Z 2 , 3) in the template of 3-SAT, and then we pp-interpret the template of 3-SAT in K + k . Since pp-interpretations compose, we get what we want.
Our next goal is to extend the PC lower bound for k-COLOR to all fields. Before we do so, let us note that exactly the same strategy as in the previous paragraph is not enough. The reason is that 3LIN(Z 2 ) is easy for Polynomial Calculus over fields of characteristic two. Surely we could start with an instance of 3LIN(Z 3 ), which is going to be hard for fields of characteristic two, but the result is again not going to be hard for all fields simultaneously as it will fail to be hard for fields of characteristic three. The solution is to start with a problem that has instances that are hard for Polynomial Calculus for all fields simultaneously. Luckily, 3-SAT is such a case: Theorem 19 (see Theorem 3.13 in [3]). There exists a positive real δ and an explicit family (G n ) n≥1 of unsatisfiable instances of 3-SAT, where G n has Θ(n) variables and Θ(n) clauses, such that, for every field F and every sufficiently large n, every PC refutation over F of G n with respect to the EQ encoding scheme has degree at least δn.
Let us note that in order to get Theorem 19 from the exact statement of Theorem 3.13 in [3] one needs explicit families of 3-regular unique-neighbor expanders. Such families were described in [4].
With the lower bound of Theorem 19 in place we can get the version of the PC lower bound of Corollary 5 for all fields: the corresponding explicit instances of k-COLOR are obtained by applying the conjunction of Theorem 19 and Theorem 6 on the already noted fact that the template of 3-SAT pp-interprets in K + k . This gives a new proof of Theorem 1.1 from [40].
Let us point out the main differences and similarities between the original proof from [40] and our new proof. At a high level, those proofs are very similar: both are gadget reductions that convert hard CNF formulas into hard instances of k-COLOR. In our proof, the gadgets are based on the way the template of 3-SAT is constructed from the template of k-COLOR by the addition of constants followed by the pp-interpretation (as presented in Section 4). Hence, the starting hard formulas can be any family of 3-CNF formulas that are hard for Polynomial Calculus. The proof from [40] is also a gadget reduction, but in their case the reduction is specifically tailored to a concrete family of CNF formulas that encode a sparse 9 Concluding remarks Theorems 5, 6 and 7 imply that for the proof systems under consideration the class of constraint languages admitting efficient refutations can be characterised algebraically. For most of those proof systems such a characterisation follows from the fact that efficient proofs of unsatisfiability exist exactly for languages of bounded width. However, by Theorem 18 the class of constraint languages admitting efficient refutations in Lovász-Schrijver, and consequently also the class of constraint languages admitting efficient Frege refutations, exceeds bounded width. At the same time both of those classes are shown to admit algebraic characterisations. Providing such characterisations is a natural open problem that arises from our work. In particular, with the Algebraic CSP Dichotomy Conjecture recently confirmed [20,50], it would be interesting to verify or refute the tempting conjecture that the class of languages admitting polynomial size Frege (or Extended Frege) refutations coincides with the class of all polynomial time solvable constraint languages.
Other proof systems which are shown to be closed under reducibilities and can surpass bounded width are Polynomial Calculus proof systems over fields of prime characteristics. Finding algebraic characterisations for the classes of constraint languages admitting efficient unsatisfiability proofs in each of those proof systems is another question suggested by our work. Importantly, since Polynomial Calculus over a field of non-zero characteristic p has efficient refutations for systems of linear equations over Z p and does not have efficient refutations for systems of linear equations over Z m if p does not devide m (c.f. Theorem 16), for two fields of distinct prime characteristics, such characterisations will necessarily be different.
Both questions raised so far could lead to the discovery of some interesting new families of algebras as has happened before in the development of the algebraic approach to CSPs (c.f., the class of algebras with few subpowers [32]).
A related direction suggested by our work is whether the proof complexity of approximating MAX CSPs is also preserved by reductions. On the one hand, it is known that most of the classical CSP constructions preserve almost satisfiability; e.g., if B ′ is pp-definable with-out equality in B, then if A is an instance of MAX CSP(B ′ ) that is almost satisfiable, then its standard transformation into an instance A ′ of MAX CSP(B) is also almost satisfiable. The question we raise is the following: For which proof systems is it also the case that if there are efficient proofs of the fact that A ′ is far from satisfiable then there also are efficient proofs of the fact that A is far from satisfiable? Depending on how the terms "almost satisfiable" and "far from satisfiable" are quantified, a positive answer for such questions could lead to an algebraic approach to the proof complexity of approximating MAX CSPs and the UGC.