Structure and Complexity of Bag Consistency

Since the early days of relational databases, it was realized that acyclic hypergraphs give rise to database schemas with desirable structural and algorithmic properties. In a bynow classical paper, Beeri, Fagin, Maier, and Yannakakis established several different equivalent characterizations of acyclicity; in particular, they showed that the sets of attributes of a schema form an acyclic hypergraph if and only if the local-to-global consistency property for relations over that schema holds, which means that every collection of pairwise consistent relations over the schema is globally consistent. Even though real-life databases consist of bags (multisets), there has not been a study of the interplay between local consistency and global consistency for bags. We embark on such a study here and we first show that the sets of attributes of a schema form an acyclic hypergraph if and only if the local-to-global consistency property for bags over that schema holds. After this, we explore algorithmic aspects of global consistency for bags by analyzing the computational complexity of the global consistency problem for bags: given a collection of bags, are these bags globally consistent? We show that this problem is in NP, even when the schema is part of the input. We then establish the following dichotomy theorem for fixed schemas: if the schema is acyclic, then the global consistency problem for bags is solvable in polynomial time, while if the schema is cyclic, then the global consistency problem for bags is NP-complete. The latter result contrasts sharply with the state of affairs for relations, where, for each fixed schema, the global consistency problem for relations is solvable in polynomial time.


INTRODUCTION
This paper bring together two different strands of research in database theory: the study of global consistency and the study of bag semantics. Before presenting an overview of our main results, we provide some background to each of these two strands. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 2021 ACM 0001-0782/08/0X00 ...$5.00.
The study of global consistency in relational databases arose from the universal relation model, which is the assumption that all relations at hand are projections of a single relation, called the universal relation. Much of the work on database dependencies and normalization during the 1970s made this assumption first implicitly and then explicitly, as for instance in the paper by Beeri, Bernstein, and Goodman [8]. The universal relation model implies that occurrences of the same attribute in different relations have the same meaning; it also provides a framework to study dependencies across different relations. Furthermore, it has been argued that the universal relation model yields logical independence and access-path independence [22], thus it can be regarded as an early model of data integration. At times, the universal relation model was surrounded by controversy with arguments both against it [18] and in favor of it [24]. The controversy notwithstanding and instead of assuming the presence of a universal relation, researchers also investigated when a universal relation exists.
On the algorithmic side, the universal relation problem (also known as the global consistency problem) is the following decision problem: given relations R1, . . . , Rm, is there a relation R such that, for every i ≤ m, the projection of R on the attributes of Ri is equal to Ri? If the answer is positive, then the relations R1, . . . , Rm are said to be globally consistent and R is said to be a universal relation for them or a witness to their global consistency. Honeyman, Ladner, and Yannakakis [15] showed that the universal relation problem is NP-complete, even for relations of arity 2.
On the structural side, the problem is to characterize when a collection of relations is globally consistent. It is easy to see that if the relations R1, . . . , Rm are globally consistent, then they are pairwise consistent (i.e., every two of them are globally consistent). As pointed out in [15], however, the converse does not hold in general; in other words, pairwise consistency is a necessary but not sufficient condition for global consistency. This state of affairs raised the question: can we identify the settings in which pairwise consistency is both a necessary and sufficient condition for global consistency? Let R1, . . . , Rm be a collection of relations over a schema with X1, . . . , Xm as the sets of attributes. The sets X1, . . . , Xm can be viewed as the hyperedges of a hypergraph. Beeri et al. [9] showed that the sets of attributes of a schema form an acyclic hypergraph if and only if the local-to-global consistency property for relations over that schema holds, which means that every collection of pairwise consistent relations over the schema is globally consistent. Thus, for acyclic schemas, pairwise consistency is necessary and sufficient for global consistency. Consequently, the universal relation problem is solvable in polynomial time, if the sets of attributes of the schema form an acyclic hypergraph.
Much of the research in database theory assumes that relations are sets. In 1993, Chaudhuri and Vardi [12] pointed out that there is a gap between database theory and database practice because "real" databases use bags (multisets). They called for a re-examination of the foundations of databases where the fundamental concepts and algorithmic problems are investigated under bag semantics, instead of set semantics. In particular, Chaudhuri and Vardi [12] raised the question of the decidability of the conjunctive query containment problem under bags semantics (the same problem under set semantics is known to be NP-complete [11]). In spite of various efforts in the past and some recent progress [19,20], this question remains unanswered at present.
It is perhaps surprising that a study of consistency notions under bag semantics has not been carried so far. Our main goal here is to embark on such a study and to explore both structural and algorithmic aspects of pairwise consistency and of global consistency under bag semantics. In this study, the consistency notions for bags are, of course, defined using bag semantics in the computation of projections.

Summary of Results
In general, properties of relations need not carry over automatically to similar properties of bags. This phenomenon manifests itself in the context of consistency properties. Indeed, it is well known that if a collection of relations is globally consistent, then their relational join is a witness to their global consistency (see, e.g., [15]); in other words, their relational join is a universal relation for them and, in fact, it is the largest universal relation. In contrast, we point out that this property fails for bags, i.e., there is a collection of bags that is globally consistent but the bag-join of the bags in the collection is not a witness to their global consistency; furthermore, there may be no biggest witness to the consistency of these bags.
Our first result asserts that two bags are consistent if and only if they have the same projection on their common attributes. While the analogous fact for relations is rather trivial, here we need to bring in tools from the theory of linear programming and maximum flow problems. As a corollary, we obtain a polynomial-time algorithm for checking whether two given bags are consistent and returning a witness to their consistency, if they are consistent. After this, we establish our main result concerning the structure of bag consistency. Specifically, we show that the sets of attributes of a schema form an acyclic hypergraph if and only if the local-to-global consistency for bags over that schema holds. Thus, the main finding by Beeri et al. [9] about acyclicity and consistency extends to bags. The architecture of the proof, however, is different from that in [9]. In particular, if a schema is cyclic, we give an explicit construction of a collection of bags that are pairwise consistent, but not globally consistent; the inspiration for our construction comes from an earlier construction of hard-to-prove tautologies in propositional logic by Tseitin [23].
We then explore algorithmic aspects of global consistency for bags by analyzing the computational complexity of the global consistency problem for bags: given a collection of bags, are these bags globally consistent? Using a sparsemodel property of integer programming that is reminiscent of Carathéodory's Theorem for conic hulls [13], we first show that this problem is in NP, even when the schema is part of the input. After this, we establish the following dichotomy theorem for fixed schemas: if the schema is acyclic, then the global consistency problem for bags is solvable in polynomial time, while if the schema is cyclic, then the global consistency problem for bags is NP-complete. The latter result contrasts sharply with the state of affairs for relations, where, for each fixed schema, the global consistency problem for relations is solvable in polynomial time. Our NP-hardness results build on an earlier NP-hardness result about three-dimensional statistical data tables by Irving and Jerrum [17], which was later on refined by De Loera and Onn [21]. Translated into our context, this result asserts the NPhardness of the global consistency problem for bags over the triangle hypergraph, i.e., the hypergraph with hyperedges of the form {A1, A2}, {A2, A3}, {A3, A1}.
We conclude the paper with a brief overview of extensions of the results reported here to relations over semirings.

Related Work
The interplay between local consistency and global consistency arises naturally in several different settings. Already in 1962, Vorob'ev [25] studied this interplay in the setting of probability distributions and characterized the local-to-global consistency property for probability distributions in terms of a structural property of hypergraphs that turned out to be equivalent to hypergraph acyclicity. It appears that Beeri et al. [9] were unaware of Vorob'ev work, but later on Vorob'ev's work was cited in a survey of database theory by Yannakakis [27]. In recent years, the interplay between local consistency and global consistency has been explored at great depth in the setting of quantum information by Abramsky and his collaborators (see, e.g., [3,4,5]). In that setting, the interest is in contextuality phenomena, which are situations where collections of measurements are locally consistent but globally inconsistent -Bell's celebrated theorem [10] is an instance of this. The similarities between these different settings (probability distributions, relational databases, and quantum mechanics) were pointed out explicitly by Abramsky [1,2]. This also raised the question of developing a unifying framework in which, among other things, the results by Vorob'ev and the results by Beeri et al. are special cases of a single result. Using a relaxed notion of consistency, we established such a result for relations over semirings [6]. For the bag semiring, however, the relaxed notion of consistency that we studied in [6] is essentially equivalent to the consistency of probability distributions with rational values (and not to the consistency of bags). This left open the question of exploring the interplay between (the standard notions of) local consistency and global consistency for bags, which is what we set to do in the present paper.

RELATIONAL CONSISTENCY
Basic Notions An attribute A is a symbol with an associated set Dom(A) called its domain. If X is a finite set of attributes, then we write Tup(X) for the set of X-tuples; this means that Tup(X) is the set of functions that take each attribute A ∈ X to an element of its domain Dom(A). Note that Tup(∅) is non-empty as it contains the empty tuple, i.e., the unique function with empty domain. If Y ⊆ X is a subset of attributes and t is an X-tuple, then the projection of t on Y , denoted by t[Y ], is the unique Y -tuple that agrees with t on Y . In particular, t[∅] is the empty tuple.
Let X be a set of attributes. A relation over X is a function R : Tup(X) → {0, 1}. We write R(X) to emphasize the fact that R is a relation over schema X. The support Supp(R) of R is the set of X-tuples t with a non-zero value, i.e., Supp(R) := {t ∈ Tup(X) : R(t) ̸ = 0}. Whenever no confusion arises, we write R ′ to denote Supp(R). We say that R is finite if its support R ′ is a finite set. In what follows, we will make the blanket assumption that all relations considered are finite, so we will omit the term "finite". Every relation R can be identified with its support R ′ , thus every relation R can be viewed as a finite set of X-tuples.
Let R be a relation over X and assume that Z ⊆ X. The If X and Y are sets of attributes, then we write XY as shorthand for the union X ∪ Y . Accordingly, if x is an Xtuple and y is a Y -tuple such that that then we write xy to denote the XY -tuple that agrees with x on X and on y on Y . We say that x joins with y, and that y joins with x, to produce the tuple xy.
The join R 1 S of two relations R(X) and S(Y ) is the relation over XY consisting of all XY -tuples t such that t[X] is in R and t[Y ] is in S, i.e., all tuples of the form xy such that x ∈ R, y ∈ S, and x joins with y.

Consistency of Two Relations
Assume that R(X) and S(X) are two relations over the schemas X and Y . We say that R(X) and S(Y ) are consistent if there is a relation T over XY such that T [X] = R and T [Y ] = S. We also say that T witnesses the consistency of R and S. The next proposition, whose proof is straightforward, gives a useful criterion for the consistency of R and S. Proposition 1. Let R(X) and S(Y ) be two relations. The following statements are equivalent: Global Consistency of Relations Let R1(X1), . . . , Rm(Xm) be relations over the schemas X1, . . . , Xm. We say that the collection R1, . . . , Rm is globally consistent if there is a relation T over X1 ∪ · · · ∪ Xm such that Ri = T [Xi] for all i ∈ [m] = {1, . . . , m}. We say that T witnesses the global consistency of R1, . . . , Rm, and we call it a universal relation for R1, . . . , Rm. The next result presents well known and easy to prove facts about global consistency (see, e.g., [15]).
In relational database theory, there has been an extensive study of both the structural and the algorithmic aspects of global consistency. We begin by surveying some of the results concerning the structural aspects of global consistency. The main problem is to characterize when a collection of relations is globally consistent.
We say that the relations R1(X1), . . . , Rm(Xm) are pairwise consistent if for every i, j ∈ [m], the relations Ri(Xi) and Rj(Xj) are consistent. Clearly, if a relation T witnesses the global consistency of R1, . . . , Rm, then the relation T [XiXj] witnesses the consistency of Ri and Rj, for every i, j ∈ [m]. Thus, if the collection R1, . . . , Rm is globally consistent, then the relations R1, . . . , Rm are pairwise consistent. The converse, however, is not true, in general.
Beeri, Fagin, Maier, and Yannakakis [9] characterized the set of schemas for which pairwise consistency is a necessary and sufficient condition for global consistency of relations. Their characterization involves notions from hypergraph theory that we now review.

Acyclic Hypergraphs
where V is a set of vertices and E is a set of hyperedges, each of which is a non-empty subset of V . Every collection X1, . . . , Xm of sets of attributes can be identified with a hypergraph H = (V, E), where V = X1 ∪ · · · ∪ Xm and E = {X1, . . . , Xm}. Conversely, every hypergraph H = (V, E) gives rise to a collection X1, . . . , Xm of sets of attributes, where X1, . . . , Xm are the hyperedges of H. Thus, we can move from collections of sets of attributes to hypergraphs, and vice versa. The notion of an acyclic hypergraph generalizes the notion of an acyclic graph. Since we will not work directly with the definition of an acyclic hypergraph, we refer the reader to [9] for the precise definition. Instead, we focus on other notions that are equivalent to hypergraph acyclicity and will be of interest to us in the sequel.

Conformal and Chordal Hypergraphs
The primal graph of a hypergraph H = (V, E) is the undirected graph that has V as its set of vertices and has an edge between any two distinct vertices that appear together in at least one hyperedge of H.
A hypergraph H is conformal if the set of vertices of every clique (i.e., complete subgraph) of the primal graph of H is contained in some hyperedge of H. A hypergraph H is chordal if its primal graph is chordal, that is, if every cycle of length at least four of the primal graph of H has a chord (i.e., an edge that connects two nodes on the cycle, but is not one of the edges of the cycle). To illustrate these concepts, let Vn = {A1, . . . , An} be a set of n vertices and consider the hypergraphs If n ≥ 2, then the hypergraph Pn is both conformal and chordal. The hypergraph C3 = H3 is chordal, but not conformal. For every n ≥ 4, the hypergraph Cn is conformal, but not chordal, while the hypergraph Hn is chordal, but not conformal.

Running Intersection Property
Join Tree A join tree for a hypergraph H is an undirected tree T with the set E of the hyperedges of H as its vertices and such that for every vertex v of H, the set of vertices of T containing v forms a subtree of T , i.e., if v belongs to two vertices Xi and Xj of T , then v belongs to every vertex of T in the unique simple path from Xi to Xj in T .
Local-to-Global Consistency Property for Relations Let H be a hypergraph and let X1, . . . , Xm be a listing of all hyperedges of H. We say that H has the local-to-global consistency property for relations if every pairwise consistent collection R1(X1), . . . , Rm(Xm) of relations over the schemas X1, . . . , Xm is globally consistent.
We are ready to state the main result in Beeri et al. [9].
Theorem 1 (Theorem 3.4 in [9]). Let H be a hypergraph. The following statements are equivalent: (a) H is an acyclic hypergraph.
(b) H is a conformal and chordal hypergraph.
(c) H has the running intersection property.
(d) H has a join tree.
(e) H has the local-to-global consistency property for relations.
As an illustration, if n ≥ 2, the hypergraph Pn is acyclic, hence it has the local-to-global consistency property for relations. In contrast, if n ≥ 3, the hypergraphs Cn and Hn are cyclic, hence they do not have the local-to-global consistency property for relations.

Complexity of Global Consistency for Relations
We now discuss the algorithmic aspects of global consistency. The global consistency problem for relations (also known as the universal relation problem for relations) asks: given a hypergraph H = (V, {X1, . . . , Xm}) and relations R1, . . . , Rm over H, is the collection R1, . . . , Rm globally consistent? Honeyman, Ladner, and Yannakakis [15] established the following result.
Theorem 2. The global consistency problem for relations is NP-complete.
The NP-hardness of the global consistency problem for relations is proved via a reduction from 3-Colorability in which each relation has arity 2 and consists of just six pairs. Specifically, each edge (u, v) in a given graph G gives rise to a relation of arity 2 with attributes u and v; the six pairs in the relation are the pairs of different colors chosen from the three colors "red", "blue", and "green". The membership in NP uses the observation that if a collection R1, . . . , Rm of relations is globally consistent, then a witness W of this fact can be obtained as follows: for each i ≤ m and each tuple t ∈ Ri, pick a tuple in the join R1 1 · · · 1 Rm that extends t and insert it in W . In particular, the cardinality |W | of W is bounded by the sum m i=1 |Ri| ≤ m max{|Ri| : i ∈ [m]}, and thus the size of W is bounded by a polynomial in the size of the input hypergraph H and the input relations R1, . . . , Rm.
Several restricted cases of the global consistency problem for relations turn out to be solvable in polynomial time.
First, Proposition 1 implies that the consistency problem for two relations is solvable in polynomial time, since it amounts to checking that the two given relations R(X) and S(Y ) have the same projection on X ∩ Y .
Second, from the preceding fact and from Theorem 1, it follows that the global consistency problem for relations is solvable in polynomial time when restricted to acyclic hypergraphs, since, in this case, the global consistency of a collection of relations is equivalent to the pairwise consistency of the relations in the collection.
Finally, for every fixed hypergraph H = (V, {X1, . . . , Xm}) (be it cyclic or acyclic), the global consistency problem restricted to relations R1(X1), . . . , Rm(Xm) with sets of attributes X1, . . . , Xm is also solvable in polynomial time. This is so because, by Proposition 2, one can first compute the join J = R1 1 · · · 1 Rm in polynomial time and then check whether J[Xi] = Ri holds, for i = 1, . . . , m. While the cardinality |J| of this witness J can only be bounded by m i=1 |Ri| ≤ max{|Ri| : i ∈ [m]} m , this cardinality is still polynomial in the size of the input because, in this case, the exponent m is fixed and not part of the input.

BAG CONSISTENCY
Basic Notions Let X be a set of attributes. A bag over X is a function R : Tup(X) → {0, 1, 2, . . .}. As with relations, we write R(X) to emphasize the fact that R is a bag over X; the support Supp(R) (also denoted by R ′ ) of R is the set of X-tuples t that are assigned non-zero value. We say that R is finite if its support R ′ is a finite set. In the sequel, we will assume that all bags are finite.
If R is a bag and t is an X-tuple, then the non-negative integer R(t) is called the multiplicity of t in R; we write t : R(t) to denote that the multiplicity of t in R is equal to R(t). Every bag R can be viewed as a finite set of elements of the form t : R(t), where t ∈ R ′ and R(t) ̸ = 0. A bag can also be represented in tabular form. For example, the table A B # a1 b1 : 2 a2 b2 : 1 a3 b3 : 5 represents the bag R = {(a1, b1) : 2, (a2, b2) : 1, (a3, b3) : 5}. Let R be a bag over X and assume that Z ⊆ X. If t is a Z-tuple, then the marginal of R over t is defined by Let R be a bag over X and S a bag over Y . The bag join R 1 b S of R and S is the bag over XY having support R ′ 1 S ′ and such that every XY -tuple t ∈ R ′ 1 S ′ has multiplicity (R  (4); we say that T witnesses the consistency of R and S. It is easy to see that if R(X) and S(Y ) are consistent bags and T is a bag that witnesses their consistency, then T ′ ⊆ R ′ 1 S ′ , that is, the support of T is contained in the join of the supports of R and S.

Consistency of Two Bags
By Proposition 1, if two relations R(X) and S(Y ) are consistent, then their join R 1 S witnesses their consistency; moreover, R 1 S is the largest relation that has this property. In contrast, this is not true for bags because there are consistent bags R(X) and S(Y ) such that the support T ′ of every bag T witnessing their consistency is a proper subset of R ′ 1 S ′ .
The converse turns out to also be true, but its proof is far from obvious. We will establish the converse by bringing into the picture concepts from linear programming and from the theory of maximum flows.
With each pair of bags R(X) and S(Y ), we associate the following linear program P (R, S). Let J = R ′ 1 S ′ be the join of the supports of R and S. For each t ∈ J, there is a variable xt.
The linear program P (R, S) can be viewed as the set of the flow constraints of an instance of the max-flow problem. A network N = (V, E, c, s, t) is a directed graph G = (V, E) with a non-negative weight c(u, v), called the capacity, assigned to each edge (u, v) ∈ E, and two distinguished vertices s, t ∈ V , called the source and the sink. A flow for the network is an assignment of non-negative weights f (u, v) on the edges (u, v) ∈ E so that both the capacity constraints and the flow constraints are respected, that is, where N − (u) and N + (u) denote the sets of in-neighbors and out-neighbors of u in G. The value of such a flow is the quantity where the equality follows from the flow constraints. In the max-flow problem, the goal is to find a flow of maximum value. A flow is saturated if f (s, w) = c(s, w) for every w ∈ N + (s) and f (v, t) = c(v, t) for every v ∈ N − (t). It is obvious that if a saturated flow exists, then every max flow is saturated.
With each pair R(X) and S(Y ) of bags, we associate the following network N (R, S). The network has 1+|R ′ |+|S ′ |+1 vertices: one source vertex s * , one vertex for each tuple r in the support R ′ of R, one vertex for each tuple s in the support S ′ of S, and one target vertex t * . There is an arc of capacity R(r) from s * to r for each r ∈ R ′ , an arc of capacity S(s) from s to t * for each s ∈ S ′ , and an arc of unbounded (i.e., very large) capacity from t[X] to t[Y ] for each t ∈ R ′ 1 S ′ . The next result yields several different characterizations of the consistency of two bags. Lemma 1. Let R(X) and S(Y ) be two bags. The following statements are equivalent: 1. R(X) and S(Y ) are consistent.
3. P (R, S) is feasible over the rationals. 4. P (R, S) is feasible over the integers.

N (R, S) admits a saturated flow.
Proof. (Sketch) The equivalence of the statements (1) and (4) is immediate from the definitions. As discussed earlier, (1) implies (2). To show that (2) (5), let x * = (x * t )t∈J be a rational solution for P (R, S) and let f be the following assignment for N (R, S): This assignment is a flow since the equations of P (R, S) say that the flow-constraints are satisfied; furthermore, it is a saturated flow by construction. For (5) implies (1), let g be a saturated flow for N (R, S); in particular, this is a max flow for N (R, S). Since all capacities in N (R, S) are integers, the integrality theorem for the max-flow problem asserts that there is a max flow f consisting of integers (see, e.g., [26]), which, of course, is also a saturated flow. Let T (XY ) be the bag defined by setting T (t) := f (t[X], t[Y ]) for each t ∈ R ′ 1 S ′ . Since f is saturated, we have that f (s * , r) = c(s * , r) = R(r) for each r ∈ R ′ and f (s, t * ) = c(s, t * ) = S(s) for each s ∈ S ′ . This means that the flow-constraints imply that T witnesses the consistency of R and S. Thus, the statements (1), (2), (3), and (5) are equivalent.
The equivalence of statements (1) and (2) in Lemma 1 yields a simple polynomial-time test to determine the consistency of two bags, namely, given two bags R(X) and S(y), check whether or not R[X ∩ Y ] = S[X ∩ Y ]. Later on, we will see that the equivalence of statements (1) and (5) implies that there is a polynomial-time algorithm for constructing a witness to the consistency of two consistent bags.
Global Consistency for Bags Let R1(X1), . . . , Rm(Xm) be bags over the schemas X1, . . . , Xm. We say that the collection R1, . . . , Rm is globally consistent if there a bag T over X1 ∪ · · · ∪ Xm such that Ti[Xi] = Ri for all i ∈ [m]. We say that the bag T witnesses the global consistency of the bags R1, . . . , Rm. As with relations, pairwise consistency of a collection of bags is a necessary, but not sufficient, condition for the global consistency of the collection. Let H be a hypergraph and let X1, . . . , Xm be a listing of all hyperedges of H. We say that H has the local-to-global consistency property for bags if every pairwise consistent collection R1(X1), . . . , Rm(Xm) of bags over the schemas X1, . . . , Xm is globally consistent. The main structural result of this paper asserts that the acyclic hypergraphs are precisely the hypergraphs for which the local-to-global consistency property for bags holds. (e) H has the local-to-global consistency property for bags.
Proof. (Outline) Let H be a hypergraph. By Theorem 1, statements (a), (b), (c), and (d) are equivalent, because these statements express "structural" properties of hypergraphs, i.e., they involve only the vertices and the hyperedges of the hypergraph at hand. So, we only have to show that statement (e), which involves "semantic" notions about bags, is equivalent to (one of) the other three statements. This will be achieved in two steps. First, we show that statement (c) implies statement (e), i.e., if H has the running intersection property, then H has the local-to-global consistency property for bags. Second, we show that statement (e) implies statement (b) by showing the contrapositive: if H is not conformal or H is not chordal, then H does not have the local-to-global consistency property for bags.
Step 1. If the hypergraph H has the running intersection property, then there is a listing X1, . . . , Xm of its hyperedges such that for every Let R1(X1), . . . , Rm(Xm) be a collection of pairwise consistent bags over the schemas X1, . . . , Xm. By induction on i = 1, . . . , m, we show that there is a bag Ti over X1∪· · ·∪Xi that witnesses the global consistency of the bags R1, . . . , Ri. The claim is obvious for the base case i = 1. Assume that i ≥ 2 and that the claim is true for all smaller indices. Let X := X1 ∪ · · · ∪ Xi−1 and, by the running intersection property, let j ∈ [i − 1] be such that Xi ∩ X ⊆ Xj. By induction hypothesis, there is a bag Ti−1 over X that witnesses the global consistency of R1, . . . , Ri−1. We show that Ti−1 and Ri are consistent by showing that Ti−1[X ∩Xi] = Ri[X ∩Xi] and invoking Lemma 1. After this, we show that if Ti is a bag that witnesses the consistency of the bags Ti−1 and Ri, then Ti witnesses the global consistency of R1, . . . , Ri.
Step 2. We have to show that if if H is not conformal or H is not chordal, then H does not have the local-to-global consistency property for bags. We first establish that it is enough to show that certain "minimal" hypergraphs do not have the local-to-global consistency property for bags. Specifically, it is enough to show the following two statements: with Vn = {A1, . . . , An} and n ≥ 3 has the local-toglobal consistency property for bags. Recall that Hn is not conformal. The preceding "minimal" non-conformal and non-chordal hypergraphs share the following properties: all their hyperedges have the same number of vertices and all their vertices appear in the same number of hyperedges. Let H * = (V * , E * ) be a hypergraph and let d and k be positive integers. The hypergraph H * is called k-uniform if every hyperedge of H * has exactly k vertices. It is called dregular if every vertex of H * appears in exactly d hyperedges of H. Thus, the "minimal" non-conformal hypergraph Hn is (n − 1)-uniform and (n − 1)-regular. Likewise, the "minimal" non-chordal hypergraph Cn is 2-uniform and 2-regular.
Assume that H * is a k-uniform and d-regular hypergraph with d ≥ 2 and with hyperedges E * = {X1, . . . , Xm}. We construct a collection C(H * ) := {R1(X1), . . . , Rm(Xm)} of bags and show that the bags in this collection are pairwise consistent but are not globally consistent. This will imply that the local-to-global consistency property for bags fails for the hypergraphs Hn and Cn above. To show that the relations R1, . . . , Rm are not globally consistent, we proceed by contradiction. If T were a bag that witnesses their consistency, then T would be non-empty and its support would contain a tuple t such that the projections t[Xi] belong to the supports R ′ i of the Ri, for each i ∈ [m]. In turn this means that Since by d-regularity each C ∈ V belongs to exactly d many sets Xi, adding up all the equations in (6) and (7) gives which is absurd since the left-hand side is congruent to 0 mod d and the right-hand side is congruent to 1 mod d.
It should be pointed out that the proof of Theorem 1 in [9] has a different architecture than the proof of our Theorem 3. In particular, to prove the equivalence between the localto-global consistency property for relations and acyclicity, Beeri et al. make use of Graham's algorithm, which is an algorithm for testing if a given hypergraph is acyclic. More importantly, for every cyclic hypergraph H, the proof of Theorem 1 in [9] yields a collection of relations over H that are pairwise consistent but not globally consistent; these relations, however, are not pairwise consistent as bags, therefore they cannot be used to prove Theorem 3.
As an immediate consequence of Theorems 1 and 3, we obtain the following result. Corollary 1. Let H be a hypergraph. The following statements are equivalent: (a) H has the local-to-global consistency property for relations.
(b) H has the local-to-global consistency property for bags.
Complexity of Global Consistency for Bags The global consistency problem for bags asks: given a hypergraph H = (V, {X1, . . . , Xm}) and bags R1, . . . , Rm over H, is the collection R1, . . . , Rm globally consistent? Using an integral version of Carathéodory's Theorem due to Eisenbrand and Shmonin [13], we can show that this problem is in NP. At the end of Section 2, we saw that for every fixed hypergraph H, the global consistency problem for relations over the hyperedges of H is solvable in polynomial time. As we shall see next, the state of affairs is by far more nuanced for bags. Every fixed hypergraph H gives rise to the decision problem GCPB(H), which asks: given bags R1, . . . , Rm over H, is the collection R1, . . . , Rm globally consistent? The next result is a dichotomy theorem that classifies the complexity of all decision problems GCPB(H), where H is a hypergraph.

RELATIONS OVER SEMIRINGS
What do relations and bags have in common? For quite some time, it has been realized that relations and bags can be viewed as different instances of a single generalized concept of a relation in which tuples have "labels" that come from the domain of some algebraic structure.
Ioannidis and Ramakrishnan [16] considered relations over labeled systems and studied the query containment problem for relations over such systems. Later on Green, Karvounarakis, and Tannen [14] considered relations over semirings and studied the provenance of query answers. A semiring is an algebraic structure of the form K = (A, +, ×, 0, 1) such that (A, +, 0) is a commutative monoid, (A, ×, 1) is a monoid, × distributes over +, and a × 0 = 0 × a = 0, for every a ∈ A. A semiring K is positive if the following two properties hold: (i) if a + b = 0, then a = 0 and b = 0; (ii) if a × b = 0, then a = 0 or b = 0 (i.e., K has no zero divisors). If K is a semiring and X is a set of attributes, then a K-relation over X is a function R : Tup(X) → A. In the PODS 2021 proceedings version of the present paper [7], we raised the question of whether or not the results about the global consistency for bags extend to K-relations, where K is a positive semiring. In particular, does the analog of Theorem 3 for K-relations hold, where K is an arbitrary positive semiring? If not, are there broad classes of semirings for which the analog of Theorem 3 for K-relations holds? Since that time, we have obtained fairly complete answers to these questions that we summarize next; these results will appear in a forthcoming paper.
Our first finding asserts that if K is an arbitrary positive semiring and H is a hypergraph such that the local-to-global consistency property for K-relations holds, then H must be acyclic. Thus, one of the two directions in Theorem 3 holds for arbitrary positive semirings. Our second finding, however, reveals that the reverse direction does not hold for arbitrary positive semirings. For this, we consider the semiring We show that there are three R1-relations T1, T2, T3 over H that are pairwise consistent but not globally consistent.
According to Proposition 1 and to Lemma 1, both relations and bags have the following property: two relations R(X), S(Y ) (or two bags R(X), S(Y )) are consistent if and only if R[X ∩ Y ] = S[X ∩ Y ]. We say that a semiring K has the inner consistency property if the preceding property holds for all pairs of K-relations. Our third finding tells that if K is a positive semiring with the inner consistency property and if H is an acyclic hypergraph, then the local-to-global consistency property holds for H. Thus, for positive semirings with the inner consistency property, the acyclicity of a hypergraph H is equivalent to the local-toglobal consistency property for H. This result provides a common generalization of Theorem 1 for relations and of Theorem 3 for bags.
Finally, we identify several different sufficient conditions for a semiring to have the inner consistency property. As a result, we establish that the equivalence between acyclic- ity and the local-to-global consistency property holds for a plethora of semirings, including the tropical semirings, the log semirings, Lukasiewicsz' semiring, and every semiring that is a bounded distributive lattice.