Minimizing Margin of Victory for Fair Political and Educational Districting
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Minimizing Margin of Victory for Fair Political and Educational Districting Ana-Andreea Stoica,1,2 Abhijnan Chakraborty,1 Palash Dey,3 Krishna P. Gummadi1 1 Max Planck Institute for Software Systems, Germany 2 Columbia University, USA 3 Indian Institute of Technology Kharagpur, India arXiv:1909.05583v1 [cs.SI] 12 Sep 2019 Abstract ogy, etc. Dividing people arbitrarily may lead to biased grouping, manifested differently in different contexts. In many practical scenarios, a population is divided into disjoint groups for better administration, e.g., electorates In electoral districting, given the voting pattern of into political districts, employees into departments, stu- the electorate, political parties in power may draw the dents into school districts, and so on. However, group- district boundaries that favor them – a practice termed ing people arbitrarily may lead to biased partitions, rais- as gerrymandering (Lewenberg, Lev, and Rosenschein ing concerns of gerrymandering in political districting, 2017). For example, majority of the opposition support- racial segregation in schools, etc. To counter such issues, ers may be assigned to a few districts, such that the in this paper, we conceptualize such problems in a voting opponents become minority in other districts. Alterna- scenario, and propose FAIR D ISTRICTING problem to tively, the ruling party may want to ensure that it enjoys divide a given set of people having preference over can- a healthy lead over its opponents in many districts, so didates into k groups such that the maximum margin of victory of any group is minimized. We also propose the that if a handful of its supporters change sides, it does FAIR C ONNECTED D ISTRICTING problem which addi- not hamper the winnability. There have been several in- tionally requires each group to be connected. We show stances of such manipulations in electoral (re)districting that the FAIR D ISTRICTING problem is NP-complete in the US, starting as early as in 1812, by then Mas- for plurality voting even if we have only 3 candidates but sachusetts governor Elbridge Gerry (the term gerry- admits polynomial time algorithms if we assume k to be mandering originated after him) (Issacharoff 2002). some constant or everyone can be moved to any group. In contrast, we show that the FAIR C ONNECTED D IS - Public schools in the US are governed by school TRICTING problem is NP-complete for plurality voting boards representing local communities, and are largely even if we have only 2 candidates and k = 2. Finally, funded from local property taxes (Corsi-Bunker 2015; we propose heuristic algorithms for both the problems Chakraborty et al. 2019). Most of the students go to and show their effectiveness in UK political districting a school in the district they live, proximity playing an and in lowering racial segregation in public schools in important role in the school choice (Douglas N. Harris the US. 2015). Thus, the way schools are distributed determines racial composition of their students and the revenue they earn. Several reports claim that wealthier, whiter Introduction communities have pushed policies so that white fami- Dividing a population into smaller groups is often a lies can live in white-majority areas and attend white- practical necessity for better administration. For ex- majority schools (Richards 2014; Chang 2018). De- ample, in many democratic countries (most notably, spite the desegregating efforts following the landmark the countries who follow the Westminster System e.g., Supreme Court verdict in Brown v. Board of Education UK, Canada, India, Australia, or the Presidential Sys- case in 1954 (which ruled racial segregation of children tem e.g., US, Brazil, Mexico, Indonesia), electorates are in public schools to be unconstitutional), 63% of class- divided into electoral districts; in many organizations, mates of a white student are whites, compared to 48% employees are divided into administrative units like de- of all students being whites; similarly, 40% of black and partments; students enrolled in public schools in the US Hispanic students attend schools where over 90% stu- are divided into school districts; and so on. However, dents are people of color (Frankenberg 2019). As a con- the population is not homogeneous, it consists of people sequence, a recent report by an educational non-profit with different attributes – gender, race, religion, ideol- EdBuild claimed that “Non-white school districts get $23 Billion less than white districts, despite serving the Copyright c 2019, All rights reserved by the authors. same number of students” (EdBuild 2019).
To counter such unfairness issues while dividing peo- Related Work ple into groups, in this paper, we conceptualize such Voting mechanisms have been at the center of historical, problems in a voting scenario: the goal is to divide a political, and sociological studies (Barbara and Garcia- set of n people, each having a preference over a set of Molina 1987; Barberà et al. 1991; Lublin 1999; Erdélyi, candidates, into k groups. While the mapping of elec- Hemaspaandra, and Hemaspaandra 2015), due to their toral districting into voting is direct and utilizes peo- impact on local communities and society at large. The ple’s ideological preferences, we can think of context- problem of unfair distribution of voters into districts, specific mapping in other scenarios. For example, in otherwise known as gerrymandering, has received sig- school districting, we can think of students having pref- nificant attention from academic researchers (Butler erence according to their sensitive attributes (e.g., gen- 1992; Johnston, Rossiter, and Pattie 2006; Issacharoff der, race, etc.). Once the mapping is done, we propose 2002; Bachrach et al. 2016), and in particular from the FAIR D ISTRICTING problem to create k groups such the computational social choice theorists, setting ge- that the maximum margin of victory of any group is ographical (Lewenberg, Lev, and Rosenschein 2017) minimized, where margin of victory is defined as the and social constraints (Cohen-Zemach, Lewenberg, and number of people who need to change their preference Rosenschein 2018; Ito et al. 2019; Borodin et al. 2018) to change the winner. We also propose the FAIR C ON - to population mobility. NECTED D ISTRICTING problem which additionally re- Central to the problem of gerrymandering is the con- quires each group to be connected. Reducing margin of cept of representation: does a collective represent the victory would lead to everyone’s opinion within a group choices or attributes of those comprising it? In other to be valuable, since the consensus of the group can be words, does a district represent its voters? While re- changed even if a small number of people change their cent papers conceptualize different measures of rep- preferences. In practical applications, it would lead to resentation in voting scenarios (Bachrach et al. 2016; higher accountability from the elected candidate in po- Johnston, Rossiter, and Pattie 2006; Feix et al. 2008; litical districting, lower racial segregation in schools, Gelman, Katz, and Tuerlinckx 2002; Felsenthal and increase inter-discipline exchange in academic depart- Miller 2015), to our knowledge, we are the first to use ments, and so on. the concept of margin of victory for re-districting vot- Contribution ers to achieve better representation. While minimizing margin of victory does not ensure proportional repre- We make the following contributions in this paper. sentation of all voter choices in each district, it at least B We show that the FAIR D ISTRICTING problem is NP- ensures that the voices present are not lost in the crowd. complete for the plurality voting rule even when we Intuitively, lowering margin of victory across districts have only 3 alternatives and there is no constraint would ensure a strong opposition to each majority win- on the size of individual groups [??]. We comple- ner, safeguarding against district monopolies, as well as ment this intractability result by proving existence against diluting voter power across many districts. of polynomial time algorithms, when (i) every voter Computing the margin of victory for different voting can be moved to any group (which we term as the rules has been studied in (Xia 2012), while (Dey and FAIR PARTITIONING problem) [??], and (ii) we have Narahari 2015) and (Blom, Stuckey, and Teague 2018) a constant number of groups [??] for the plurality vot- estimate it in real elections. However, to our knowledge, ing rule. the problem of minimizing margin of victory has not at- tracted much attention. In this paper, we characterize B We show that the FAIR C ONNECTED D ISTRICTING the complexity of this problem for plurality voting, one problem is NP-complete for the plurality voting rule of the most common voting rules, and give practical al- even when there is only 2 alternatives, 2 districts, gorithms to solve it in real and synthetic datasets. the maximum degree of any vertex in the under- lying graph is 5, and no constraint on the size of districts [??]. This shows that, although both FAIR Preliminaries D ISTRICTING and FAIR C ONNECTED D ISTRICTING Voting Setting problems are NP-complete, FAIR C ONNECTED D IS - For a positive integer k, we denote the set {1, 2, . . . , k} TRICTING problem is computationally harder than by [k]. Let A = {ai : i ∈ [m]} be a set of m the FAIR D ISTRICTING problem. alternatives. A complete order over the set A of al- B We propose heuristic algorithms for both FAIR D IS - ternatives is called a preference. We denote the set TRICTING and FAIR C ONNECTED D ISTRICTING of all possible preferences over A by L(A). A tuple problems and show their effectiveness in reducing (i )i∈[n] ∈ L(A)n of n preferences is called a pro- margin of victory in electoral districts in the UK, file. An election E is a tuple (, A) where is a pro- as well as in lowering racial segregation in public file over a set A of alternatives. If not mentioned oth- schools in the US. erwise, we denote the number of alternatives and the 2
number of preferences by m and n respectively. A map (Hi )i∈[k] , (Vi )i∈[k] , π, smin , smax , t). In this paper, we r : ]n,|A|∈N+ L(A)n −→ 2A \ {∅} is called a voting study the above problems only for the plurality voting rule. In this paper we consider the plurality voting rule rule and thus omit specifying it every time. where the set of winners is the set of alternatives who The following observation is immediate from problem appear at the first position of a highest number of alter- definitions itself. natives. We say that a voter votes for an alternative if the voter prefers that alternative most. The number of pref- Observation 1. FAIR PARTITIONING many to one re- erences where an alternative appears at the first place is duces to FAIR D ISTRICTING which many to one re- called her plurality score. duces to FAIR C ONNECTED D ISTRICTING. Margin of Victory Results: Intractability Let r be any voting rule. The margin of victory of an In this section, we present our hardness results. Our first election ((i )i∈[n] , A) is the minimum number of votes result shows that FAIR D ISTRICTING is NP-complete that needs to be changed to change the election out- even with 3 alternatives. For that we reduce from the come. It easily follows that the margin of victory of a well known SAT problem which is known to be NP- plurality election is the ceiling of half the difference complete. between the plurality score of the two highest plural- ity scores of the alternatives. For ease of notation, we Theorem 1. The FAIR D ISTRICTING problem is NP- assume that the margin of victory of an empty election complete even if we have only 3 alternatives and there (no voters) is ∞. is no constraint on the size of any district. Problem Definition Proof. FAIR D ISTRICTING clearly belongs to NP. To prove NP-hardness, we reduce from the SAT problem. We now define our basic problem which we call FAIR Let (X = {xi : i ∈ [n]} , C = {Cj : j ∈ [m]}) be an ar- D ISTRICTING. bitrary instance of SAT. Let us consider the following Definition 1 (FAIR D ISTRICTING). Given a set A of m instance (A, V, k, H =, (Vi )i∈[k] , π, smin = 0, smax = alternatives, a set V of n voters along with their corre- ∞, t = 2) of FAIR D ISTRICTING. sponding preferences, a set of k groups H = {Hi , i ∈ [k]} along with the set Vi of voters corresponding to A = {a, b, c} each group Hi for i ∈ [k] such that (Vi )i∈[k] forms a H = {Xi , X̄i , Zi : i ∈ [n]} ∪ {Yj : j ∈ [m]} partition of V, a function π : V −→ 2H \ {∅} denot- ∀i ∈ [n], Votes in Xi : m + 2 votes for a ing the set of groups where each voter can be part of, minimum size smin and maximum size smax of every m votes for b, m − 1 votes for c group, and a target t of maximum margin of victory of ∀i ∈ [n], Votes in X̄i : m + 2 votes for a any group, compute if there exists a partition (Vi0 )i∈[k] m votes for b, m − 1 votes for c of V into these k groups such that the following holds. ∀i ∈ [n], Votes in Zi : m + 2 votes for a (i) For every i ∈ [k] and v ∈ Vi0 , we have Hi ∈ π(v) m + 1 votes for c (ii) For every i ∈ [k], we have smin 6 |Vi0 | 6 smax ∀j ∈ [m], Votes in Yj : m + 3 votes for a (iii) The margin of victory in the group Hi is at most t for every i ∈ [k] m votes for b We denote an arbitrary instance of this problem by t=2 (A, V, k, H = (Hi )i∈[k] , (Vi )i∈[k] , π, smin , smax , t). Let f be a function defined on the set of literals as An important special case of FAIR D ISTRICTING f (xi ) = Xi and f (x̄i ) = X̄i for every i ∈ [n]. We now is when every voter can be moved to every district; describe the π function. For i ∈ [n], no voter in Zi can that is π(v) = H for every voter v ∈ V. We call move to any other district except one voter who votes this problem FAIR PARTITIONING. We denote an arbi- for the alternative c and she can move to Xi and X̄i . For trary instance of FAIR PARTITIONING by (A, V, k, H = i ∈ [n], no voter voting for the alternatives a or c in both (Hi )i∈[k] , (Vi )i∈[k] , smin , smax , t). Xi and X̄i leave their current districts; a voter in Xi (X̄i The FAIR D ISTRICTING problem is generalized to respectively) who votes for the alternative b can move define the FAIR C ONNECTED D ISTRICTING prob- to the district Yj for some j ∈ [m] if the variable xi (x̄i lem where the input also have a graph defined on respectively) appears in the clause Cj . Finally no voter the set of voters, the given districts are all con- in the district Yj , j ∈ [m] leave their current district. nected, and we require the new districts to be con- This finishes the description of π and the description of nected as well. We denote an arbitrary instance of the instance of FAIR D ISTRICTING. We claim that the FAIR C ONNECTED D ISTRICTING by (A, V, G, k, H = two instances are equivalent. 3
In one direction, let us assume that the SAT instance TIONING by (G, Z1 , Z2 ). is a YES instance. Let g : X −→ {0, 1} be a satisfy- It is already known that the 2-D ISJOINT ing assignment for the SAT instance. Let us consider C ONNECTED PARTITIONING problem is NP- the following movement of the voters – for i ∈ [n], if complete (van ’t Hof, Paulusma, and Woeginger g(xi ) = 1, then one voter in the district Zi which votes 2009, Theorem 1). However the proof of Theorem 1 for the alternative c moves to the district Xi ; otherwise in (van ’t Hof, Paulusma, and Woeginger 2009) can be one voter in the district Zi which votes for the alter- imitated as a reduction from the version of SAT where native c moves to the district X̄i . For j ∈ [m], let the every literal appears in exactly two clauses; this version clause Cj be `1 ∨ `2 ∨ `3 and g sets the literal `1 to be is also known to be NP-complete (Berman, Karpinski, 1 (we can assume this without loss of generality). Then and Scott 2003). This proves the following. one voter from the district f (`1 ) who votes for b moves Proposition 1. The 2-D ISJOINT C ONNECTED PARTI - to the district Yj . Since the assignment g satisfies all the TIONING problem is NP-complete even if the maximum clauses, the margin of victory in the district Yj is 2 for degree of the input graph is 5. every j ∈ [m]. For i ∈ [n], if g(xi ) = 0 (g(xi ) = 1 re- spectively), then the margin of victory in the district X̄i Theorem 2. The FAIR C ONNECTED D ISTRICTING (Xi respectively) is 2 since it receives a voter voting for problem is NP-complete even if we have only 2 alter- the alternative c. The rest of the districts (for i ∈ [n], Xi natives, 2 districts, the maximum degree of any vertex if g(xi ) = 0 and X̄i if g(x1 ) = 1) remain same and in the underlying graph is 5, and we do not have any their margin of victory remains to be 2. Hence the FAIR constraint on the size of districts. PARTITIONING instance is also a YES instance. Proof. The FAIR C ONNECTED D ISTRICTING prob- In the other direction, let us assume that the FAIR lem is clearly in NP. To prove NP-hardness, we PARTITIONING instance is a YES instance. We define reduce from 2-D ISJOINT C ONNECTED PARTITION - an assignment g : X −→ {0, 1} to the variables in the ING to FAIR C ONNECTED D ISTRICTING . Let (G 0 = SAT instance as follows. For i ∈ [n], if a voter in the (U, E 0 ), Z1 , Z2 ) be an arbitrary instance of FAIR C ON - district Zi who votes for c moves to Xi , then we define NECTED D ISTRICTING . Without loss of generality, let g(xi ) = 1; otherwise we define g(xi ) = 0. We claim us assume that the degree of every vertex in Z2 is 2; that g is a satisfying assignment for the SAT instance. z2 be any arbitrary (fixed) vertex of Z2 . Let us consider Suppose not, then there exists a clause Cj = `1 ∨`2 ∨`3 the following instance (A, V, G = (V, E), k = 2, H = for some j ∈ [m] which g does not satisfy. To make the (Hi )i∈[2] , (Vi )i∈[2] , π, smin = 0, smax = ∞, t = 1) of margin of victory of the district Yj at most 2, one voter FAIR C ONNECTED D ISTRICTING. who votes for b must move into Yj either from the dis- trict f (`1 ) or from the district f (`2 ) or from the district A = {x, y} f (`3 ). However, since g does not set any of `1 , `2 , or `3 V = {vz : z ∈ Z2 } to 1, none of these districts receive any voter who votes ∪ {vu , wu : u ∈ V \ Z2 } for the alternative c. Consequently, none of the district can send a voter who votes for the alternative b to the ∪ D, D = {di : i ∈ [|Z2 | + 1]} district Yj since otherwise the margin of victory of dis- E = {{va , vb } : {a, b} ∈ E 0 } trict which sends a voter who votes for the alternative b ∪ {{vu , wu } : u ∈ V[G 0 ] \ Z2 } becomes at least 3 contradicting our assumption that the FAIR PARTITIONING instance is a YES instance. Hence ∪ {{di , dj } : i, j ∈ [|Z2 | + 1] , j = i + 1} the SAT instance is a YES instance. ∪ {{z2 , d1 }} H2 = {di : i ∈ [|Z2 | + 1]} Due to Observation 1, it follows immediately from H1 = V \ H2 Theorem 1 that the FAIR C ONNECTED D ISTRICTING problem for plurality voting rule is also NP-complete. Vote of vu , u ∈ V : x y We next show that FAIR C ONNECTED D ISTRICTING Vote of wu , u ∈ V \ Z2 : y x is NP-complete even if we simultaneously have 2 al- Vote of di , i ∈ [|Z2 | + 1] : y x ternatives and 2 districts. For that, we reduce from 2- π(vz ) = {H1 }, z ∈ Z1 D ISJOINT C ONNECTED PARTITIONING which is de- fined as follows. π(di ) = {H2 }, i ∈ [|Z2 | + 1] Definition 2 (2-D ISJOINT C ONNECTED PARTITION - π(v) = {H1 , H2 } for every other vertex v ING ). Given a connected graph G = (V, E) and two This finishes the description of the instance of FAIR disjoint nonempty sets Z1 , Z2 ⊂ V, compute if there ex- C ONNECTED D ISTRICTING. We now claim that the ists a partition (V1 , V2 ) of V such that Z1 ⊆ V1 , Z2 ⊆ FAIR C ONNECTED D ISTRICTING instance is equiva- V2 , G[V1 ] and G[V2 ] are both connected. We denote an lent to the 2-D ISJOINT C ONNECTED PARTITIONING arbitrary instance of 2-D ISJOINT C ONNECTED PARTI - instance. 4
In one direction, let us assume that the 2-D ISJOINT in the table T as follows for every ` ∈ {2, 3, . . . , k}. C ONNECTED PARTITIONING instance is a YES in- stance. Let (V1 , V2 ) be a partition of U such that Z1 ⊆ T (ia )a∈A , ` V1 , Z2 ⊆ V2 , G 0 [V1 ] and G 0 [V2 ] are both connected. We ( ) mv (i0a )a∈A , consider the following new partition of the voters. = 0 min max T (ia − i0a )a∈A , ` − 1 (ia )a∈A ,i0a >0 ∀a∈A P 0 Voters of H1 : {vu , wu : u ∈ V1 }; voters of H2 : others smin 6 a∈A ia 6smax In the above expression mv (i0a )a∈A denotes the Since G 0 [V1 ] is connected, it follows that G[H1 ] is also connected. Similarly, since G 0 [V2 ] is connected, plurality margin of victory of the profile which con- G[D] is connected, and {z2 , d1 } ∈ E[G], it follows that sists of i0a number of voters voting for the alter- G[H2 ] is also connected. In H1 , both the alternatives x native a for a Q∈ A. Updating each entry of the and y receive the same number of votes and thus the table takesQ O a∈A na poly(m, n) time. The ta- margin of victory of H1 is 1. In H2 , the alternatives ble has k a∈A na entries. Hence the running time 2 Q x receives 1 less vote than the alternatives y and thus of our algorithm is O a∈A na poly(m, n) = the margin of victory of H2 is 1. Thus the FAIR C ON - O(1) 2m O n poly(m, n) which is n when we have m = NECTED D ISTRICTING instance is also a YES instance. O(1). In the other direction, let us assume that there exists a valid partition (H10 , H20 ) of the voters such that both We next present a polynomial time algorithm for G[H10 ] and G[H20 ] are connected and the margin of vic- FAIR D ISTRICTING if we have a constant number of tory of both H10 and H20 are 1. Let us define V1 = {u ∈ districts. V[G 0 ] : vu ∈ H10 } and V2 = V[G 0 ] \ V1 . It follows Theorem 4. The FAIR D ISTRICTING problem is poly- from the function π that we have Z1 ⊆ V10 , Z2 ⊆ V20 . nomial time solvable if the number of districts is a con- Also G 0 [V10 ] is connected since the voters in H10 are con- stant. nected. We also have G 0 [V20 ] is connected since the vot- ers in H20 are connected, the vertices in D forms a path, Proof. An arbitrary instance of FAIR D ISTRICTING be and there exists a pendant vertex in D. We also have (A, V, k, H = (Hi )i∈[k] , (Vi )i∈[k] , π, smin , smax , t). Z2 ∈ V200 since the voters in {vu :∈ Z2 } belongs to We guess a winner and a runner up of every district – H2 ; otherwise the margin of victory of H2 would be there are m k = O(m2k ) possibilities. We also guess strictly more than 1. Hence (V10 , V20 ) is a solution of the 2 the plurality score of a winner of every district – there 2-D ISJOINT C ONNECTED PARTITIONING instance and are O(nk ) possibilities. Given a guess of a winner, its thus the instance is a YES instance. plurality score, and a runner up alternative of every dis- trict, we reduce the problem of computing if there exists Results: Polynomial Time Algorithms a partition of V (respecting the given guesses) which We now present out polynomial time algorithms. We achieves the maximum margin of victory of at most t to first show that FAIR PARTITIONING is polynomial time a s0 to t0 flow problem (with demand on edges) instance solvable. (G = (U, E) , c : E −→ R+ , d : E −→ R+ ) as follows. Theorem 3. The FAIR PARTITIONING problem is poly- U = UL ∪ UM ∪ UR ∪ {s0 , t0 } where nomial time solvable if the number of alternatives is a UL = {uv : v ∈ V} constant. UM = {ua,i : a ∈ A, i ∈ [k]} Proof. An arbitrary instance of FAIR PARTITIONING UR = {ui : i ∈ [k]} be (A, V, k, H = (Hi )i∈[k] , (Vi )i∈[k] , smin , smax , t). E = {(s0 , uv ) : v ∈ V} For an alternative a ∈ A, let na be the num- ∪ {(uv , ua,i ) : v ∈ V, i ∈ [k], ber of vote that a receives. We present a dynamic programming based algorithm for the FAIR PARTI - v’s vote is a · · · , Hi ∈ π(v)} TIONING problem. The dynamic programming table ∪ {(ua,i , ui ) : a ∈ A, i ∈ [k]} T (ia ∈ {0, 1, . . . , na})a∈A , ` ∈ [k] is defined as fol- ∪ {(ui , t0 ) : i ∈ [k]} lows – T (ia )a∈A , ` is the minimum integer λ such that the voting profile consisting ia number of voters voting for the alternative a can be partitioned into ` dis- The capacity c of every edge from s0 to UL and from tricts such that the margin of victory of any district is at UL to UM is 1. For every i ∈ [k], if x and y are re- most λ. For every ia ∈ {0, 1, . . . , na }, a ∈ A, we ini- spectively the (guessed) winner and runner up of Hi tialize T (ia )a∈A , 1 to be the margin of victory of the and ni is the (guessed) plurality score of a winner in voting profile which consists of ia number of voters vot- Hi , then we define the capacity and demand of the edge ing for the alternative a for a ∈ A. We update the entries (ux,i , ui ) to be ni and the capacity and demand of the 5
edge (uy,i , ui ) to be (ni − t); if (ni − t) is not positive, describe the algorithms below, given an initial partition then we discard the current guess. We define the capac- as input. ity of the edge (uz,i , ui ) to be ni for every alternative z B G REEDY PARTITIONING minimizes the maximum who is not the guessed winner in Hi for i ∈ [k]. Finally margin of victory of all districts by greedily moving we define the capacity and demand of every edge from voters between districts (starting from voters in the UR to t0 to be smax and smin respectively. We claim district having highest margin of victory in the initial that the given FAIR D ISTRICTING instance is a YES in- partition), allowing voters to move to any district. stance if and only if there exists a guess of a winner, its plurality score, and a runner up alternative of every dis- B G REEDY D ISTRICTING minimizes the maximum trict whose corresponding flow instance has an s0 to t0 margin of victory of all districts by greedily moving flow of value n. voters between districts, where every voter has a con- straint on where they can move. In one direction, suppose the FAIR D ISTRICTING in- stance is a YES instance. Let xi and yi be a winner and B G REEDY C ONNECTED D ISTRICTING minimizes the a runner up respectively in Hi and ni be the plurality maximum margin of victory of all districts by greed- score of a winner in Hi for i ∈ [k]. For the guess cor- ily moving voters between districts such that no dis- responding to the solution of FAIR D ISTRICTING, we trict becomes a disconnected subgraph. send 1 unit of flow from s0 to uv , v ∈ V, from uv to ua,i if the voter v belongs to Hi in the solution and v votes Data for a. Since every vertex in UM has exactly one outgo- We collected three main datasets, two real datasets ing neighbor, all the incoming flow at every vertex in and one synthetic dataset using graph models. The real UM move to their corresponding neighbor in UR . Simi- datasets consist of the general parliament elections in larly, the outgoing neighbor of every vertex in UR is t0 , the U.K. from 2017 and demographic information of all the incoming flow at every vertex in UR move to t0 . students in public schools of Detroit, MI. We evaluate Obviously the flow conservation property is satisfied at all three greedy algorithms on the synthetic dataset, but every vertex. Also capacity and demand constraints are only evaluate G REEDY PARTITIONING and G REEDY also satisfied at every edge since the guess corresponds D ISTRICTING on the real dataset as we lack the (social) to a solution of the FAIR D ISTRICTING instance. Fi- network information in them. nally since the total outgoing flow at s0 is n, the total flow value is also n. UK General Elections: We collected data regarding In the other direction, assuming xi and yi being a UK Parliament elections in 2017 from The Electoral guessed winner and a runner up respectively in Hi and Commission (electoralcommission.org.uk), ni being the plurality score of a winner in Hi for i ∈ [k], using constituencies as districts and parties as the corresponding flow network has a flow value of n, alternatives. Though the votes are cast for individu- we claim that the FAIR D ISTRICTING instance is a YES als, yet in the Parliament number of seats for each party instance. We can assume without loss of generality that is the number that counts, we are interested in the ef- the flow value on every edge in a maximum flow is an fect of districting on the distribution of votes over par- integer since the demand and capacity of every edge are ties rather than over individuals. Knowing the number integers. We define a voter v ∈ V to be in the district of votes each party got in each constituency, we simu- Hi , i ∈ [k] if there exists an alternative a such that lated the preferences of the voters. there is one unit of flow in the edge (uv , ua,i ). It follows We tested our algorithm on 10 neighboring con- from the construction of the maximum flow instance stituencies out of the 650 in the region of Scotland bor- that the above partitioning the voters into the districts dering Edinburgh, which represents a very diverse area Hi is valid (that is, it respects π, smin , and smax ) and in terms of voting preferences. Indeed, these neighbor- the maximum margin of victory of any district is at most ing constituencies voted very differently, each having t. Hence the FAIR D ISTRICTING instance is also a YES a clear majority. For example, the distribution of votes instance. in East Lothian was 36.3% Labour, 29.8% Conserva- tive, 30.7 Scottish National Party (SNP), and 3.1% Lib- Experimental Evaluation eral Democrats, while in Edinburgh East it was 34.6% Labour, 18.5% Conservative, 42.5% SNP, and 4.2% Greedy Algorithms Liberal Democrats.1 We subsampled this dataset, work- Given the high complexities of FAIR PARTITION - ing with a randomized sample of approximately 50, 000 ING , FAIR D ISTRICTING , and FAIR C ONNECTED D IS - 1 The 10 constituencies we sampled are: Dumfriesshire, TRICTING problems, we propose a set of fast greedy Clydesdale and Tweeddale, Berwickshire, Roxburgh and heuristics to minimize the margin of victory by mov- Selkirk, East Lothian, Midlothian, Edinburgh South, Edin- ing voters between districts, while respecting the con- burgh East, Edinburgh North and Leith, Edinburgh South straints on mobility of the users, and connectedness. We West, Edinburgh West, and Livingston, for which an in- 6
Initial Greedy-partitioning 2500 Initial Greedy-connected Baseline 35 Greedy-districting 10 Greedy-districting Initial Initial Maximum margin of victory 17500 Greedy-districting 30 Greedy-connected 2000 Greedy-partitioning Maximum margin of victory 8 15000 Greedy-partitioning Greedy-districting Total margin of victory 25 Greedy-partitioning 1500 12500 Baseline 6 20 10000 1000 15 4 7500 5000 10 500 2 2500 5 0 0 0 0 UK Data School Data 0.00 0.25 0.50 0.75 1.00 UK Data School Data 0.00 0.25 0.50 0.75 1.00 Homophily Homophily (a) Real Data (b) Synthetic Data (c) Real Data (d) Synthetic Data Figure 1: Maximum margin of victory for all algorithms in (a) real data and (b) synthetic data. Total (sum) of margin of victory in (c) real data and (d) synthetic data. that our algorithms can be used to increase racial diver- 40 SNP 40 SNP sity (and lower segregation) in schools, if accompanied Lib Dem Lib Dem 30 Labour Labour by government policies that facilitate movement of stu- Conservative 30 Conservative dents between schools (Montgomery III 1970). % of voters % of voters 20 We collected school data from the National Center 20 for Education Statistics (NCES: nces.ed.gov/ccd) 10 10 about public schools in Detroit, MI, which is still 0 0 one of the cities with highest rates of segregation and Initial Greedy Initial Greedy most economic and social struggles encountered by mi- (a) East Lothian (b) Edinburgh East norities (Institute 2018; Kent and Thomas C. Frohlich 2015). We gathered data from 61 schools in Detroit, Figure 2: Voters’ distribution in UK constituencies each containing between 40 and 5000 students, sum- before and after applying G REEDY D ISTRICTING. ming up to 41, 834 students and their reported race. We modeled this data in the form of an election, where the people and we recorded the location of each constituen- voters are the students and the alternatives are their cies (represented by its center), enforcing in G REEDY reported race (NCES data has 7 reported races: Asian, D ISTRICTING that voters can be incentivized to move Native American, Hispanic, Black, White, Hawaiian, or to vote only in their closest two constituencies. and Mixed-race). Given each student’s race, we mod- Figure 1(a) shows that both G REEDY D ISTRICTING eled this ‘election’ as a plurality voting scenario, where and G REEDY PARTITIONING are able to reduce the each student only ‘votes’ for their reported race. Fur- maximum margin of victory of this dataset by approxi- thermore, we recorded the location of each school, en- mately 91−92% percent, from 776 to 67 and 55, respec- forcing in G REEDY D ISTRICTING that students can tively. Figure 1(c) shows the effect greedy had on min- only go to their closest 5 schools. imizing the overall margin of victory, showing an even Figure 1(a) shows that both G REEDY D ISTRICTING larger decrease by almost 95%, from 2652 to 148 and and G REEDY PARTITIONING decrease the total margin 135, respectively. Since G REEDY D ISTRICTING repre- of victory by 11−12% on average, from 2, 501 to 2, 213 sents the more realistic application, we show in Figure 2 and 2, 311, respectively. Again, Figure 1(c) shows the its effect on the voters’ distribution in East Lothian and overall decrease in margin of victory by the greedy Edinburgh East, showing that it created a stronger oppo- algorithms, showing a larger decrease of 18 − 24%, sition for the leading parties (for Labour in East Lothian from 18, 870 to 15, 360 and 14, 376, respectively. As and for SNP in Edinburgh East). G REEDY D ISTRICTING represents the more realistic scenario, we present in Figure 3 its effect on the racial US Public School Districting: Neighborhood racial distribution of students in a selection of three schools. segregation is still widespread in many places in US, we observe that schools containing students from one trickling down to segregation in schools (Frankenberg predominant racial group become more equilibrated: 2019; Richards 2014). One of the main consequences Dove Academy goes from having 98% Black students of that is white-majority schools receiving substantially and 2% White to having 80% Black, 14.5% White, more funding than schools with mostly students of and 5.5% Hispanic students, Universal Academy goes color (EdBuild 2019). In this paper, we attempt to show from having 95% White students, 3.5% Hispanic, and 2.5% Black to having 50.15% White and 49.7% His- teractive map with the vote distribution can be found at panic, while Cesar Chavez Academy goes from having https://www.bbc.com/news/election-2017-40176349. 88% Hispanic students, 8% Black, and 3% White to 7
100 Hispanic Hispanic Hispanic Racial distribution by percentage Racial distribution by percentage Racial distribution by percentage Black Black 80 Black 80 White 80 White White 60 60 60 40 40 40 20 20 20 0 0 0 Initial Greedy Initial Greedy Initial Greedy (a) Dove Academy (b) Universal Academy (c) Cesar Chavez Academy Figure 3: Racial distribution of students in selected schools before and after applying G REEDY D ISTRICTING. having 48% Hispanic, 48% Black, and 3% White stu- ers into districts given a network, the districts’ size con- dents. Given the discrepancy between White-majority straints, and mobility of voters. While G REEDY PAR - schools with other schools, we hope that such demo- TITIONING and G REEDY D ISTRICTING come as natu- graphic changes can help in equalizing funding for all ral formulations, we argue the need for G REEDY C ON - students. NECTED D ISTRICTING as well, since re-districting can- Of course, since minimizing margin of victory only not be done arbitrarily and will be more effective if the considers the most predominant two races, we may need population remain connected. Thus, in our simulations, to enforce an additional diversity constraint to preserve the graph models aim to mimic the natural connections a minimum fraction of students from other races in individuals make. a school (e.g., the 2.5% Black students in Universal We simulated the greedy algorithm for each graph in- Academy may need to stay). We leave exploring this stance, averaging over 10 iterations the minimal maxi- direction for future work. mum margin of victory that it can reach and compared that to the baseline value. Figure 1(b) shows the effect of Graph Simulations: To further understand the re- these algorithms in improving the maximum margin of lationship between margin of victory and population victory aggregated for all graph instances, varying the structure, we simulated a set of graphs based on the homophily factor and allowing districts to change up to Erdos-Renyi (ER) graph model, varying the level of 20% in size. We observe that no matter how homophilic connectivity between people with similar political lean- the initial graph is, greedy is able to successfully reduce ings. Unable to vary such a parameter in the real data, the margin of victory for all three algorithms: G REEDY we turn to classical graph models to do so. Following PARTITIONING performs the best as it contains no con- the methodology in (Cohen-Zemach, Lewenberg, and straints on mobility of voters, being evaluated close to Rosenschein 2018), we used the line model to simu- the baseline value and reducing maximum margin of late voters, alternatives, and voters0 political affil- victory from 10 to 6 − 7 on average, G REEDY D IS - TRICTING performs second-best, reducing it from 10 to iation. We then created 50 instances of the ER graph model, where each node represents a voter and the 8 on average, while G REEDY C ONNECTED D ISTRICT- ING reduces it from 10 to 9 on average, performing edges are formed according to the model with an added homophily factor based on the distance between nodes worse than the other two due to a tighter connectiv- (as simulated by the line model). For every node, we ity constraint. Figure 1(d) shows the overall decrease in generated the list of preferences over the candidates ac- margin of victory, where the effect is more significant: cording to the distance between the voter and the can- G REEDY PARTITIONING and G REEDY D ISTRICTING didates. The inputs to each such graph are the number achieve a result close to the baseline, while G REEDY of voters N (100), the number of candidates C (5), the C ONNECTED D ISTRICTING performs slightly worse, number of districts K (5), and a homophily parameter. reducing the total margin of victory of 46% on aver- age. The results are qualitatively similar for varying the Such models capture the network and clustering ef- district size constraints, which we omit due to lack of fects exhibited by voter districts in real world (Keegan space. 2016; Adamic and Glance 2005; Conover et al. 2011). We then test all three greedy algorithms in minimizing the maximum margin of victory by re-districting the Conclusion and Future Directions population in these graphs. We further add a baseline In this paper, we tackled the problem of fairly dividing algorithm that computes the optimal partition of vot- people into groups by conceptualizing the problem in a 8
voting scenario. By modeling the preferences of people Karpinski, M.; and Scott, A. D. 2003. Approximation over different candidates, we set the goal to minimize hardness and satisfiability of bounded occurrence the maximum margin of victory in any group. In doing instances of SAT. Electronic Colloquium on Computa- so, we provide a rigorous framework to reason about the tional Complexity (ECCC) 10(022). complexity of the problem, showing that dividing peo- [Blom, Stuckey, and Teague 2018] Blom, M.; Stuckey, ple with constraints on their neighborhood or their con- P. J.; and Teague, V. J. 2018. Computing the margin of nections is NP-complete in the most general case, and victory in preferential parliamentary elections. In Inter- admit polynomial time algorithms for particular cases. national Joint Conference on Electronic Voting, 1–16. Furthermore, we develop and evaluate fast greedy Springer. heuristics to minimize the maximum margin of victory [Borodin et al. 2018] Borodin, A.; Lev, O.; Shah, N.; in practical scenarios. Indeed, our results show signifi- and Strangway, T. 2018. Big city vs. the great outdoors: cant improvement of the margin of victory in the case Voter distribution and how it affects gerrymandering. In of elections and school choice, as well as in synthetic IJCAI, 98–104. experiments. In the case of elections, minimizing mar- gin of victory leads to better representation of oppos- [Butler 1992] Butler, D. 1992. The redrawing of par- ing parties in electoral districts, where we notice that liamentary boundaries in britain. British Elections and the opposing parties in the UK can gain more power Parties Yearbook 2(1):5–12. through re-districting. In the case of school choice, we [Chakraborty et al. 2019] Chakraborty, A.; Mota, N.; model students demographic information as an election, Biega, A. J.; Mohammadi, N.; Gummadi, K. P.; and where each student ’votes’ (or prefers) their own demo- Heidari, H. 2019. Nudging toward equitable online graphic attribute, and show that our greedy algorithms donations: A case study of educational charity. are able to provide more diversity in highly segregated [Chang 2018] Chang, A. 2018. We can draw school schools. While government policies are ultimately cru- zones to make classrooms less segregated. this is how cial in reducing segregation, we hope that this quantita- well your district does. tive analysis can motivate them and show their potential [Cohen-Zemach, Lewenberg, and Rosenschein 2018] efficacy. Cohen-Zemach, A.; Lewenberg, Y.; and Rosenschein, Multiple directions remain open for future work. We J. S. 2018. Gerrymandering over graphs. In Pro- plan to (i) include an analysis of the social connec- ceedings of the 17th International Conference on tions in real datasets that may further constrain people’s Autonomous Agents and MultiAgent Systems, 274–282. mobility, and (ii) extend synthetic experiments to other International Foundation for Autonomous Agents and graph models as well. Finally, it would be interesting Multiagent Systems. to (iii) measure the effect of minimizing the margin of victory on different gerrymandering metrics, and (iv) [Conover et al. 2011] Conover, M. D.; Ratkiewicz, J.; investigate whether lowering racial segregation would Francisco, M.; Gonçalves, B.; Menczer, F.; and Flam- lead more equitable distribution of revenues to public mini, A. 2011. Political polarization on twitter. In Fifth schools. international AAAI conference on weblogs and social media. References [Corsi-Bunker 2015] Corsi-Bunker, A. 2015. Guide to the education system in the united states. University of [Adamic and Glance 2005] Adamic, L. A., and Glance, Minnesota 23. N. 2005. The political blogosphere and the 2004 us election: divided they blog. In Proceedings of the [Dey and Narahari 2015] Dey, P., and Narahari, Y. 2015. 3rd international workshop on Link discovery, 36–43. Estimating the margin of victory of an election using ACM. sampling. In Twenty-Fourth International Joint Confer- ence on Artificial Intelligence. [Bachrach et al. 2016] Bachrach, Y.; Lev, O.; Lewen- berg, Y.; and Zick, Y. 2016. Misrepresentation in district [Douglas N. Harris 2015] Douglas N. Harris, M. F. L. voting. In IJCAI, 81–87. 2015. The identification of schooling preferences: methods and evidence from post-katrina new orleans. [Barbara and Garcia-Molina 1987] Barbara, D., and Garcia-Molina, H. 1987. The reliability of voting [EdBuild 2019] EdBuild. 2019. Non-white school dis- mechanisms. IEEE Transactions on Computers tricts get $23 billion less than white districts, despite (10):1197–1208. serving the same number of students. [Barberà et al. 1991] Barberà, S.; Sonnenschein, H.; [Erdélyi, Hemaspaandra, and Hemaspaandra 2015] Zhou, L.; et al. 1991. Voting by committees. Econo- Erdélyi, G.; Hemaspaandra, E.; and Hemaspaandra, metrica 59(3):595–609. L. A. 2015. More natural models of electoral control by partition. In International Conference on Algorithmic [Berman, Karpinski, and Scott 2003] Berman, P.; DecisionTheory, 396–413. Springer. 9
[Feix et al. 2008] Feix, M. R.; Lepelley, D.; Merlin, V.; Journal of Elections, Public Opinion and Parties Rouet, J.-L.; and Vidu, L. 2008. Majority effi- 16(1):37–54. cient representation of the citizens in a federal union. [Keegan 2016] Keegan, J. 2016. Blue feed, red feed. Manuscript, Université de la Réunion, Université de The Wall Street Journal 18. Caen, and Université dOrléans. [Kent and Thomas C. Frohlich 2015] Kent, A., and [Felsenthal and Miller 2015] Felsenthal, D. S., and Thomas C. Frohlich, H. P. 2015. The 9 most segregated Miller, N. R. 2015. What to do about election cities in america. inversions under proportional representation? Repre- [Lewenberg, Lev, and Rosenschein 2017] Lewenberg, sentation 51(2):173–186. Y.; Lev, O.; and Rosenschein, J. S. 2017. Divide [Frankenberg 2019] Frankenberg, E. 2019. What school and conquer: Using geographic manipulation to win segregation looks like in the us today, in 4 charts. district-based elections. In AAMAS, 624–632. [Gelman, Katz, and Tuerlinckx 2002] Gelman, A.; [Lublin 1999] Lublin, D. 1999. The paradox of repre- Katz, J. N.; and Tuerlinckx, F. 2002. The mathematics sentation: Racial gerrymandering and minority inter- and statistics of voting power. Statistical Science ests in Congress. Princeton University Press. 420–435. [Montgomery III 1970] Montgomery III, J. 1970. [Institute 2018] Institute, U. 2018. Segregated neigh- Swann v. charlotte-mecklenburg board of education: borhoods, segregated schools? Roadblocks to the implementation of brown. Wm. & [Issacharoff 2002] Issacharoff, S. 2002. Gerrymander- Mary L. Rev. 12:838. ing and political cartels. Harvard Law Review 116. [Richards 2014] Richards, M. P. 2014. The gerryman- [Ito et al. 2019] Ito, T.; Kamiyama, N.; Kobayashi, Y.; dering of school attendance zones and the segregation and Okamoto, Y. 2019. Algorithms for gerrymander- of public schools: A geospatial analysis. American Ed- ing over graphs. In Proceedings of the 18th Interna- ucational Research Journal 51(6):1119–1157. tional Conference on Autonomous Agents and MultiA- [van ’t Hof, Paulusma, and Woeginger 2009] van ’t Hof, gent Systems, 1413–1421. International Foundation for P.; Paulusma, D.; and Woeginger, G. J. 2009. Partition- Autonomous Agents and Multiagent Systems. ing graphs into connected parts. Theor. Comput. Sci. [Johnston, Rossiter, and Pattie 2006] Johnston, R.; 410(47-49):4834–4843. Rossiter, D.; and Pattie, C. 2006. Disproportionality [Xia 2012] Xia, L. 2012. Computing the margin of vic- and bias in the results of the 2005 general election in tory for various voting rules. In Proceedings of the 13th great britain: evaluating the electoral systems impact. ACM Conference on Electronic Commerce, 982–999. ACM. 10
You can also read