ABSTRACT Encouraging voters to truthfully reveal their preferences in an election has long been an important issue. Previous studies have shown that some voting protocols are hard to manipulate, but predictably used NP-hardness as the complexity measure. Such a worst-case analysis may be an insufficient guarantee of resistance to manipulation. Indeed, we demonstrate that NP-hard manipulations may be tractable in the average-case. For this purpose, we augment the existing theory of average-case complexity with new concepts; we consider elections distributed with respect to junta distributions, which concentrate on hard instances, and introduce a notion of heuristic polynomial time. we use our techniques to prove that a family of important voting protocols is susceptible to manipulation by coalitions, when the number of candidates is constant. categories and subject descriptors f.2 [theory of computation]: analysis of algorithms and problem complexity; i.2.11 [artificial intelligence]: distributed artificial intelligence‹multiagent systems; j.4 [computer applications]: social and behavioral sciences‹economics general terms algorithms, theory, economics keywords computational complexity, voting 1. introduction in multiagent environments, it may be the case that different agents have diverse preferences, and it is therefore important to find a way to aggregate agent preferences. a general scheme for preference aggregation is voting: the agents permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. to copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. aamas'06 may 812 2006, hakodate, hokkaido, japan. copyright 2006 acm 1-59593-303-4/06/0005 …$5.00. reveal their preferences by ranking a set of candidates, and a winner is determined according to a voting protocol. the candidates can be various entities such as beliefs or plans, and indeed may be potential real-life parliament members. things are made complicated by the fact that in many settings (as in reality) the agents are self-interested. such an agent may reveal its preferences untruthfully, if it believes this would make the final outcome of the elections more favorable for it. consequently, the outcome may be one that does not maximize social welfare. this problem is provably acute: it is known [8, 10] that, for elections with three or more candidates, in any voting protocol that is nondictatorial, 1 there are elections where an agent is better off
by voting untruthfully. fortunately, it is reasonable to make the assumption that the agents are computationally bounded. therefore, although in principle an agent may be able to manipulate an election, the computation required may be infeasible. this has motivated researchers to study the computational complexity of manipulating voting protocols. it has long been known [3] that there are voting protocols that are np-hard to manipulate by a single voter. recent results by conitzer and sandholm [5, 4] show that some manipulations of common voting protocols are np-hard, even for a small number of candidates. moreover, in [6], it is shown that adding a pre-round to some voting protocols can make manipulations hard (even pspace-hard in some cases). elkind and lipmaa [7] show that the notion of pre-round, together with one-way functions, can be used to construct protocols that are hard to manipulate even by a large minority fraction of the voters. in computer science, the notion of hardness is usually considered in the sense of worst-case complexity. not surprisingly, most results on the complexity of manipulation use np-hardness as the complexity measure. however, it may still be the case that most instances of the problem are easy to manipulate. a relatively little-known theory of average case complexity exists [11]; that theory introduces the concept of distributional problems, and defines what a reduction between distributional problems is. it is also known that averagecase complete problems exist (albeit artificial ones, such as a distributional version of the halting problem). sadly, it is very difficult to show that a certain problem is average-case complete, and such results are known only for a handful of problems. additionally, the goal of the 1in a dictatorial protocol, there is an agent that dictates the outcome regardless of the others' choices. Koch Industries Funding Wisconsin Governor Scott Walker Recall Election existing theory is to define when a problem is hard in the average-case; it does not provide criteria for deciding when a problem is easy. a step towards showing that a manipulation is easy on average was made in [7]. it involves an analysis of the plurality protocol with a pre-round, but focuses on a very specific distribution, which does not satisfy some basic desiderata as to what properties an interesting distribution should have. in this paper, we engage in a novel average-case analysis, based on criteria we propose. coming up with an interesting distribution of problem instances with respect to which the average-case complexity is computed is a difficult task, and the solution may be controversial.
we analyze problems whose instances are distributed with respect to a junta distribution. such a distribution must satisfy several conditions, which (arguably) guarantee that it focuses on instances that are harder to manipulate. we consider a protocol to be susceptible to manipulation when there is a polynomial time algorithm that can usually manipulate it: the probability of failure (when the instances are distributed according to a junta distribution) must be inverse-polynomial. such an algorithm is known as a heuristic polynomial time algorithm. we use these new methods to prove our main result: an important family of protocols, called scoring protocols, is susceptible to coalitional manipulation when the number of candidates is constant. specifically, we contemplate sensitive scoring protocols, which include such well-known protocols as borda and veto. to accomplish this task, we define a natural distribution º " over the instances of a well-defined coalitional manipulation problem, and show that this is a junta distribution. furthermore, we present the manipulation algorithm greedy, and prove that it usually succeeds with respect to º ". we also show that all protocols are susceptible to a certain setting of manipulation, where the manipulator is unsure about the others votes. this result depends upon a basic conjecture regarding junta distributions, but also has implications that transcend our specific definition of these distributions. in section 2, we outline some important voting protocols, and properly define the manipulation problems we shall discuss. in section 3, we formally introduce the tools for our average case analysis: junta distributions, heuristic polynomial time, and susceptibility to manipulations. in section 4 we prove our main result: sensitive scoring protocols are susceptible to coalitional manipulation with few candidates. in section 5, we discuss the case when a single manipulator is unsure about the other voters votes. finally, in section 6, we present conclusions and directions for future research. 2. preliminaries we first describe some common voting protocols and formally define the manipulation problems with which we shall deal. next, we introduce a useful lemma from probability theory.
2.1 elections and manipulations an election consists of a set c of m candidates, and a set v of n voters, who provide a total order on the candidates. an election also includes a winner determination function from the set of all possible combinations of votes to c. we note that throughout this paper, m = o(1), so the complexity results are in terms of n. different voting protocols are distinguished by their winner determination functions. the protocols we shall discuss are: " scoring protocols: a scoring protocol is defined by vector ± = ±1, ±2, . . . , ±m, such that ±1 "e ±2 "e . . . "e ±m and ±i " n "* {0}. a candidate receives ±i points for each voter which ranks it in the i th place. examples of scoring protocols are: plurality: ± = 1, 0, . . . , 0, 0. veto: ± = 1, 1, . . . , 1, 0. borda: ± = m " 1, m " 2, . . . , 1, 0. " copeland: for each possible pair of candidates, simulate an election; a candidate wins such a pairwise election if more voters prefer it over the opponent. a candidate gets 1 point for each pairwise election it wins, and "1 for each pairwise election it loses. " maximin: a candidate s score in a pairwise election is the number of voters that prefer it over the opponent. the winner is the candidate whose minimum score over all pairwise elections is highest. " single transferable vote (stv): the election proceeds in rounds. in each round, the candidate s score is the number of voters that rank it highest among the remaining candidates; the candidate with the lowest score is eliminated. remark 1.
2.1 elections and manipulations an election consists of a set c of m candidates, and a set v of n voters, who provide a total order on the candidates. an election also includes a winner determination function from the set of all possible combinations of votes to c. we note that throughout this paper, m = o(1), so the complexity results are in terms of n. different voting protocols are distinguished by their winner determination functions. the protocols we shall discuss are: " scoring protocols: a scoring protocol is defined by vector ± = ±1, ±2, . . . , ±m, such that ±1 "e ±2 "e . . . "e ±m and ±i " n "* {0}. a candidate receives ±i points for each voter which ranks it in the i th place. examples of scoring protocols are: plurality: ± = 1, 0, . . . , 0, 0. veto: ± = 1, 1, . . . , 1, 0. borda: ± = m " 1, m " 2, . . . , 1, 0. " copeland: for each possible pair of candidates, simulate an election; a candidate wins such a pairwise election if more voters prefer it over the opponent. a candidate gets 1 point for each pairwise election it wins, and "1 for each pairwise election it loses. " maximin: a candidate s score in a pairwise election is the number of voters that prefer it over the opponent. the winner is the candidate whose minimum score over all pairwise elections is highest. " single transferable vote (stv): the election proceeds in rounds. in each round, the candidate s score is the number of voters that rank it highest among the remaining candidates; the candidate with the lowest score is eliminated. remark 1. >>9125734
>>9125800 we assume that tie-breaking is always adversarial to the manipulator. 2 in the case of weighted votes, a voter with weight k " n is naturally regarded as k voters who vote unanimously. in this paper, we consider weights in [0, 1]. this is equivalent, since any set of integer weights in the range 1, . . . , polyn can be scaled down to weights in the segment [0, 1] with o(logn) bits of precision. the main results of the paper focus on scoring protocols. we shall require the following definition: definition 1. let p be a scoring protocol with parameters ± = ±1, ±2, . . . , ±m. we say that p is sensitive iff ±1 "e ±2 "e . . . "e ±m"1 ±m = 0 (notice the strict inequality on the right). In particular, Borda and Veto are sensitive scoring protocols. Remark 2. Generally, from any scoring protocol with ±m"1 > ±m, an equivalent sensitive scoring protocol can be obtained by subtracting ±m on a coordinate-by-coordinate basis from the vector ±. Moreover, observe that if a protocol is a scoring protocol but is not sensitive, and ±m = 0, then ±m"1 = 0. In this case, for three candidates it is equivalent to the plurality protocol, for which most manipulations are tractable even in the worst-case. Therefore, it is sufficient to restrict our results to sensitive scoring protocols. 2This is a standard assumption, also made, for example, in [5, 4]. We next consider some types of manipulations, state the appropriate complexity results, and introduce some notations. Remark 3. We discuss the constructive cases, where the goal is trying to make a candidate win, as opposed to destructive manipulation, where the goal is to make a candidate lose. Constructive manipulations are always at least as hard (in the worst-case sense) as their destructive counterparts, and in some cases strictly harder (if one is able to determine whether p can be made to win, one can also ask whether any of the other m " 1 candidates can be made to win, thus making p lose). Definition 2. In the Individual-Manipulation problem, we are given all the other votes, and a preferred candidate p. We are asked whether there is a way for the manipulator to cast its vote so that p wins. Bartholdi and Orlin [3] show that IM is NP-complete in Single Transferable Vote, provided the number of candidates is unbounded. However, the problem is in P for most voting schemes, and hence will not be studied here. Definition 3. In the Coalitional-WeightedManipulation (CWM) problem, we are given a set of weighted votes S, the weights of a set of votes T which have not been cast, and a preferred candidate p. We are asked whether there is a way to cast the votes in T so that p wins the election. We know [5, 4] that CWM is NP-complete in Borda, Veto and Single Transferable Vote, even with 3 candidates, and in Maximin and Copeland with at least 4 candidates. The CWM version that we shall analyze, which is specifically tailored for scoring protocols, is a slightly modified version whose analysis is more straightforward: Definition 4. In the Scoring-Coalitional-WeightedManipulation (SCWM) problem, we are given an initial score S[c] for each candidate c, the weights of a set of votes T which have not been cast, and a preferred candidate p.
>>9125813 We are asked whether there is a way to cast the votes in T so that p wins the election. S[c] can be interpreted as c s total score from the votes in S. However, we do not require that there exist a combination of votes that actually induces S[c] for all c. Definition 5. In the Uncertain-Votes-WeightedEvaluation (UVWE) problem, we are given a weight for each voter, a distribution over all the votes, a candidate p, and a number r " [0, 1]. We are asked whether the probability of p winning is greater than r. Definition 6. In the Uncertain-Votes-WeightedManipulation (UVWM) problem, we are given a single manipulative voter with a weight, weights for all other voters, a distribution over all the others votes, a candidate p, and a number r, where r " [0, 1]. We are asked whether the manipulator can cast its vote so that p wins with probability greater than r. If CWM is NP-hard in a protocol, then UVWE and UVWM are also NP-hard in it [5]. These problems will be studied in Section 5. We make the assumption that the given distributions over the others votes can be sampled in polynomial time. 2.2 Chernoff s Bounds
The following lemma will be of much use later on. Informally, it states that the average of independent identically distributed (i.i.d.) random variables is almost always close to the expectation. Lemma 1 (Chernoff s Bounds). Let X1, . . . , Xt be i.i.d. random variables such that a "d Xi "d b and E[Xi] = º. Then for any 0, it holds that: " Pr[ 1 t Pt i=1 Xi "e º + ] "d e "2t 2 (b"a)2 " Pr[ 1 t Pt i=1 Xi "d º " ] "d e "2t 2 (b"a)2 3. JUNTA DISTRIBUTIONS AND SUSCEPTIBLE MECHANISMS In this section we lay the mathematical foundations required for an average-case analysis of the complexity of manipulations. All of the definitions are as general as possible; they can be applied to the manipulation of any mechanism, not merely to the manipulation of voting protocols. We describe a distribution over the instances of a problem as a collection of distributions º1, . . . , ºn, . . ., where ºn is a distribution over the instances x such that |x| = n. We wish to analyze problems whose instances are distributed with respect to a distribution which focuses on hard-to-manipulate instances. Ideally, we would like to insure that if one manages to produce an algorithm which can usually manipulate instances according to this distinguished difficult distribution, the algorithm would also usually succeed when the instances are distributed with respect to most other reasonable distributions. Definition 7. Let º = {ºn}n"N be a distribution over the possible instances of an NP-hard manipulation problem M. º is a junta distribution if and only if º has the following properties: 1. Hardness: The restriction of M to º is the manipulation problem whose possible instances are only: [ n"N {x : |x| = n "' ºn(x) > 0}. Deciding this restricted problem is still NP-hard. 2. Balance: There exist a constant c > 1 and N " N such that for all n "e N: 1 c "d Prx"<º n[M(x) = 1] "d 1 " 1 c . 3. Dichotomy: for all n and instances x such that |x| = n: ºn(x) "e 2 "polyn "( º n(x) = 0. If M is a voting manipulation problem, we also require the following property: 4. Symmetry: Let v be a voter whose vote is given, let c1, c2 = p be two candidates, and let i " {1, . . . , m}. The probability that v ranks c1 in the i th place is the same as the probability that v ranks c2 in the i th place. If M is a coalitional manipulation problem, we also require the following property: 5. Refinement: Let x be an instance such that |x| = n and ºn(x) > 0; if all colluders voted identically, then p would not be elected. The name junta distribution comes from the idea that in such a distribution, relatively few powerful and difficult instances represent all the other problem instances. Alternatively, our intent is to have a few problematic distributions (the family of junta distributions) convincingly represent all other distributions with respect to the average-case analysis. The first three properties are basic, and are relevant to problems of manipulating any mechanism. The definition is modular, and additional properties may be added on top of the basic three, in case one wishes to analyze a mechanism which is not a voting protocol. The exact choice of properties is of extreme importance (and, as we mentioned above, may be arguable). We shall briefly explain our choices. Hardness is meant to insure that the junta distribution contains hard instances. Balance guarantees that a trivial algorithm which always accepts (or always rejects) has a significant chance of failure. The dichotomy property helps in preventing situations where the distribution gives a (positive but) negligible probability to all the hard instances, and a high probability to several easy instances. We now examine the properties that are specific to manipulation problems. The necessity of symmetry is best explained by an example. Consider CWM in STV with m "e 3. One could design a distribution where p wins if and only if a distinguished candidate loses the first round. Such a distribution could be tailored to satisfy the other conditions, but misses many of the hard instances.