Free, Fair, Accurate, and Trustworthy Elections |
A Manual for Voters, Poll Workers, and Administrators |
DRAFT |
When I cash a check at the credit union, the teller counts the money twice, while I watch. Then I count the money while the teller watches. The teller is not insulted by this. When I count the money I am not impugning the integrity of the teller; I just want to count the money. Double-checking never hurt anybody. Redundancy is your friend.
All ballots must be accounted for, including unused ballots.
This may sound obvious, but believe it or not there are some election boards that have established only one login to the the vote-tallying machines, the "Admin" login, and all users share the password to that login.
By way of analogy, the accouting department would not be amused by a purchase order initiated by "admin" and approved by "admin".
Vote counting should be secure – which means it should not be done using secret methods or secret computer programs.
Votes should be cast in private, but they should be counted in public.
Rationale for paper ballots: The paper ballots form a permanent record. They can be machine counted on the spot. They can also be hand counted. They can be recounted as many times as desired.
Additional rationale: Mark-sense paper ballots can be scanned to detect common types of voter errors, while the voter still has a chance to make corrections. Detectable errors include overvotes and some types of undervotes.
Rationale for going to the polling place: Although polling places are not completely immune to abuse, they have been around for a long time, and people have a relatively good idea how to secure them.
Rationale for waiting until election day: Any kind of early voting (whether by mail or in person at an early-voting precinct) puts the voter at a disadvantage because less information is available. Sometimes highly persuasive information about the candidates may pop up in the last days before the election. Early voters forfeit the chance to take this information into account. For example, in a multi-way race, such as we often see in a primary election, one candidate might drop out of the race just before election day, after the “early” votes have already been cast. Early voters forfeit the chance to shift their vote to one of the remaining candidates.
Rationale: Mail-in ballots create too many opportunities for vote-buying and voter intimidation (threat 7-1). After they have been mailed in, there are too many ways for ballots to get selectively lost or changed (threat 7-3, threat 7-4, threat 7-5).
Additional rationale: Mail-in voting forfeits any chance of scanning the ballot for errors and allowing the voter to make corrections, as mentioned in item 2-2.
Note that many jurisdictions “rotate” the names on the ballot, since being first on the list is known to confer an advantage. This places a nontrivial burden on the vote-tallying system, since it must match the filled-in marks to actual names, not merely to positions on the page.
Similar principles apply to the staff at any point where votes are aggregated or tallied. At every stage, there needs to be bipartisan oversight of any activity that could possibly affect the tally.
By way of counterexample, consider the following scenario: At each precinct, boxes of ballots are sealed with special numbered seals. The boxes are transported to the central counting facility. The boxes are opened and the ballots are counted. In this scenario, it is not sufficient to have observers confined to a gallery, watching from such a distance that they cannot tell whether the seals have been tampered with.
Keep in mind that machines per se are not the only bottleneck. It does no good to have lots of voting stations standing empty while there is a huge bottleneck at the table where voters are checked in. Parties should calculate well in advance the needed quantities of registration books and/or other check-in resources, staff, floor space, etc., and make sure everything will be available.
The waiting line should be located so that it does not interfere with efficient processing of those who make it to the front of the line. Otherwise things go from bad to worse, as congestion causes inefficiency which causes more congestion......
There should only be one line. First come, first served. It is outrageous to have a situation where someone waits in line, only to be told that they have waited in the “wrong” line and they have to start over.
A good way to think about this is in terms of series versus parallel.
When threats can operate in parallel, security requires tremendous attention to detail. As the proverb says, a chain is only as strong as its weakest link. The same applies to fences (and gates). The same applies to walls (and doors and windows).
By way of object lesson, consider a room that has four strong doors, equipped with an electronic locks that record who comes in and when. All that counts for nothing if there is a fifth door with a low-tech combination lock, where the combination is widely shared, and there is no hope of recording who came in or when. (This is not a made-up example. Just such a room was used for central election tabulating operations in 2008.)
When threats can operate in parallel, the more threats there are, the less security there is. A long, rambling fence is more vulnerable than a compact fence.
Parallelism applies to time as well as space. Defenses must be maintained at all times as well as all places. Remember the attacker has the initiative. That is, the attacker gets to choose the time and place of the attack.
It is desirable to have a layered defense. As a preliminary example, consider multiple concentric fences, such that the attacker must get through N fences before he can do any harm. In this case, the larger N is, the more security there is (assuming that each of the N fences contributes some nonzero amount).
It is preferable to have multiple layers of different kinds (rather than just adding more layers of the same kind). For example, having guards on patrol plus video surveillance is preferable to having just guards or just video. That’s because the video serves as a check on the guards. A trick that gets past the guards won’t necessarily get past the video. (In contrast, having N identical concentric fences is not optimal, because an attacker who figures out how to defeat one fence may be able to defeat all the others.)
A layered defense creates an element of helpful redundancy.
Maintaining a layered defense requires some discipline. If you get sloppy, you might not worry about layer #1 because layer #2 will take up the slack, and you might not worry about layer #2 because layer #1 will take up the slack, et cetera. If you follow this sloppy practice too long, layer after layer will fail, and you won’t notice until there is a catastrophic failure of the whole multi-layer system. Therefore the rule is: It does not suffice to test a layered system as a whole. Fastidiously maintain each layer separately. Test each layer separately. (You can also do an end-to-end test of the multilayer system if you like, but that’s not a substitute for layer-by-layer testing.)
Some goals can be achieved by spot-checking. A simple, well-known example is highway speed enforcement. The police do not need to catch every single speeder. If only a small percentage of speeders are caught, and a sufficient penalty is imposed, deterrence is achieved.
It must be emphasized that the penalty is an indispensable part of the deterrence scenario. If the penalty is too small, speeders will speed as much as they want, because they don’t mind getting caught. This may sound obvious, but there have been cases where large-scale electoral wrongdoing has been detected, but those responsible suffered no penalty. See reference 17.
As an important application of spot-checking, hand-counting 100% of the ballots may not be necessary, except in unusually close elections. Hand-counting a statistical sample, if done right, should serve as a satisfactory check on the accuracy of the machine count. See section 4.
As discussed in detail in reference 1, there are several good reasons to hand count a statistically significant sample of the ballots right there in the precinct, right after the polls close. This is in addition to the usual machine count of 100% of the ballots.
Hand counting 10% of the ballots shouldn’t take too long, and will predict the outcome of the full count with a 1% margin of error with 99% confidence.
Anybody who wanted to hack the election would need to hack the hand count and then hack the machine count by the same amount.
Also, whenever the machine count indicates that the race is close, within a percent or two, it should be routine to hand-count 100% of the ballots for that race, and then do a second machine count. This makes a total of three counts.
Also do a 100% hand count in any precinct where the 10% hand count differs from the machine count by more than the expected 1% margin of error. Note that this will happen in at least 1% of the precincts, due to statistical fluctuations in the 10% sample. This is a significant burden, because it means that 10% of the precincts will be late in reporting their results.
As a further check, it would be worthwhile to do a 100% hand count of some randomly selected precincts, even if the race is not close and even if the 10% hand count was not discrepant.
It is to be emphasized that all this happens routinely, not based on any request or challenge from any candidate.
Keep in mind that the primary, fundamental, and overarching goal is to have free, fair, accurate, and trustworthy elections in the future.
Also keep in mind the maxim: Treat ballots like money. As a corollary, changes to the election system should be treated like purchase orders.
Suppose that someone uses county funds to purchase a ladder. A routine audit shows that the purchase was requested by "Admin" and the purchase order was approved by "Admin". Since Admin is a pseudonym, there is no accountability. No auditor would tolerate this.
This is intolerable because there is no way of knowing whether purchase is proper or not. Let’s be clear: The burden of proof is not on the auditor to prove that the purchase is improper. The burden is on the purchase to prove that it is proper. Prudent business practices demands keeping an audit trail, so that it is easy to demonstrate that each and every purchase was proper.
Any competent business manager will institute systematic purchasing procedures before there is a huge financial fraud. By the same token, a competent election manager will institute systematic procedures before there is a huge fraud. The controls have two purposes: (a) to make sure there is no impropriety, and (b) to make sure there is not even the appearance of the possibility of impropriety.
Procedures and software tools are available to facilitate doing such things much more systematically. We can look to the Linux kernel as an example. Do you think anybody would tolerate a patch to the official Linux kernel where we didn’t know who submitted the patch, who tested the patch, who committed the patch, or why?
The Linux project uses "git". The author of a patch can digitally sign the patch. Those who test the patch can digitally sign their test reports. The guy who commits the patch to the official repository can digitally sign the commit message. The commit message answers the question of "why" the patch was desirable. For an example of what a git log looks like, see e.g. http://mapserver.flightgear.org/git/gitweb.pl?p=fgdata;a=summary
Doing things properly comes at a cost. Having two people sign off on every patch to the election systems creates bureaucratic burden on the people who do the work. This is entirely analogous to the bureaucratic burden of filling out purchase orders whenever you want to buy something with government money. This burden is part of the cost of doing business. It is necessary to prevent impropriety and to prevent the appearance of impropriety.
One way to protect the audit log is to make multiple copies and distribute them widely.
There should be an easy way to make backups “at the push of a button”. The backup file should have an auto-generated unique name, to minimize the chance of inadvertently overwriting an earlier backup.
It wouldn’t hurt to make the backup file read-only. This is no protection against intentional tampering, but it offers some token resistance to unintentional snafus.
The Diebold GEMS system keeps all three of those things in the same database. (I’m not kidding. You can’t make this stuff up.) This makes it super easy to tamper with the votes and/or tamper with the configuration, and then tamper with the audit log so as to cover your tracks.
To say the same thing the other way, it would be a Bad Idea to have one omnipotent “Admin” account that is needed for doing routine things but capable of doing non-routine things.
There exist various well-known techniques for achieving separation: Different processes on the same machine, different virtual machines on the same hardware, or even completely different machines (such as a separate, loosely-connected machine to receive immutable backups and logfile entries).
Similarly we should insist that the tally machine use ECC memory.
Partial mitigation: Rigorous supervision and chain of custody.
Possible mitigation: Immediate hand-count of a statistically-significant random sample.
Possible mitigation: Scanning and tallying each ballot in real time, as it is cast, so we know it wasn’t “spoiled” when it went into the ballot box.
Mitigation: item 2-6.
Mitigation: Any early tally must either be kept rigorously secret, or be made public, so that no one can derive any partisan advantage from it. It is paradoxical but true that complete secrecy is fair, and complete openness is also fair.
Partial mitigation: avoid DRE: item 2-3. Other mitigation SORELY LACKING in current-generation tally machines.
Mitigation: In the longer term, secure hardware and secure open software.
Sorting could be made very easy with even a teensy bit of pre-planning. For starters, if there are N precincts, obtain N distinct PO boxes and print the appropriate PO box number on the envelopes, so that the Post Office does most of the sorting for you. Similarly print big fat stripes at precinct-dependent places on the envelopes, so that you can see at a glance if a stack of envelopes contains one that doesn’t belong.
This probably never worked perfectly, but it certainly doesn’t work now, since it is too easy to create plausible-looking ballots using a laser printer. Also current inventory-control methods are, in many cases, inadequate.
In accordance with the principle of treating ballots like money, you could imagine printing ballots on special paper using special ink and other fancy security measures, but this is not the usual practice.
Similarly note that paper money has serial numbers, whereas ballots customarily do not. This is a genuine dilemma. Serial numbers could be used to compromise voter privacy (leading to coercion, vote-buying, and other onesey-twosey attacks) but the lack of serial numbers makes the system more vulnerable to stuffing and other large-scale attacks.
Actually having some sort of unique numbering of ballots might not be so bad, if certain precautions were taken. The tracking number for each ballot comprises the precinct number and a unique random nonce. The nonce is random, not sequential, so it is technically not a “serial” number. The nonce is assigned on a per-precinct basis, so that it is possible for each precinct to determine which of its ballots has been used, without requiring any wide-area communication. The tracking number is printed on the ballot in some sort of bar code that is easy for a machine to read but hard for humans to read at a glance. This plus the fact that the nonces are not sequential makes it hard for pollwatchers to know which voter got which ballot.
Option A: The ballot-printing machine generates the tracking numbers, and keeps track of which numbers have been issued. This data is given to the tally machine, which checks off each number as it is used. It alarms if a wrong number shows up, or if a number is not on the list of issued numbers. In the case where the list of issued numbers is lost, the tally machine can revert to “legacy mode” i.e. it can count ballots without regard to the tracking numbers.
Option B: The tracking number is digitally signed. The private key must be very tightly protected. The public key can be widely known. The public key is used by the vote-scanning machine to check the validity of the tracking number. The machine then keeps track of which numbers have been used, and alarms if a number is re-used, or if an invalidly-signed number is used. Option B (unlike option A) means the tally machine does not need any prior knowledge of which tracking numbers have been issued. This is vulnerable to ballot-stuffing if the private key leaks out.
As a defense, most check-blanks nowadays use specially treated paper which shows if they are attacked by solvent.
In the case of ballots, it may be easier (and better) to give voters pens with relatively indelible ink. Chain the pen to the voting booth. Cheap “gel” pens are reputedly solvent-resistant, but it is possible to go much farther than that, including special inks that undergo an irreversible chemical reaction with the cellulose in the paper.
So the question arises, what do you do if stray marks are found? This is a dilemma. If you invalidate ballots that have stray marks, you solve one problem but create another: It becomes too easy for someone to tamper with the election by spoiling ballots after they are cast, simply by adding stray marks.
This dilemma can be mostly solved by scanning ballots at the time they are cast. If stray marks are detected at that time, the voter must re-vote using a new, clean ballot. If stray marks are not detected at that time, stray marks detected later will not invalidate the ballot. (I know of a loophole here, but I’d rather not discuss it.)