The Oregon Constitution enshrines Oregonians' right to directly enact legislation through ballot initiatives. In simple terms, the process entails a predetermined number of Oregon voters (112,020 for the November 2022 election) to petition the Secretary of State of Oregon to include the proposed legislation on a general election ballot. In order to ensure that these voter petitions are valid, they need to be validated against state voter files manually, but this process is heavily time-consuming and tedious. To speed up the process of validation, this project aims to essentially convert handwritten petitions into digitized data using handwritten text recognition.
Digitized Oregonian petitioner data (name, address), as well as the petition number, will be stored in an excel file. The data will be extracted using Google Vision, which reads the petition sheet using OCR. Extracted data is initially stored in a Pandas Dataframe, as this by far the more efficient library for data storage and manipulation within Python.
The OCR output is not always accurate, hence data validation needs to be performed before it can be exported. This involves comparing the data with the Oregon Voter File, which contains all registered voters in Oregon. We employ a two-pass approach, first by validating the names, and then a second pass to validate the addresses, since the accuracy of the data is key. These passes essentially compare the OCR data with the Voter File using a technique called Fuzzy Matching, which determines the “closeness” of two words using the Levenshtein distance. Further, considering the size of the Voter File is almost 4 million unique entries, we speed up the process using multiprocessing within Pandas. Once the verified output is obtained, this is automatically exported as an Excel file.
This will allow Solving for Progress LLC to provide consulting services including strategic research and support campaigns for progressive political campaigns that align with these petitions. Further, Solving For Progress LLC will be able to keep interested petitioners updated on the current status of the petition.