Source code for 2019/2020 MSc Data Science and Analytics dissertation. See final-report.pdf for full dissertation write-up.
Congressional Redistricting in the United States: Using Markov Chain Monte Carlo to Identify Partisan Gerrymanders
Many, if not all, representative democracies utilise geographic boundaries to define electoral populations. In the system of the United States, on which this work focuses, these so-called districts are subject to abuse. Partisan gerrymandering is the process of drawing electoral districts such that they favour the party authoring the map. With voters changing their political views, party preference, and even physical location, it can be difficult to determine if a district is gerrymandered. We seek to answer the question: how can one identify a partisan gerrymander? To do so, we deploy the technique of Markov chain Monte Carlo to produce a distribution sampled uniformly at random from the set of all legal districting plans. This is done using a bespoke library newly written in python by the author. We leverage this distribution in our analysis of a suspected gerrymander to assemble evidence to make our case. We find that, while difficult to prove definitively, enough statistics and tools exist to produce a compelling set of evidence for or against a gerrymander claim. We use these tools to demonstrate that a proposed districting plan is a partisan gerrymander. We conclude with a discussion of additional options available for future work.