This project is a Python application designed to decrypt EDM festival lineups encrypted with a substitution cipher. I was inspired to by the Countdown NYE 2023 lineup, which was teased on Instagram in an encrypted format. This repository includes:
- Jupyter notebooks used to develop and test the decryption logic on the Countdown NYE 2023 lineup
- Python scripts for formatting and decrypting new encrypted lineup files, ensuring reusability and efficiency
Note: The decryption algorithm is based on heuristics and assumptions about the structure of the ciphertext. The algorithm is not guaranteed to work on all encrypted lineups, and may require manual intervention to decrypt certain artists.
- Clone the repository:
git clone https://github.com/stevenxngo/lineup-decoder
- Navigate to the project directory:
cd lineup-decoder
- Install the required packages:
pip install -r requirements.txt
-
Place the encrypted lineup text file in the
data/ciphertext
directory -
Run the main script to prepare the data and decrypt the artist names with the following command (replace
[filename]
with the name of the encrypted lineup file, excluding the file extension):
python main.py [filename]
- The decrypted lineup will be saved in the
data/decoded_plaintext
directory
-
Place the plaintext lineup text file in the
data/plaintext
directory -
Run the main script with the
-e
flag as per the following command (replace[filename]
with the name of the plaintext lineup file, excluding the file extension):
python main.py [filename] -e
- The encrypted lineup will be saved in the
data/ciphertext
directory
- Data Preparation: Converts raw encrypted lineup text files into a structured CSV format
- Decryption: Implements a substitution cipher decryption algorithm to convert encrypted artist names into readable formats
- Automation: Provides a streamlined pipeline for processing new encrypted files with minimal manual intervention
- Modular Code: Separates data preparation, decryption and encryption into distinct, reusable scripts
The decryption algorithm is based around heuristics and assumptions about the structure of the substitution cypher encrypted text of alphanumeric characters. The heuristics include:
(-- ---)
where each character is different is(DJ SET)
(X---XYZ XYZ)
is(SUNSET SET)
(Z-------- XYZ)
is(THROWBACK SET)
... XYX ...
is... B2B ...
Assuming at least two of the heuristics are present in the cyphertext, this results in a starting dictionary of known mappings to create a template. The algorithm then iterates through each artist in the cyphertext, filling in a template with known characters (e.g. if J
-> E
and G
-> T
and the cyphertext is HJGJD
the template is _ETE_
).
A list of possible matches is then generated by iterating through a list of known artists (located in data/artists.txt
) and comparing the template to the artist name. If there is only one match and the at least 50% of the artist cyphertext is known, the artist is added to the plaintext and dictionary is updated. This process is repeated until either all artists are decrypted or no new artists can be decrypted.
Note: The current implementation can only process artists with alphanumeric characters and spaces. Artists with special characters (e.g. K?D, A-Trak, ABOVE & BEYOND) will require manual intervention.
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
- Countdown NYE for releasing the encrypted lineup and inspiring this project
- @lycheebeach on Tiktok for providing me with the idea to automate the decryption process
For questions, feedback, or inquiries about the project, feel free to reach out to me:
Steven Ngo - steventxngo@gmail.com - GitHub - LinkedIn