Skip to content

A Python legal transcript auto-formatter and checklist

License

Notifications You must be signed in to change notification settings

ArshavineRoy/pycheck

Repository files navigation

PyCheck

license python version Formation-Studio platforms

Introduction

This is PyCheck! A legal transcript auto-formatter and checklist, inspired by Automate the Boring Stuff with Python by Al Sweigart and Formation library by @ObaraEmmanuel.

pycheck-screenshot

Motivation

I have been proofreading legal transcripts manually (~500 pages weekly on average), following specific guidelines. The development of this tool aims to achieve the following:

  • Accuracy: Ensure all formatting guidelines are applied correctly, such as adding a colon and two spaces after all speaker IDs in Colloquy and a tab after all Q&As.

  • Consistency: Common words like videoconference should be spelled as such throughout the document.

  • One-Click functionality: Merge and format the document(s) in one click.

  • Ease & Speed: Simplify the time-consuming and tedious process of manual formatting, reducing it to mere seconds.

The use of PyCheck has improved my review scores to 100%.

For a raw sample transcript, see sample.txt and for a PyCheck-formatted document, see final.txt.

🚀 Quick Start

1. Clone the repository

git clone https://github.com/ArshavineRoy/pycheck
cd pycheck

2. Create and activate a virtual environment

  • To use virtualenv:

    virtualenv venv
  • Activation

    # macOS and Linux
    
    source venv/bin/activate
    
    # Windows
    
    venv\Scripts\activate

3. Install required dependencies

pip install -r requirements.txt

4. Run PyCheck

python app.py

5. Click Open and load raw sample.txt in the project's root folder

sample.txt in its entirety is a FULLY FICTIONAL deposition transcipt that serves as a file you might have to proofread, with deliberate mistakes included to demonstrate the power of PyCheck!

📖 Usage

PyCheck has the following perks out of the box:

  • Text color for different scenarios to easily differentiate between examination headings, by-lines, Colloquy, QA, and parentheticals.

  • Highlights any inconsistencies in QA examinations. This way, you won't have any Qs following each other. The order should be Q-A-Q-A-Q-A not Q-Q-Q-A-Q-A.

  • Edit to fix any mistakes in-app, save as a new file or copy-paste when done.

  • A refresh button to re-run the checklist instead of loading the file every time.

  • Highlights and formats all instances of strike that, just in case.

    When an attorney says, "Strike that," the statement that follows MUST start on a new line.

  • Highlights the beginning of different files, in case you're loading several parts.

    To demonstrate, duplicate sample.txt and select both files when uploading. Note: The files are alphabetically ordered.

Miscellaneous

Let's go over some legal jargon:

Author & License

Author - Arshavine Waema.

Licensed under the MIT License - see the LICENSE file for details.

About

A Python legal transcript auto-formatter and checklist

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages