Plagiarism Checker is a Python program that compares two text files and calculates the similarity score between them. It uses the cosine similarity measure and TF-IDF (term frequency-inverse document frequency) to determine the degree of similarity between the texts.
Plagiarism is a serious academic offense that can have serious consequences, such as expulsion from school or legal action. The Plagiarism Checker project is designed to help students and educators check for plagiarism in their written work. By comparing two texts, the program can detect similarities and determine if plagiarism has occurred.
The program preprocesses the texts by removing non-alphabetic characters and stop words, and stemming the words. Then, it calculates the cosine similarity score between the texts using the TF-IDF measure. If the similarity score is above a certain threshold (0.8 by default), the program considers the texts to be highly similar and reports a potential case of plagiarism.
- Clone the repository or download the ZIP file.
- Install the required packages by running the command
pip install -r requirements.txt
. - Run the program by running the command
plagiarism_checker.py
. - Follow the prompts to enter the file paths of the texts to be checked.
To use the Plagiarism Checker, follow these steps:
- Open the command prompt or terminal and navigate to the directory where the program is saved.
- Run the command
plagiarism_checker.py
. - Enter the file paths of the texts to be checked when prompted.
- Read the report generated by the program, which indicates the similarity score and whether the texts are similar or not.
The Plagiarism Checker project was developed by GitProSolutions as a project for Plagiarism Checker. It uses the following third-party libraries:
- nltk
- scikit-learn
The Plagiarism Checker project is licensed under the MIT License. See the LICENSE
file for more details.