This Streamlit app performs Named Entity Recognition (NER) using NLP techniques and links the identified entities to their corresponding Wikipedia pages. It also disambiguates ambiguous entities like "Apple," offering clickable links for better context.
- Extracts named entities from user input using spaCy.
- Links entities to Wikipedia articles.
- Handles disambiguation for ambiguous entities (e.g., distinguishing between Apple Inc. and apple the fruit).
- User-friendly interface built with Streamlit.
- Supports capitalization for proper noun detection and handling lowercase input.
-
Clone the repository:
git clone https://github.com/arya-io/NER-EntityLinker.git
-
Navigate to the project directory:
cd NER-EntityLinker
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run app.py
- Python 3.7 or higher
- streamlit
- spacy
- requests
- re (Regular Expression library)
./en_core_web_sm-3.8.0.tar.gz
file to be downloaded from github
- Open your terminal and run the Streamlit app.
- Enter text in the input field provided in the main section.
- Click on "Process Text" to see extracted entities with clickable links to Wikipedia.
- For ambiguous entities, the app will attempt to disambiguate and provide the most relevant Wikipedia link.
Example input:
Apple Inc. launched the iPhone 15 last week. Microsoft and Apple are leading the tech industry.
The app will extract entities like "Apple Inc.", "iPhone 15", and "Microsoft" and link them to relevant Wikipedia pages. It will also handle the ambiguations accordingly.
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request or raise an issue for any improvements or bug fixes.
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Commit your changes (
git commit -am 'Add some feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
- spaCy for Named Entity Recognition.
- Streamlit for building the app interface.
- Wikipedia API for entity linking.