Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison against TF-IDF Vectorizer (using sklearn) #7

Open
9 tasks
BALaka-18 opened this issue Sep 25, 2020 · 2 comments
Open
9 tasks

Comparison against TF-IDF Vectorizer (using sklearn) #7

BALaka-18 opened this issue Sep 25, 2020 · 2 comments
Labels
code code based issue Hacktoberfest This issue is under Hacktoberfest 2020 medium intermediate level issues

Comments

@BALaka-18
Copy link
Owner

BALaka-18 commented Sep 25, 2020

Description

TF-IDF is one of the most famous algorithms when it comes to keyword extraction from text. Your task is to create a function that will extract keywords from text using the TF-IDF algorithm and compare the results against this library. How similar / different are the results ?

For reference :

For your reference, you may read these :

  1. Keyword extraction
  2. TF-IDF Vectorizer - Sklearn docs

Folder Structure, Function details

Create a folder tfidf_vectorizer in the root directory. The folder must contain a .py file that will contain the function for extracting the keywords from text using sklearn's TfidfVectorizer.

Structure : tfidf_vectorizer/extract_keywords_tfidf_sklearn.py

Acceptance Criteria

  • Code must be properly formatted.
  • Code must be accompanied by appropriate comments.
  • File structure must be strictly maintained.
  • Test cases must be present at the end of the code.
  • Variables and functions must be properly named
  • IMPORTANT : Make sure requirements.txt file is updated if you are including any new library.
  • All instructions provided in the Description must be strictly followed.

Definition of Done

  • All of the required items are completed.
  • Approval by 1 mentor.

Time Estimation

1.5 hours

@BALaka-18 BALaka-18 added code code based issue Hacktoberfest This issue is under Hacktoberfest 2020 medium intermediate level issues labels Sep 25, 2020
@rexdivakar
Copy link

Assign it to me @BALaka-18

@BALaka-18
Copy link
Owner Author

@rexdivakar assigned.

@rexdivakar rexdivakar removed their assignment Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code code based issue Hacktoberfest This issue is under Hacktoberfest 2020 medium intermediate level issues
Projects
None yet
Development

No branches or pull requests

2 participants