- Visit the Readrr web application
- Primary API Documention
- Labs 21 Search API Documentation
Developer | Github | Portfolio | |
---|---|---|---|
Patrick Wolf | 🤷 | ||
Ryan Zernach | 💼 | ||
Michael Rowland | 💼 | ||
Jose Marquez | 💼 |
Developer | Github | Portfolio | |
---|---|---|---|
Claudia Chajon | 🤷 | ||
Enrique Collado | 🤷 | ||
Dylan Nason | 🤷 | ||
Kumar Veeravel | 🤷 |
The aim of this project is to provide a clean, uncluttered user interface that allows a user to track books, in a similar fashion to something like GoodReads. More details can be found in the product vision document (PVD accessible only to team members)
The core DS role on this project is to provide recommendations. If there are other DS utilties that you would like to add, communicate with Web, UI and iOS in order to get UI design input on the feature and identify the necessary data.
Currently, the application uses a simple nearest-neighbors-based search engine which funnels title matches into a system that references a cosine similarity matrix. The matrix is born out of a combination of collaborative and content based recommendation approaches. This method ultimately provides the best recommendations we have encountered to date. Unfortunately, the current data is limited to less than 10k books; in order to prevent empty recommendations where data for a book is non-existent, the hybrid engine falls back to a purely description-based recommendation wherever necessary. This means that there are two recommendation engines working together to provide a seamless experience.
The hybrid engine is an aggregation of cosine similarities from a collaborative filtering method and a content-based one, using descriptions. Alternatively, the content-based system uses a combination of spacy for tokenization, tfidf for vectorization and a scikit-learn nearest-neighbors model to find the closest matches to a book in question.
All of these techniques are served to Web and iOS through a Flask application, with a gunicorn HTTP server, deployed inside of a Docker container to AWS elastic beanstalk.
- 10k Books, 6m Ratings
- Book Crossing Dataset (Mostly used for publishers to populate database)
Description Based Recommendations
Details on how to connect to the Web API are located at the top of this document.
Currently, the domain for the data science API is dsapi.readrr.app
The account used for the postman collection is the betterreadslabs21@gmail.com account (Sign in using google). See TL or SL for login credentials (you can also get login credentials for AWS, which the api is deployed on).
When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.
Please note we have a code of conduct. Please follow it in all your interactions with the project.
If you are having an issue with the existing project code, please submit a bug report under the following guidelines:
- Check first to see if your issue has already been reported.
- Check to see if the issue has recently been fixed by attempting to reproduce the issue using the latest master branch in the repository.
- Create a live example of the problem.
- Submit a detailed bug report including your environment & browser, steps to reproduce the issue, actual and expected outcomes, where you believe the issue is originating from, and any potential solutions you have considered.
We would love to hear from you about new features which would improve this app and further the aims of our project. Please provide as much detail and information as possible to show us why you think your new feature should be implemented.
If you have developed a patch, bug fix, or new feature that would improve this app, please submit a pull request. It is best to communicate your ideas with the developers first before investing a great deal of time into a pull request to ensure that it will mesh smoothly with the project.
Remember that this project is licensed under the MIT license, and by submitting a pull request, you agree that your work will be, too.
- Ensure any install or build dependencies are removed before the end of the layer when doing a build.
- Update the README.md with details of changes to the interface, including new plist variables, exposed ports, useful file locations and container parameters.
- Ensure that your code conforms to our existing code conventions and test coverage.
- Include the relevant issue number, if applicable.
- You may merge the Pull Request in once you have the sign-off of two other developers, or if you do not have permission to do that, you may request the second reviewer to merge it for you.
These contribution guidelines have been adapted from this good-Contributing.md-template.
More info on using badges here