- Change the model with new features - absences, failures and studytime.
- For absences, instead of original data, we set absences that are larger than or equal to 10 as 10.
- We tried to evaluate information gain for each feature to train the data. However, the typical method for decision trees doesn't work here.
- As a result, we retrain our data by our intuition for features that are possibly useful when evaluating applicants
- We split the data into 2 halves - one for train and one for test. This allows us to have a cross validation.
- We have gained a accuracy for about 80% - 85%. The origin model only have 50%. There's a significant increase in the predict precision. Please consult the homework assignment for additional context and instructions for this code.
pipenv is a packaging tool for Python that solves some common problems associated with the typical workflow using pip, virtualenv, and the good old requirements.txt.
- The version of Python you and your team will be using (version greater than 3.8)
- pip package manager is updated to latest version
- For additional resources, check out this link
To install pipenv from the command line, execute the following:
sudo -H pip install -U pipenv
The same instructions for Mac OS should work for windows, but if it doesn't, follow the instructions here.
The repository contains Pipfile
that declares which packages are necessary to run the model_build.ipnyb
.
To install packages declared by the Pipfile, run pipenv install
in the command line from the root directory.
You might want to use additional packages throughout the assignment.
To do so, run pipenv install [PACKAGE_NAME]
, as you would install python packages using pip.
This should also update Pipfile
and add the downloaded package under [packages]
.
Note that Pipfile.lock
will also be updated with the specific versions of the dependencies that were installed.
Any changes to Pipfile.lock
should also be committed to your Git repository to ensure that all of your team is using the same dependency versions.
Working in teams can be a hassle since different team members might be using different versions of Python.
To avoid this issue, you can create a python virtual environment, so you and your team will be working with the same version of Python and PyPi packages.
Run pipenv shell
in your command line to activate this project's virtual environment.
If you have more than one version of Python installed on your machine, you can use pipenv's --python
option to specify which version of Python should be used to create the virtual environment.
If you want to learn more about virtual environments, read this article.
You can also specify which version of python you and your team should use under the [requires]
section in Pipfile
.
You should run your notebook in the virtual environment from pipenv. To do, you should run the following command from the root of your repository:
pipenv run jupyter notebook
You should also use pipenv to run your Flask API server.
To do so, execute the following commands from the app
directory in the pip venv shell.
Set an environment variable for FLASK_APP. For Mac and Linux:
export FLASK_APP=app.py
For Windows:
set FLASK_APP=app
To run:
pipenv run flask run
Or if you're in the pipenv shell, run:
flask run
You can alter the port number that is used by the Flask server by changing the following line in app/app.py
:
app.run(host="0.0.0.0", debug=True, port=80)
To run tests, execute the following command from the app
directory:
pytest
If you're not in the Pipenv shell, then execute the following command from the app
directory:
pipenv run pytest