diff --git a/.gitignore b/.gitignore index ccc6528..be94aeb 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,3 @@ /data __pycache__ +.ipynb_checkpoints \ No newline at end of file diff --git a/README.md b/README.md index 5a4e044..c6a7196 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,31 @@ # EDA This repo serves as code reference for the following TDS post *insert link* +## Set up environment +1. Clone this repo + +Open terminal (bash or powershell) +``` +git clone https://github.com/notha99y/EDA.git +``` + +2. Get Ananconda + +Anaconda (download from [here](https://anaconda.org/anaconda/python)) + + +3. Create conda environment + +``` +conda env create -f=environment.yml +``` + +4. Get Data + +The titanic dataset can be download from https://www.kaggle.com/francksylla/titanic-machine-learning-from-disaster.
+Once downloaded, copy them into `data` folder from the root project directory. + +## Get your hands dirty +1. Create your own jupyter notebook
+2. Import data in using pandas
+3. Start playing with data and inventing your own style of EDA! \ No newline at end of file diff --git a/notes.txt b/notes.txt index 0d8799b..fd30b34 100644 --- a/notes.txt +++ b/notes.txt @@ -1 +1 @@ -Download data from \ No newline at end of file +Download data from https://www.kaggle.com/francksylla/titanic-machine-learning-from-disaster \ No newline at end of file diff --git a/pictures/typesofdata.png b/pictures/typesofdata.png new file mode 100644 index 0000000..1a19ea3 Binary files /dev/null and b/pictures/typesofdata.png differ