Data Clinic Technical Fit Assessment

Public repository that exposes project partners and collaborators to open source tooling, python libraries, and simple programs requiring command line execution.

How do we assess fit?

Working through the steps laid out in this README helps us jumpstart a conversation around your comfort using open source tools and technologies that the Data Clinic could use during the partnership. For example, docker, python-based modules, jupyter notebook, and command line execution.

When our partners don't have a prescribed ecosystem (i.e. tech stack) it's important that we design a solution that you can truly own after handover. Early exposure to the design components that support our analytics and applications helps project partners know what they'll be using to do their work.

Early shared understanding of what we need to do (re: design) and what y'all need to do (re: upskilling) gives us all much more room to maneauver. For example, we can make different design choices from the get-go, spend some time through-out the project upskilling, or guide you towards free or low-cost learning materials. Or perhaps this experience stimulates conversations within your team on different platforms to explore going forward (Tableau? Google Cloud? Azure?).

What must you do?

To complete the exercise, follow the steps below. We recommend that you actively document your process and as you go along.

Some thoughts to keep in mind as you journal:

Where did you get stuck? How did you unstick yourself? Did you use any online guides/materials to help you out?
What did you learn? Was anything you were asked to do that surprised you?
Think about your current processes - imagine where this new tool will fold into that process. Describe that too.

Steps

1. Do Your Downloads!

Get your code editor ready

We recommend that you adopt a code editor to avoid working solely in terminal. For the sake of this exercise, download the free version of Visual Studio Code. Note that there are other open source, free editors out there to chose from!

Get your package manager ready

We'll be using Homebrew

Open a terminal. How to open a terminal on MAC
Paste this into the terminal. This will run an installation script, which will pause and offer explanation throughout the process and will periodically ask for your permission to continue on.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Set-up Github

Git is a distributed version control system. Every change made to code is tracked in the repository. Also, the entire codebase and history is available on every developer/user's computer, which allows for easy branching and merging and executing. More info on Github

Must Dos

Create a Github account
Use brew to install git command line package. Paste this into the terminal. This tutorial video provides additional guidance on set-up (if needed) How to Install Git on Mac OS

brew install git

Extra Credit

Fork a repo

Get your Docker daemon ready

We aren't going to ask you to set-up a python-friendly development environment on your local machine. Instead, we'll guide you through a containerized solution.

"Docker is a tool for creating and deploying isolated environments for running applications with their dependencies. Basically, Docker makes it easy to write and run codes smoothly on other machines with different operating systems by putting together the code and all its dependencies in a container. This container makes the code self-contained and independent from the operating system." Source Article

Install Docker using Brew

brew install docker

Install Colima. Colima is an open source project that let's us work with docker containers without needed to work through the Docker Desktop.

brew install colima

2. Interact with tools and libraries via command line execution

Build and run the docker container

Open Visual Studio Code and clone this repository.

colima start --cpu=4 --disk=100 --memory=6 --dns=1.1.1.1

docker build -t data-clinic-technical-assessment .

docker run -it -p 8000:80 -p 8888:8888 -v $(pwd):/technical_test data-clinic-technical-assessment bash

Generate data files via command line

make run_data

This will write a file called test.csv to ./data/interim directory.

python -m src.data.make_census_data --group_id S1501 --geometry_level NECTA

This will call the census api and populate write the following files to your local ./data directory.

./raw/NECTA
        └── tiger_zip.zip

./interim/NECTA
            ├── tl_2020_us_necta.cpg
            ├── tl_2020_us_necta.dbf
            ├── tl_2020_us_necta.prj
            ├── tl_2020_us_necta.shp
            ├── tl_2020_us_necta.shp.ea.iso.xml
            ├── tl_2020_us_necta.shp.iso.xml
            └── tl_2020_us_necta.shx

./processed/NECTA
              └── census_geom_NECTA.geojson

If you see these files, then it worked!

Feel free to explore census data by trying different table/geography combinations. List of ACS 2020 5-year subject tables

Modify the config.py file

Open the config.py file located under the ./src directory. Modify the string variable specified after WORD_FOR_PRINT = and save the file. Run the command make run_data again and see if the printed statement changes.

Create Geospatial Visualizations in Jupyter Notebooks

Once you've launched your local container and you're in the bash terminal, execute:

jupyter lab --ip 0.0.0.0 --allow-root --NotebookApp.notebook_dir=/technical_test

Copy and paste the https path from the terminal into your web browser. I am confident that this works for Chrome. This link will take you to a Jupyter Lab environment whose kernel is backed by your local docker container. Note that closing the terminal will end your Jupyter Lab session

There is an example notebook in the ./notebooks directory called data_visualization.

Practice running the notebook cells. And feel free to play around with them! This is a space designed for you to explore.

To end your jupyter session, run ctrl+c in terminal.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
notebooks		notebooks
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-logo.png		docker-logo.png
environment.yml		environment.yml
jupyter-lab-logo.png		jupyter-lab-logo.png
requirements.txt		requirements.txt
site-logo.png		site-logo.png
vsc-logo.png		vsc-logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Clinic Technical Fit Assessment

How do we assess fit?

What must you do?

Steps

1. Do Your Downloads!

Get your code editor ready

Get your package manager ready

Set-up Github

Get your Docker daemon ready

2. Interact with tools and libraries via command line execution

Build and run the docker container

Generate data files via command line

Modify the config.py file

Create Geospatial Visualizations in Jupyter Notebooks

References [for Jen as she develops...]

About

Releases

Packages

Languages

License

tsdataclinic/data-clinic-tech-fit

Folders and files

Latest commit

History

Repository files navigation

Data Clinic Technical Fit Assessment

How do we assess fit?

What must you do?

Steps

1. Do Your Downloads!

Get your code editor ready

Get your package manager ready

Set-up Github

Get your Docker daemon ready

2. Interact with tools and libraries via command line execution

Build and run the docker container

Generate data files via command line

Modify the config.py file

Create Geospatial Visualizations in Jupyter Notebooks

References [for Jen as she develops...]

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages