forked from ukgovdatascience/Python-for-Analysts
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 04864a7
Showing
67 changed files
with
22,560 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Python for Analysts Training" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Hi! And welcome to the Python for Analysts training course. This covers everything you need to know to start using Python for data analysis and visualisation as well as showcasing some more advanced and snazzy stuff, including Statistics, Machine Learning, Web Scraping / Interaction etc.\n", | ||
"\n", | ||
"The course assumes no prior knowledge of Python and will teach you everything you need to know in order to use Python for data analysis and visualisation, including interfacing with Python via the Jupyter interface, using Text Editors / Integrated Development Environments (IDEs), upgrading Python, working with the command line etc.\n", | ||
"\n", | ||
"Lastly, note that the course can only hope to give you an introduction to Python for Data Analysis over the 3 days. You'll no doubt want to continue your learning afterward, and the course provides links to relevant material with which to further your development." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Structure of the Course" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The course is designed to cover the best part of 2 days with time for exercises and consolidation in between.\n", | ||
"\n", | ||
"You will be expected to have a project to practice with ideally for days 1 and 2 but definitely for day 3! This will allow you to consolidate your Python training and continue to learn and develop.\n", | ||
"\n", | ||
"The structure of the course is as follows:\n", | ||
"\n", | ||
"## Day 1-2:\n", | ||
"\n", | ||
"### <b>Basics</b> \n", | ||
"\n", | ||
"* Interfacing with Python\n", | ||
"* Basic Python Sytnax\n", | ||
"* Data Structures\n", | ||
"* Coding concepts\n", | ||
"* Looping\n", | ||
"* Enhancing Python with Packages\n", | ||
"\n", | ||
"### <b>Working with data</b>\n", | ||
"\n", | ||
"* Data Analysis Libraries\n", | ||
"* Advanced Data Structures\n", | ||
"* Importing / Exporting Data\n", | ||
"* Working with DataFrames\n", | ||
"* Summary Statistics\n", | ||
"* Tables\n", | ||
"\n", | ||
"### <b>Visualisation</b>\n", | ||
"\n", | ||
"* Static Visualisation\n", | ||
"* Statistical Visualisation\n", | ||
"* Interactive Visualisation\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Those taking the course should note that the best way to consolidate their learning is via your project. Not only will this help you embed what you've leared, but it will also get you used to solving problems and continuing your learning journey in Python!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Following along" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"During the lectures, you might wish to just listen, follow along on your screen, or execute the code in your own blank notebook, make notes etc. All of this is fine so long as you pay attention!\n", | ||
"\n", | ||
"In most of the lectures the code is 'pre-baked' - we will explain what it does, execute it and show you and talk you through the output. This means we can give the class our full attention and not focus on finding typos or wondering why code didn't run properly!" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,275 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# B00: Introduction" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Welcome to the Python for analysts training course!\n", | ||
"\n", | ||
"This course is aimed at existing analysts who have some experience using analytical software (e.g. SAS, SPSS, STATA, Matlab etc.) who want to learn Python." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## So what is Python?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Great question.\n", | ||
"\n", | ||
"It's an open source programming language created by a guy called Guido van Rossum in 1989. He now works for Google and is Python's BDFL or Benevolent Dictator For Life. This essentially means he's in charge of the direction of the language. This is done (in conjunction with the user community of course!) via PEPs or Python Enhancement Protocols. You can read these at www.Python.org but two great examples are PEP 8 and PEP 20.\n", | ||
"\n", | ||
"* <a href = \"https://www.python.org/dev/peps/pep-0008/\">PEP 8</a>\n", | ||
"\n", | ||
"* <a href = \"https://www.python.org/dev/peps/pep-0020/\">PEP 20</a>\n", | ||
"\n", | ||
"\n", | ||
"Over time Python has been developed by its user community and is now recognised as one of the 'big 2' programming languages for Data Science, the other being R.\n", | ||
"\n", | ||
"Python is:\n", | ||
"\n", | ||
"* an easy and intuitive language just as powerful as major competitors\n", | ||
"\n", | ||
"* open source, so anyone can contribute to its development\n", | ||
" \n", | ||
"* code that is as understandable as plain English\n", | ||
" \n", | ||
"* suitability for everyday tasks, allowing for short development times\n", | ||
"\n", | ||
"* named after Monty Python's Flying Circus\n", | ||
"\n", | ||
"From time to time you'll also hear that Python is an Object Orientated Programming (OOP) language. You don't have to worry about this when you're starting out and for the most part it will operate in a similar manner to other statistical and analytical software that you may have used.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## What makes Python so great?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"* It's free\n", | ||
"\n", | ||
"* It's syntactically simple\n", | ||
"\n", | ||
"* It's open source\n", | ||
"\n", | ||
"* It has an energised user community\n", | ||
"\n", | ||
"* Easy to get answers to questions (e.g. Stack Overflow)\n", | ||
"\n", | ||
"* Its capability can be extended with packages (also called libraries, modules, extensions etc.)\n", | ||
"\n", | ||
"* It can do so many things (e.g. Data, Statistics, Modeling, Visualisation, Web development...)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## I hear there are different versions of Python... What's up with that?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Yes, there are two versions of Python in use: Versions 2 and 3. The differences are subtle and when you're starting out this needn't concern you too much.\n", | ||
"\n", | ||
"It's fair to say that up until recently Version 2 was probably slightly more used, however Python 3 has overtaken it in terms of popularity and since we're picking up Python from scratch with this training, it's better to use version 3.\n", | ||
"\n", | ||
"Additionally the Python Data community (Pydata for short) are pushing people toward version 3 of the language." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Extending Python" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"It's safe to say that on its own Python is pretty useless for Data Science. However, over time the user community has developed a number of packages or libraries for Python. These extend the core functionality and enbable users to do a wider variety of things with it." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Good examples of packages for Python include:\n", | ||
"\n", | ||
"* Pandas for data analysis\n", | ||
"\n", | ||
"* Numpy for Mathmatical functions\n", | ||
"\n", | ||
"* Scipy for Scientific computing and machine learning\n", | ||
"\n", | ||
"* Matplotlib for data visualisation\n", | ||
"\n", | ||
"We'll be covering some of these later in the course!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Anaconda" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The Anaconda Installation gathers together 400 or so of the most used Python packages and libraries for data analysts and scientists and is pretty much a one stop shop for everything you need to get started.\n", | ||
"\n", | ||
"One of the drawbacks to open source software is that does tend to have compatability issues sometimes and this can lead to problems installing. Anaconda gets around all of this so it will be our starting point for our Python Journey." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## The REPL" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Interacting with Base Python is via the REPL (Read, Eval, Print, Loop). You can find the Python executable file by searching for 'Python.exe'. Double clicking the icon will bring up the REPL and after a few seconds you should see the REPL prompt:\n", | ||
"\n", | ||
"'>>>'\n", | ||
"\n", | ||
"If you really wanted to you could execute everything in this course via the REPL and whilst it has uses we're not really interested in it for the time being so just be aware that it's there for now!" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Jupyter" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"the Jupyter Notebook is a browser based interface in which you can type and execute code as well as visualise your results.\n", | ||
"\n", | ||
"What you're reading now is a Jupyter Notebook and it's agreat way to mix up narrative, pictures, data, results and graphs whilst also signposting you to useful information elsewhere. The majority of the exercises in this course will take place in the Jupyter notebook. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## IDEs / Text Editors" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In addition to the Jupyter Notebook, you'll sometimes here people talk about IDEs and Text Editors.\n", | ||
"\n", | ||
"#### IDE\n", | ||
"\n", | ||
"IDE stands for 'Integrated Development Environment'. Essentially this is a more advanced interface for interacting with Python than the default Python Console (REPL). IDEs allow us to create, execute, debug and save scripts.\n", | ||
"\n", | ||
"#### Text Editor\n", | ||
"\n", | ||
"Text editors are just that - tools for creating, and saving scripts with limited debugging capability and no functionality to execute.\n", | ||
"\n", | ||
"Since Jupyter is an excellent interface for executing code, I generally just use a text editor to store code. When I want to execute code, I simply copy and paste across. This isn't for everyone though and my advice is to do whatever you feel comformtable with. You can quite easily get by using just Jupyter for now.\n", | ||
"\n", | ||
"Note that if you want to save down code outside of Jupyter, you'll need to give it a .py suffix. This will tell your system that it's a Python file.\n", | ||
"\n", | ||
"Some good text editors include <a href=\"https://atom.io/\">Atom</a>, <a href=\"http://brackets.io/\">Brackets</a>, <a href=\"http://www.sublimetext.com/2\">Sublime Text</a> and <a href = \"https://notepad-plus-plus.org\">Notepad++</a> which we'll also be using in this course and will be meeting a bit later.\n", | ||
"\n", | ||
"IDEs are more complicated to use, but a good starting point is <a href =\"https://pythonhosted.org/spyder/\">Spyder</a> (Scientific PYthon Development EnviRonment) comes as part of Anaconda and is aimed at analytical and scientific users.\n", | ||
"\n", | ||
"#### Your Cookbook\n", | ||
"\n", | ||
"Learning & remembering all the syntax of Python is difficult! This gets worse as you branch out into other languages...\n", | ||
"\n", | ||
"The first file you'll create will be your 'Cookbook'. This will contain all your useful code:\n", | ||
"\n", | ||
"* Things you do on a regular basis\n", | ||
"\n", | ||
"* Things you do on a not so regular basis\n", | ||
"\n", | ||
"* Code that you 'hack' and use from others\n", | ||
"\n", | ||
"* Shortcuts, reference, tips and tricks...\n", | ||
"\n", | ||
"* Lots of supporting notes!!!\n", | ||
"\n", | ||
"I find it a lot easier to save my cookbooks in a text editor, but again this is up to you. I also keep my cookbooks on my <a href = \"http://www.Github.com/Tommo565\">Github</a> account. Feel free to take a look and 'fork' it for your own use." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Object Orientated Programming (OOP)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"You'll hear a lot about Python being an 'Object Orientated Programming' (OOP) language. This is a computer science term and isn't something you need to worry about too much in the scope of this course.\n", | ||
"\n", | ||
"For now it's important that you understand, that everything you meet in Python is an 'Object' in the sense that anything can be assigned to a variable or passed as an argument to a function. This may not make sense now, but hopefully will become clearer as the course progresses.\n", | ||
"\n", | ||
"There is some further reading on OOP below:\n", | ||
"\n", | ||
"<a href = \"https://jeffknupp.com/blog/2014/06/18/improve-your-python-python-classes-and-object-oriented-programming/\">Classes and Object Orientated Programming</a>\n", | ||
"\n", | ||
"<a href = \"http://www.python-course.eu/object_oriented_programming.php\">Object Orientated Programming</a>\n", | ||
"\n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
Oops, something went wrong.