UCSanDiegoX edX Course DSE210x Statistics and Probability in Data Science using Python https://courses.edx.org/courses/course-v1:UCSanDiegoX+DSE210x+3T2017/course/
Welcome to Statistics and Probability in Data Science using Python! We are delighted to welcome you to Statistics and Probability in Data Science using Python. In this course, you will learn the motivation, intuition, and theory behind the probabilistic and statistical foundations of data science, and will get to experiment and practice with these concepts via Python programs and the Jupyter Notebook platform.
Course Staff Instructors Alon Orlitsky, Professor, ECE and CSE Departments, UC San Diego Yoav Freund, Professor, CSE Department, UC San Diego
Teaching Assistants Matthew Elliot, Graduate Student, CSE, UC San Diego Rohit Parasnis, Graduate Student, ECE, UC San Diego Hanwen Yao, Graduate Student, ECE, UC San Diego Zhen Zhai, Graduate Student, CSE, UC San Diego
What do you need to know to succeed? The course is intended for learners with an undergraduate degree or senior undergraduates interested in broadening their understanding of probability and statistics. We will assume basic knowledge of the following topics
Logic (e.g., De Morgan’s Laws) Set theory (e.g., what are functions) Calculus (e.g., calculating integrals and derivatives) Programming (e.g., basic experience with any programming language) Linear algebra (e.g., vectors and matrices) The Python programming language will be used throughout the course. If you would like to learn or gain more practice with Python, please consider viewing or taking the first course in this MicroMasters, Python for Data Science.
Overview The course will cover the following topics:
Counting and combinatorics Discrete and continuous probability Conditional probability and Bayes’ Rule Random variables Expectation, variance, and correlation Common distribution families Probabilistic inequalities and concentration Moments and limit theorems Hypothesis testing Sampling and confidence intervals PCA and regression Entropy and compression Learning Objectives The course will teach you how to visualize, understand, and reason about probabilistic and statistical concepts, and how to apply your knowledge to analyze data sets and draw meaningful conclusions from data. We will cover both theoretical and practical aspects, and will start each topic with motivation and intuition and will proceed with rigorous arguments and provable techniques. Each topic will be accompanied by a Python Notebook that you could run and modify to experiment with the material learned and get a better feel for the material covered.
Course Outline The course consists of 10 units. In each of the course’s first 10 weeks we will release one unit, and you will have six weeks to complete it.
● Week 1 - Introduction
● Week 2 - Sets
● Week 3 - Counting and Combinatorics
● Week 4 - Probability and Conditioning
● Week 5 - Random Variables, Expectation, and Variance
● Week 6 - Discrete and Continuous Distribution Families
● Week 7 - Inequalities and Concentration Theorems
● Week 8 - Sampling, Confidence Intervals, and Hypothesis Testing
● Week 9 - Regression and Principal Component Analysis
● Week 10 - Entropy and Compression