collection of classnotes, and class projects from MOOCs I took.
Deep Learning notebook
- Optimizing a neural network with backward propagation
- Building deep learning models with keras
- Fine-tuning keras models 100%
Unsupervised Learning in Python notebook
- Clustering for dataset exploration: k-means clustering, Evaluating a clustering, Transforming features for better clusterings
- Visualization with hierarchical clustering (dendogram) and t-SNE
- Decorrelating data and dimension reduction: Principal Component Analysis" ("PCA"), PCA with sparse matrix
- Discovering interpretable features: dimension reduction technique called "Non-negative matrix factorization" ("NMF")
- Use NMF to build a recommendation system
Fraud Detection in Python notebook
- Resample methods for imbalance data: over sampling, under sampling, SMOTE method to
- Fraud detection using labeled data: supervised learning for fraud detection, Performance metrics for fraud detection, Adjusting algorithm weights, and Using ensemble methods to improve fraud detection
- Clustering methods for fraud detection ( KMeans, and MiniBatchKMeans), Elbow curve method to judge the right amount of clusters, Assigning fraud versus non-fraud, and DBscan
- Incooperate text data into fraud detection
- Topic Modeling on Fraud: Latent Dirichlet Allocation(LDA)
Data Visualization Class notebook
- Customizing 1D plots: apply ggplot style, reset style to default, add arrow to annotate a graph, rotate axis, legend
- Plotting 2D arrays: contour plot, 2d histrogram, plot images, histrogram and cumulative distribution function of a gray scale image, Equalizing an image histogram, Extracting bivariate histograms from a color image.
- Statistical plots with Seaborn: lmplot, residplot, regplot, jointplot, hue, violinplot, striplot, swamplot, pairplot, heatmap
- Analyzing time series: plot data with datetime index, multiple time slices, inset view
Interactive Visualization with Bokeh notebook
- Basics Bokeh: maker options, drawing geometrical shape using patch(), plotting pandas dataframe in bokeh, box_select tool, Hover tool, Colormap
- Building interactive apps with Bokeh: connet Bokeh widgets to a python code.
For example, generate fit after user select a plot, or change plotting data from a selection panel. Widget options include slider, select (dropdown), button etc.
Time Series Analysis notebook
- Merging Time Series With Different Dates
- Correlation, autocorrelation function
- Linear Regression
- Random Walk
- Stationarity, autoregressive (AR) Models
- Moving Average (MA) Model
- ARMA model
- Cointegration Models
- A Multivariate Time Series
Machine Learning for Time Series Data in Python notebook
- Classification heartbeat sounds: feature engineering and LinearSVC
- Regression stock prices
- Feature engineer time series data: envelope, tempogram, spectrogram, bandwidths, centroids
- Auto-regressive models
- cross-validating time series data
- How to work with non-stationary data, and assesting model stability
Analyzed data from the popular mobile game, Cookie Cats. Used bootstrap analysis to compare effectiveness of time pause at level 30 and 40 toward user retention notebook
Statistical Analysis in Python: random number generator and hacker statistics Bernoulli trials, Poisson distribution, normal distribution, exponential distribution, Probability function, Generate bootstrap replicates, calculate bootstrap confidence intervals, pairs boostrap, Formulating and simulating a null hypothesis, Pipeline for hypothesis testing, A/B testing, Hypothesis test for correlation coefficient notebook
Inferential Statistic notebook
- Variance, Covariance, and Correlation, Correlation tests: Pearson, Spearman rank, and Kendall Tau
- Chi-square Test of Independence
- McNemar test, Independent T-test, Paired Samples t-test, Welch’s t-test, Wilcoxon Sign-Ranked Test
- Analysis of Variance (ANOVA), ANOVA (2-way, N-way)
- Multiple Linear Regression, Logistic Regression