Skip to content

johnwslee/Data_Science_Libraries

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Personal Collections of Data Science Libraries

This repository comprises Jupyter notebooks containing various Python libraries that I found very useful for data science. Most of the code was collected from articles on Medium, an online publishing platform, for the purpose of personal study and practice. In some cases, the code was modified by myself to ensure that it actually works.

List of Libraries

1. Python & Linux

  - General Tips, PRegEx, Pathlib, PyCircular, Decorators, OpenCV, make & Makefile, Watchdog

2. Machine Learning

  2.1. Data Preparation

   - Pandas, NumPy, FiftyOne, PySpark, Upgini, Synthetic Dataset

  2.2. Data Visalization

   - Sweetviz, Matplotlib/Plotly/Seaborn, PyGWalker

  2.3. Models and Algorithms

   - scikit-learn, Mahalanobis Distance, Open3D, PyMLPipe, Reinforcement Learning, Predictive Maintenance

  2.4. Web Apps

   - Dash, Streamlit, Gradio, Modelbit, PyScript

3. Deep Learning

  3.1. Pytorch

   - Basics

   - CNN: Binary Classification, 1D/2D comparison, Transfer Learning, Multi-Classification, Multi-Label Classification

   - Visions: Image Captioning, Image Segmentation, Object Detection

   - Generative AI: DPDM

   - Advanced Topics: Temporal Fusion Transformer, Physics-Informed NN, Graph Neural Network, Transformer

  3.2. TensorFlow

   - Basics, TensorBoard, Autoencoder

  3.3. HuggingFace

   - How to Use HuggingFace

4. Models and Tools for Timeseries

  - Darts, tslearn

5. Survival Analysis

  - lifelines

6. Natural Language Processing

  - NLTK

7. Hyperparameter Optimization

  - Optuna

8. Explainable AI

  - SHAP, Grad-CAM

9. Low-Code Machine Learning

  - PyCaret

10. Statistics

  10.1. General

  - Statistical Testing Flowchart, Distributions and Collinearity, Categorical Correlation, A/B Testing with Resampling/Booststrapping, Power Analysis

  10.2. Bayesian Statistics

   - PyMC, PyStan, BNLearn

  10.3. Markov Chain

   - Markov Chain, hmmlearn

11. Web Scraping

  - BeutifulSoup/Selenium/Wordcloud

12. Large Language Models

  - Jupyter_AI/Pandas_AI/Langchain

About

Personal Collections of Data Science Libraries

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published