Skip to content

This project implements in Python some common statistical analysis methods used in data analysis, including Entropy, Mutual Information, Kolmogorov–Smirnov test, Kullback-Leibler divergence (KLD), AB tests (Mann-Whitney U and t-tests)

License

Notifications You must be signed in to change notification settings

dple/Statistical_data_analysis

Repository files navigation

Statistical Data Analysis

This project implements some common statistical data analysis methods, including:

1. T-test:

Given two samples, perform t-test to determine if there is a significant difference between the means of these two samples. This test is mostly used when the data sets follow a normal distribution.

2. Mann-Whitney U test

Mann-Whitney U test is used to perform two samples hypothesis tests if their distributions are unknown. Formula: t = (mean1 - mean2)/sqrt(sem1^2 - sem2^2) mean1, mean2: means of two samples sem1, sem2: standard error of two samples

3. Kullback–Leibler divergence (KLD),

KLD is also called relative entropy

4. Empirical Cumulative Distribution Functions (CDF) distance

Empirical CDF distance of two non-parametric variables.

Formula:
D = sum((CDF(X) - CDF(Y))**2)

5. Entropy

Calculate entropy by binning. The number of bins is determined by different rules depending on the variable's distributions

6. Kolmogorov–Smirnov (KS) Test

KS test that can be used to compare two non-parametric variables.

Formula:
D = max(abs(CDF(X) - CDF(Y)))

7. Mutual Information

Calculate the mutual information between X and Y. Using single and joint entropies: https://en.wikipedia.org/wiki/Mutual_information#Relation_to_conditional_and_joint_entropy

Formula: I(X, Y) = H(X) + H(Y) - H(X, Y)

About

This project implements in Python some common statistical analysis methods used in data analysis, including Entropy, Mutual Information, Kolmogorov–Smirnov test, Kullback-Leibler divergence (KLD), AB tests (Mann-Whitney U and t-tests)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published