This project provides an in-depth analysis of Liver Cirrhosis using clinical data. The objective is to explore the relationship between various biomarkers (e.g., Bilirubin, Copper, Alkaline Phosphatase) and patient demographics (e.g., Age, Sex) to understand the progression of the disease.
- ๐ Data Exploration: Visualizing clinical indicators of liver cirrhosis.
- ๐ Biomarker Comparisons: Analysis of biomarkers across different disease stages.
- ๐งโ๐คโ๐ง Gender Differences: Exploring male and female variations in disease progression.
- ๐ Age Trends: Statistical insights into age-related disease patterns.
- ๐ก Survival Insights: Predicting survival rates and the likelihood of liver transplant for advanced stages.
- ๐ฅ Installation Instructions
- ๐ ๏ธ Usage
- ๐ Data Overview
- ๐งฎ Analysis
- ๐จ Visualizations
- ๐ฉโ๐ป Contributors
- ๐ License
- Python 3.x
- Required Libraries:
- ๐ฆ
pandas
for data manipulation - ๐จ
matplotlib
andseaborn
for visualizations - ๐งฎ
scipy
andnumpy
for statistical computations
- ๐ฆ
-
Clone the Repository
git clone https://github.com/Bushra-Butt-17/Liver-Cirrhosis-Analysis.git
-
Install Dependencies
Navigate to the project directory and install all required libraries:cd Liver-Cirrhosis-Analysis pip install -r requirements.txt
-
Run the Project
Open the Jupyter Notebook or Python scripts:jupyter notebook
This project serves researchers, data scientists, and healthcare professionals by enabling them to:
- Examine correlations between biomarkers and liver disease progression.
- Predict outcomes like liver transplants or mortality.
- Use the provided codebase for clinical or educational purposes.
- Load the dataset (ensure the correct file path).
- Follow the structured steps in the notebook for:
- Data cleaning.
- EDA.
- Statistical analysis and visualizations.
- Interpret the outputs to draw meaningful insights.
The dataset contains clinical information for liver cirrhosis patients. Key columns include:
๐ท๏ธ Feature | ๐ Description |
---|---|
Age | Patient's age. |
Sex | Gender (M/F). |
Stage | Stage of liver cirrhosis (1 to 4). |
Bilirubin | Serum Bilirubin levels (marker of liver function). |
Copper | Blood Copper levels (relevant for cirrhosis). |
Alk_Phos | Alkaline Phosphatase (enzyme indicating liver dysfunction). |
Albumin | Protein levels (lower levels suggest liver damage). |
SGOT | Serum Glutamic Oxaloacetic Transaminase (liver enzyme). |
Platelets | Platelet count (typically lower in cirrhosis). |
Prothrombin | Blood clotting time, impacted by liver function. |
Sample Data:
Stage | Age | Sex | Bilirubin | Copper | Alk_Phos | SGOT | Albumin | Platelets | Prothrombin |
---|---|---|---|---|---|---|---|---|---|
1 | 45 | M | 2.0 | 100 | 150 | 80 | 3.5 | 300,000 | 11.2 |
2 | 60 | F | 3.5 | 150 | 220 | 100 | 3.2 | 280,000 | 10.8 |
EDA techniques include:
- Descriptive statistics: Summary measures (mean, median, etc.).
- Outlier detection.
- Correlation analysis.
# Calculate average biomarkers by gender
df.groupby('Sex')[['Age', 'Bilirubin', 'Copper']].mean()
Advanced techniques to uncover patterns:
- T-tests: Comparing biomarker levels between two groups.
- ANOVA: Testing significant differences across multiple stages.
- Survival Analysis: Predicting outcomes like mortality or transplants.
from scipy import stats
stage_1 = df[df['Stage'] == 1]['Bilirubin']
stage_3 = df[df['Stage'] == 3]['Bilirubin']
t_stat, p_value = stats.ttest_ind(stage_1, stage_3)
print(f"T-statistic: {t_stat}, P-value: {p_value}")
Visualizations include:
- ๐ Stacked Bar Chart: Gender distribution across cirrhosis stages.
- ๐งฎ Heatmap: Correlations between biomarkers.
- ๐ฆ Boxplots: Variability of biomarkers by disease stage.
import seaborn as sns
correlation_matrix = df[['Bilirubin', 'Copper', 'Alk_Phos']].corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')
plt.title('Biomarker Correlation Heatmap')
plt.show()
Feel free to contribute by:
- Reporting issues.
- Opening pull requests.
- Enhancing the analysis or visualizations.
This project is licensed under the MIT License. See the LICENSE file for details.
Python
Data Analysis
Visualization
Medical Research
Statistical Analysis
Seaborn
Matplotlib
Pandas