GitHub - SpicyChicken6/metabolomics_analysis_tools: tool development for bioinf576 class

Metabolomics analysis tools

---Introduction: Tool development for bioinf576 class---

The goal of this project is to implement a Python based pipeline or package related to metabolomics data analysis. I am currently working with targeted metabolomics data in my lab, and it will be helpful with my work to develop a package that contains some very common metabolomics data analysis tools, including:

data transformation
data normalization
data scaling
common statistical analyses including PCA, MA plot and Volcano plot.

Even though there are lots of packages available for the functions mentioned above, implementing them myself will help me understand those functions better and help me do a better analysis job hopefully.

Package user guide (A sample implementation can be found here: Sample implementation)

---Package install---

Steps:

Git clone or download the github folder;
Open the terminal, and go to this folder;
Enter pip install dist/metabolomics_analysis_tools-0.1.0.tar.gz to install the package locally;

---Sample data file---

The sample data file "human_cachexia.csv" is located under the path "metabolomics_analysis_tools/metabolomics_analysis_tools/resources/test_dataset/". After the package is installed, to load the data with the "read_data_file" function included, it will use this sample data by default. The parameter "file_path" can be used to specity the user-defined data file location to load. However, the user-defined file should have similar data format as in the sample data, as shown below: _{(Column 1 is the patient id info,

Column 2 is the group assigned for patients (note that this package focuses on single factor analysis, which means there should be only two groups in the data) ,

Column 3 to the end will be the metabolite levels for different metabolites)}

---To Use functions---

There are two groups of functions in this package, and there are multiple modules under each of these group:

"data_preprocessing":
data_reading _{(functions include: read_data_file.)}
normalization _{(functions include: normalize_by_sum, normalize_by_median, normalize_by_reference_sample_PQN.)}
scaling _{(functions include: data_scaling_mean_centered)}
transformation _{(functions include: data_transformation_log)}
"stats_analyses":
analyses _{(functions include: PCA_analysis, ma_plot, volcano_plot)}

To use a certain function, for example, to use read_data_file function to read file, you can do:
import metabolomics_analysis_tools.data_preprocessing as dp
df=dp.data_reading.read_data_file()
Then you can use other functions on the data you just loaded.

---Expected results---

The file reading step should return us a panda dataframe;
After data transformation, the data should be more normally distributed;
The stats analyses step will generate plots for each analysis for us to look at.

Name		Name	Last commit message	Last commit date
Latest commit History 150 Commits
.github		.github
design_documents		design_documents
dist		dist
metabolomics_analysis_tools		metabolomics_analysis_tools
tutorials		tutorials
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 3

Languages

License

SpicyChicken6/metabolomics_analysis_tools

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages