SCOP

Predicting classes of protein sequences

We are planning to develop a Structural Classification of Protein Sequences (SCOP) application. The main purpose of this project is to predict the classes of given protein sequences. The SCOP app will be used to predict the class of a given protein sequence, which will be provided in the form of a PDB ID or PSSM file. We aim to predict the top level of SCOP hierarchical classification (Protein fold classes), which is further classified into folds, superfamilies, and families.

A motivation for this classification is to determine the evolutionary relationship between proteins. Proteins with the same shapes but little sequence or functional similarity are placed in different superfamilies and are assumed to have only a very distant common ancestor. Proteins having the same shape and some sequence and/or functional similarity are placed in "families," and are assumed to have a closer common ancestor. This represents the top-level "root" of the SCOP hierarchical classification.

Conclusion

The broadest groups in SCOP are the protein fold classes. These classes group structures with similar secondary structure compositions but different overall tertiary structures and evolutionary origins. This represents the top-level "root" of the SCOP hierarchical classification.

The project is developed using the Python language and Flask. We implemented this project in Google Colab. We can predict up to 4 classes:

All alpha proteins: Domains consisting of α-helices All beta proteins: Domains consisting of β-sheets Alpha and beta proteins (a/b): Mainly parallel beta sheets (beta-alpha-beta units) Alpha and beta proteins (a+b): Mainly antiparallel beta sheets (segregated alpha and beta regions) After performing various comprehensive tests, we conclude that our project is functioning correctly. However, there is always room for improvement and learning. This was our first ML project, and indeed, we learned something new from it. We are now looking forward to overcoming the existing limitations and adding new possible extensions, which are discussed in the next section.

Project report for this SCOP application https://github.com/PrincyGajera/SCOP/blob/main/Docs/Project_Report.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Docs		Docs
Models		Models
Prediction		Prediction
Preprocessing		Preprocessing
SCOP_App		SCOP_App
Final_model.ipynb		Final_model.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCOP

About

Releases

Packages

Languages

License

PrincyGajera/SCOP

Folders and files

Latest commit

History

Repository files navigation

SCOP

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages