Toronto Metropolitan University
Browse
Franklin_Daniel.pdf (1.8 MB)

Structural Classification of Proteins Using Image Based Machine Learning

Download (1.8 MB)
thesis
posted on 2022-11-03, 16:57 authored by Daniel Franklin

Classification of proteins is an important area of research that enables better grouping of proteins either by their function, evolutionary similarities or in their structural makeup. Structural classification is the area of research that this thesis focuses on. We use visualizations of proteins to build a machine learning class prediction model, that successfully classifies proteins using the Structural Classification of Proteins (SCOP) framework. SCOP is a well-researched classification with many approaches using a representation of a proteins secondary structure in a linear chain of structures. This thesis uses a novel approach of rendering a three dimensional visualization of the protein itself and then applying image based machine learning to determine a protein’s SCOP classification. The resulting convolutional neural network (CNN) method has achieved average accuracies in the range 78-87% on the 25PDB dataset, which is better than or equal to the existing methods.

History

Language

eng

Degree

  • Master of Science

Program

  • Computer Science

Granting Institution

Ryerson University

LAC Thesis Type

  • Thesis

Usage metrics

    Computer Science (Theses)

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC