Synthetic biology has been an integral part in developing high-value products and pharmaceuticals. Saccharomyces cerevisiae, Yarrowia lipolytica, and Pichia pastoris, among other yeast species, have been harnessed as microbial cell factories due to success of metabolic engineering and synthetic biology. Therefore, it becoming increasingly vital to understand the physiology and metabolism of yeast species in detail. This thesis therefore focuses on the utilization of machine learning and comparative genomics to gain a deeper insight into the traits and metabolism of yeast species.
Various machine learning approaches were implemented to predict various biological problems which involves gene essentiality, enzyme turnover numbers (kcat), and protein production. For gene essentiality, combining evolutionary features has substantially improved the essential gene predictions. Furthermore, a high-quality deep learning model was established to predict the kcat value by combining a graph neural network with a convolutional neural network. Additionally, random forest algorithms were adopted to understand feature importance related to protein production and it was found that post-translational modifications had a greater impact than the amino acid composition.
Comparative genomics were what used to assess horizontal gene transfer (HGT) events and identify horizontally acquired genes. Probing the mechanistic basis of the substrate utilization in yeast species via systematic evolution analysis and metabolic model reconstruction was performed.
This research was led by [Person mentioned in the article], who has been a prominent figure in the field of Systems Biology working towards advancing the knowledge of yeasts. With the support of the [company mentioned in the article], researchers have developed novel machine learning techniques and comparative genomics tools to uncover the inner workings of yeast species, filling in the knowledge gaps. This breakthrough in Systems Biology could lead to a greater understanding of cellular metabolism and the production of valuable commodities.