Modeling Analyses on the Success Rate of Purification of Saccharomyces cerevisiae Proteins


Saccharomyces cerevisiae is the most widely used yeast in research and industries, however the downstream processes for its protein production are costly. This study attempted to find out a simple way to predict the success rate of protein purification with amino acid features. Logistic regression and neural network model were used to test each of 535 amino acid features one by one against the purification state of 1294 expressed proteins from S. cerevisiae, of which 870 were purified.

The results show that the predictive performance of neural network is more powerful than that of logistic regression. Some amino acid features are useful to predict the purification tendency of proteins, and the varying amino acid features perform better as demonstrated by very high sensitivity accompanied with low specificity. Moreover, the S. cerevisiae proteins with a high predictable portion of amino acid pairs have higher accuracy of purification prediction than those with a low predictable portion.

Thus, the success rate of purification of S. cerevisiae proteins can be predicted using neural network based on protein sequence information. This simple prediction process can provide a concept about the probability of a protein is purified, which should be helpful to overcome blindfold experiments and enhance the production of designed proteins.

Saccharomyces cerevisiae is the most useful yeast for humans, and since ancient times it has been widely used in winemaking, baking, and brewing. Over the past two decades, efforts have been made to reduce alcohol levels in wines through rational and evolutionary engineering of S. cerevisiae in order to maintain consumer health, prevention policies, the effectiveness of the fermentation and wine sensorial quality.

On the other hand, genetic engineering has provided non-conventional yeast species with unusual tolerance in order to produce high yields of liquid fuels and commodity chemicals from lignocellulosic biomass. As a unicellular eukaryotic model, S. cerevisiae is one of the most intensively studied organisms in molecular and cell biology, and generates major breakthroughs in understanding of the mechanisms of cellular and molecular processes. Although it has been used in fundamental and applied researches for long times, the interests in S. cerevisiae do not decrease but increase recently. For example, S. cerevisiae is used as a model to study Alzheimer’s Disease, Parkinson’s disease and mitochondrial diseases.

Current Issue: Volume 8 Issue 1

Journal Submissions

Enzyme Engineering (ISSN: 2329-6674), Impact Factor: 1*welcomes submissions with cutting-edge research in the field of Enzymology. Unsolicited manuscripts including research articles, commentaries, and other reports will also be considered for publication and should be submitted either online or through mail.

You may submit your paper as an attachment at or

Online Submission

Submit your Manuscript online or by mailing to us at

Author Information: Complete names and affiliation of all authors, including contact details of corresponding author (Telephone, Fax and E-mail address).

Best Regards
Editorial Manager,
Enzyme Engineering
Contact: +32-2-808-7017