
Kejue Jia
- Postdoc Research Associate
Education
- B.S., Computer Science, Beijing University of Technology
- M.S., Computer Science, San Diego State University
- Ph.D., Bioinformatics and Computational Biology, Iowa State University
Research
Sequence Matching
Sequence matching lies at the very upper end in many general computational analysis pipelines. The accuracy of sequence matching determines the quality of any subsequent analyses. In Kejue's Ph.D. research, he incorporated the structural information into the amino acid substitution matrix derivation and significantly improved the accuracy of sequence matching. Especially for "twilight zone" sequences, the new substitution matrix achieves major gains in the agreement between the sequence matching and the structure alignment (see below).

Protein Evolution and High Order Sequence Correlations
The protein sequence correlation reflects the evolutionary dependences among residue sites. Kejue aims to push the limit of dependence detection methods from paired correlations to higher-order correlations that are more natural for a highly packed molecule such as protein. This project is motivated by the immediate needs and long-standing challenges of revealing and comprehending the complex dependences within protein structures (see below).

RNA Structure Prediction
In Kejue's Postdoc study, he has also extended his studies to include RNA. In this work, Kejue is working on establishing the connection between RNA sequences, dynamics, and its alternative conformations (see below). This project will immediately yield improved predicted protein and RNA structures and their co-structures, as well as a deeper understanding of RNA evolution.

Machine Learning
Kejue also embraces newly developed Artificial Intelligence approaches. He and one of the lab members Mesih Kilinc together developed a fast and accurate protein homolog detection tool based on the protein sequence language model embeddings. The tool finds homologs with known functions for 800 uncharacterized human proteins and has confirmed these to be similar from predicted structures from Aphafold.
Software
During his studies, Kejue has built highly parallelizable pipeline management software that he uses for his research. This software package allows him to perform intensive computational tasks upon huge datasets (usually hundreds of Gbs at a time). The software is open-source and available for download at https://github.com/jkjium/contactGroups.