Alternative pre-mRNA splicing is a tightly regulated post-transcriptional process that amplifies the coding potential of the genome by co-expressing multiple variants of the same gene. Its investigation at the level of single cells have been challenging thus far due to the limitations of single-cell sequencing technology and current alternative splicing tools have inaccurately reported that most multi-exonic genes tend to express a single isoform at a time. We aim to develop new computational tools using novel statistical approaches to accurately quantify levels of exon splicing and implement it to study changes in alternative splicing during subclass-specific neurogenesis.
NMD is a highly conserved quality control mechanism that enforces the accuracy of gene expression by clearing transcripts harboring premature termination codons. The success of many biotechnology and biomedical applications such as CRISPR-Cas9 gene knockout systems and cancer immunotherapies rely on optimal NMD activity. The efficacy of NMD is highly variable between biological systems but little is known about its regulatory mechanisms. We aim to achieve broader understanding of the predictors of NMD activity using advanced statistical and machine learning models.
We recently published {factR}, a suite of bioinformatics tools for the functional annotation of novel transcripts detected by next- and third-generation sequencing experiments. At its core, {factR} builds coding sequences of newly-identified mRNA isoforms and determine its functional consequence based on predicted protein domain architectures and NMD-triggering features. We believe that this robust and easy-to-use software will help us to better understand the complexity of the cellular transcriptome. {factR} is actively maintained on R/Bioconductor (https://bioconductor.org/packages/release/bioc/html/factR.html).