Assembly and analysis of barcoded data of transcriptome sequencing
Andrey Prjibelski (Supervisor)
Determination of primary structure of mRNA molecules and assessment of gene expression levels using sequencing data are crucial tasks of computational transcriptomics, having many applications in clinical bioinformatics and molecular biology. Due to existing limitations of biotechnological methods currently available algorithms for processing of sequencing data cannot determine full sequences of isoforms for genes with complex structure. Assessment of expression levels is generally done on gene level, but not for specific isoforms.
A recently developed sequencing protocol spISO-seq (Tilgner et al., 2018) outputs barcoded transcriptome data, providing novel opportunities for analysis of sequencing data. However no computational method for processing of spISO-seq data are currently available.
In course of this project we are developing computational methods for analysis of spISO-seq data, allowing do determine full sequences of alternative isoforms and to assess their expression levels. More profound analysis of transcriptome data will provide novel opportunities to explore genomes of previously not sequenced organisms and genetics of diseases connected with impaired gene expression.
Tilgner, H., Jahanbani, F., Gupta, I., Collier, P., Wei, E., Rasmussen, M. and Snyder, M., 2018. Microfluidic isoform sequencing shows widespread splicing coordination in the human transcriptome. Genome research, 28(2), pp.231-242.