Library Portal | UWC Portal
    • Login
    Contact Us | Quick Submission Guide | About Us | FAQs | Login
    View Item 
    •   Repository Home
    • Faculty of Natural Sciences
    • South African National Bioinformatics Institute (SANBI)
    • Research Articles (SANBI)
    • View Item
    •   Repository Home
    • Faculty of Natural Sciences
    • South African National Bioinformatics Institute (SANBI)
    • Research Articles (SANBI)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms

    Thumbnail
    View/Open
    mbandi_inferring_bmcbioinf_2015.pdf (669.2Kb)
    Date
    2015
    Author
    Mbandi, Stanley K.
    Hesse, Uljana
    Van Heusden, Peter
    Christoffels, Alan
    Metadata
    Show full item record
    Abstract
    Background: De novo transcriptome assembly of short transcribed fragments (transfrags) produced from sequencing-by-synthesis technologies often results in redundant datasets with differing levels of unassembled, partially assembled or mis-assembled transcripts. Post-assembly processing intended to reduce redundancy typically involves reassembly or clustering of assembled sequences. However, these approaches are mostly based on common word heuristics and often create clusters of biologically unrelated sequences, resulting in loss of unique transfrags annotations and propagation of mis-assemblies. Results: Here, we propose a structured framework that consists of a few steps in pipeline architecture for Inferring Functionally Relevant Assembly-derived Transcripts (IFRAT). IFRAT combines 1) removal of identical subsequences, 2) error tolerant CDS prediction, 3) identification of coding potential, and 4) complements BLAST with a multiple domain architecture annotation that reduces non-specific domain annotation. We demonstrate that independent of the assembler, IFRAT selects bona fide transfrags (with CDS and coding potential) from the transcriptome assembly of a model organism without relying on post-assembly clustering or reassembly. The robustness of IFRAT is inferred on RNA-Seq data of Neurospora crassa assembled using de Bruijn graph-based assemblers, in single (Trinity and Oases-25) and multiple (Oases-Merge and additive or pooled) k-mer modes. Single k-mer assemblies contained fewer transfrags compared to the multiple k-mer assemblies. However, Trinity identified a comparable number of predicted coding sequence and gene loci to Oases pooled assembly. IFRAT selects bona fide transfrags representing over 94% of cumulative BLAST-derived functional annotations of the unfiltered assemblies. Between 4-6% are lost when orphan transfrags are excluded and this represents only a tiny fraction of annotation derived from functional transference by sequence similarity. The median length of bona fide transfrags ranged from 1.5kb (Trinity) to 2kb (Oases), which is consistent with the average coding sequence length in fungi. The fraction of transfrags that could be associated with gene ontology terms ranged from 33-50%, which is also high for domain based annotation. We showed that unselected transfrags were mostly truncated and represent sequences from intronic, untranslated (5′ and 3′) regions and non-coding gene loci. Conclusions: IFRAT simplifies post-assembly processing providing a reference transcriptome enriched with functionally relevant assembly-derived transcripts for non-model organism.
    URI
    http://hdl.handle.net/10566/1491
    Collections
    • Research Articles (SANBI) [76]
    • Prof. Alan Christoffels [19]
    • Mr. Peter van Heusden [4]

    DSpace 5.5 | Ubuntu 14.04 | Copyright © University of the Western Cape
    Contact Us | Send Feedback
    Theme by 
    @mire NV
     

     

    Browse

    All of RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage Statistics

    DSpace 5.5 | Ubuntu 14.04 | Copyright © University of the Western Cape
    Contact Us | Send Feedback
    Theme by 
    @mire NV