RNA Sequencing and Analysis

Kimberly R. Kukurba; Stephen B. Montgomery

doi:10.1101/pdb.top084970

RNA Sequencing and Analysis

Kimberly R. Kukurba1,2 and
Stephen B. Montgomery1,2,3,4

¹Department of Pathology, Stanford University School of Medicine, Stanford, California 94305;
²Department of Genetics, Stanford University School of Medicine, Stanford, California 94305;
³Department of Computer Science, Stanford University School of Medicine, Stanford, California 94305

Next Section

Abstract

RNA sequencing (RNA-Seq) uses the capabilities of high-throughput sequencing methods to provide insight into the transcriptome of a cell. Compared to previous Sanger sequencing- and microarray-based methods, RNA-Seq provides far higher coverage and greater resolution of the dynamic nature of the transcriptome. Beyond quantifying gene expression, the data generated by RNA-Seq facilitate the discovery of novel transcripts, identification of alternatively spliced genes, and detection of allele-specific expression. Recent advances in the RNA-Seq workflow, from sample preparation to library construction to data analysis, have enabled researchers to further elucidate the functional complexity of the transcription. In addition to polyadenylated messenger RNA (mRNA) transcripts, RNA-Seq can be applied to investigate different populations of RNA, including total RNA, pre-mRNA, and noncoding RNA, such as microRNA and long ncRNA. This article provides an introduction to RNA-Seq methods, including applications, experimental design, and technical challenges.

Previous Section Next Section

INTRODUCTION

The central dogma of molecular biology outlines the flow of information that is stored in genes as DNA, transcribed into RNA, and finally translated into proteins (Crick 1958, Crick 1970). The ultimate expression of this genetic information modified by environmental factors characterizes the phenotype of an organism. The transcription of a subset of genes into complementary RNA molecules specifies a cell's identity and regulates the biological activities within the cell. Collectively defined as the transcriptome, these RNA molecules are essential for interpreting the functional elements of the genome and understanding development and disease.

The transcriptome has a high degree of complexity and encompasses multiple types of coding and noncoding RNA species. Historically, RNA molecules were relegated as a simple intermediate between genes and proteins, as encapsulated in the central dogma of molecular biology. Therefore, messenger RNA (mRNA) molecules were the most frequently studied RNA species because they encoded proteins via the genetic code. In addition to protein-coding mRNA, there is a diverse group of noncoding RNA (ncRNA) molecules that are functional. Previously, most known ncRNAs fulfilled basic cellular functions, such as ribosomal RNAs and transfer RNAs involved in mRNA translation, small nuclear RNA (snRNAs) involved in splicing, and small nucleolar RNAs (snoRNAs) involved in the modification of rRNAs (Mattick and Makunin 2006). More recently, novel classes of RNA have been discovered, enhancing the repertoire of ncRNAs. For instance, one such class of ncRNAs is small noncoding RNAs, which include microRNA (miRNA) and piwi-interacting RNA (piRNA), both of which regulate gene expression at the posttranscriptional level (Stefani and Slack 2008). Another noteworthy class of ncRNAs is long noncoding RNAs (lncRNAs). As a functional class, lncRNAs were first described in mice during the large-scale sequencing of cDNA libraries (Okazaki et al. 2002). A myriad of molecular functions have been discovered for lncRNAs, including chromatin remodeling, transcriptional control, and posttranscriptional processing, although the vast majority are not fully characterized (Guttman et al. 2009; Mercer et al. 2009; Wilusz et al. 2009).

Initial gene expression studies relied on low-throughput methods, such as northern blots and quantitative polymerase chain reaction (qPCR), that are limited to measuring single transcripts. Over the last two decades, methods have evolved to enable genome-wide quantification of gene expression, or better known as transcriptomics. The first transcriptomics studies were performed using hybridization-based microarray technologies, which provide a high-throughput option at relatively low cost (Schena et al. 1995). However, these methods have several limitations: the requirement for a priori knowledge of the sequences being interrogated; problematic cross-hybridization artifacts in the analysis of highly similar sequences; and limited ability to accurately quantify lowly expressed and very highly expressed genes (Casneuf et al. 2007; Shendure 2008). In contrast to hybridization-based methods, sequence-based approaches have been developed to elucidate the transcriptome by directly determining the transcript sequence. Initially, the generation of expressed sequence tag (EST) libraries by Sanger sequencing of complementary DNA (cDNA) was used in gene expression studies, but this approach is relatively low-throughput and not ideal for quantifying transcripts (Adams et al. 1991, 1995; Itoh et al. 1994). To overcome these technical constraints, tag-based methods such as serial analysis of gene expression (SAGE) and cap analysis gene expression (CAGE) were developed to enable higher throughput and more precise quantification of expression levels. By quantifying the number of tagged sequences, which directly corresponded to the number of mRNA transcripts, these tag-based methods provide a distinct advantage over measuring analog-style intensities as in array-based methods (Velculescu et al. 1995; Shiraki et al. 2003). However, these assays are insensitive to measuring expression levels of splice isoforms and cannot be used for novel gene discovery. In addition, the laborious cloning of sequence tags, the high cost of automated Sanger sequencing, and the requirement for large amounts of input RNA have greatly limited its use.

The development of high-throughput next-generation sequencing (NGS) has revolutionized transcriptomics by enabling RNA analysis through the sequencing of complementary DNA (cDNA) (Wang et al. 2009). This method, termed RNA sequencing (RNA-Seq), has distinct advantages over previous approaches and has revolutionized our understanding of the complex and dynamic nature of the transcriptome. RNA-Seq provides a more detailed and quantitative view of gene expression, alternative splicing, and allele-specific expression. Recent advances in the RNA-Seq workflow, from sample preparation to sequencing platforms to bioinformatic data analysis, has enabled deep profiling of the transcriptome and the opportunity to elucidate different physiological and pathological conditions. In this article we will provide an introduction to RNA sequencing and analysis using next-generation sequencing methods and discusses how to apply these advances for more comprehensive and detailed transcriptome analyses.

Previous Section Next Section

TRANSCRIPTOME SEQUENCING

The introduction of high-throughput next-generation sequencing (NGS) technologies revolutionized transcriptomics. This technological development eliminated many challenges posed by hybridization-based microarrays and Sanger sequencing-based approaches that were previously used for measuring gene expression. A typical RNA-Seq experiment consists of isolating RNA, converting it to complementary DNA (cDNA), preparing the sequencing library, and sequencing it on an NGS platform (Fig. 1). However, many experimental details, dependent on a researcher's objectives, should be considered before performing RNA-Seq. These include the use of biological and technical replicates, depth of sequencing, and desired coverage across the transcriptome. In some cases, these experimental options will have minimal impact on the quality of the data. However, in many cases the researcher must carefully design the experiment, placing a priority on the balance between high-quality results and the time and monetary investment.

View larger version:

Figure 1.

Overview of RNA-Seq. First, RNA is extracted from the biological material of choice (e.g., cells, tissues). Second, subsets of RNA molecules are isolated using a specific protocol, such as the poly-A selection protocol to enrich for polyadenylated transcripts or a ribo-depletion protocol to remove ribosomal RNAs. Next, the RNA is converted to complementary DNA (cDNA) by reverse transcription and sequencing adaptors are ligated to the ends of the cDNA fragments. Following amplification by PCR, the RNA-Seq library is ready for sequencing.

Isolation of RNA

The first step in transcriptome sequencing is the isolation of RNA from a biological sample. To ensure a successful RNA-Seq experiment, the RNA should be of sufficient quality to produce a library for sequencing. The quality of RNA is typically measured using an Agilent Bioanalyzer, which produces an RNA Integrity Number (RIN) between 1 and 10 with 10 being the highest quality samples showing the least degradation. The RIN estimates sample integrity using gel electrophoresis and analysis of the ratios of 28S to 18S ribosomal bands. Note that the RIN measures are based on mammalian organisms and certain species with abnormal ribosomal ratios (i.e., insects) may erroneously generate poor RIN numbers. Low-quality RNA (RIN < 6) can substantially affect the sequencing results (e.g., uneven gene coverage, 3′–5′ transcript bias, etc.) and lead to erroneous biological conclusions. Therefore, high-quality RNA is essential for successful RNA-Seq experiments. Unfortunately, high-quality RNA samples may not be available in some cases, such as human autopsy samples or paraffin embedded tissues, and the effect of degraded RNA on the sequencing results should be carefully considered (Tomita et al. 2004; Thompson et al. 2007; Rudloff et al. 2010).

Library Preparation Methods

Following RNA isolation, the next step in transcriptome sequencing is the creation of an RNA-Seq library, which can vary by the selection of RNA species and between NGS platforms. The construction of sequencing libraries principally involves isolating the desired RNA molecules, reverse-transcribing the RNA to cDNA, fragmenting or amplifying randomly primed cDNA molecules, and ligating sequencing adaptors. Within these basic steps, there are several choices in library construction and experimental design that must be carefully made depending on the specific needs of the researcher (Table 1). Additionally, the accuracy of detection for specific types of RNAs is largely dependent on the nature of the library construction. Although there are a few basic steps for preparing RNA-Seq libraries, each stage can be manipulated to enhance the detection of certain transcripts while limiting the ability to detect other transcripts.

View this table:

Table 1.

RNA-Seq library protocols

Selection of RNA Species

Before constructing RNA-Seq libraries, one must choose an appropriate library preparation protocol that will enrich or deplete a “total” RNA sample for particular RNA species. The total RNA pool includes ribosomal RNA (rRNA), precursor messenger RNA (pre-mRNA), mRNA, and various classes of noncoding RNA (ncRNA). In most cell types, the majority of RNA molecules are rRNA, typically accounting for over 95% of the total cellular RNA. If the rRNA transcripts are not removed before library construction, they will consume the bulk of the sequencing reads, reducing the overall depth of sequence coverage and thus limiting the detection of other less-abundant RNAs. Because the efficient removal of rRNA is critical for successful transcriptome profiling, many protocols focus on enriching for mRNA molecules before library construction by selecting for polyadenylated (poly-A) RNAs. In this approach, the 3′ poly-A tail of mRNA molecules is targeted using poly-T oligos that are covalently attached to a given substrate (e.g., magnetic beads). Alternatively, researchers can selectively deplete rRNA using commercially available kits, such as RiboMinus (Life Technologies) or RiboZero (Epicentre). This latter method facilitates the accurate quantification of noncoding RNA species, which may be polyadenylated and thus excluded from poly-A libraries. Lastly, highly abundant RNA can be removed by denaturing and re-annealing double-stranded cDNA in the presence of duplex-specific nucleases that preferentially digest the most abundant species, which re-anneal as double-stranded molecules more rapidly than less-abundant molecules (Christodoulou et al. 2011). This method can also be used to remove other highly abundant mRNA transcripts in samples, such as hemoglobin in whole blood, immunoglobulins in mature B cells, and insulin in pancreatic beta cells.

A comprehensive understanding of the technical biases and limitations surrounding each methodological approach is essential for selecting the best method for library preparation. For example, poly-A libraries are the superior choice if one is solely interested in coding RNA molecules. Conversely, ribo-depletion libraries are a more appropriate choice for accurately quantifying noncoding RNA as well as pre-mRNA that has not been posttranscriptionally modified. Furthermore, moderate differences exist between ribo-depletion protocols, such as the efficiency of rRNA removal and differential coverage of small genes, which should be investigated before selecting a method (Huang et al. 2011).

In addition to the selective depletion of specific RNA species, new approaches have been developed to selectively enrich for regions of interest. These approaches include methods employing PCR-based approaches, hybrid capture, in-solution capture, and molecular inversion probes (Querfurth et al. 2012). The hybridization-based in solution capture involves a set of biotinylated RNA baits transcribed from DNA template oligo libraries that contain sequences corresponding to particular genes of interest. The RNA baits are combined with the RNA-Seq library where they hybridize to RNA sequences that are complementary to the baits, and the bounded complexes are recovered using streptavidin-coated beads. The resulting RNA-Seq library is now enriched for sequences corresponding to the baits and yet retains its gene expression information despite the removal of other RNA species (Levin et al. 2009). The approach enables researchers to reduce sequencing costs by sequencing selected regions in a greater number of samples.

Selection of Small RNA Species

Complementing the library preparation protocols discussed above, more specific protocols have been developed to selectively target small RNA species, which are key regulators of gene expression. Small RNA species include microRNA (miRNA), small interfering RNA (siRNA), and piwi-interacting RNA (piRNA). Because small RNAs are lowly abundant, short in length (15–30 nt), and lack polyadenylation, a separate strategy is often preferred to profile these RNA species (Morin et al. 2010). Similar to total RNA isolation, commercially available extraction kits have been developed to isolate small RNA species. Most kits involve isolation of small RNAs by size fractionation using gel electrophoresis. Size fractionation of small RNAs requires involves running the total RNA on a gel, cutting a gel slice in the 14–30 nucleotide region, and purifying the gel slice. For higher concentrations of small RNAs, the excised gel slice can be concentrated by ethanol precipitation. An alternative to gel electrophoresis is the use of silica spin columns, which bind and elute small RNAs from a silica column. After isolation of small RNAs species from total RNA, the RNA is ready for cDNA synthesis and primer ligation.

cDNA Synthesis

Universal to all RNA-Seq preparation methods is the conversion of RNA into cDNA because most sequencing technologies require DNA libraries. Most protocols for cDNA synthesis create libraries that were uniformly derived from each cDNA strand, thus representing the parent mRNA strand and its complement. In this conventional approach, the strand orientation of the original RNA is lost as the sequencing reads derived from each cDNA strand are indistinguishable in an effort to maximize efficiency of reverse transcription. However, strand information can be particularly valuable for distinguishing overlapping transcripts on opposite strands, which is critical for de novo transcript discovery (Parkhomchuk et al. 2009; Vivancos et al. 2010; Mills et al. 2013). Therefore, alternative library preparation protocols have since been developed that yield strand-specific reads. One strategy to preserve strand information is to ligate adapters in predetermined directions to single-stranded RNA or the first-strand of cDNA (Lister et al. 2008). Unfortunately, this approach is laborious and results in coverage bias at both the 5′ and 3′ ends of cDNA molecules. The preferred strategy to preserve strandedness is to incorporate a chemical label such as deoxy-UTP (dUTP) during synthesis of the second-strand cDNA that can be specifically removed by enzymatic digestion (Parkhomchuk et al. 2009). During library construction, this facilitates distinguishing the second-strand cDNA from the first strand. Although this approach is favored, the validity of antisense transcripts near highly expressed genes should be measured with caution because a small amount of reads (∼1%) have been observed from the opposite strand (Zeng and Mortazavi 2012).

Multiplexing

Another consideration for constructing cost-effective RNA-Seq libraries is assaying multiple indexed samples in a single sequencing lane. The large number of reads that can be generated per sequencing run (e.g., a single lane of an Illumina HiSeq 2500 generates up to 750 million paired-end reads) permits the analysis of increasingly complex samples. However, increasingly high sequencing depths provide diminishing returns for lower complexity samples, resulting in oversampling with minimal improvement in data quality (Smith et al. 2010). Therefore, an affordable and efficient solution is to introduce unique 6-bp indices, also known as “barcodes,” to each RNA-Seq library. This enables the pooling and sequencing of multiple samples in the same sequencing reaction because the barcodes identify which sample the read originated from. Depending on the application, adequate transcriptome coverage can be attained for 2–20 samples (Birney et al. 2007; Blencowe et al. 2009). To detect transcripts of moderate to high abundance, ∼30–40 million reads are required to accurately quantify gene expression. To obtain coverage over the full-sequence diversity of complex transcript libraries, including rare and lowly-expressed transcripts, up to 500 million reads is required (Fu et al. 2014). As such, for any given study it is important to consider the level of sequencing depth required to answer experimental questions with confidence while efficiently using NGS resources.

Quantitative Standards

Although RNA-Seq is a widely used technique for transcriptome profiling, the rapid development of sequencing technologies and methods raises questions about the performance of different platforms and protocols. Variation in RNA-Seq data can be attributed to an assortment of factors, ranging from the NGS platform used to the quality of input RNA to the individual performing the experiment. To control for these sources of technical variability, many laboratories use positive controls or “spike-ins” for sequencing libraries. The External RNA Controls Consortium (ERCC) developed a set of universal RNA synthetic spike-in standards for microarray and RNA-Seq experiments (Jiang et al. 2011; Zook et al. 2012). The spike-ins consist of a set of 96 DNA plasmids with 273–2022 bp standard sequences inserted into a vector of ∼2800 bp. The spike-in standard sequences are added to sequencing libraries at different concentrations to assess coverage, quantification, and sensitivity. These RNA standards serve as an effective quality control tool for separating technical variability from biological variability detected in differential transcriptome profiling studies.

Selection of Tissue or Cell Populations

When beginning an RNA-Seq experiment, one of the initial considerations is the choice of biological material to be used for library construction and sequencing. This choice is not trivial considering there are hundreds of cell types in over 200 different tissues that make up greater than 50 unique organs in humans alone. In addition to spatial (e.g., cell- and tissue-type) specificity, gene expression shows temporal specificity, such that different developmental stages will show unique expression signatures. Ultimately, the biological material chosen will be dependent on both the experimental goals and feasibility. For example, the tissue of choice for an investigation of unique gene expression signatures in colon cancer, the tissue choice is clear. However, for research studies investigating variation in gene expression across individuals in a population, the choice of biological material is less apparent and will likely depend on the feasibility of obtaining the biological samples (e.g., blood draws are less invasive and easier to perform than tissue biopsies).

Handling Tissue Heterogeneity

Another consideration when selecting the biological source of RNA is the heterogeneity of tissues. The accuracy of gene expression quantification is dependent on the purity of samples. In fact, the heterogeneity can substantially impact estimations of transcript abundances in samples composed of multiple cell types. Most tissue samples isolated from the human body are heterogeneous by nature. Furthermore, pathological tissue samples are often composed of disease-state cells surrounded by normal cells. To isolate distinct cell types, experimental methods have been developed, including laser-capture microdissection and cell purification. Laser-capture microdissection enables the isolation of cell types that are morphologically distinguishable under direct microscopic visualization (Emmert-Buck et al. 1996). Although this technique yields high-quality RNA, the total yield is low and requires PCR amplification, thereby introducing amplification biases and creating less distinguishable expression profiles across different cell types (Kube et al. 2007). Cell purification and enrichment protocols are also available, such as differential centrifugation and fluorescence-activated cell sorting (Cantor et al. 1975). In conjunction with RNA-Seq, these experimental methods have overcome previous technical limitations and enable researchers to uncover unique expression signatures across specific cell-types and developmental stages (Moran et al. 2012; Nica et al. 2013). In addition to these experimental methods, in silico probabilistic models can be applied in downstream analysis to differentiate the transcript abundances of distinct cells from RNA-Seq data of heterogeneous tissue samples (Erkkila et al. 2010; Li and Xie 2013). Interestingly, in some cases, the sample heterogeneity can have advantages in transcriptome profiling by identifying novel pathways, implicating cellular origins of disease, or identifying previously unknown pathological sites (Alizadeh et al. 2000; Khan et al. 2001; Sorlie et al. 2001).

Single-Cell Transcriptomics

Beyond tissue heterogeneity, considerable evidence indicates that cell-to-cell variability in gene expression is ubiquitous, even within phenotypically homogeneous cell populations (Huang 2009). Unfortunately, conventional RNA-Seq studies do not capture the transcriptomic composition of individual cells. The transcriptome of a single cell is highly dynamic, reflecting its functionality and responses to ever-changing stimuli. In addition to cellular heterogeneity resulting from regulation, individual cells show transcriptional “noise” that arises from the kinetics of mRNA synthesis and decay (Yang et al. 2003; Sun et al. 2012). Furthermore, genes that show mutually exclusive expression in individual cells may be observed as genes showing co-expression in expression analyses of bulk cell populations.

To uncover cell-to-cell variation within populations, significant efforts have been invested in developing single-cell RNA-Seq methods. The biggest challenge has been extending the limits of library preparation to accommodate extremely low input RNA. A human cell contains <1 pg of mRNA (Kawasaki 2004), whereas most sequencing protocols such as Illumina's TruSeq RNA-Seq kit recommends 400 ng to 1 µg of input RNA material. Various single-cell RNA amplification methods have been developed to accommodate less input RNA (Tang et al. 2009, 2010; Hashimshony et al. 2012; Islam et al. 2012; Picelli et al. 2013; Sasagawa et al. 2013; Shalek et al. 2013). The key limiting factors in the detection of transcripts in single cells are cDNA synthesis and PCR amplification. The efficiency of RNA-to-cDNA conversion is imperfect, estimated to be as low as 5%–25% of all transcripts (Islam et al. 2012). In addition, PCR amplification methods do not linearly amplify transcript and are prone to introduce biases based on the nucleic acid composition of different transcripts, ultimately altering the relative abundance of these transcripts in the sequencing library. Methods that avoid PCR amplification steps, such as CEL-Seq, through linear in vitro amplification of the transcriptome can avoid these biases (Hashimshony et al. 2012). In addition, the use of nanoliter-scale reaction volumes with microfluidic devices as opposed to microliter-scale reactions can reduce biases that arise during sample preparation (Wu et al. 2014). Although single-cell methods are still under active development, quantitative assessments of these techniques indicate that obtaining accurate transcriptome measurements by single-cell RNA-Seq is possible after accounting for technical noise (Brennecke et al. 2013; Wu et al. 2014). These methods will undoubtedly be important for uncovering oscillatory and heterogeneous gene expression within single-cell types, as well as identifying cell-specific biomarkers that further our understanding of biology across many physiological and pathological conditions.

Sequencing Platforms for Transcriptomics

When designing an RNA-Seq experiment, the selection of a sequencing platform is important and dependent on the experimental goals. Currently, several NGS platforms are commercially available and other platforms are under active technological development (Metzker 2010). The majority of high-throughput sequencing platforms use a sequencing-by-synthesis method to sequence tens of millions of sequence clusters in parallel. The NGS platforms can often be categorized as either ensemble-based (i.e. sequencing many identical copies of a DNA molecule) or single-molecule-based (i.e. sequencing a single DNA molecule). The differences between these sequencing techniques and platforms can affect downstream analysis and interpretation of the sequencing data.

In recent years, the sequencing industry has been dominated by Illumina, which applies an ensemble-based sequencing-by-synthesis approach (Bentley et al. 2008). Using fluorescently labeled reversible-terminator nucleotides, DNA molecules are clonally amplified while immobilized on the surface of a glass flowcell. Because molecules are clonally amplified, this approach provides the relative RNA expression levels of genes. To remove potential PCR-amplification biases, PCR controls and specific steps in the downstream computational analysis are required. One major benefit of ensemble-based platforms is low sequencing error rates (<1%) dominated by single mismatches. Low error rates are particularly important for sequencing miRNAs, whose relatively small sizes result in misalignment or loss of reads if error rates are too high. Currently, the Illumina HiSeq platform is the most commonly applied next-generation sequencing technology for RNA-Seq and has set the standard for NGS sequencing. The platform has two flow cells, each providing eight separate lanes for sequencing reactions to occur. The sequencing reactions can take between 1.5 and 12 d to complete, depending on the total read length of the library. Even more recently, Illumina released the MiSeq, a desktop sequencer with lower throughput but faster turnaround (generates ∼30 million paired-end reads in 24 h). The simplified workflow of the MiSeq instrument offers rapid turnaround time for transcriptome sequencing on a smaller scale.

Single-molecule-based platforms such as PacBio enable single-molecule real-time (SMRT) sequencing (Eid et al. 2009). This approach uses DNA polymerase to perform uninterrupted template-directed synthesis using fluorescently labeled nucleosides. As each base is enzymatically incorporated into a growing DNA strand, a distinctive pulse of fluorescence is detected in real-time by zero-mode waveguide nanostructure arrays. An advantage of SMRT is that it does not include a PCR amplification step, thereby avoiding amplification bias and improving uniform coverage across the transcriptome. Another advantage of this sequencing approach is the ability to produce extraordinarily long reads with average lengths of 4200 to 8500 bp, which greatly improves the detection of novel transcript structures (Au et al. 2013; Sharon et al. 2013). A critical disadvantage of SMRT is a high rate of errors (∼5%) that are predominately characterized by insertions and deletions (Carneiro et al. 2012); the high error rate results in misalignment and loss of sequencing reads due to the difficulty of matching erroneous reads to the reference genome.

Another important consideration for choosing a sequencing platform is transcriptome assembly. Transcriptome assembly, which is discussed in greater detail later, is necessary to transform a collection of short sequencing reads into a set of full-length transcripts. In general, longer sequencing reads make it simpler to accurately and unambiguously assemble transcripts, as well as identify splicing isoforms. The extremely long reads generated by the PacBio platform are ideal for de novo transcriptome assembly in which the reads are not aligned to a reference transcriptome. The longer reads will facilitate an accurate detection of alternative splice isoforms, which may not be discovered with shorter reads. Moleculo, a company acquired by Illumina, has developed long-read sequencing technology capable of producing 8500 bp reads. Although it has yet to be widely adopted for transcriptome sequencing, the long reads aid transcriptome assembly. Lastly, Illumina has developed protocols for its desktops MiSeq to sequence slightly longer reads (up to 350 bp). Although much shorter than PacBio and Moleculo reads, the longer MiSeq reads can also be used to improve both de novo and reference transcriptome assembly.

Previous Section Next Section

TRANSCRIPTOME ANALYSIS

Gene expression profiling by RNA-Seq provides an unprecedented high-resolution view of the global transcriptional landscape. As the sequencing technologies and protocol methodologies continually evolve, new informatics challenges and applications develop. Beyond surveying gene expression levels, RNA-Seq can also be applied to discover novel gene structures, alternatively spliced isoforms, and allele-specific expression (ASE). In addition, genetic studies of gene expression using RNA-Seq have observed genetically correlated variability in expression, splicing, and ASE (Montgomery et al. 2010; Pickrell et al. 2010; Battle et al. 2013; Lappalainen et al. 2013). This section will introduce how expression data are analyzed to provide greater insight into the extensive complexity of transcriptomes.

RNA-Sequencing Data Analysis Workflow

The conventional pipeline for RNA-Seq data includes generating FASTQ-format files contains reads sequenced from an NGS platform, aligning these reads to an annotated reference genome, and quantifying expression of genes (Fig. 2). Although basic sequencing analysis tools are more accessible than ever, RNA-Seq analysis presents unique computational challenges not encountered in other sequencing-based analyses and requires specific consideration to the biases inherent in expression data.

View larger version:

Figure 2.

Overview of RNA-Seq data analysis. Following typical RNA-Seq experiments, reads are first aligned to a reference genome. Second, the reads may be assembled into transcripts using reference transcript annotations or de novo assembly approaches. Next, the expression level of each gene is estimated by counting the number of reads that align to each exon or full-length transcript. Downstream analyses with RNA-Seq data include testing for differential expression between samples, detecting allele-specific expression, and identifying expression quantitative trait loci (eQTLs).

Read Alignment

Mapping RNA-Seq reads to the genome is considerably more challenging than mapping DNA sequencing reads because many reads map across splice junctions. In fact, conventional read mapping algorithms, such as Bowtie (Langmead et al. 2009) and BWA (Li and Durbin 2009), are not recommended for mapping RNA-Seq reads to the reference genome because of their inability to handle spliced transcripts. One approach to resolving this problem is to supplement the reference genome with sequences derived from exon–exon splice junctions acquired from known gene annotations (Mortazavi et al. 2008). A preferred strategy is to map reads with a “splicing-aware” aligner that can recognize the difference between a read aligning across an exon–intron boundary and a read with a short insertion. As RNA-Seq data have become more widely used, a number of splicing-aware mapping tools have been developed specifically for mapping transcriptome data. The more commonly used RNA-Seq alignment tools include GSNAP (Wu and Nacu 2010), MapSplice (Wang et al. 2010a), RUM (Grant et al. 2011), STAR (Dobin et al. 2013), and TopHat (Trapnell et al. 2009) (Table 2). Each aligner has different advantages in terms of performance, speed, and memory utilization. Selecting the best aligner to use depends on these metrics and the overall objectives of the RNA-Seq study. Efforts to systematically evaluate the performance of RNA-Seq aligners have been initiated by GENCODE's RNA-Seq Genome Annotation Assessment Project3 (RGASP3), which has found major performance difference between alignments tools on numerous benchmarks, including alignment yield, basewise accuracy, mismatch and gap placement, and exon junction discovery (Engstrom et al. 2013).

View this table:

Table 2.

Widely used RNA-Seq software packages

Transcript Assembly and Quantification

After RNA-Seq reads are aligned, the mapped reads can be assembled into transcripts. The majority of computational programs infer transcript models from the accumulation of read alignments to the reference genome (Trapnell et al. 2010; Li et al. 2011; Roberts et al. 2011a; Mezlini et al. 2013) (Table 2). An alternative approach for transcript assembly is de novo reconstruction, in which contiguous transcript sequences are assembled with the use of a reference genome or annotations (Robertson et al. 2010; Grabherr et al. 2011; Schulz et al. 2012). The reconstruction of transcripts from short-read data is a major challenge and a gold standard method for transcript assembly does not exist. The nature of the transcriptome (e.g., gene complexity, degree of polymorphisms, alternative splicing, dynamic range of expression), common technological challenges (e.g., sequencing errors), and features of the bioinformatics workflow (e.g., gene annotation, inference of isoforms) can substantially affect transcriptome assembly quality. RGASP3 has initiated efforts to evaluate computational methods for transcriptome reconstruction and has found that most algorithms can identify discrete transcript components, but the assembly of complete transcript structures remains a major challenge (Steijger et al. 2013).

A common downstream feature of transcript reconstruction software is the estimation of gene expression levels. Computational tools such as Cufflinks (Trapnell et al. 2010), FluxCapacitor (Montgomery et al. 2010; Griebel et al. 2012), and MISO (Katz et al. 2010), quantify expression by counting the number of reads that map to full-length transcripts (Table 2). Alternative approaches, such as HTSeq, can quantify expression without assembling transcripts by counting the number of reads that map to an exon (Anders et al. 2013). To accurately estimate gene expression, read counts must be normalized to correct for systematic variability, such as library fragment size, sequence composition bias, and read depth (Oshlack and Wakefield 2009; Roberts et al. 2011b). To account for these sources of variability, the reads per kilobase of transcripts per million mapped reads (RPKM) metric normalizes a transcript's read count by both the gene length and the total number of mapped reads in the sample. For paired end-reads, a metric that normalizes for sources of variances in transcript quantification is the paired fragments per kilobase of transcript per million mapped reads (FPKM) metric, which accounts for the dependency between paired-end reads in the RPKM estimate (Trapnell et al. 2010). Another technical challenge for transcript quantification is the mapping of reads to multiple transcripts that are a result of genes with multiple isoforms or close paralogs. One solution to correct for this “read assignment uncertainty” is to exclude all reads that do not map uniquely, as in Alexa-Seq (Griffith et al. 2010). However, this strategy is far from ideal for genes lacking unique exons. An alternative strategy used by Cufflinks (Trapnell et al. 2012), and MISO (Katz et al. 2010) is to construct a likelihood function that models the sequencing experiment and estimates the maximum likelihood that a read maps to a particular isoform.

Considerations for miRNA Sequencing Analysis

The general approach for analysis of miRNA sequencing data is similar to approaches discussed for mRNA. To identify known miRNAs, the sequencing reads can be mapped to a specific database, such as miRBase, a repository containing over 24,500 miRNA loci from 206 species in its latest release (v21) in June 2014 (Kozomara and Griffiths-Jones 2014). In addition, several tools have been developed to facilitate analysis of miRNAs including the commonly used tools miRanalyzer (Hackenberg et al. 2011) and miRDeep (An et al. 2013). MiRanalyzer can detect known miRNAs annotated on miRBase as well as predict novel miRNAs using a machine-learning approach based on the random forest method with a broad range of features. Similarly, miRDeep is able to identify known miRNAs and predict novel miRNAs using properties of miRNA biogenesis to score the compatibility of the position and frequency of sequenced RNA from the secondary structure of precursor miRNAs. Although miRDeep and miRanalyzer contain modules for target prediction, expression quantification, and differential expression, the methods developed for mRNA quantification and differential expression can also be applied to miRNA data (Eminaga et al. 2013).

Quality Assessment and Technical Considerations

At each stage in the RNA-Seq analysis pipeline, careful consideration should be applied to identifying and correcting for various sources of bias. Bias can arise throughout the RNA-Seq experimental pipeline, including during RNA extraction, sample preparation, library construction, sequencing, and read mapping (Kleinman and Majewski 2012; Lin et al. 2012; Pickrell et al. 2012; 't Hoen et al. 2013). First, the quality of the raw sequence data in FASTQ-format files should be evaluated to ensure high-quality reads. User-friendly software tools designed to generate quality overviews include the FASTX-toolkit (http://hannonlab.cshl.edu/fastx_toolkit), the FastQC software (http://www.bioinformatics.babraham.ac.uk/projects/fastqc), and the RobiNA package (Lohse et al. 2012). Several important parameters that should be evaluated include the sequence diversity of reads, adaptor contamination, base qualities, nucleotide composition, and percentage of called bases. These technical artifacts can arise at the sequencing stage or during the construction of the RNA-Seq. For example, the 5′ read end, derived from either end of a double-stranded cDNA fragment, shows higher error rate due to mispriming events introduced by the random oligos during the RNA-Seq library construction protocol (Lin et al. 2012). If possible, actions to correct for these biases should be performed, such as trimming the ends of reads, to expedite the speed and improve the quality of the read alignments.

After aligning the reads, additional parameters should be assessed to account for biases that arise at the read mapping stage. These parameters include the percentage of reads mapped to the transcriptome, the percentage of reads with a mapped mate pair, the coverage bias at the 5′- and 3′-ends, and the chromosomal distribution of reads. One of the most common sources of mapping errors for RNA-Seq data occurs when a read spans the splicing junction of an alternatively spliced gene. A misalignment can be easily introduced due to ambiguous mapping of the read end to one of the two (or more) possible exons and is especially common when reads are mapped to a reference transcriptome that contains an incomplete annotation of isoforms (Kleinman and Majewski 2012; Pickrell et al. 2012). If genotype information is available, the integrity of the samples should also be evaluated by investigating the correlation of single-nucleotide variants (SNVs) between the DNA and RNA reads ('t Hoen et al. 2013). The concordance between the DNA and RNA sequencing data may provide insight into sample swaps or sample mixtures caused accidentally as a result of personnel or equipment error. In the case of a swapped sample, more discordant variants would be observed between the DNA and RNA sequencing data. In the case of a mixture of samples, more significant patterns of allele-specific expression would be observed than expected for a single individual as a result of more combinations of heterozygous and homozygous sites that would skew the alleles beyond the expected 1:1 allelic ratio.

Differential Gene Expression

A primary objective of many gene expression experiments is to detect transcripts showing differential expression across various conditions. Extensive statistical approaches have been developed to test for differential expression with microarray data, where the continuous probe intensities across replicates can be approximated by a normal distribution (Cui and Churchill 2003; Smyth 2004; Grant et al. 2005). Although in principle these approaches are also applicable to RNA-Seq data, different statistical models must be considered for discrete read counts that do not fit a normal distribution. Early RNA-Seq studies suggested that the distribution of read counts across replicates fit a Poisson distribution, which formed the basis for modeling RNA-Seq count data (Marioni et al. 2008). However, further studies indicated that biological variability is not captured by the Poisson assumption, resulting in high false-positive rates due to underestimation of sampling error (Anders and Huber 2010; Langmead et al. 2010; Robinson and Oshlack 2010). Hence, negative binomial distribution models that take into account overdispersion or extra-Poisson variation have been shown to best fit the distribution of read counts across biological replicates.

To model the count-based nature of RNA-Seq data, complex statistical models have been developed to handle sources of variability that model overdispersion across technical and biological replicates. One source of variability is differences in sequencing read depth, which can artificially create differences between samples. For instance, differences in read depth will result in the samples appearing more divergent if raw read counts between genes are compared. To correct for this, it is advantageous to transform raw read count data to FPKM or RPKM values in differential expression analyses. Although this correction metric is commonly used in place of read counts, the presence of several highly expressed genes in a particular sample can significantly alter the RPKM and FPKM values. For example, a highly expressed gene can “absorb” many reads, consequently repressing the read counts for other genes and artificially inflating gene expression variation. To account for this bias, several statistical models have been proposed that use the highly expressed genes as model covariates (Robinson and Oshlack 2010). Another source of variability that has been observed is that the distribution of sequencing reads is unequal across genes. Therefore, a two-parameter generalized Poisson model that simultaneously considers read depth and sequencing bias as independent parameters was developed and shown to improve RNA-Seq analysis (Srivastava and Chen 2010). More complex normalization methods have also been developed to account for hidden covariates without removing significant biological variability. For example, the probabilistic estimation of expression residuals (PEER) framework (Stegle et al. 2012) and the hidden covariates with prior (HCP) framework (Mostafavi et al. 2013) are methods that use a Bayesian approach to infer hidden covariates and remove their effects from expression data.

To detect differential expression, a variety of statistical methods have been designed specifically for RNA-Seq data. A popular tool to detect differential expression is Cuffdiff, which is part of the Tuxedo suite of tools (Bowtie, Tophat, and Cufflinks) developed to analyze RNA-Seq data (Trapnell et al. 2013). In addition to Cuffdiff, several other packages support testing differential expression, including baySeq (Hardcastle and Kelly 2010), DESeq (Anders and Huber 2010), DEGseq (Wang et al. 2010b), and edgeR (Robinson et al. 2010) (Table 2). Although these packages can assign significance to differentially expressed transcripts, the biological observations should be carefully interpreted. Each model makes specific assumptions that may be violated in the context of the observed data; therefore, an understanding of the model parameters and their constraints is critical for drawing meaningful and accurate biological conclusions (Bullard et al. 2010). Furthermore, replicates in RNA-Seq experiments are crucial for measuring variability and improving estimations for the model parameters (Tarazona et al. 2011; Glaus et al. 2012). Biological replicates (e.g., cells grown on two different plates under the same conditions) are preferred to technical replicates (e.g., one RNA-Seq library sequenced on two different lanes), which show little variation. Although the number of replicates required per condition is an open research question, a minimum of three replicates per sample has been suggested (Auer and Doerge 2010). In many cases, multiplexed RNA-Seq libraries can be used to add biological replicates without increasing sequencing costs (if sequenced at a lower depth) and will greatly improve the robustness of the experimental design (Liu et al. 2014). Additionally, the accuracy of measurements of differential gene expression can be further improved by using ERCC spike-in controls to distinguish technical variation from biological variation.

Allele-Specific Expression

A major advantage of RNA-Seq is the ability to profile transcriptome dynamics at a single-nucleotide resolution. Therefore, the sequenced transcript reads can provide coverage across heterozygous sites, representing transcription from both the maternal and paternal alleles. If a sufficient number of reads cover a heterozygous site within a gene, the null hypothesis is that the ratio of maternal to paternal alleles is balanced. Significant deviation from this expectation suggests allele-specific expression (ASE). Potential mechanisms for ASE include genetic variation (e.g., single-nucleotide polymorphism in a cis-regulatory region upstream of a gene) and epigenetic effects (e.g., genomic imprinting, methylation, histone modifications, etc.). Early studies showed that allele-specific differences can affect up to 30% of loci within an individual (Ge et al. 2009) and are caused by both common and rare genetic variants (Pastinen 2010). Studies have also applied ASE to identify expression modifiers of protein-coding variation (Lappalainen et al. 2011; Montgomery et al. 2011), effects of loss-of-function variation (MacArthur et al. 2012), and differences between pathogenic and healthy tissues (Tuch et al. 2010). Furthermore, ASE studies using single-cell transcriptomics have uncovered a stochastic pattern of allelic expression that may contribute to variable expressivity, a novel perspective which may have fundamental implications for variable disease penetrance and severity (Deng et al. 2014).

Conventional workflows to detect ASE involve counting reads containing each allele at heterozygous sites and applying a statistical test, such as the binomial test or the Fisher's exact test (Degner et al. 2009; Rozowsky et al. 2011; Wei and Wang 2013). However, more rigorous statistical approaches are necessary to overcome technical challenges involved in ASE detection. These challenges include read-mapping bias, sampling variance, overdispersion at extreme read depths, alternatively spliced alleles, insertions and deletions (indels), and genotyping errors. To account for overdispersion, one approach is to model allelic read counts using a beta-binomial distribution at individual loci (Sun 2012); however, accurate estimation of the overdispersion parameter requires replicates and, in our experience, major source of bias come from site-specific mapping differences. Another strategy is to use a hierarchical Bayesian model that combines information across loci, as well as across replicates and technologies, to make global and site-specific inferences for ASE (Skelly et al. 2011). To assess reference-allele mapping bias, the number of mismatches in reads containing the nonreference allele should be assessed as increased bias is observed with greater sequence divergence between alleles (Stevenson et al. 2013). To correct for read-mapping bias, an enhanced reference genome can be constructed that masks all SNP positions or includes the alternative alleles at polymorphic loci (Degner et al. 2009; Satya et al. 2012). Statistical methods to better address these technical biases are under active development and are expected to foster further improvements in ASE detection.

Expression Quantitative Trait Loci

Another prominent direction of RNA-Seq studies has been the integration of expression data with other types of biological information, such as genotyping data. The combination of RNA-Seq with genetic variation data has enabled the identification of genetic loci correlated with gene expression variation, also known as expression quantitative trait loci (eQTLs). This expression variation caused by common and rare variants is postulated to contribute to phenotypic variation and susceptibility to complex disease across individuals (Majewski and Pastinen 2011). The goal of eQTL analysis is to identify associations that will uncover underlying biological processes, discover genetic variants causing disease, and determine causal pathways. Initial eQTL studies using RNA-Seq data identified a greater number of statistically significant eQTLs than had been identified by microarray studies (Montgomery et al. 2010; Pickrell et al. 2010). Most of the eQTLs identified directly influenced gene expression in an allele-specific manner and were located near transcriptional start sites, indicating that eQTLs could modulate expression directly, or in cis. Later studies identified trans-eQTLs, which are variants that affect the expression of a distant gene (>1 Mb) by modifying the activity or expression of upstream factors that regulate the gene (Fehrmann et al. 2011; Battle et al. 2013; Westra et al. 2013). Although trans-eQTLs show weaker effects and present validation difficulties, they can potentially reveal previously unknown pathways in gene regulation networks.

RNA-Seq has revolutionized QTL analyses because it enables association analyses of more than just gene expression levels alone. For example, RNA-Seq provides unprecedented opportunity to investigate variations in splicing by profiling alternately spliced isoforms of a gene. This has enabled the identification of variants influencing the quantitative expression of alternatively spliced isoforms commonly referred to as splicing-QTLs (sQTLs) (Lalonde et al. 2011). In addition, specific RNA-Seq library constructions (e.g., ribo-depleted) have enabled the detection of eQTLs affecting other RNA species; recent studies have identified variants affecting the expression of various ncRNAs, including long intergenic noncoding RNAs (Montgomery et al. 2010; Gamazon et al. 2012; Kumar et al. 2013; Popadin et al. 2013). The expanding potential of RNA-Seq to associate phenotypic variations with genetic variation offers an enhanced understanding of gene regulation.

Traditional eQTL mapping methods that were developed for microarray data use linear models such as linear regression and ANOVA to associate genetic variants with gene expression (Kendziorski and Wang 2006). These methods have been directly applied to RNA-Seq data following appropriate normalization of total read counts. Most eQTL studies perform separate testing for each transcript-SNP pair using linear regression and ANOVA models to detect significant association. Nonlinear approaches have also been developed to test associations, such as generalized linear and mixed models, Bayesian regression (Servin and Stephens 2007). Alternative models, such as Merlin, have also been developed to detect eQTLs from expression data that include related individuals using pedigree data (Abecasis et al. 2002). In addition, several methods have been developed to simultaneously test the effect of multiple SNPs on the expression of a single gene using Bayesian methods (Lee et al. 2008). To further improve on the detection of causal regulatory variants, several studies have integrated ASE information with eQTL analysis. These studies showed that genetic variants showing allele-specific effects and identified as eQTLs show higher enrichment in functional annotations and provide stronger evidence of cis-regulatory impact (Battle et al. 2013; Lappalainen et al. 2013; Sun and Hu 2013). Because high-throughput sequencing has created genotype data sets featuring millions of SNPs and expression data sets featuring tens of thousands of transcripts, the task of testing billions of transcript-SNP pairs in eQTL analysis can be computationally intensive. To mitigate this computational burden, software has been developed such as Matrix eQTL to efficiently test the associations by modeling the effect of genotype as either additive linear (least squares model) or categorical (ANOVA model) (Shabalin 2012). Because of the large number of tests performed, it is important to correct for multiple-testing by calculating the false discovery rate (Benjamini and Hochberg 1995; Yekutieli and Benjamini 1999) or resampling using bootstrap or permutation procedures (Karlsson 2006; Zhang et al. 2012).

However, the design and interpretation of eQTL studies is not straightforward. Many complications result from the complexity of gene regulation, which shows both spatial (cell and tissue location) specificity as well as temporal (developmental stage) specificity. For instance, several studies have performed eQTL analysis across multiple tissues, indicating that genetic regulatory elements can have tissue-specific effects (Petretto et al. 2006; Schadt et al. 2008; Dimas et al. 2009; Kwan et al. 2009; Grundberg et al. 2012; Flutre et al. 2013). Therefore, future eQTL analyses should test for SNP-transcript associations in well-defined cell types that are relevant to the trait of interest (Lonsdale et al. 2013). For example, a study detecting eQTLs in cardiovascular disease should use heart tissue while a study interested in autoimmune disease should use whole blood. Another major consideration for eQTL studies is accounting for population structure and elucidating the causal variants (Stranger et al. 2012). The structure of genomic variation can vary significantly between populations and will influence the resolution of any genetic association study (Frazer et al. 2007; Altshuler et al. 2010). Furthermore, if substantial linkage disequilibrium (LD) exists within the genome, the associated genetic variant is often “tagging” the causal variant rather than acting as the causal regulatory variant itself. As eQTL studies integrate data across different populations and use population-scale genome sequencing, the ability to elucidate causal variants will greatly improve (Montgomery et al. 2010; Lappalainen et al. 2013).

Previous Section Next Section

FUTURE PROSPECTS

As sequencing technologies advance, computational tools will need to evolve in parallel to solve new technical challenges and support novel applications. For example, as the ability of sequencing platforms to produce longer reads becomes a reality, new mapping methods are required to accurately and efficiently align long reads. Because longer reads can span multiple exon–exon junctions, the identification and quantification of alternative isoforms will improve significantly with the extra information encoded in longer reads. Furthermore, as laboratory methods mature to enable sequencing of minute quantities of RNA, complex statistical approaches will be needed to discriminate between technical noise and meaningful biological variation. These progresses will facilitate the analysis of transcriptomes in rare cell types and cell states, enabling researchers to reconstruct biological networks active at the cellular level. In addition, these advancements will allow transcriptome analysis to move into the field of clinical diagnostics; for example, earlier monitoring of cancer screening and pregnancy could be accomplished by sequencing cancerous RNA or fetal RNA in the maternal blood. Furthermore, the integration of whole-genome sequencing with RNA-Seq in larger samples will provide greater insight into genetic regulatory variation. These experimental and bioinformatic advances will provide a powerful toolbox for fully characterizing the transcriptome as it relates to basic biological questions, as well as its rising impact on personalized medicine.

Previous Section Next Section

ACKNOWLEDGMENTS

The authors gratefully acknowledge their colleagues, Tuuli Lappalainen and Jin Billy Li, as well as fellow laboratory members, including Zach Zappala, Kevin Smith, Marianne DeGorter and Mauro Pala, for their valuable comments. K.R.K. is supported by the National Defense Science and Engineering Graduate (NDSEG) Fellowship from the U.S. Department of Defense, and S.B.M. is funded by the Edward Mallinckrodt, Jr. Foundation.

Previous Section Next Section

Footnotes

↵4 Correspondence: smontgom{at}stanford.edu

© 2015 Cold Spring Harbor Laboratory Press

Previous Section

REFERENCES

↵
1. Abecasis GR,
2. Cherny SS,
3. Cookson WO,
4. Cardon LR
Abecasis GR, Cherny SS, Cookson WO, Cardon LR. 2002. Merlin—Rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30: 97–101.
CrossRef Medline Google Scholar
↵
1. Adams MD,
2. Kelley JM,
3. Gocayne JD,
4. Dubnick M,
5. Polymeropoulos MH,
6. Xiao H,
7. Merril CR,
8. Wu A,
9. Olde B,
10. Moreno RF
Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, et al. 1991. Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252: 1651–1656.
FREE Full Text
↵
1. Adams MD,
2. Kerlavage AR,
3. Fleischmann RD,
4. Fuldner RA,
5. Bult CJ,
6. Lee NH,
7. Kirkness EF,
8. Weinstock KG,
9. Gocayne JD,
10. White O
Adams MD, Kerlavage AR, Fleischmann RD, Fuldner RA, Bult CJ, Lee NH, Kirkness EF, Weinstock KG, Gocayne JD, White O, et al. 1995. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377: 3–174.
Medline Google Scholar
↵
1. Alizadeh AA,
2. Eisen MB,
3. Davis RE,
4. Ma C,
5. Lossos IS,
6. Rosenwald A,
7. Boldrick JC,
8. Sabet H,
9. Tran T,
10. Yu X
Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, et al. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503–511.
CrossRef Medline Google Scholar
↵
1. Altshuler DM,
2. Gibbs RA,
3. Peltonen L,
4. Dermitzakis E,
5. Schaffner SF,
6. Yu FL,
7. Bonnen PE,
8. de Bakker PIW,
9. Deloukas P,
10. Gabriel SB
Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu FL, Bonnen PE, de Bakker PIW, Deloukas P, Gabriel SB, et al. 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467: 52–58.
CrossRef Medline Google Scholar
↵
1. An J,
2. Lai J,
3. Lehman ML,
4. Nelson CC
An J, Lai J, Lehman ML, Nelson CC. 2013. miRDeep*: An integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res 41: 727–737.
FREE Full Text
↵
1. Anders S,
2. Huber W
Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome Biol 11: R106.
CrossRef Medline Google Scholar
↵
1. Anders S,
2. McCarthy DJ,
3. Chen Y,
4. Okoniewski M,
5. Smyth GK,
6. Huber W,
7. Robinson MD
Anders S, McCarthy DJ, Chen Y, Okoniewski M, Smyth GK, Huber W, Robinson MD. 2013. Count-based differential expression analysis of RNA sequencing data using R and bioconductor. Nature protocols 8: 1765–1786.
CrossRef Medline Google Scholar
↵
1. Au KF,
2. Sebastiano V,
3. Afshar PT,
4. Durruthy JD,
5. Lee L,
6. Williams BA,
7. van Bakel H,
8. Schadt EE,
9. Reijo-Pera RA,
10. Underwood JG
Au KF, Sebastiano V, Afshar PT, Durruthy JD, Lee L, Williams BA, van Bakel H, Schadt EE, Reijo-Pera RA, Underwood JG, et al. 2013. Characterization of the human ESC transcriptome by hybrid sequencing. Proc Natl Acad Sci 110: E4821–E4830.
FREE Full Text
↵
1. Auer PL,
2. Doerge RW
Auer PL, Doerge RW. 2010. Statistical design and analysis of RNA sequencing data. Genetics 185: 405–416.
FREE Full Text
↵
1. Battle A,
2. Mostafavi S,
3. Zhu X,
4. Potash JB,
5. Weissman MM,
6. McCormick C,
7. Haudenschild CD,
8. Beckman KB,
9. Shi J,
10. Mei R
Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, et al. 2013. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 24: 14–24.
Medline Google Scholar
↵
1. Benjamini Y,
2. Hochberg Y
Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate—A practical and powerful approach to multiple testing. J Roy Stat Soc B Met 57: 289–300.
Google Scholar
↵
1. Bentley DR,
2. Balasubramanian S,
3. Swerdlow HP,
4. Smith GP,
5. Milton J,
6. Brown CG,
7. Hall KP,
8. Evers DJ,
9. Barnes CL,
10. Bignell HR
Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456: 53–59.
CrossRef Medline Google Scholar
↵
1. Birney E,
2. Stamatoyannopoulos Ja,
3. Dutta A,
4. Guigó R,
5. Gingeras TR,
6. Margulies EH,
7. Weng Z,
8. Snyder M,
9. Dermitzakis ET,
10. Thurman RE
Birney E, Stamatoyannopoulos Ja, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. 2007. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.
CrossRef Medline Google Scholar
↵
1. Blencowe BJ,
2. Ahmad S,
3. Lee LJ
Blencowe BJ, Ahmad S, Lee LJ. 2009. Current-generation high-throughput sequencing: Deepening insights into mammalian transcriptomes. Genes Dev 23: 1379–1386.
FREE Full Text
↵
1. Brennecke P,
2. Anders S,
3. Kim JK,
4. Kolodziejczyk AA,
5. Zhang X,
6. Proserpio V,
7. Baying B,
8. Benes V,
9. Teichmann SA,
10. Marioni JC
Brennecke P, Anders S, Kim JK, Kolodziejczyk AA, Zhang X, Proserpio V, Baying B, Benes V, Teichmann SA, Marioni JC, et al. 2013. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10: 1093–1095.
CrossRef Medline Google Scholar
↵
1. Bullard JH,
2. Purdom E,
3. Hansen KD,
4. Dudoit S
Bullard JH, Purdom E, Hansen KD, Dudoit S. 2010. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11: 94.
CrossRef Medline Google Scholar
↵
1. Cantor H,
2. Simpson E,
3. Sato VL,
4. Fathman CG,
5. Herzenberg LA
Cantor H, Simpson E, Sato VL, Fathman CG, Herzenberg LA. 1975. Characterization of subpopulations of T lymphocytes. I. Separation and functional studies of peripheral T-cells binding different amounts of fluorescent anti-Thy 1.2 (theta) antibody using a fluorescence-activated cell sorter (FACS). Cell Immunol 15: 180–196.
CrossRef Medline Google Scholar
↵
1. Carneiro MO,
2. Russ C,
3. Ross MG,
4. Gabriel SB,
5. Nusbaum C,
6. DePristo MA
Carneiro MO, Russ C, Ross MG, Gabriel SB, Nusbaum C, DePristo MA. 2012. Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics 13: 375.
CrossRef Medline Google Scholar
↵
1. Casneuf T,
2. Van de Peer Y,
3. Huber W
Casneuf T, Van de Peer Y, Huber W. 2007. In situ analysis of cross-hybridisation on microarrays and the inference of expression correlation. BMC Bioinformatics 8: 461.
CrossRef Medline Google Scholar
↵
1. Christodoulou DC,
2. Gorham JM,
3. Herman DS,
4. Seidman JG
Christodoulou DC, Gorham JM, Herman DS, Seidman JG. 2011. Construction of normalized RNA-seq libraries for next-generation sequencing using the crab duplex-specific nuclease. Current Protocols in Molecular Biology/edited by Frederick M Ausubel, [et al.] Chapter 4: Unit 4 12.
Google Scholar
↵
1. Crick F
Crick F. 1970. Central dogma of molecular biology. Nature 227: 561–563.
CrossRef Medline Google Scholar
↵
1. Crick FH
Crick FH. 1958. On protein synthesis. Symp Soc Exp Biol 12: 138–163.
Medline Google Scholar
↵
1. Cui X,
2. Churchill GA
Cui X, Churchill GA. 2003. Statistical tests for differential expression in cDNA microarray experiments. Genome Biol 4: 210.
CrossRef Medline Google Scholar
↵
1. Degner JF,
2. Marioni JC,
3. Pai AA,
4. Pickrell JK,
5. Nkadori E,
6. Gilad Y,
7. Pritchard JK
Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK. 2009. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25: 3207–3212.
FREE Full Text
↵
1. Deng Q,
2. Ramskold D,
3. Reinius B,
4. Sandberg R
Deng Q, Ramskold D, Reinius B, Sandberg R. 2014. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343: 193–196.
FREE Full Text
↵
1. Dimas AS,
2. Deutsch S,
3. Stranger BE,
4. Montgomery SB,
5. Borel C,
6. Attar-Cohen H,
7. Ingle C,
8. Beazley C,
9. Gutierrez Arcelus M,
10. Sekowska M
Dimas AS, Deutsch S, Stranger BE, Montgomery SB, Borel C, Attar-Cohen H, Ingle C, Beazley C, Gutierrez Arcelus M, Sekowska M, et al. 2009. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325: 1246–1250.
FREE Full Text
↵
1. Dobin A,
2. Davis CA,
3. Schlesinger F,
4. Drenkow J,
5. Zaleski C,
6. Jha S,
7. Batut P,
8. Chaisson M,
9. Gingeras TR
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. 2013. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29: 15–21.
FREE Full Text
↵
1. Eid J,
2. Fehr A,
3. Gray J,
4. Luong K,
5. Lyle J,
6. Otto G,
7. Peluso P,
8. Rank D,
9. Baybayan P,
10. Bettman B
Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al. 2009. Real-time DNA sequencing from single polymerase molecules. Science 323: 133–138.
FREE Full Text
↵
1. Eminaga S,
2. Christodoulou DC,
3. Vigneault F,
4. Church GM,
5. Seidman JG
Eminaga S, Christodoulou DC, Vigneault F, Church GM, Seidman JG. 2013. Quantification of microRNA expression with next-generation sequencing. Current Protocols in Molecular Biology/edited by Frederick M Ausubel [et al.] Chapter 4: Unit 4 17.
Google Scholar
↵
1. Emmert-Buck MR,
2. Bonner RF,
3. Smith PD,
4. Chuaqui RF,
5. Zhuang Z,
6. Goldstein SR,
7. Weiss RA,
8. Liotta LA
Emmert-Buck MR, Bonner RF, Smith PD, Chuaqui RF, Zhuang Z, Goldstein SR, Weiss RA, Liotta LA. 1996. Laser capture microdissection. Science 274: 998–1001.
FREE Full Text
↵
1. Engstrom PG,
2. Steijger T,
3. Sipos B,
4. Grant GR,
5. Kahles A,
6. Consortium R,
7. Alioto T,
8. Behr J,
9. Bertone P,
10. Bohnert R
Engstrom PG, Steijger T, Sipos B, Grant GR, Kahles A, Consortium R, Alioto T, Behr J, Bertone P, Bohnert R, et al. 2013. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10: 1185–1191.
CrossRef Medline Google Scholar
↵
1. Erkkila T,
2. Lehmusvaara S,
3. Ruusuvuori P,
4. Visakorpi T,
5. Shmulevich I,
6. Lahdesmaki H
Erkkila T, Lehmusvaara S, Ruusuvuori P, Visakorpi T, Shmulevich I, Lahdesmaki H. 2010. Probabilistic analysis of gene expression measurements from heterogeneous tissues. Bioinformatics 26: 2571–2577.
FREE Full Text
↵
1. Fehrmann RSN,
2. Jansen RC,
3. Veldink JH,
4. Westra HJ,
5. Arends D,
6. Bonder MJ,
7. Fu JY,
8. Deelen P,
9. Groen HJM,
10. Smolonska A
Fehrmann RSN, Jansen RC, Veldink JH, Westra HJ, Arends D, Bonder MJ, Fu JY, Deelen P, Groen HJM, Smolonska A, et al. 2011. Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet 7: e1002197.
CrossRef Medline Google Scholar
↵
1. Flutre T,
2. Wen X,
3. Pritchard J,
4. Stephens M
Flutre T, Wen X, Pritchard J, Stephens M. 2013. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet 9: e1003486.
CrossRef Medline Google Scholar
↵
1. Frazer KA,
2. Ballinger DG,
3. Cox DR,
4. Hinds DA,
5. Stuve LL,
6. Gibbs RA,
7. Belmont JW,
8. Boudreau A,
9. Hardenbol P,
10. Leal SM
Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, et al. 2007. A second generation human haplotype map of over 3.1 million SNPs. Nature 449: U851–U853.
Google Scholar
↵
1. Fu GK,
2. Xu W,
3. Wilhelmy J,
4. Mindrinos M,
5. Davis RW,
6. Xiao W,
7. Fodor SPA
Fu GK, Xu W, Wilhelmy J, Mindrinos M, Davis RW, Xiao W, Fodor SPA. 2014. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations. Proc Natl Assoc Sci 111: 1891–1896.
Google Scholar
↵
1. Gamazon ER,
2. Ziliak D,
3. Im HK,
4. LaCroix B,
5. Park DS,
6. Cox NJ,
7. Huang RS
Gamazon ER, Ziliak D, Im HK, LaCroix B, Park DS, Cox NJ, Huang RS. 2012. Genetic architecture of microRNA expression: Implications for the transcriptome and complex traits. Am J Hum Genet 90: 1046–1063.
CrossRef Medline Google Scholar
↵
1. Ge B,
2. Pokholok DK,
3. Kwan T,
4. Grundberg E,
5. Morcos L,
6. Verlaan DJ,
7. Le J,
8. Koka V,
9. Lam KC,
10. Gagne V
Ge B, Pokholok DK, Kwan T, Grundberg E, Morcos L, Verlaan DJ, Le J, Koka V, Lam KC, Gagne V, et al. 2009. Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nat Genet 41: 1216–1222.
CrossRef Medline Google Scholar
↵
1. Glaus P,
2. Honkela A,
3. Rattray M
Glaus P, Honkela A, Rattray M. 2012. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28: 1721–1728.
FREE Full Text
↵
1. Grabherr MG,
2. Haas BJ,
3. Yassour M,
4. Levin JZ,
5. Thompson DA,
6. Amit I,
7. Adiconis X,
8. Fan L,
9. Raychowdhury R,
10. Zeng Q
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29: 644–652.
CrossRef Medline Google Scholar
↵
1. Grant GR,
2. Farkas MH,
3. Pizarro AD,
4. Lahens NF,
5. Schug J,
6. Brunk BP,
7. Stoeckert CJ,
8. Hogenesch JB,
9. Pierce EA
Grant GR, Farkas MH, Pizarro AD, Lahens NF, Schug J, Brunk BP, Stoeckert CJ, Hogenesch JB, Pierce EA. 2011. Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics 27: 2518–2528.
FREE Full Text
↵
1. Grant GR,
2. Liu J,
3. Stoeckert CJ Jr
Grant GR, Liu J, Stoeckert CJ Jr. 2005. A practical false discovery rate approach to identifying patterns of differential expression in microarray data. Bioinformatics 21: 2684–2690.
FREE Full Text
↵
1. Griebel T,
2. Zacher B,
3. Ribeca P,
4. Raineri E,
5. Lacroix V,
6. Guigo R,
7. Sammeth M
Griebel T, Zacher B, Ribeca P, Raineri E, Lacroix V, Guigo R, Sammeth M. 2012. Modelling and simulating generic RNA-Seq experiments with the flux simulator. Nucleic Acids Res 40: 10073–10083.
FREE Full Text
↵
1. Griffith M,
2. Griffith OL,
3. Mwenifumbo J,
4. Goya R,
5. Morrissy AS,
6. Morin RD,
7. Corbett R,
8. Tang MJ,
9. Hou YC,
10. Pugh TJ
Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy AS, Morin RD, Corbett R, Tang MJ, Hou YC, Pugh TJ, et al. 2010. Alternative expression analysis by RNA sequencing. Nat Methods 7: 843–847.
CrossRef Medline Google Scholar
↵
1. Grundberg E,
2. Small KS,
3. Hedman AK,
4. Nica AC,
5. Buil A,
6. Keildson S,
7. Bell JT,
8. Yang TP,
9. Meduri E,
10. Barrett A
Grundberg E, Small KS, Hedman AK, Nica AC, Buil A, Keildson S, Bell JT, Yang TP, Meduri E, Barrett A, et al. 2012. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat Genet 44: 1084–1089.
CrossRef Medline Google Scholar
↵
1. Guttman M,
2. Amit I,
3. Garber M,
4. French C,
5. Lin MF,
6. Feldser D,
7. Huarte M,
8. Zuk O,
9. Carey BW,
10. Cassady JP
Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, et al. 2009. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458: 223–227.
CrossRef Medline Google Scholar
↵
1. Hackenberg M,
2. Rodriguez-Ezpeleta N,
3. Aransay AM
Hackenberg M, Rodriguez-Ezpeleta N, Aransay AM. 2011. miRanalyzer: An update on the detection and analysis of microRNAs in high-throughput sequencing experiments. Nucleic Acids Res 39: W132–W138.
FREE Full Text
↵
1. Hardcastle TJ,
2. Kelly KA
Hardcastle TJ, Kelly KA. 2010. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11: 422.
CrossRef Medline Google Scholar
↵
1. Hashimshony T,
2. Wagner F,
3. Sher N,
4. Yanai I
Hashimshony T, Wagner F, Sher N, Yanai I. 2012. CEL-Seq: Single-cell RNA-Seq by multiplexed linear amplification. Cell Reports 2: 666–673.
Medline Google Scholar
↵
1. Huang R,
2. Jaritz M,
3. Guenzl P,
4. Vlatkovic I,
5. Sommer A,
6. Tamir IM,
7. Marks H,
8. Klampfl T,
9. Kralovics R,
10. Stunnenberg HG
Huang R, Jaritz M, Guenzl P, Vlatkovic I, Sommer A, Tamir IM, Marks H, Klampfl T, Kralovics R, Stunnenberg HG, et al. 2011. An RNA-Seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs. PLoS ONE 6: e27288.
CrossRef Medline Google Scholar
↵
1. Huang S
Huang S. 2009. Non-genetic heterogeneity of cells in development: More than just noise. Development 136: 3853–3862.
FREE Full Text
↵
1. Islam S,
2. Kjallquist U,
3. Moliner A,
4. Zajac P,
5. Fan JB,
6. Lonnerberg P,
7. Linnarsson S
Islam S, Kjallquist U, Moliner A, Zajac P, Fan JB, Lonnerberg P, Linnarsson S. 2012. Highly multiplexed and strand-specific single-cell RNA 5′ end sequencing. Nat Protocols 7: 813–828.
CrossRef Medline Google Scholar
↵
1. Itoh K,
2. Matsubara K,
3. Okubo K
Itoh K, Matsubara K, Okubo K. 1994. Identification of an active gene by using large-scale cDNA sequencing. Gene 140: 295–296.
CrossRef Medline Google Scholar
↵
1. Jiang L,
2. Schlesinger F,
3. Davis CA,
4. Zhang Y,
5. Li R,
6. Salit M,
7. Gingeras TR,
8. Oliver B
Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. 2011. Synthetic spike-in standards for RNA-seq experiments. Genome Res 21: 1543–1551.
FREE Full Text
↵
1. Karlsson A
Karlsson A. 2006. Review of “Permutation, parametric, and bootstrap tests of hypotheses.” J R Stat Soc A Stat 169: 171–171.
Google Scholar
↵
1. Katz Y,
2. Wang ET,
3. Airoldi EM,
4. Burge CB
Katz Y, Wang ET, Airoldi EM, Burge CB. 2010. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods 7: 1009–1015.
CrossRef Medline Google Scholar
↵
1. Kawasaki ES
Kawasaki ES. 2004. Microarrays and the gene expression profile of a single cell. Ann N Y Acad Sci 1020: 92–100.
CrossRef Medline Google Scholar
↵
1. Kendziorski C,
2. Wang P
Kendziorski C, Wang P. 2006. A review of statistical methods for expression quantitative trait loci mapping. Mamm Genome 17: 509–517.
CrossRef Medline Google Scholar
↵
1. Khan J,
2. Wei JS,
3. Ringner M,
4. Saal LH,
5. Ladanyi M,
6. Westermann F,
7. Berthold F,
8. Schwab M,
9. Antonescu CR,
10. Peterson C
Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, et al. 2001. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7: 673–679.
CrossRef Medline Google Scholar
↵
1. Kleinman CL,
2. Majewski J
Kleinman CL, Majewski J. 2012. Comment on “Widespread RNA and DNA sequence differences in the human transcriptome.” Science 335: 1302.
FREE Full Text
↵
1. Kozomara A,
2. Griffiths-Jones S
Kozomara A, Griffiths-Jones S. 2014. miRBase: Annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42: D68–D73.
FREE Full Text
↵
1. Kube DM,
2. Savci-Heijink CD,
3. Lamblin AF,
4. Kosari F,
5. Vasmatzis G,
6. Cheville JC,
7. Connelly DP,
8. Klee GG
Kube DM, Savci-Heijink CD, Lamblin AF, Kosari F, Vasmatzis G, Cheville JC, Connelly DP, Klee GG. 2007. Optimization of laser capture microdissection and RNA amplification for gene expression profiling of prostate cancer. BMC Mol Biol 8: 25.
CrossRef Medline Google Scholar
↵
1. Kumar V,
2. Westra HJ,
3. Karjalainen J,
4. Zhernakova DV,
5. Esko T,
6. Hrdlickova B,
7. Almeida R,
8. Zhernakova A,
9. Reinmaa E,
10. Vosa U
Kumar V, Westra HJ, Karjalainen J, Zhernakova DV, Esko T, Hrdlickova B, Almeida R, Zhernakova A, Reinmaa E, Vosa U, et al. 2013. Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. PLoS Genet 9: e1003201.
CrossRef Medline Google Scholar
↵
1. Kwan T,
2. Grundberg E,
3. Koka V,
4. Ge B,
5. Lam KC,
6. Dias C,
7. Kindmark A,
8. Mallmin H,
9. Ljunggren O,
10. Rivadeneira F
Kwan T, Grundberg E, Koka V, Ge B, Lam KC, Dias C, Kindmark A, Mallmin H, Ljunggren O, Rivadeneira F, et al. 2009. Tissue effect on genetic control of transcript isoform variation. PLoS Genet 5: e1000608.
CrossRef Medline Google Scholar
↵
1. Lalonde E,
2. Ha KC,
3. Wang Z,
4. Bemmo A,
5. Kleinman CL,
6. Kwan T,
7. Pastinen T,
8. Majewski J
Lalonde E, Ha KC, Wang Z, Bemmo A, Kleinman CL, Kwan T, Pastinen T, Majewski J. 2011. RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. Genome Res 21: 545–554.
FREE Full Text
↵
1. Langmead B,
2. Hansen KD,
3. Leek JT
Langmead B, Hansen KD, Leek JT. 2010. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11: R83.
CrossRef Medline Google Scholar
↵
1. Langmead B,
2. Trapnell C,
3. Pop M,
4. Salzberg SL
Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10: R25.
CrossRef Medline Google Scholar
↵
1. Lappalainen T,
2. Montgomery SB,
3. Nica AC,
4. Dermitzakis ET
Lappalainen T, Montgomery SB, Nica AC, Dermitzakis ET. 2011. Epistatic selection between coding and regulatory variation in human evolution and disease. Am J Hum Genetics 89: 459–463.
CrossRef Medline Google Scholar
↵
1. Lappalainen T,
2. Sammeth M,
3. Friedlander MR,
4. 't Hoen PA,
5. Monlong J,
6. Rivas MA,
7. Gonzalez-Porta M,
8. Kurbatova N,
9. Griebel T,
10. Ferreira PG
Lappalainen T, Sammeth M, Friedlander MR, 't Hoen PA, Monlong J, Rivas MA, Gonzalez-Porta M, Kurbatova N, Griebel T, Ferreira PG, et al. 2013. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501: 506–511.
CrossRef Medline Google Scholar
↵
1. Lee SH,
2. van der Werf JHJ,
3. Hayes BJ,
4. Goddard ME,
5. Visscher PM
Lee SH, van der Werf JHJ, Hayes BJ, Goddard ME, Visscher PM. 2008. Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genet 4 doi:101371/journal.pgen.1000231.
CrossRef Google Scholar
↵
1. Levin JZ,
2. Berger MF,
3. Adiconis X,
4. Rogov P,
5. Melnikov A,
6. Fennell T,
7. Nusbaum C,
8. Garraway LA,
9. Gnirke A
Levin JZ, Berger MF, Adiconis X, Rogov P, Melnikov A, Fennell T, Nusbaum C, Garraway LA, Gnirke A. 2009. Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Genome Biol 10: R115.
CrossRef Medline Google Scholar
↵
1. Li H,
2. Durbin R
Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25: 1754–1760.
FREE Full Text
↵
1. Li JJ,
2. Jiang CR,
3. Brown JB,
4. Huang H,
5. Bickel PJ
Li JJ, Jiang CR, Brown JB, Huang H, Bickel PJ. 2011. Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation. Proc Natl Acad Sci 108: 19867–19872.
FREE Full Text
↵
1. Li Y,
2. Xie X
Li Y, Xie X. 2013. A mixture model for expression deconvolution from RNA-seq in heterogeneous tissues. BMC Bioinformatics 14: S11.
Google Scholar
↵
1. Lin W,
2. Piskol R,
3. Tan MH,
4. Li JB
Lin W, Piskol R, Tan MH, Li JB. 2012. Response to comments on “Widespread RNA and DNA sequence differences in the human transcriptome.” Science 335: 1302.
FREE Full Text
↵
1. Lister R,
2. O'Malley RC,
3. Tonti-Filippini J,
4. Gregory BD,
5. Berry CC,
6. Millar AH,
7. Ecker JR
Lister R, O'Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR. 2008. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133: 523–536.
CrossRef Medline Google Scholar
↵
1. Liu Y,
2. Zhou J,
3. White KP
Liu Y, Zhou J, White KP. 2014. RNA-seq differential expression studies: More sequence or more replication? Bioinformatics 30: 301–304.
FREE Full Text
↵
1. Lohse M,
2. Bolger AM,
3. Nagel A,
4. Fernie AR,
5. Lunn JE,
6. Stitt M,
7. Usadel B
Lohse M, Bolger AM, Nagel A, Fernie AR, Lunn JE, Stitt M, Usadel B. 2012. RobiNA: A user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Res 40: W622–W627.
FREE Full Text
↵
1. Lonsdale J,
2. Thomas J,
3. Salvatore M,
4. Philips R,
5. Lo E
Lonsdale J, Thomas J, Salvatore M, Philips R, Lo E, et al. 2013. The Genotype-Tissue Expression (GTEx) project. Nat Genet 45: 580–585.
CrossRef Medline Google Scholar
↵
1. MacArthur DG,
2. Balasubramanian S,
3. Frankish A,
4. Huang N,
5. Morris J,
6. Walter K,
7. Jostins L,
8. Habegger L,
9. Pickrell JK,
10. Montgomery SB
MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, et al. 2012. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335: 823–828.
FREE Full Text
↵
1. Majewski J,
2. Pastinen T
Majewski J, Pastinen T. 2011. The study of eQTL variations by RNA-seq: From SNPs to phenotypes. Trends Genet 27: 72–79.
CrossRef Medline Google Scholar
↵
1. Marioni JC,
2. Mason CE,
3. Mane SM,
4. Stephens M,
5. Gilad Y
Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. 2008. RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18: 1509–1517.
FREE Full Text
↵
1. Mattick JS,
2. Makunin IV
Mattick JS, Makunin IV. 2006. Non-coding RNA. Hum Mol Genet 15 Spec No 1: R17–R29.
Google Scholar
↵
1. Mercer TR,
2. Dinger ME,
3. Mattick JS
Mercer TR, Dinger ME, Mattick JS. 2009. Long non-coding RNAs: Insights into functions. Nat Rev Genet 10: 155–159.
CrossRef Medline Google Scholar
↵
1. Metzker ML
Metzker ML. 2010. Sequencing technologies—The next generation. Nat Rev Genet 11: 31–46.
CrossRef Medline Google Scholar
↵
1. Mezlini AM,
2. Smith EJM,
3. Fiume M,
4. Buske O,
5. Savich GL,
6. Shah S,
7. Aparicio S,
8. Chiang DY,
9. Goldenberg A,
10. Brudno M
Mezlini AM, Smith EJM, Fiume M, Buske O, Savich GL, Shah S, Aparicio S, Chiang DY, Goldenberg A, Brudno M. 2013. iReckon: Simultaneous isoform discovery and abundance estimation from RNA-seq data. Genome Res 23: 519–529.
FREE Full Text
↵
1. Mills JD,
2. Kawahara Y,
3. Janitz M
Mills JD, Kawahara Y, Janitz M. 2013. Strand-specific RNA-Seq provides greater resolution of transcriptome profiling. Curr Genomics 14: 173–181.
Medline Google Scholar
↵
1. Montgomery SB,
2. Lappalainen T,
3. Gutierrez-Arcelus M,
4. Dermitzakis ET
Montgomery SB, Lappalainen T, Gutierrez-Arcelus M, Dermitzakis ET. 2011. Rare and common regulatory variation in population-scale sequenced human genomes. PLoS Genet 7: e1002144.
CrossRef Medline Google Scholar
↵
1. Montgomery SB,
2. Sammeth M,
3. Gutierrez-Arcelus M,
4. Lach RP,
5. Ingle C,
6. Nisbett J,
7. Guigo R,
8. Dermitzakis ET
Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET. 2010. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464: 773–777.
CrossRef Medline Google Scholar
↵
1. Moran I,
2. Akerman I,
3. van de Bunt M,
4. Xie R,
5. Benazra M,
6. Nammo T,
7. Arnes L,
8. Nakic N,
9. Garcia-Hurtado J,
10. Rodriguez-Segui S
Moran I, Akerman I, van de Bunt M, Xie R, Benazra M, Nammo T, Arnes L, Nakic N, Garcia-Hurtado J, Rodriguez-Segui S, et al. 2012. Human beta cell transcriptome analysis uncovers lncRNAs that are tissue-specific, dynamically regulated, and abnormally expressed in type 2 diabetes. Cell Metab 16: 435–448.
CrossRef Medline Google Scholar
↵
1. Morin RD,
2. Zhao YJ,
3. Prabhu AL,
4. Dhalla N,
5. McDonald H,
6. Pandoh P,
7. Tam A,
8. Zeng T,
9. Hirst M,
10. Marra M
Morin RD, Zhao YJ, Prabhu AL, Dhalla N, McDonald H, Pandoh P, Tam A, Zeng T, Hirst M, Marra M. 2010. Preparation and analysis of MicroRNA libraries using the Illumina massively parallel sequencing technology. Methods Mol Biol 650: 173–199.
Medline Google Scholar
↵
1. Mortazavi A,
2. Williams BA,
3. McCue K,
4. Schaeffer L,
5. Wold B
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5: 621–628.
CrossRef Medline Google Scholar
↵
1. Mostafavi S,
2. Battle A,
3. Zhu X,
4. Urban AE,
5. Levinson D,
6. Montgomery SB,
7. Koller D
Mostafavi S, Battle A, Zhu X, Urban AE, Levinson D, Montgomery SB, Koller D. 2013. Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. PLoS ONE 8: e68141.
CrossRef Medline Google Scholar
↵
1. Nica AC,
2. Ongen H,
3. Irminger JC,
4. Bosco D,
5. Berney T,
6. Antonarakis SE,
7. Halban PA,
8. Dermitzakis ET
Nica AC, Ongen H, Irminger JC, Bosco D, Berney T, Antonarakis SE, Halban PA, Dermitzakis ET. 2013. Cell-type, allelic, and genetic signatures in the human pancreatic beta cell transcriptome. Genome Res 23: 1554–1562.
FREE Full Text
↵
1. Okazaki Y,
2. Furuno M,
3. Kasukawa T,
4. Adachi J,
5. Bono H,
6. Kondo S,
7. Nikaido I,
8. Osato N,
9. Saito R,
10. Suzuki H
Okazaki Y, Furuno M, Kasukawa T, Adachi J, Bono H, Kondo S, Nikaido I, Osato N, Saito R, Suzuki H, et al. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420: 563–573.
CrossRef Medline Google Scholar
↵
1. Oshlack A,
2. Wakefield MJ
Oshlack A, Wakefield MJ. 2009. Transcript length bias in RNA-seq data confounds systems biology. Biol Direct 4: 14.
CrossRef Medline Google Scholar
↵
1. Parkhomchuk D,
2. Borodina T,
3. Amstislavskiy V,
4. Banaru M,
5. Hallen L,
6. Krobitsch S,
7. Lehrach H,
8. Soldatov A
Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A. 2009. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res 37: e123.
FREE Full Text
↵
1. Pastinen T
Pastinen T. 2010. Genome-wide allele-specific analysis: Insights into regulatory variation. Nat Rev Genetics 11: 533–538.
CrossRef Medline Google Scholar
↵
1. Petretto E,
2. Mangion J,
3. Dickens NJ,
4. Cook SA,
5. Kumaran MK,
6. Lu H,
7. Fischer J,
8. Maatz H,
9. Kren V,
10. Pravenec M
Petretto E, Mangion J, Dickens NJ, Cook SA, Kumaran MK, Lu H, Fischer J, Maatz H, Kren V, Pravenec M, et al. 2006. Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet 2: e172.
CrossRef Medline Google Scholar
↵
1. Picelli S,
2. Bjorklund AK,
3. Faridani OR,
4. Sagasser S,
5. Winberg G,
6. Sandberg R
Picelli S, Bjorklund AK, Faridani OR, Sagasser S, Winberg G, Sandberg R. 2013. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat Methods 10: 1096–1098.
CrossRef Medline Google Scholar
↵
1. Pickrell JK,
2. Gilad Y,
3. Pritchard JK
Pickrell JK, Gilad Y, Pritchard JK. 2012. Comment on “Widespread RNA and DNA sequence differences in the human transcriptome.” Science 335: 1302.
FREE Full Text
↵
1. Pickrell JK,
2. Marioni JC,
3. Pai AA,
4. Degner JF,
5. Engelhardt BE,
6. Nkadori E,
7. Veyrieras JB,
8. Stephens M,
9. Gilad Y,
10. Pritchard JK
Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard JK. 2010. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464: 768–772.
CrossRef Medline Google Scholar
↵
1. Popadin K,
2. Gutierrez-Arcelus M,
3. Dermitzakis ET,
4. Antonarakis SE
Popadin K, Gutierrez-Arcelus M, Dermitzakis ET, Antonarakis SE. 2013. Genetic and epigenetic regulation of human lincRNA gene expression. Am J Hum Genet 93: 1015–1026.
CrossRef Medline Google Scholar
↵
1. Querfurth R,
2. Fischer A,
3. Schweiger MR,
4. Lehrach H,
5. Mertes F
Querfurth R, Fischer A, Schweiger MR, Lehrach H, Mertes F. 2012. Creation and application of immortalized bait libraries for targeted enrichment and next-generation sequencing. Biotechniques 52: 375–380.
Medline Google Scholar
↵
1. Roberts A,
2. Pimentel H,
3. Trapnell C,
4. Pachter L
Roberts A, Pimentel H, Trapnell C, Pachter L. 2011a. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27: 2325–2329.
FREE Full Text
↵
1. Roberts A,
2. Trapnell C,
3. Donaghey J,
4. Rinn JL,
5. Pachter L
Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. 2011b. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol 12: R22.
CrossRef Medline Google Scholar
↵
1. Robertson G,
2. Schein J,
3. Chiu R,
4. Corbett R,
5. Field M,
6. Jackman SD,
7. Mungall K,
8. Lee S,
9. Okada HM,
10. Qian JQ
Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, Mungall K, Lee S, Okada HM, Qian JQ, et al. 2010. De novo assembly and analysis of RNA-seq data. Nat Methods 7: 909–912.
CrossRef Medline Google Scholar
↵
1. Robinson MD,
2. McCarthy DJ,
3. Smyth GK
Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26: 139–140.
FREE Full Text
↵
1. Robinson MD,
2. Oshlack A
Robinson MD, Oshlack A. 2010. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11: R25.
CrossRef Medline Google Scholar
↵
1. Rozowsky J,
2. Abyzov A,
3. Wang J,
4. Alves P,
5. Raha D,
6. Harmanci A,
7. Leng J,
8. Bjornson R,
9. Kong Y,
10. Kitabayashi N
Rozowsky J, Abyzov A, Wang J, Alves P, Raha D, Harmanci A, Leng J, Bjornson R, Kong Y, Kitabayashi N, et al. 2011. AlleleSeq: Analysis of allele-specific expression and binding in a network framework. Mol Syst Biol 7: 522.
Medline Google Scholar
↵
1. Rudloff U,
2. Bhanot U,
3. Gerald W,
4. Klimstra DS,
5. Jarnagin WR,
6. Brennan MF,
7. Allen PJ
Rudloff U, Bhanot U, Gerald W, Klimstra DS, Jarnagin WR, Brennan MF, Allen PJ. 2010. Biobanking of human pancreas cancer tissue: Impact of ex-vivo procurement times on RNA quality. Ann Surg Oncol 17: 2229–2236.
CrossRef Medline Google Scholar
↵
1. Sasagawa Y,
2. Nikaido I,
3. Hayashi T,
4. Danno H,
5. Uno KD,
6. Imai T,
7. Ueda HR
Sasagawa Y, Nikaido I, Hayashi T, Danno H, Uno KD, Imai T, Ueda HR. 2013. Quartz-Seq: A highly reproducible and sensitive single-cell RNA sequencing method, reveals non-genetic gene-expression heterogeneity. Genome Biol 14: R31.
CrossRef Medline Google Scholar
↵
1. Satya RV,
2. Zavaljevski N,
3. Reifman J
Satya RV, Zavaljevski N, Reifman J. 2012. A new strategy to reduce allelic bias in RNA-Seq readmapping. Nucleic Acids Res 40: e127.
FREE Full Text
↵
1. Schadt EE,
2. Molony C,
3. Chudin E,
4. Hao K,
5. Yang X,
6. Lum PY,
7. Kasarskis A,
8. Zhang B,
9. Wang S,
10. Suver C
Schadt EE, Molony C, Chudin E, Hao K, Yang X, Lum PY, Kasarskis A, Zhang B, Wang S, Suver C, et al. 2008. Mapping the genetic architecture of gene expression in human liver. PLoS Biol 6: e107.
CrossRef Medline Google Scholar
↵
1. Schena M,
2. Shalon D,
3. Davis RW,
4. Brown PO
Schena M, Shalon D, Davis RW, Brown PO. 1995. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270: 467–470.
FREE Full Text
↵
1. Schulz MH,
2. Zerbino DR,
3. Vingron M,
4. Birney E
Schulz MH, Zerbino DR, Vingron M, Birney E. 2012. Oases: Robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28: 1086–1092.
FREE Full Text
↵
1. Servin B,
2. Stephens M
Servin B, Stephens M. 2007. Imputation-based analysis of association studies: Candidate regions and quantitative traits. PLoS Genet 3: e114.
CrossRef Medline Google Scholar
↵
1. Shabalin AA
Shabalin AA. 2012. Matrix eQTL: Ultra-fast eQTL analysis via large matrix operations. Bioinformatics 28: 1353–1358.
FREE Full Text
↵
1. Shalek AK,
2. Satija R,
3. Adiconis X,
4. Gertner RS,
5. Gaublomme JT,
6. Raychowdhury R,
7. Schwartz S,
8. Yosef N,
9. Malboeuf C,
10. Lu D
Shalek AK, Satija R, Adiconis X, Gertner RS, Gaublomme JT, Raychowdhury R, Schwartz S, Yosef N, Malboeuf C, Lu D, et al. 2013. Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498: 236–240.
CrossRef Medline Google Scholar
↵
1. Sharon D,
2. Tilgner H,
3. Grubert F,
4. Snyder M
Sharon D, Tilgner H, Grubert F, Snyder M. 2013. A single-molecule long-read survey of the human transcriptome. Nat Biotechnol 31: 1009–1014.
CrossRef Medline Google Scholar
↵
1. Shendure J
Shendure J. 2008. The beginning of the end for microarrays? Nat Methods 5: 585–587.
CrossRef Medline Google Scholar
↵
1. Shiraki T,
2. Kondo S,
3. Katayama S,
4. Waki K,
5. Kasukawa T,
6. Kawaji H,
7. Kodzius R,
8. Watahiki A,
9. Nakamura M,
10. Arakawa T
Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H, Kodzius R, Watahiki A, Nakamura M, Arakawa T, et al. 2003. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci 100: 15776–15781.
FREE Full Text
↵
1. Skelly DA,
2. Johansson M,
3. Madeoy J,
4. Wakefield J,
5. Akey JM
Skelly DA, Johansson M, Madeoy J, Wakefield J, Akey JM. 2011. A powerful and flexible statistical framework for testing hypotheses of allele-specific gene expression from RNA-seq data. Genome Res 21: 1728–1737.
FREE Full Text
↵
1. Smith AM,
2. Heisler LE,
3. St Onge RP,
4. Farias-Hesson E,
5. Wallace IM,
6. Bodeau J,
7. Harris AN,
8. Perry KM,
9. Giaever G,
10. Pourmand N
Smith AM, Heisler LE, St Onge RP, Farias-Hesson E, Wallace IM, Bodeau J, Harris AN, Perry KM, Giaever G, Pourmand N, et al. 2010. Highly-multiplexed barcode sequencing: An efficient method for parallel analysis of pooled samples. Nucleic Acids Res 38: e142.
FREE Full Text
↵
1. Smyth GK
Smyth GK. 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statist Appl Genetics Mol Biol 3: Article 3.
Google Scholar
↵
1. Sorlie T,
2. Perou CM,
3. Tibshirani R,
4. Aas T,
5. Geisler S,
6. Johnsen H,
7. Hastie T,
8. Eisen MB,
9. van de Rijn M,
10. Jeffrey SS
Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, et al. 2001. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci 98: 10869–10874.
FREE Full Text
↵
1. Srivastava S,
2. Chen L
Srivastava S, Chen L. 2010. A two-parameter generalized Poisson model to improve the analysis of RNA-seq data. Nucleic Acids Res 38: e170.
FREE Full Text
↵
1. Stefani G,
2. Slack FJ
Stefani G, Slack FJ. 2008. Small non-coding RNAs in animal development. Nat Rev Mol Cell Biol 9: 219–230.
CrossRef Medline Google Scholar
↵
1. Stegle O,
2. Parts L,
3. Piipari M,
4. Winn J,
5. Durbin R
Stegle O, Parts L, Piipari M, Winn J, Durbin R. 2012. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat Protocols 7: 500–507.
Medline Google Scholar
↵
1. Steijger T,
2. Abril JF,
3. Engstrom PG,
4. Kokocinski F,
5. Consortium R,
6. Abril JF,
7. Akerman M,
8. Alioto T,
9. Ambrosini G,
10. Antonarakis SE
Steijger T, Abril JF, Engstrom PG, Kokocinski F, Consortium R, Abril JF, Akerman M, Alioto T, Ambrosini G, Antonarakis SE, et al. 2013. Assessment of transcript reconstruction methods for RNA-seq. Nat Methods 10: 1177–1184.
CrossRef Medline Google Scholar
↵
1. Stevenson KR,
2. Coolon JD,
3. Wittkopp PJ
Stevenson KR, Coolon JD, Wittkopp PJ. 2013. Sources of bias in measures of allele-specific expression derived from RNA-seq data aligned to a single reference genome. BMC Genomics 14: 536.
CrossRef Medline Google Scholar
↵
1. Stranger BE,
2. Montgomery SB,
3. Dimas AS,
4. Parts L,
5. Stegle O,
6. Ingle CE,
7. Sekowska M,
8. Smith GD,
9. Evans D,
10. Gutierrez-Arcelus M
Stranger BE, Montgomery SB, Dimas AS, Parts L, Stegle O, Ingle CE, Sekowska M, Smith GD, Evans D, Gutierrez-Arcelus M, et al. 2012. Patterns of cis regulatory variation in diverse human populations. PLoS Genet 8: e1002639.
CrossRef Medline Google Scholar
↵
1. Sun M,
2. Schwalb B,
3. Schulz D,
4. Pirkl N,
5. Etzold S,
6. Lariviere L,
7. Maier KC,
8. Seizl M,
9. Tresch A,
10. Cramer P
Sun M, Schwalb B, Schulz D, Pirkl N, Etzold S, Lariviere L, Maier KC, Seizl M, Tresch A, Cramer P. 2012. Comparative dynamic transcriptome analysis (cDTA) reveals mutual feedback between mRNA synthesis and degradation. Genome Res 22: 1350–1359.
FREE Full Text
↵
1. Sun W
Sun W. 2012. A statistical framework for eQTL mapping using RNA-seq data. Biometrics 68: 1–11.
Medline Google Scholar
↵
1. Sun W,
2. Hu Y
Sun W, Hu Y. 2013. eQTL mapping using RNA-seq data. Statist Biosci 5: 198–219.
Google Scholar
↵
1. Tang F,
2. Barbacioru C,
3. Nordman E,
4. Li B,
5. Xu N,
6. Bashkirov VI,
7. Lao K,
8. Surani MA
Tang F, Barbacioru C, Nordman E, Li B, Xu N, Bashkirov VI, Lao K, Surani MA. 2010. RNA-Seq analysis to capture the transcriptome landscape of a single cell. Nat Protocols 5: 516–535.
Medline Google Scholar
↵
1. Tang F,
2. Barbacioru C,
3. Wang Y,
4. Nordman E,
5. Lee C,
6. Xu N,
7. Wang X,
8. Bodeau J,
9. Tuch BB,
10. Siddiqui A
Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch BB, Siddiqui A, et al. 2009. mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 6: 377–382.
CrossRef Medline Google Scholar
↵
1. Tarazona S,
2. Garcia-Alcalde F,
3. Dopazo J,
4. Ferrer A,
5. Conesa A
Tarazona S, Garcia-Alcalde F, Dopazo J, Ferrer A, Conesa A. 2011. Differential expression in RNA-seq: A matter of depth. Genome Res 21: 2213–2223.
FREE Full Text
↵
1. 't Hoen PA,
2. Friedlander MR,
3. Almlof J,
4. Sammeth M,
5. Pulyakhina I,
6. Anvar SY,
7. Laros JF,
8. Buermans HP,
9. Karlberg O,
10. Brannvall M
't Hoen PA, Friedlander MR, Almlof J, Sammeth M, Pulyakhina I, Anvar SY, Laros JF, Buermans HP, Karlberg O, Brannvall M, et al. 2013. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat Biotechnol 31: 1015–1022.
CrossRef Medline Google Scholar
↵
1. Thompson KL,
2. Pine PS,
3. Rosenzweig BA,
4. Turpaz Y,
5. Retief J
Thompson KL, Pine PS, Rosenzweig BA, Turpaz Y, Retief J. 2007. Characterization of the effect of sample quality on high density oligonucleotide microarray data using progressively degraded rat liver RNA. BMC Biotechnol 7: 57.
CrossRef Medline Google Scholar
↵
1. Tomita H,
2. Vawter MP,
3. Walsh DM,
4. Evans SJ,
5. Choudary PV,
6. Li J,
7. Overman KM,
8. Atz ME,
9. Myers RM,
10. Jones EG
Tomita H, Vawter MP, Walsh DM, Evans SJ, Choudary PV, Li J, Overman KM, Atz ME, Myers RM, Jones EG, et al. 2004. Effect of agonal and postmortem factors on gene expression profile: Quality control in microarray analyses of postmortem human brain. Biol Psychiatry 55: 346–352.
CrossRef Medline Google Scholar
↵
1. Trapnell C,
2. Hendrickson DG,
3. Sauvageau M,
4. Goff L,
5. Rinn JL,
6. Pachter L
Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. 2013. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol 31: 46–53.
CrossRef Medline Google Scholar
↵
1. Trapnell C,
2. Pachter L,
3. Salzberg SL
Trapnell C, Pachter L, Salzberg SL. 2009. TopHat: Discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105–1111.
FREE Full Text
↵
1. Trapnell C,
2. Roberts A,
3. Goff L,
4. Pertea G,
5. Kim D,
6. Kelley DR,
7. Pimentel H,
8. Salzberg SL,
9. Rinn JL,
10. Pachter L
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. 2012. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protocols 7: 562–578.
CrossRef Medline Google Scholar
↵
1. Trapnell C,
2. Williams Ba,
3. Pertea G,
4. Mortazavi A,
5. Kwan G,
6. van Baren MJ,
7. Salzberg SL,
8. Wold BJ,
9. Pachter L
Trapnell C, Williams Ba, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28: 511–515.
CrossRef Medline Google Scholar
↵
1. Tuch BB,
2. Laborde RR,
3. Xu X,
4. Gu J,
5. Chung CB,
6. Monighetti CK,
7. Stanley SJ,
8. Olsen KD,
9. Kasperbauer JL,
10. Moore EJ
Tuch BB, Laborde RR, Xu X, Gu J, Chung CB, Monighetti CK, Stanley SJ, Olsen KD, Kasperbauer JL, Moore EJ, et al. 2010. Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS ONE 5: e9317.
CrossRef Medline Google Scholar
↵
1. Velculescu VE,
2. Zhang L,
3. Vogelstein B,
4. Kinzler KW
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. 1995. Serial analysis of gene expression. Science 270: 484–487.
FREE Full Text
↵
1. Vivancos AP,
2. Guell M,
3. Dohm JC,
4. Serrano L,
5. Himmelbauer H
Vivancos AP, Guell M, Dohm JC, Serrano L, Himmelbauer H. 2010. Strand-specific deep sequencing of the transcriptome. Genome Res 20: 989–999.
FREE Full Text
↵
1. Wang K,
2. Singh D,
3. Zeng Z,
4. Coleman SJ,
5. Huang Y,
6. Savich GL,
7. He X,
8. Mieczkowski P,
9. Grimm SA,
10. Perou CM
Wang K, Singh D, Zeng Z, Coleman SJ, Huang Y, Savich GL, He X, Mieczkowski P, Grimm SA, Perou CM, et al. 2010a. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38: e178.
FREE Full Text
↵
1. Wang L,
2. Feng Z,
3. Wang X,
4. Wang X,
5. Zhang X
Wang L, Feng Z, Wang X, Wang X, Zhang X. 2010b. DEGseq: An R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26: 136–138.
FREE Full Text
↵
1. Wang Z,
2. Gerstein M,
3. Snyder M
Wang Z, Gerstein M, Snyder M. 2009. RNA-Seq: A revolutionary tool for transcriptomics. Nat Rev Genet 10: 57–63.
CrossRef Medline Google Scholar
↵
1. Wei X,
2. Wang X
Wei X, Wang X. 2013. A computational workflow to identify allele-specific expression and epigenetic modification in maize. Genomics Proteomics Bioinformatics 11: 247–252.
Medline Google Scholar
↵
1. Westra HJ,
2. Peters MJ,
3. Esko T,
4. Yaghootkar H,
5. Schurmann C,
6. Kettunen J,
7. Christiansen MW,
8. Fairfax BP,
9. Schramm K,
10. Powell JE
Westra HJ, Peters MJ, Esko T, Yaghootkar H, Schurmann C, Kettunen J, Christiansen MW, Fairfax BP, Schramm K, Powell JE, et al. 2013. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genetics 45: 1238–1243.
CrossRef Medline Google Scholar
↵
1. Wilusz JE,
2. Sunwoo H,
3. Spector DL
Wilusz JE, Sunwoo H, Spector DL. 2009. Long noncoding RNAs: Functional surprises from the RNA world. Genes Dev 23: 1494–1504.
FREE Full Text
↵
1. Wu AR,
2. Neff NF,
3. Kalisky T,
4. Dalerba P,
5. Treutlein B,
6. Rothenberg ME,
7. Mburu FM,
8. Mantalas GL,
9. Sim S,
10. Clarke MF
Wu AR, Neff NF, Kalisky T, Dalerba P, Treutlein B, Rothenberg ME, Mburu FM, Mantalas GL, Sim S, Clarke MF, et al. 2014. Quantitative assessment of single-cell RNA-sequencing methods. Nat Methods 11: 41–46.
CrossRef Medline Google Scholar
↵
1. Wu TD,
2. Nacu S
Wu TD, Nacu S. 2010. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26: 873–881.
FREE Full Text
↵
1. Yang E,
2. van Nimwegen E,
3. Zavolan M,
4. Rajewsky N,
5. Schroeder M,
6. Magnasco M,
7. Darnell JE Jr
Yang E, van Nimwegen E, Zavolan M, Rajewsky N, Schroeder M, Magnasco M, Darnell JE Jr. 2003. Decay rates of human mRNAs: Correlation with functional characteristics and sequence attributes. Genome Res 13: 1863–1872.
FREE Full Text
↵
1. Yekutieli D,
2. Benjamini Y
Yekutieli D, Benjamini Y. 1999. Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J Stat Plan Infer 82: 171–196.
CrossRef Google Scholar
↵
1. Zeng W,
2. Mortazavi A
Zeng W, Mortazavi A. 2012. Technical considerations for functional sequencing assays. Nat Immunol 13: 802–807.
CrossRef Medline Google Scholar
↵
1. Zhang X,
2. Huang SP,
3. Sun W,
4. Wang W
Zhang X, Huang SP, Sun W, Wang W. 2012. Rapid and robust resampling-based multiple-testing correction with application in a genome-wide expression quantitative trait loci study. Genetics 190: 1511–1520.
FREE Full Text
↵
1. Zook JM,
2. Samarov D,
3. McDaniel J,
4. Sen SK,
5. Salit M
Zook JM, Samarov D, McDaniel J, Sen SK, Salit M. 2012. Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing. PLoS ONE 7: e41356.
CrossRef Medline Google Scholar

[1] ↵

Abecasis GR,

Cherny SS,

Cookson WO,

Cardon LR

Abecasis GR, Cherny SS, Cookson WO, Cardon LR. 2002. Merlin—Rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30: 97–101.

CrossRef Medline Google Scholar

[2] Abecasis GR,

[3] Cherny SS,

[4] Cookson WO,

[5] Cardon LR

[6] ↵

Adams MD,

Kelley JM,

Gocayne JD,

Dubnick M,

Polymeropoulos MH,

Xiao H,

Merril CR,

Wu A,

Olde B,

Moreno RF

Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, et al. 1991. Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252: 1651–1656.

FREE Full Text

[7] Adams MD,

[8] Kelley JM,

[9] Gocayne JD,

[10] Dubnick M,

[11] Polymeropoulos MH,

[12] Xiao H,

[13] Merril CR,

[14] Wu A,

[15] Olde B,

[16] Moreno RF

[17] ↵

Adams MD,

Kerlavage AR,

Fleischmann RD,

Fuldner RA,

Bult CJ,

Lee NH,

Kirkness EF,

Weinstock KG,

Gocayne JD,

White O

Adams MD, Kerlavage AR, Fleischmann RD, Fuldner RA, Bult CJ, Lee NH, Kirkness EF, Weinstock KG, Gocayne JD, White O, et al. 1995. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377: 3–174.

Medline Google Scholar

[18] Adams MD,

[19] Kerlavage AR,

[20] Fleischmann RD,

[21] Fuldner RA,

[22] Bult CJ,

[23] Lee NH,

[24] Kirkness EF,

[25] Weinstock KG,

[26] Gocayne JD,

[27] White O

[28] ↵

Alizadeh AA,

Eisen MB,

Davis RE,

Ma C,

Lossos IS,

Rosenwald A,

Boldrick JC,

Sabet H,

Tran T,

Yu X

Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, et al. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503–511.

CrossRef Medline Google Scholar

[29] Alizadeh AA,

[30] Eisen MB,

[31] Davis RE,

[32] Ma C,

[33] Lossos IS,

[34] Rosenwald A,

[35] Boldrick JC,

[36] Sabet H,

[37] Tran T,

[38] Yu X

[39] ↵

Altshuler DM,

Gibbs RA,

Peltonen L,

Dermitzakis E,

Schaffner SF,

Yu FL,

Bonnen PE,

de Bakker PIW,

Deloukas P,

Gabriel SB

Altshuler DM, Gibbs RA, Peltonen L, Dermitzakis E, Schaffner SF, Yu FL, Bonnen PE, de Bakker PIW, Deloukas P, Gabriel SB, et al. 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467: 52–58.

CrossRef Medline Google Scholar

[40] Altshuler DM,

[41] Gibbs RA,

[42] Peltonen L,

[43] Dermitzakis E,

[44] Schaffner SF,

[45] Yu FL,

[46] Bonnen PE,

[47] de Bakker PIW,

[48] Deloukas P,

[49] Gabriel SB

[50] ↵

An J,

Lai J,

Lehman ML,

Nelson CC

An J, Lai J, Lehman ML, Nelson CC. 2013. miRDeep*: An integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res 41: 727–737.

FREE Full Text

[51] An J,

[52] Lai J,

[53] Lehman ML,

[54] Nelson CC

[55] ↵

Anders S,

Huber W

Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome Biol 11: R106.

CrossRef Medline Google Scholar

[56] Anders S,

[57] Huber W

[58] ↵

Anders S,

McCarthy DJ,

Chen Y,

Okoniewski M,

Smyth GK,

Huber W,

Robinson MD

Anders S, McCarthy DJ, Chen Y, Okoniewski M, Smyth GK, Huber W, Robinson MD. 2013. Count-based differential expression analysis of RNA sequencing data using R and bioconductor. Nature protocols 8: 1765–1786.

CrossRef Medline Google Scholar

[59] Anders S,

[60] McCarthy DJ,

[61] Chen Y,

[62] Okoniewski M,

[63] Smyth GK,

[64] Huber W,

[65] Robinson MD

[66] ↵

Au KF,

Sebastiano V,

Afshar PT,

Durruthy JD,

Lee L,

Williams BA,

van Bakel H,

Schadt EE,

Reijo-Pera RA,

Underwood JG

Au KF, Sebastiano V, Afshar PT, Durruthy JD, Lee L, Williams BA, van Bakel H, Schadt EE, Reijo-Pera RA, Underwood JG, et al. 2013. Characterization of the human ESC transcriptome by hybrid sequencing. Proc Natl Acad Sci 110: E4821–E4830.

FREE Full Text

[67] Au KF,

[68] Sebastiano V,

[69] Afshar PT,

[70] Durruthy JD,

[71] Lee L,

[72] Williams BA,

[73] van Bakel H,

[74] Schadt EE,

[75] Reijo-Pera RA,

[76] Underwood JG

[77] ↵

Auer PL,

Doerge RW

Auer PL, Doerge RW. 2010. Statistical design and analysis of RNA sequencing data. Genetics 185: 405–416.

FREE Full Text

[78] Auer PL,

[79] Doerge RW

[80] ↵

Battle A,

Mostafavi S,

Zhu X,

Potash JB,

Weissman MM,

McCormick C,

Haudenschild CD,

Beckman KB,

Shi J,

Mei R

Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, et al. 2013. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 24: 14–24.

Medline Google Scholar

[81] Battle A,

[82] Mostafavi S,

[83] Zhu X,

[84] Potash JB,

[85] Weissman MM,

[86] McCormick C,

[87] Haudenschild CD,

[88] Beckman KB,

[89] Shi J,

[90] Mei R

[91] ↵

Benjamini Y,

Hochberg Y

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate—A practical and powerful approach to multiple testing. J Roy Stat Soc B Met 57: 289–300.

Google Scholar

[92] Benjamini Y,

[93] Hochberg Y

[94] ↵

Bentley DR,

Balasubramanian S,

Swerdlow HP,

Smith GP,

Milton J,

Brown CG,

Hall KP,

Evers DJ,

Barnes CL,

Bignell HR

Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456: 53–59.

CrossRef Medline Google Scholar

[95] Bentley DR,

[96] Balasubramanian S,

[97] Swerdlow HP,

[98] Smith GP,

[99] Milton J,

[100] Brown CG,

[101] Hall KP,

[102] Evers DJ,

[103] Barnes CL,

[104] Bignell HR

[105] ↵

Birney E,

Stamatoyannopoulos Ja,

Dutta A,

Guigó R,

Gingeras TR,

Margulies EH,

Weng Z,

Snyder M,

Dermitzakis ET,

Thurman RE

Birney E, Stamatoyannopoulos Ja, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. 2007. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.

CrossRef Medline Google Scholar

[106] Birney E,

[107] Stamatoyannopoulos Ja,

[108] Dutta A,

[109] Guigó R,

[110] Gingeras TR,

[111] Margulies EH,

[112] Weng Z,

[113] Snyder M,

[114] Dermitzakis ET,

[115] Thurman RE

[116] ↵

Blencowe BJ,

Ahmad S,

Lee LJ

Blencowe BJ, Ahmad S, Lee LJ. 2009. Current-generation high-throughput sequencing: Deepening insights into mammalian transcriptomes. Genes Dev 23: 1379–1386.

FREE Full Text

[117] Blencowe BJ,

[118] Ahmad S,

[119] Lee LJ

[120] ↵

Brennecke P,

Anders S,

Kim JK,

Kolodziejczyk AA,

Zhang X,

Proserpio V,

Baying B,

Benes V,

Teichmann SA,

Marioni JC

Brennecke P, Anders S, Kim JK, Kolodziejczyk AA, Zhang X, Proserpio V, Baying B, Benes V, Teichmann SA, Marioni JC, et al. 2013. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10: 1093–1095.

CrossRef Medline Google Scholar

[121] Brennecke P,

[122] Anders S,

[123] Kim JK,

[124] Kolodziejczyk AA,

[125] Zhang X,

[126] Proserpio V,

[127] Baying B,

[128] Benes V,

[129] Teichmann SA,

[130] Marioni JC

[131] ↵

Bullard JH,

Purdom E,

Hansen KD,

Dudoit S

Bullard JH, Purdom E, Hansen KD, Dudoit S. 2010. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11: 94.

CrossRef Medline Google Scholar

[132] Bullard JH,

[133] Purdom E,

[134] Hansen KD,

[135] Dudoit S

[136] ↵

Cantor H,

Simpson E,

Sato VL,

Fathman CG,

Herzenberg LA

Cantor H, Simpson E, Sato VL, Fathman CG, Herzenberg LA. 1975. Characterization of subpopulations of T lymphocytes. I. Separation and functional studies of peripheral T-cells binding different amounts of fluorescent anti-Thy 1.2 (theta) antibody using a fluorescence-activated cell sorter (FACS). Cell Immunol 15: 180–196.

CrossRef Medline Google Scholar

[137] Cantor H,

[138] Simpson E,

[139] Sato VL,

[140] Fathman CG,

[141] Herzenberg LA

[142] ↵

Carneiro MO,

Russ C,

Ross MG,

Gabriel SB,

Nusbaum C,

DePristo MA

Carneiro MO, Russ C, Ross MG, Gabriel SB, Nusbaum C, DePristo MA. 2012. Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics 13: 375.

CrossRef Medline Google Scholar

[143] Carneiro MO,

[144] Russ C,

[145] Ross MG,

[146] Gabriel SB,

[147] Nusbaum C,

[148] DePristo MA

[149] ↵

Casneuf T,

Van de Peer Y,

Huber W

Casneuf T, Van de Peer Y, Huber W. 2007. In situ analysis of cross-hybridisation on microarrays and the inference of expression correlation. BMC Bioinformatics 8: 461.

CrossRef Medline Google Scholar

[150] Casneuf T,

[151] Van de Peer Y,

[152] Huber W

[153] ↵

Christodoulou DC,

Gorham JM,

Herman DS,

Seidman JG

Christodoulou DC, Gorham JM, Herman DS, Seidman JG. 2011. Construction of normalized RNA-seq libraries for next-generation sequencing using the crab duplex-specific nuclease. Current Protocols in Molecular Biology/edited by Frederick M Ausubel, [et al.] Chapter 4: Unit 4 12.

Google Scholar

[154] Christodoulou DC,

[155] Gorham JM,

[156] Herman DS,

[157] Seidman JG

[158] ↵

Crick F

Crick F. 1970. Central dogma of molecular biology. Nature 227: 561–563.

CrossRef Medline Google Scholar

[159] Crick F

[160] ↵

Crick FH

Crick FH. 1958. On protein synthesis. Symp Soc Exp Biol 12: 138–163.

Medline Google Scholar

[161] Crick FH

[162] ↵

Cui X,

Churchill GA

Cui X, Churchill GA. 2003. Statistical tests for differential expression in cDNA microarray experiments. Genome Biol 4: 210.

CrossRef Medline Google Scholar

[163] Cui X,

[164] Churchill GA

[165] ↵

Degner JF,

Marioni JC,

Pai AA,

Pickrell JK,

Nkadori E,

Gilad Y,

Pritchard JK

Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK. 2009. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25: 3207–3212.

FREE Full Text

[166] Degner JF,

[167] Marioni JC,

[168] Pai AA,

[169] Pickrell JK,

[170] Nkadori E,

[171] Gilad Y,

[172] Pritchard JK

[173] ↵

Deng Q,

Ramskold D,

Reinius B,

Sandberg R

Deng Q, Ramskold D, Reinius B, Sandberg R. 2014. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343: 193–196.

FREE Full Text

[174] Deng Q,

[175] Ramskold D,

[176] Reinius B,

[177] Sandberg R

[178] ↵

Dimas AS,

Deutsch S,

Stranger BE,

Montgomery SB,

Borel C,

Attar-Cohen H,

Ingle C,

Beazley C,

Gutierrez Arcelus M,

Sekowska M

Dimas AS, Deutsch S, Stranger BE, Montgomery SB, Borel C, Attar-Cohen H, Ingle C, Beazley C, Gutierrez Arcelus M, Sekowska M, et al. 2009. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325: 1246–1250.

FREE Full Text

[179] Dimas AS,

[180] Deutsch S,

[181] Stranger BE,

[182] Montgomery SB,

[183] Borel C,

[184] Attar-Cohen H,

[185] Ingle C,

[186] Beazley C,

[187] Gutierrez Arcelus M,

[188] Sekowska M

[189] ↵

Dobin A,

Davis CA,

Schlesinger F,

Drenkow J,

Zaleski C,

Jha S,

Batut P,

Chaisson M,

Gingeras TR

Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. 2013. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29: 15–21.

FREE Full Text

[190] Dobin A,

[191] Davis CA,

[192] Schlesinger F,

[193] Drenkow J,

[194] Zaleski C,

[195] Jha S,

[196] Batut P,

[197] Chaisson M,

[198] Gingeras TR

[199] ↵

Eid J,

Fehr A,

Gray J,

Luong K,

Lyle J,

Otto G,

Peluso P,

Rank D,

Baybayan P,

Bettman B

Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al. 2009. Real-time DNA sequencing from single polymerase molecules. Science 323: 133–138.

FREE Full Text

[200] Eid J,

[201] Fehr A,

[202] Gray J,

[203] Luong K,

[204] Lyle J,

[205] Otto G,

[206] Peluso P,

[207] Rank D,

[208] Baybayan P,

[209] Bettman B

[210] ↵

Eminaga S,

Christodoulou DC,

Vigneault F,

Church GM,

Seidman JG

Eminaga S, Christodoulou DC, Vigneault F, Church GM, Seidman JG. 2013. Quantification of microRNA expression with next-generation sequencing. Current Protocols in Molecular Biology/edited by Frederick M Ausubel [et al.] Chapter 4: Unit 4 17.

Google Scholar

[211] Eminaga S,

[212] Christodoulou DC,

[213] Vigneault F,

[214] Church GM,

[215] Seidman JG

[216] ↵

Emmert-Buck MR,

Bonner RF,

Smith PD,

Chuaqui RF,

Zhuang Z,

Goldstein SR,

Weiss RA,

Liotta LA

Emmert-Buck MR, Bonner RF, Smith PD, Chuaqui RF, Zhuang Z, Goldstein SR, Weiss RA, Liotta LA. 1996. Laser capture microdissection. Science 274: 998–1001.

FREE Full Text

[217] Emmert-Buck MR,

[218] Bonner RF,

[219] Smith PD,

[220] Chuaqui RF,

[221] Zhuang Z,

[222] Goldstein SR,

[223] Weiss RA,

[224] Liotta LA

[225] ↵

Engstrom PG,

Steijger T,

Sipos B,

Grant GR,

Kahles A,

Consortium R,

Alioto T,

Behr J,

Bertone P,

Bohnert R

Engstrom PG, Steijger T, Sipos B, Grant GR, Kahles A, Consortium R, Alioto T, Behr J, Bertone P, Bohnert R, et al. 2013. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10: 1185–1191.

CrossRef Medline Google Scholar

[226] Engstrom PG,

[227] Steijger T,

[228] Sipos B,

[229] Grant GR,

[230] Kahles A,

[231] Consortium R,

[232] Alioto T,

[233] Behr J,

[234] Bertone P,

[235] Bohnert R

[236] ↵

Erkkila T,

Lehmusvaara S,

Ruusuvuori P,

Visakorpi T,

Shmulevich I,

Lahdesmaki H

Erkkila T, Lehmusvaara S, Ruusuvuori P, Visakorpi T, Shmulevich I, Lahdesmaki H. 2010. Probabilistic analysis of gene expression measurements from heterogeneous tissues. Bioinformatics 26: 2571–2577.

FREE Full Text

[237] Erkkila T,

[238] Lehmusvaara S,

[239] Ruusuvuori P,

[240] Visakorpi T,

[241] Shmulevich I,

[242] Lahdesmaki H

[243] ↵

Fehrmann RSN,

Jansen RC,

Veldink JH,

Westra HJ,

Arends D,

Bonder MJ,

Fu JY,

Deelen P,

Groen HJM,

Smolonska A

Fehrmann RSN, Jansen RC, Veldink JH, Westra HJ, Arends D, Bonder MJ, Fu JY, Deelen P, Groen HJM, Smolonska A, et al. 2011. Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet 7: e1002197.

CrossRef Medline Google Scholar

[244] Fehrmann RSN,

[245] Jansen RC,

[246] Veldink JH,

[247] Westra HJ,

[248] Arends D,

[249] Bonder MJ,

[250] Fu JY,

[251] Deelen P,

[252] Groen HJM,

[253] Smolonska A

[254] ↵

Flutre T,

Wen X,

Pritchard J,

Stephens M

Flutre T, Wen X, Pritchard J, Stephens M. 2013. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet 9: e1003486.

CrossRef Medline Google Scholar

[255] Flutre T,

[256] Wen X,

[257] Pritchard J,

[258] Stephens M

[259] ↵

Frazer KA,

Ballinger DG,

Cox DR,

Hinds DA,

Stuve LL,

Gibbs RA,

Belmont JW,

Boudreau A,

Hardenbol P,

Leal SM

Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, et al. 2007. A second generation human haplotype map of over 3.1 million SNPs. Nature 449: U851–U853.

Google Scholar

[260] Frazer KA,

[261] Ballinger DG,

[262] Cox DR,

[263] Hinds DA,

[264] Stuve LL,

[265] Gibbs RA,

[266] Belmont JW,

[267] Boudreau A,

[268] Hardenbol P,

[269] Leal SM

[270] ↵

Fu GK,

Xu W,

Wilhelmy J,

Mindrinos M,

Davis RW,

Xiao W,

Fodor SPA

Fu GK, Xu W, Wilhelmy J, Mindrinos M, Davis RW, Xiao W, Fodor SPA. 2014. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations. Proc Natl Assoc Sci 111: 1891–1896.

Google Scholar

[271] Fu GK,

[272] Xu W,

[273] Wilhelmy J,

[274] Mindrinos M,

[275] Davis RW,

[276] Xiao W,

[277] Fodor SPA

[278] ↵

Gamazon ER,

Ziliak D,

Im HK,

LaCroix B,

Park DS,

Cox NJ,

Huang RS

Gamazon ER, Ziliak D, Im HK, LaCroix B, Park DS, Cox NJ, Huang RS. 2012. Genetic architecture of microRNA expression: Implications for the transcriptome and complex traits. Am J Hum Genet 90: 1046–1063.

CrossRef Medline Google Scholar

[279] Gamazon ER,

[280] Ziliak D,

[281] Im HK,

[282] LaCroix B,

[283] Park DS,

[284] Cox NJ,

[285] Huang RS

[286] ↵

Ge B,

Pokholok DK,

Kwan T,

Grundberg E,

Morcos L,

Verlaan DJ,

Le J,

Koka V,

Lam KC,

Gagne V

Ge B, Pokholok DK, Kwan T, Grundberg E, Morcos L, Verlaan DJ, Le J, Koka V, Lam KC, Gagne V, et al. 2009. Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nat Genet 41: 1216–1222.

CrossRef Medline Google Scholar

[287] Ge B,

[288] Pokholok DK,

[289] Kwan T,

[290] Grundberg E,

[291] Morcos L,

[292] Verlaan DJ,

[293] Le J,

[294] Koka V,

[295] Lam KC,

[296] Gagne V

[297] ↵

Glaus P,

Honkela A,

Rattray M

Glaus P, Honkela A, Rattray M. 2012. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28: 1721–1728.

FREE Full Text

[298] Glaus P,

[299] Honkela A,

[300] Rattray M

[301] ↵

Grabherr MG,

Haas BJ,

Yassour M,

Levin JZ,

Thompson DA,

Amit I,

Adiconis X,

Fan L,

Raychowdhury R,

Zeng Q

Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29: 644–652.

CrossRef Medline Google Scholar

[302] Grabherr MG,

[303] Haas BJ,

[304] Yassour M,

[305] Levin JZ,

[306] Thompson DA,

[307] Amit I,

[308] Adiconis X,

[309] Fan L,

[310] Raychowdhury R,

[311] Zeng Q

[312] ↵

Grant GR,

Farkas MH,

Pizarro AD,

Lahens NF,

Schug J,

Brunk BP,

Stoeckert CJ,

Hogenesch JB,

Pierce EA

Grant GR, Farkas MH, Pizarro AD, Lahens NF, Schug J, Brunk BP, Stoeckert CJ, Hogenesch JB, Pierce EA. 2011. Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics 27: 2518–2528.

FREE Full Text

[313] Grant GR,

[314] Farkas MH,

[315] Pizarro AD,

[316] Lahens NF,

[317] Schug J,

[318] Brunk BP,

[319] Stoeckert CJ,

[320] Hogenesch JB,

[321] Pierce EA

[322] ↵

Grant GR,

Liu J,

Stoeckert CJ Jr

Grant GR, Liu J, Stoeckert CJ Jr. 2005. A practical false discovery rate approach to identifying patterns of differential expression in microarray data. Bioinformatics 21: 2684–2690.

FREE Full Text

[323] Grant GR,

[324] Liu J,

[325] Stoeckert CJ Jr

[326] ↵

Griebel T,

Zacher B,

Ribeca P,

Raineri E,

Lacroix V,

Guigo R,

Sammeth M

Griebel T, Zacher B, Ribeca P, Raineri E, Lacroix V, Guigo R, Sammeth M. 2012. Modelling and simulating generic RNA-Seq experiments with the flux simulator. Nucleic Acids Res 40: 10073–10083.

FREE Full Text

[327] Griebel T,

[328] Zacher B,

[329] Ribeca P,

[330] Raineri E,

[331] Lacroix V,

[332] Guigo R,

[333] Sammeth M

[334] ↵

Griffith M,

Griffith OL,

Mwenifumbo J,

Goya R,

Morrissy AS,

Morin RD,

Corbett R,

Tang MJ,

Hou YC,

Pugh TJ

Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy AS, Morin RD, Corbett R, Tang MJ, Hou YC, Pugh TJ, et al. 2010. Alternative expression analysis by RNA sequencing. Nat Methods 7: 843–847.

CrossRef Medline Google Scholar

[335] Griffith M,

[336] Griffith OL,

[337] Mwenifumbo J,

[338] Goya R,

[339] Morrissy AS,

[340] Morin RD,

[341] Corbett R,

[342] Tang MJ,

[343] Hou YC,

[344] Pugh TJ

[345] ↵

Grundberg E,

Small KS,

Hedman AK,

Nica AC,

Buil A,

Keildson S,

Bell JT,

Yang TP,

Meduri E,

Barrett A

Grundberg E, Small KS, Hedman AK, Nica AC, Buil A, Keildson S, Bell JT, Yang TP, Meduri E, Barrett A, et al. 2012. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat Genet 44: 1084–1089.

CrossRef Medline Google Scholar

[346] Grundberg E,

[347] Small KS,

[348] Hedman AK,

[349] Nica AC,

[350] Buil A,

[351] Keildson S,

[352] Bell JT,

[353] Yang TP,

[354] Meduri E,

[355] Barrett A

[356] ↵

Guttman M,

Amit I,

Garber M,

French C,

Lin MF,

Feldser D,

Huarte M,

Zuk O,

Carey BW,

Cassady JP

Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, et al. 2009. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458: 223–227.

CrossRef Medline Google Scholar

[357] Guttman M,

[358] Amit I,

[359] Garber M,

[360] French C,

[361] Lin MF,

[362] Feldser D,

[363] Huarte M,

[364] Zuk O,

[365] Carey BW,

[366] Cassady JP

[367] ↵

Hackenberg M,

Rodriguez-Ezpeleta N,

Aransay AM

Hackenberg M, Rodriguez-Ezpeleta N, Aransay AM. 2011. miRanalyzer: An update on the detection and analysis of microRNAs in high-throughput sequencing experiments. Nucleic Acids Res 39: W132–W138.

FREE Full Text

[368] Hackenberg M,

[369] Rodriguez-Ezpeleta N,

[370] Aransay AM

[371] ↵

Hardcastle TJ,

Kelly KA

Hardcastle TJ, Kelly KA. 2010. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11: 422.

CrossRef Medline Google Scholar

[372] Hardcastle TJ,

[373] Kelly KA

[374] ↵

Hashimshony T,

Wagner F,

Sher N,

Yanai I

Hashimshony T, Wagner F, Sher N, Yanai I. 2012. CEL-Seq: Single-cell RNA-Seq by multiplexed linear amplification. Cell Reports 2: 666–673.

Medline Google Scholar

[375] Hashimshony T,

[376] Wagner F,

[377] Sher N,

[378] Yanai I

[379] ↵

Huang R,

Jaritz M,

Guenzl P,

Vlatkovic I,

Sommer A,

Tamir IM,

Marks H,

Klampfl T,

Kralovics R,

Stunnenberg HG

Huang R, Jaritz M, Guenzl P, Vlatkovic I, Sommer A, Tamir IM, Marks H, Klampfl T, Kralovics R, Stunnenberg HG, et al. 2011. An RNA-Seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs. PLoS ONE 6: e27288.

CrossRef Medline Google Scholar

[380] Huang R,

[381] Jaritz M,

[382] Guenzl P,

[383] Vlatkovic I,

[384] Sommer A,

[385] Tamir IM,

[386] Marks H,

[387] Klampfl T,

[388] Kralovics R,

[389] Stunnenberg HG

[390] ↵

Huang S

Huang S. 2009. Non-genetic heterogeneity of cells in development: More than just noise. Development 136: 3853–3862.

FREE Full Text

[391] Huang S

[392] ↵

Islam S,

Kjallquist U,

Moliner A,

Zajac P,

Fan JB,

Lonnerberg P,

Linnarsson S

Islam S, Kjallquist U, Moliner A, Zajac P, Fan JB, Lonnerberg P, Linnarsson S. 2012. Highly multiplexed and strand-specific single-cell RNA 5′ end sequencing. Nat Protocols 7: 813–828.

CrossRef Medline Google Scholar

[393] Islam S,

[394] Kjallquist U,

[395] Moliner A,

[396] Zajac P,

[397] Fan JB,

[398] Lonnerberg P,

[399] Linnarsson S

[400] ↵

Itoh K,

Matsubara K,

Okubo K

Itoh K, Matsubara K, Okubo K. 1994. Identification of an active gene by using large-scale cDNA sequencing. Gene 140: 295–296.

CrossRef Medline Google Scholar

[401] Itoh K,

[402] Matsubara K,

[403] Okubo K

[404] ↵

Jiang L,

Schlesinger F,

Davis CA,

Zhang Y,

Li R,

Salit M,

Gingeras TR,

Oliver B

Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. 2011. Synthetic spike-in standards for RNA-seq experiments. Genome Res 21: 1543–1551.

FREE Full Text

[405] Jiang L,

[406] Schlesinger F,

[407] Davis CA,

[408] Zhang Y,

[409] Li R,

[410] Salit M,

[411] Gingeras TR,

[412] Oliver B

[413] ↵

Karlsson A

Karlsson A. 2006. Review of “Permutation, parametric, and bootstrap tests of hypotheses.” J R Stat Soc A Stat 169: 171–171.

Google Scholar

[414] Karlsson A

[415] ↵

Katz Y,

Wang ET,

Airoldi EM,

Burge CB

Katz Y, Wang ET, Airoldi EM, Burge CB. 2010. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods 7: 1009–1015.

CrossRef Medline Google Scholar

[416] Katz Y,

[417] Wang ET,

[418] Airoldi EM,

[419] Burge CB

[420] ↵

Kawasaki ES

Kawasaki ES. 2004. Microarrays and the gene expression profile of a single cell. Ann N Y Acad Sci 1020: 92–100.

CrossRef Medline Google Scholar

[421] Kawasaki ES

[422] ↵

Kendziorski C,

Wang P

Kendziorski C, Wang P. 2006. A review of statistical methods for expression quantitative trait loci mapping. Mamm Genome 17: 509–517.

CrossRef Medline Google Scholar

[423] Kendziorski C,

[424] Wang P

[425] ↵

Khan J,

Wei JS,

Ringner M,

Saal LH,

Ladanyi M,

Westermann F,

Berthold F,

Schwab M,

Antonescu CR,

Peterson C

Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, et al. 2001. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7: 673–679.

CrossRef Medline Google Scholar

[426] Khan J,

[427] Wei JS,

[428] Ringner M,

[429] Saal LH,

[430] Ladanyi M,

[431] Westermann F,

[432] Berthold F,

[433] Schwab M,

[434] Antonescu CR,

[435] Peterson C

[436] ↵

Kleinman CL,

Majewski J

Kleinman CL, Majewski J. 2012. Comment on “Widespread RNA and DNA sequence differences in the human transcriptome.” Science 335: 1302.

FREE Full Text

[437] Kleinman CL,

[438] Majewski J

[439] ↵

Kozomara A,

Griffiths-Jones S

Kozomara A, Griffiths-Jones S. 2014. miRBase: Annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42: D68–D73.

FREE Full Text

[440] Kozomara A,

[441] Griffiths-Jones S

[442] ↵

Kube DM,

Savci-Heijink CD,

Lamblin AF,

Kosari F,

Vasmatzis G,

Cheville JC,

Connelly DP,

Klee GG

Kube DM, Savci-Heijink CD, Lamblin AF, Kosari F, Vasmatzis G, Cheville JC, Connelly DP, Klee GG. 2007. Optimization of laser capture microdissection and RNA amplification for gene expression profiling of prostate cancer. BMC Mol Biol 8: 25.

CrossRef Medline Google Scholar

[443] Kube DM,

[444] Savci-Heijink CD,

[445] Lamblin AF,

[446] Kosari F,

[447] Vasmatzis G,

[448] Cheville JC,

[449] Connelly DP,

[450] Klee GG

[451] ↵

Kumar V,

Westra HJ,

Karjalainen J,

Zhernakova DV,

Esko T,

Hrdlickova B,

Almeida R,

Zhernakova A,

Reinmaa E,

Vosa U

Kumar V, Westra HJ, Karjalainen J, Zhernakova DV, Esko T, Hrdlickova B, Almeida R, Zhernakova A, Reinmaa E, Vosa U, et al. 2013. Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. PLoS Genet 9: e1003201.

CrossRef Medline Google Scholar

[452] Kumar V,

[453] Westra HJ,

[454] Karjalainen J,

[455] Zhernakova DV,

[456] Esko T,

[457] Hrdlickova B,

[458] Almeida R,

[459] Zhernakova A,

[460] Reinmaa E,

[461] Vosa U

[462] ↵

Kwan T,

Grundberg E,

Koka V,

Ge B,

Lam KC,

Dias C,

Kindmark A,

Mallmin H,

Ljunggren O,

Rivadeneira F

Kwan T, Grundberg E, Koka V, Ge B, Lam KC, Dias C, Kindmark A, Mallmin H, Ljunggren O, Rivadeneira F, et al. 2009. Tissue effect on genetic control of transcript isoform variation. PLoS Genet 5: e1000608.

CrossRef Medline Google Scholar

[463] Kwan T,

[464] Grundberg E,

[465] Koka V,

[466] Ge B,

[467] Lam KC,

[468] Dias C,

[469] Kindmark A,

[470] Mallmin H,

[471] Ljunggren O,

[472] Rivadeneira F

[473] ↵

Lalonde E,

Ha KC,

Wang Z,

Bemmo A,

Kleinman CL,

Kwan T,

Pastinen T,

Majewski J

Lalonde E, Ha KC, Wang Z, Bemmo A, Kleinman CL, Kwan T, Pastinen T, Majewski J. 2011. RNA sequencing reveals the role of splicing polymorphisms in regulating human gene expression. Genome Res 21: 545–554.

FREE Full Text

[474] Lalonde E,

[475] Ha KC,

[476] Wang Z,

[477] Bemmo A,

[478] Kleinman CL,

[479] Kwan T,

[480] Pastinen T,

[481] Majewski J

[482] ↵

Langmead B,

Hansen KD,

Leek JT

Langmead B, Hansen KD, Leek JT. 2010. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11: R83.

CrossRef Medline Google Scholar

[483] Langmead B,

[484] Hansen KD,

[485] Leek JT

[486] ↵

Langmead B,

Trapnell C,

Pop M,

Salzberg SL

Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10: R25.

CrossRef Medline Google Scholar

[487] Langmead B,

[488] Trapnell C,

[489] Pop M,

[490] Salzberg SL

[491] ↵

Lappalainen T,

Montgomery SB,

Nica AC,

Dermitzakis ET

Lappalainen T, Montgomery SB, Nica AC, Dermitzakis ET. 2011. Epistatic selection between coding and regulatory variation in human evolution and disease. Am J Hum Genetics 89: 459–463.

CrossRef Medline Google Scholar

[492] Lappalainen T,

[493] Montgomery SB,

[494] Nica AC,

[495] Dermitzakis ET

[496] ↵

Lappalainen T,

Sammeth M,

Friedlander MR,

't Hoen PA,

Monlong J,

Rivas MA,

Gonzalez-Porta M,

Kurbatova N,

Griebel T,

Ferreira PG

Lappalainen T, Sammeth M, Friedlander MR, 't Hoen PA, Monlong J, Rivas MA, Gonzalez-Porta M, Kurbatova N, Griebel T, Ferreira PG, et al. 2013. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501: 506–511.

CrossRef Medline Google Scholar

[497] Lappalainen T,

[498] Sammeth M,

[499] Friedlander MR,

[500] 't Hoen PA,

[501] Monlong J,

RNA Sequencing and Analysis

Abstract

INTRODUCTION

TRANSCRIPTOME SEQUENCING

Isolation of RNA

Library Preparation Methods

Selection of RNA Species

Selection of Small RNA Species

cDNA Synthesis

Multiplexing

Quantitative Standards

Selection of Tissue or Cell Populations

Handling Tissue Heterogeneity

Single-Cell Transcriptomics

Sequencing Platforms for Transcriptomics

TRANSCRIPTOME ANALYSIS

RNA-Sequencing Data Analysis Workflow

Read Alignment

Transcript Assembly and Quantification

Considerations for miRNA Sequencing Analysis

Quality Assessment and Technical Considerations

Differential Gene Expression

Allele-Specific Expression

Expression Quantitative Trait Loci

FUTURE PROSPECTS

ACKNOWLEDGMENTS

Footnotes

REFERENCES

Articles citing this article

This Article

Article Category

Services

Personal Folder

Updates/Comments

Citing Articles

Google Scholar

PubMed/NCBI

Subject Categories

Related Content

Share

Navigate This Article

Current Issue

From the cover