The transcriptome is defined as a collection of all the transcript readouts present in a cell. RNA-seq data can be used to explore and/or quantify the transcriptome of an organism, which can be utilized for the following types of experiments:

  • differential gene expression
    • quantitative evaluation and comparison of transcript levels between conditions
      • biological samples/library preparation
      • sequence reads
      • mapping/quantification
      • DGE with R
      • functional analysis with R
  • transcriptome assembly: building the profile of transcribed regions of the genome, a qualitative evaluation
  • refinement of gene models: building better gene models and verifying them using transcriptome assembly
  • metatranscriptomics: community transcriptome analysis

omics

A high-throughput data even from a single sample is considered ‘omics data

  • genomics = the study of complete set of DNA in an organism, single cells, or group of cells
  • transcriptomics = RNA, proteomics = proteins, metabolomics = metabolites

NOTE

The size of human genome/DNA is 3.2 billion characters/base pairs/nucleotides (A, T, C or G in DNA).

Any two humans are 99.9% genetically identical, and differ in ~3-4 million base pairs. These differences are:

  • SNPs (Single Nucleotide Polymorphisms): single letter changes
  • insertions/deletions: adding or removing a few bases
  • structural variants: larger chunks moved or repeated

High-throughput sequencing (HTS) data Next-generation sequencing (NGS) data