DNA (gene) RNA (messenger RNA) protein
RNA uses U instead of T in A, T, C, G.

| Term | Description |
|---|---|
| DNA | the molecule that stores genetic information; includes all genes and non-coding regions |
| gene | a segment of DNA that encodes instructions that can be transcribed |
| RNA | a temporary copy of a gene, used to make proteins |
| protein | functional molecules made from amino acids, built using RNA instructions defined by R group and a sequence of amino acids (polyeptide chain) |
| cell | the living unit where all this happens; different cells express different genes |
| exome | just the coding region of DNA |
| The human genome contains about 20,000-25,000 genes. A typical gene is from few hundred to several thousand nucleotides long. |
DNA
Genes provide instructions to make proteins, which perform some function within the cell. Although all cells contain the same DNA sequence, muscle cells are different from nerve cells and other types of cells because of the different genes that are turned on in these cells and the different RNAs and proteins produced.
To make proteins, the DNA is transcribed into messenger RNA (or mRNA), which is translated by the ribosome into protein. However, some genes encode RNA that does not get translated into protein; these RNAs are called non-coding RNAs (or ncRNAs). Often these RNAs have a function in and of themselves and include rRNAs, tRNAs, and siRNAs, among others.
| Term | Description |
|---|---|
| genes | segments of DNA that provide instructions for making proteins (or functional RNAs) |
| regulatory regions | turn genes on/off (e.g. promoters, enhancers) |
| introns | parts of genes that are not used to make proteins (removed during RNA processing) |
| intergenic regions | ”in between” genes: some may regulate gene expression, others are non-functional |
| repetitive DNA | repeats of certain sequences with structural or unknown functions |
| untranslated regions | not translated into protein; control how long mRNA molecule survives in the cell, how effectively it is translated, and where in the cell |
| open reading frame | translated into protein, specifying the sequence of amino acids of a polypeptide |
| A, T, C, G: adedine, thymine, cytosine, guanine. |
RNA (Ribonucletic Acid)
All RNAs transcribed from genes are called transcripts. When a gene is transcribed, the resulting messenger RNA (mRNA) is only as long as that gene.
To be translated into proteins, the RNA must undergo processing to generate the mRNA. In the figure below, the top strand in the image represents a gene in the DNA, comprised of the untranslated regions (UTRs) and the open read frame.
- It is always 5’ UTR (start codon) and 3’ UTR (end codon). A codon is a triplet of nucleotides. Genes are transcribed into pre-mRNA, which still contains the intronic sequences. After post-transciptional processing, a 5’ cap and polyA tail are added and the introns are spliced out to yield mature mRNA transcripts, which can be translated into proteins.
- polyA tail provide more “buffer” for enzymes to chew through before they reach the actual coding sequence of the mRNA. This increases the lifespan of the mRNA molecule, allowing more protein to be produced from it. Typically 100-250 adenosines long in mammalian cells.
While mRNA transcripts have a polyA tail, many of the non-coding RNA transcripts do not as the post-transcriptional processing is different for these transcripts.
Ensures efficiency, amplification. Main components of RNA: ribose sugar + phosphate + bases (A, U, C, G).
| ribose sugar | the sugar in RNA |
| phosphate group | links the nucleotides together |
| nitrogenous bases | adenine, uracil, cytosine, guanine |
Protein
There are 20 standard amino acids used in human proteins.
Proteins are made as long chains of amino acids. The 3D shape of the protein determines its function.
4 levels of protein structure:
- primary structure
- the sequence of amino acids,
Met-Ala-Gly-Ser-Leu-...
- the sequence of amino acids,
- secondary structure
- local folding patterns
- -helix: a spiral shape
- -sheet: a folded or pleated sheet
- stabilized by hydrogen bonds between backbone atoms
- local folding patterns
- tertiary structure
- the full 3D shape of one polypeptide
- formed by interactions between side chains (R-groups)
- hydrophobic interactions
- hydrogen bonds
- ionic bonds
- disulfide bridges (covalent bonds between cysteines)
- quaternary structure
- if a protein has multiple chains (subunits), this level describes how they come together
- e.g. hemoglobin has 4 subunits
Folding is driven by hydrophobic effect, hydrogen boding, ionic interactions, disulfide bridges, Van der Waals forces (weak but numerous stabilizers).
Misfolded proteins can lose function or become toxic.
Enzymes: catalysts.