The science of managing and analyzing biological data, including genomic research data, using advanced computing techniques.


The mustard family. Members include Arabidopsis, canola, broccoli, rape, cabbage, kale, cauliflower.

Cis-acting elements—

DNA sequences in the vicinity of the structural portion of a gene that regulate gene expression.

cDNA (complementary DNA) libraries—

A collection of DNA clones representing a population of messenger RNA from which all non-coding, intron sequences have been removed.

Comparative Genomics—

The comparison of gene and genome structure, function and evolution across taxa.


representation of the accuracy of sequencing. For 5x coverage, a given base has been examined, or “covered,” 5 times.


A state in which each type of chromosome is present as a pair of homologous chromosomes.

DNA (deoxyribonucleic acid)—

The fundamental molecule encoding genetic information. DNA is a double-stranded molecule held together by weak bonds between base pairs of nucleotides. The four nucleotides in DNA contain the bases adenine (A), guanine (G), cytosine (C), and thymine (T).

DNA sequence—

The relative order of the nucleotide bases making up the DNA along the chromosomes.

Draft sequence—

The determined order of base pairs of a chromosomal area at a level of 4 to 5x coverage.

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement