introns within eukaryotic genes, and catalyzes their removal and the ligation of the flanking exons. Spliceosomal introns are very rare in trypanosomes (Berriman et al., 2005), and available evidence suggests they are relatively so in dinoflagellates as well (Bachvaroff and Place, 2008). In contrast, every mRNA in both groups has a 5′ spliced leader (SL) sequence that is added by trans-splicing. The SL, also called a miniexon, is a short conserved sequence that is encoded by a high-copy-number family of genes throughout the genome. In dinoflagellates, the same 22-bp fragment is added to all transcripts, and the sequence is also conserved across the entire group (Zhang et al., 2007; Slamovits and Keeling, 2008). In kinetoplastids, the SLs are conserved within a given genome, but vary in size and sequence between species (Campbell et al., 2003). The SL is expressed as a short RNA consisting of the leader sequence followed by a GT dinucleotide and a short stretch of sequence. Complimenting this, mRNAs for protein-coding genes begin with a short stretch of sequence ending with an AG dinucleotide, followed by the 5′ untranslated region and the coding region. The spliceosome brings these 2 elements together and mediates the removal of the 2 intronic fragments and ligation of the SL to the 5′ end of the mRNA (Fig. 4.3) (Campbell et al., 2003).
The second major oddity shared by kinetoplastid and dinoflagellate nuclear gene expression is the presence of polycistronic messages. Once again, the canonical view of nuclear gene expression in eukaryotes centers around the transcription of a single gene at a time; this stands in contrast to prokaryotes, where multiple genes can be expressed on a single, multifunctional mRNA and many genes can be coregulated in operons. Complete genomic sequences from trypanosomatids demonstrate an organization where genes are distributed in contiguous clusters, ranging in size from a handful of genes to several hundreds. In these clusters, stretching up to >1 Mb, genes are oriented on the same strand, usually toward the telomeres, with adjacent clusters located on opposite strands (Berriman et al., 2005). All of the genes within a contiguous cluster are transcribed on a single, sometimes very long, polycistronic mRNA. Relatively short AT-rich regions separate the clusters and are considered to contain the sites for transcription initiation and termination. Comparison of trypanosomatid genomes shows a high degree of conservation in gene order, even within clusters between flagellates that diverged 200–500 Mya (Ghedin et al., 2004).
It is important to point out that, in contrast to prokaryotes, these clusters do not contain genes of related function (Berriman et al., 2005) and they are not coordinately regulated like bacterial operons, so they should not be considered operons. These polycistronic messages are not even translated intact, but are processed to monomeric mRNAs before translation; these monomeric mRNAs are the substrate for trans-splicing by the