Background:
Euglenozoa (Discoba) are known for unorthodox rDNA organization: rDNA may be located on extrachromosomal circles, and 28S rRNA is fragmented into smaller molecules interspersed by additional internal transcribed spacers (ITSs). Diplonemea is one of the main groups of Euglenozoa and its members are among the most abundant and diverse protists in the oceans. Despite that, the rRNA of only one diplonemid species, Diplonema papillatum, has been examined so far and found to exhibit continuous 28S rRNA. The rDNA organization has not been researched so far. To fill this gap in knowledge and allow for better understanding of the evolution of fragmented structure of the rDNA in Euglenozoa, we investigate the structure of rRNA genes in classical (Diplonemidae) and deep-sea diplonemids (Eupelagonemidae), representing the majority of known diplonemid diversity.
Methods:
Raw reads coming from genome sequencing of various euglenozoans have been accessed from Short Read Archive (SRA) and reassembled. The quality of raw reads was evaluated using FastQC v0.11.5 and trimmed in Trimmomatic v0.36. Processed reads were assembled using metaSPAdes v3.10.1. Acquired assemblies were searched by blastn with rRNA sequences of Diplonema papillatum, Euglena gracilis and Crithidia fasciculata. High scoring hits with high coverage (>5x higher than genome average) were kept. Assembly graphs were manually inspected in Bandage to identify potential misassembles. In such a case, contigs containing rDNA were manually corrected and replaced in the assemblies.
Sequences of rRNA operons were aligned using MAFFT einsi (deposited as euglenozoa_einsi.fasta). Obtained alignment was further manually edited in Geneious v10.2.2, based upon annotated secondary structures. The secondary structure of E. gracilis rRNA has been predicted, while for several trypanosomatid ribosomes, cryo-electron microscopy structures have been obtained. RNApdbee 2.0 web‑server was used to extract secondary structures from available cryo-electron microscopy models. Secondary structures of E. gracilis, T. cruzi and L. major have been annotated following previously published structures. Determined helices were marked upon the alignment. Using this profile, it was possible to predict, describe and mark the secondary structure of all other species. In parallel, homologous helices were manually aligned to prepare structure-based alignment, which was used to identify irregularities in the lengths of analysed structures. The structures is deposited as euglenozoa_einsi_manual_for_lengths.gb, the alignment as euglenozoa_einsi_manual_for_lengths.fasta.
An alignment produced by MAFFT einsi was also used for phylogenetic analyses. Fragments with very high variance and no conserved secondary structure were manually removed (e.g., ITSs). The alignment is deposited as euglenozoa_einsi_for_phylogeny.fasta.
(2021-11-13)