Manual annotation learning and training resources
We’ve put together resources to help you learn more about manual annotation, in particular via the Apollo software – whether you’re new to manual annotation, are looking to train undergraduates in manual annotation, or just need more information.
What is manual annotation, and how does Apollo help me do it?
There are a number of definitions of manual annotation. Within the Apollo context, it means the manual improvement of a gene prediction – both the structural (e.g. the protein and transcript sequence) and functional (e.g. the name and other metadata) components. Apollo is a software package that greatly simplifies the manual annotation process via a user-friendly web interface. The i5k Workspace@NAL provides the Apollo software for all of the genome projects that we host.
How do I get an account to annotate in Apollo?
You can register for an i5k Workspace Apollo account here: https://i5k.nal.usda.gov/web-apollo-registration
Tutorials on manual annotation and Apollo
- A general overview of manual annotation, with a worked example (contact i5k@ars.usda.gov if you’d like a recording): https://i5k.nal.usda.gov/sites/default/files/presentations/Apollo_training_8-2019.pdf
- An overview of the various functions of Apollo (contact i5k@ars.usda.gov if you’d like a recording): https://i5k.nal.usda.gov/sites/default/files/presentations/Apollo_webinar_8-18-2021.pdf
- The i5k Workspace’s gene and protein nomenclature guidelines: https://i5k.nal.usda.gov/i5k-workspace-gene-and-protein-naming-guidelines
- Monica Munoz-Torres, an Apollo project alumna, provides an excellent overview and dives into the details: https://www.slideshare.net/MonicaMunozTorres/presentations
- The Nescent-funded Genome Train has a number of resources to get you started: http://genomecuration.github.io/genometrain/d-feature-curation-crossing/index.html
- Alexie Papanicolaou from the Hawkesbury Institute for the Environment developed a useful tutorial: https://stressedfruitfly.com/themes/stressedfruitfly/assets/videos/snmp1_webapollo1_960.mp4
- Rob Waterhouse’s Apollo training session with BIPAA, the BioInformatics Platform for Agroecosystem Arthropods: https://www.youtube.com/watch?v=BMeSwdKiO_E
Manual annotation with undergraduate students
- Manual annotation with Apollo can be an excellent way to engage undergraduates. We encourage and support undergraduate annotation at the i5k Workspace@NAL. Get in touch with us if you are planning undergraduate annotation with i5k Workspace@NAL resources.
- For smaller groups of undergraduates with in-depth capstone projects, we recommend the guidelines proposed in “A quick guide for student-driven community genome annotation” by Prashant Hosmani et al.: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006682
- For larger undergraduate courses, we recommend the Genomics Education Partnership. This organization provides a number of excellent materials for undergraduate training: http://gep.wustl.edu/curriculum/introducing_genes; http://gep.wustl.edu/curriculum/workshop_materials
- Genome Solver is focused on microbial genomics and provides some useful background: https://qubeshub.org/community/groups/genomesolver/
Additional documentation
- The official Apollo user guide: https://genomearchitect.readthedocs.io/en/latest/UsersGuide.html
- The i5k Workspace’s Annotation guidelines: https://i5k.nal.usda.gov/content/rules-web-apollo-annotation-i5k-pilot-project
- The i5k Workspace’s gene and protein nomenclature guidelines: https://i5k.nal.usda.gov/i5k-workspace-gene-and-protein-naming-guidelines
- Mapped RNA-Seq reads are incredibly useful for manual annotation. Here’s how to map your RNA-Seq reads to an i5k Workspace genome: https://i5k.nal.usda.gov/mapping-rna-seq-reads-manual-curation-faq
- Functional annotations:
- Webinar slides on how to incorporate functional annotations into your manual annotation workflow: https://i5k.nal.usda.gov/sites/default/files/presentations/func-annot-webinar-12-2022.pdf
- The AgBase functional annotation pipeline that we use to generate functional annotations: https://agbase-docs.readthedocs.io/en/latest/agbase/workflow.html
- The pipeline we use to map RNA-Seq reads to genomes and load them to JBrowse: https://github.com/NAL-i5K/NAL_RNA_seq_annotation_pipeline
Other software
The i5k Workspace@NAL offers manual annotation resources via Apollo. Feel free to get in touch with us if you need Apollo set up for your genome! That said, our resources don’t always fit everyone’s needs. If you need to branch out, here are some resources that may be useful to you:
- G-OnRamp: A Galaxy-based alternative to Apollo. https://g-onramp.org/index.html
- GenSas: A portal to predict genes on a genome assembly and curate them with Apollo. https://www.gensas.org/
- Just need your own genome browser?
- CoGe lets you set up a private browser quickly: https://genomevolution.org/wiki/index.php/Tutorials#How_to_load_a_private_genome_with_annotations_and_experimental_data.2C_and_view_in_JBrowse_in_under_three_minutes