Ephemera danica annotations ephdan_OGSv1.0
Resource Type | Genome Annotation |
---|---|
Name | Ephemera danica annotations ephdan_OGSv1.0 |
Program, Pipeline, Workflow or Method Name | MAKER2, manual annotations, GFF3toolkit, remap-gff3 |
Program Version | NA |
Data Source |
|
Organism | |
Publication | Thomas GWC, Dohmen E, Hughes DST, Murali SC, Poelchau M, Glastad K, Anstead CA, Ayoub NA, Batterham P, Bellair M, Binford GJ, Chao H, Chen YH, Childers C, Dinh H, Doddapaneni HV, Duan JJ, Dugan S, Esposito LA, Friedrich M, Garb J, Gasser RB, Goodisman MAD, Gundersen-Rindal DE, Han Y, Handler AM, Hatakeyama M, Hering L, Hunter WB, Ioannidis P, Jayaseelan JC, Kalra D, Khila A, Korhonen PK, Lee CE, Lee SL, Li Y, Lindsey ARI, Mayer G, McGregor AP, McKenna DD, Misof B, Munidasa M, Munoz-Torres M, Muzny DM, Niehuis O, Osuji-Lacy N, Palli SR, Panfilio KA, Pechmann M, Perry T, Peters RS, Poynton HC, Prpic NM, Qu J, Rotenberg D, Schal C, Schoville SD, Scully ED, Skinner E, Sloan DB, Stouthamer R, Strand MR, Szucsich NU, Wijeratne A, Young ND, Zattara EE, Benoit JB, Zdobnov EM, Pfrender ME, Hackett KJ, Werren JH, Worley KC, Gibbs RA, Chipman AD, Waterhouse RM, Bornberg-Bauer E, Hahn MW, Richards S. Gene content evolution in the arthropods.. Genome biology. 2020 01 23; 21(1):15. |
Description | This dataset presents the Ephemera danica Official Gene Set (OGS) v1.0. The OGS is an integration of automatic gene predictions from Ephemera danica genome annotations v0.5.3 (https://10.15482/USDA.ADC/1503792), with manual annotations by the research community (https://data.nal.usda.gov/dataset/ephemera-danica-manual-annotations-genome-assembly-edan10, performed via the Apollo manual curation software, http://genomearchitect.org/). Manual and automated annotations were lifted over from genome assembly Ephemera danica genome assembly v1.0 (https://10.15482/USDA.ADC/1503791) to genome assembly Edan_2.0 (https://www.ncbi.nlm.nih.gov/assembly/GCA_000507165.2) using the coordinates_conversion and remap-gff3 programs (https://github.com/NAL-i5K/coordinates_conversion/; https://github.com/NAL-i5K/remap-gff3). 1,035 annotations were removed from the original datasets during this process, due to changes in the new genome assembly, or due to problems with the original gene models. Protein pages for the manual annotations can be accessed at NCBI: https://www.ncbi.nlm.nih.gov/protein?LinkName=nuccore_protein_wgs&from_u... The full dataset is accessible at the Ag Data Commons: https://doi.org/10.15482/USDA.ADC/1518589 |