The file reconsile_OGSs_to_master_OGS_primary.ID.gff3.rewritten contains the reprocessed reconsile_OGSs_to_master_OGS_primary.ID.gff3 where the CURATIONS have been re-processed from the original webapollo to remove redundant and wrong annotations Second, NCBI 101 IDs have been kept if no JAMg model supports them; discarded if a curation is in that region Third, JAMg IDs have been kept if none of the other two have gene models in that region (not the most scientific determination but it's what we did for the paper to avoid nasty reviewers) IDs are unique. Name are not guaranteed to be unique grep ' gene' reconsile_OGSs_to_master_OGS_primary.ID.gff3.rewritten | grep -oP 'Name=[^;]+' |sort | uniq -c |sort -rnk1,1 |less grep ' mRNA' reconsile_OGSs_to_master_OGS_primary.ID.gff3.rewritten | grep -oP 'Name=[^;]+' |sort | uniq -c |sort -rnk1,1 |less NCBI names were used as IDs. In a few (less than ten) instances, the Names were not unique so a .digit was appended The file reconsile_OGSs_to_master_OGS_primary.ID.gff3.rewritten.gff3 is a re-written version that excludes all attributes from the GFF3 and the data is url encoded. .gene contains the whole gene including intron (case encoded) .pep and .mRNA .cds should be self-explanatory