Accepted data types: Genome assemblies

  1. Accepted file formats

    1. Fasta (https://en.wikipedia.org/wiki/FASTA_format)

    2. Agp (https://www.ncbi.nlm.nih.gov/assembly/agp/AGP_Specification/)

  2. File provenance

    1. We accept only assemblies that are archived in INSDC repositories (ENA, GenBank and DDBJ) or RefSeq. We have found that users benefit from the additional contamination screen that INSDC repositories provide. Using official sequence and assembly identifiers also prevents confusion about sequence content and versioning. See http://ensemblgenomes.org/info/about/legal/browser_agreement for further elaboration.

  3. Acknowledgement of data source

    1. We list the genome assembly data source on each i5k Project page (e.g. https://i5k.nal.usda.gov/Cimex_lectularius).

    2. We require contact information (Name, valid email address, Affiliation) for the genome submitter (or other primary contact). This information is not currently listed on our pages, but we reserve the right to make this information available to those with questions about the assembly.