A major challenge of the sunflower genome project has been dealing with the large and repetitive nature of the
genome. Below is a description of our custom annotation procedures and the results. You may skip to the results or download sections using the menu to the left. Note the references for each data source at the bottom of the page.
Annotation files
Separate transposon and gene annotation files are listed below but note that there are also combined
annotation files provided that contain all features (under the section "Combined annotations").
Genes
Release version | Filename | Description | Format | Download |
---|
v1.0 | Ha412v1r1_CDS_v1.0.fasta.gz | Nucleotide CDS for each gene | FASTA | Download |
v1.0 | Ha412v1r1_CDS_v1.0.fasta.gz.md5 | MD5 checksum of nucleotide CDS for each gene | MD5 checksum | Download |
v1.0 | Ha412v1r1_CDS_iprscan_v1.0.tsv.gz | Functional annotation table | TSV | Download |
v1.0 | Ha412v1r1_CDS_iprscan_v1.0.tsv.gz.md5 | MD5 checksum of functional annotation table | MD5 checksum | Download |
v1.0 | Ha412v1r1_prot_v1.0.faa.gz | Translated CDS, or protein sequence | FASTA | Download |
v1.0 | Ha412v1r1_prot_v1.0.faa.gz.md5 | MD5 checksum of translated CDS | MD5 checksum | Download |
v1.0 | Ha412v1r1_genes_v1.0.gff3.gz | Annotated gene features in GFF3 format | FASTA | Download |
v1.0 | Ha412v1r1_genes_v1.0.gff3.gz.md5 | MD5 checksum of annotated gene features in GFF3 | MD5 checksum | Download |
v1.0 | Ha412v1r1_genes_v1.0.gtf.gz | Annotated gene features in GTF format | FASTA | Download |
v1.0 | Ha412v1r1_genes_v1.0.gtf.gz.md5 | MD5 checksum of annotated gene features in GTF | MD5 checksum | Download |
v1.0 | Ha412v1r1_genes_v1.0.fasta.gz | Full-length nucleotide sequence for each gene | FASTA | Download |
v1.0 | Ha412v1r1_genes_v1.0.fasta.gz.md5 | MD5 checksum of full-length nucleotide sequence for each gene | MD5 checksum | Download |
v1.0 | Ha412v1r1_genes_v1.0_HanXRQr1.0-20151230_genes_id-mapping.tsv.gz | Mapping file to go from HA 412 to XRQ gene IDs | TSV | Download |
v1.0 | Ha412v1r1_genes_v1.0_HanXRQr1.0-20151230_genes_id-mapping.tsv.gz.md5 | MD5 checksum of Mapping file to go from HA 412 to XRQ gene IDs | MD5 checksum | Download |
Transposons
The transposon annotations below were generated with
Tephra, a software package developed for this project. If you use these annotations in your work, please refer to the link provided on how to cite this software.
Release version | Filename | Description | Format | Download |
---|
v1.0 | Ha412v1r1_transposons_v1.0.gff3.gz | Annotated transposons in GFF3 format | GFF3 | Download |
v1.0 | Ha412v1r1_transposons_v1.0.gff3.gz.md5 | MD5 checksum of annotated transposons in GFF3 format | MD5 checksum | Download |
v1.0 | Ha412v1r1_transposons_v1.0.fasta.gz | Full-length nucleotide sequence for each transposon | FASTA | Download |
v1.0 | Ha412v1r1_transposons_v1.0.gff3.gz.md5 | MD5 checksum of full-length nucleotide sequence for each transposon | MD5 checksum | Download |
Combined annotations
Release version | Filename | Description | Format | Download |
---|
v1.0 | Ha412v1r1_genes_transposons_v1.0.gff3.gz | Annotated gene and transposon features in GFF3 format | GFF3 | Download |
v2.0 | Ha412HO_V2.onlychr.fasta.mod.EDTA.TEanno.gff.gz | Annotated transposons in GFF3 format V2 | GFF3 | Download |
v1.0 | Ha412v1r1_genes_transposons_v1.0.fasta.gz | Full-length gene and transposon sequences | FASTA | Download |
v1.0 | Ha412v1r1_genes_transposons_v1.0.fasta.gz.md5 | MD5 checksum of full-length gene and transposon sequences | MD5 checksum | Download |
Eugene Curated Annotations (HA412HO Version 2)
Release version | Filename | Description | Format | Download |
v2.0 | HAN412_Eugene_curated_v1_1.gff3.gz | Curated Gene predictions for Assembly HAN412 provided by Eugene (INRA) | GFF3 | Download |
v2.0 | PSC8_Eugene_curated_v1_1.gff3.gz | Curated Gene predictions for Assembly PSC8 provided by Eugene (INRA) | GFF3 | Download |
v2.0 | XRQv2_Eugene_curated_v1_1.gff3.gz | Curated Gene predictions for Assembly XRQv2 provided by Eugene (INRA) | GFF3 | Download |
File Specifications
Release version | Filename | Description | Format | Download |
---|
v1.0 | FILE_SPECIFICATIONS.txt | Detailed description of all annotation file contents and formats | Plain Text | Download |
v1.1 | Version_1_1_Description.txt | Eugene Annotations curation history version detailed | Plain Text | Download |
References
- Badouin H, Gouzy J, Grassa CJ, Murat F, Staton SE, et al. 2017. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature 546(7656): 148-152.