Welcome to the Sunflower Genome Database

Food production must increase by 70 to 100% by 2050 to keep pace with predicted population growth and changes in diet. This task is exacerbated by ongoing changes in climate, heightened competition for land and water, and continued deceleration of yield increases from conventional breeding. To meet this challenge, crops must be developed that combine high yield with resistance to biotic and abiotic stress, and require lower inputs. Sunflower (Helianthus annuus L.) can partly fulfill this need because of its diverse extremophile and cross-compatible wild relatives, whose alleles can be exploited for breeding. It also is a major crop in its own right, ranking 12th in terms of area harvested (http://faostat3.fao.org/home/E), with a gross value of $20 billion per year. Worldwide, it is an important food security crop in developing countries, with global use increasing rapidly (Khoury et al. 2014; PNAS 111:4001-4006). In addition to its value as a source of healthy vegetable oil, sunflower has become increasingly important as a model for ecological and evolutionary studies.

Sunflower belongs to the daisy family Compositae, which is one of the largest and most ecologically diverse families of flowering plants. However, genomic characterization of sunflower and other Compositae species has been slow, in part because Compositae crops have very large genomes. A reference genome for sunflower is not yet published, and the organization and structure of the sunflower genome remains poorly understood. This has impeded research in sunflower and other Compositae species, and hinders the facile application of molecular approaches to sunflower breeding and improvement, as well as for evolutionary studies. Here we provide a reference genome for sunflower, as well as number of other genomic resources, including high density genetic and physical maps, as well as transcriptome and sequence data for a diverse array of wild and cultivated genotypes.

The sunflower genome is fairly large and complex. It contains between 3.5 and 3.6 billion bases, making it slightly larger than the human genome. The majority of the sunflower genome is composed of repetitive sequences, mainly transposable elements. Therefore, we have employed a hybrid approach for assembly of the sunflower gnome, which combines whole-genome shotgun (WGS) sequencing using the Illumina and 454 platforms with the generation of high density genetic and physical maps that serve as scaffolds for the linear assembly of WGS sequences. The performance of this approach is enhanced by the nature of our physical maps, which employs the whole genome profiling method of Keygene, Inc., thereby providing unique sequence-based markers every 2-6 kb across the genome. More recently, our partners at INRA Toulouse have provided Illumina-based BAC end sequences, and are currently generating very long reads from the PacBio platform, both of which are expected to improve the sunflower reference sequence.

This site provides access to data and tools generated by the sunflower genome project. The data hosted on this site was generated through the collaboration of numerous institutions with the main participants listed below:

University of British Columbia
University of Georgia
INRA

The resources generated by this project will enable the improvement of sunflower of a variety of uses.

The seeds would be harvested for food and oil, while the stalks would be utilized for wood or converted to ethanol. As a dual-use crop it wouldn't be in competition with food crops for land.¹
Dr. Loren Rieseberg
Project leader, University of British Columbia

1. Excerpt from a media release in Science Daily.