Tiange Lang, Pierre Abadie, Valérie Léger, Thibaut Decourcelle, Jean-Marc Frigerio, Christian Burban, Catherine Bodénès, Erwan Guichoux, Grégoire Le Provost, Cécile Robin, Naoki Tani, Patrick Léger, Camille Lepoittevin, Veronica A. El Mujtar, François Hubert, Josquin Tibbits, Jorge Paiva, Alain Franc, Frédéric Raspail, Stéphanie Mariette, Marie-Pierre Reviron, Christophe Plomion, Antoine Kremer, Marie-Laure Desprez-Loustau, Pauline Garnier-GéréPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
<p>In the post-genomics era, non-model species like most Fagaceae still lack operational diversity resources for population genomics studies. Sequence data were produced from over 800 gene fragments covering ~530 kb across the genic partition of European oaks, in a discovery panel of 25 individuals from western and central Europe (11 *Quercus petraea*, 13 *Q. robur*, one *Q. ilex* as an outgroup). Regions targeted represented broad functional categories potentially involved in species ecological preferences, and a random set of genes. Using a high-quality dedicated pipeline, we provide a detailed characterization of these genic regions, which included over 14500 polymorphisms, with ~12500 SNPs −218 being triallelic-, over 1500 insertion-deletions, and ~200 novel di- and tri-nucleotide SSR loci. This catalog also provides various summary statistics within and among species, gene ontology information, and standard formats to assist loci choice for genotyping projects. The distribution of nucleotide diversity (θπ) and differentiation (FST) across genic regions are also described for the first time in those species, with a mean n θπ close to ~0.0049 in *Q. petraea* and to ~0.0045 in *Q. robur* across random regions, and a mean FST ~0.13 across SNPs. The magnitude of diversity across genes is within the range estimated for long-term perennial outcrossers, and can be considered relatively high in the plant kingdom, with an estimate across the genome of 41 to 51 million SNPs expected in both species. Individuals with typical species morphology were more easily assigned to their corresponding genetic cluster for *Q. robur* than for *Q. petraea*, revealing higher or more recent introgression in *Q. petraea* and a stronger species integration in *Q. robur* in this particular discovery panel. We also observed robust patterns of a slightly but significantly higher diversity in *Q. petraea*, across a random gene set and in the abiotic stress functional category, and a heterogeneous landscape of both diversity and differentiation. To explain these patterns, we discuss an alternative and non-exclusive hypothesis of stronger selective constraints in *Q. robur*, the most pioneering species in oak forest stand dynamics, additionally to the recognized and documented introgression history in both species despite their strong reproductive barriers. The quality of the data provided here and their representativity in terms of species genomic diversity make them useful for possible applications in medium-scale landscape and molecular ecology projects. Moreover, they can serve as reference resources for validation purposes in larger-scale resequencing projects. This type of project is preferentially recommended in oaks in contrast to SNP array development, given the large nucleotide variation and the low levels of linkage disequilibrium revealed.</p>
SNPs, functional candidate genes, Quercus robur, Q. petraea, Sanger amplicon resequencing, introgression, species differentiation