The program structure is a free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs.
fastSTRUCTURE for large SNP datasets is out now! Links to the preprint and software (beta release) by Anil, Matthew and Jonathan.
What to cite: The basic algorithm was described by Pritchard, Stephens & Donnelly (2000). Extensions to the method were published by Falush, Stephens and Pritchard (2003), and (2007) and Hubisz, Falush, Stephens and Pritchard (2009).
Contributors: Daniel Falush, Melissa Hubisz, Matthew Stephens, Jonathan Pritchard, Peter Donnelly, William Wen, Mike Trienis, Pall Melsted.
Questions and Discussion: There is a Structure discussion forum to which you can direct questions. Many thanks to Vikram Chhatre who moderates this discussion group.
Plotting programs and other resources: CLUMPP and distruct from Noah Rosenberg's lab can automatically sort the cluster labels and produce nice graphical displays of structure results. Other plots are produced directly by the software package itself. Structure Harvester by Dent Earl provides additional tools for visualizing Structure output. Xavier Didelot's program xmfa2struct converts files in eXtended Multi-Fasta (XMFA) format into Structure input format.
Genome-wide SNP data: TreeMix by Joe Pickrell and Jonathan uses large numbers of SNPs to estimate the historical relationships among populations, using a graph representation that allows both population splits and migration events. [Note: Joe's latest release now allows microsat data too.] fastSTRUCTURE by Anil Raj, Matthew and Jonathan, for running Structure on very large SNP datasets (preprint). fineSTRUCTURE by Daniel Lawson and colleagues enables analyses of very fine scale structure for genome-wide SNP data.
Sample data sets: available here.
Taita thrush: An example of MCMC convergence based on the original paper is shown here.
Some miscellaneous applications: structure has been widely used for interpreting population structure of humans and other organisms. A selection of interesting references (mainly applications) is shown below.
Traces of human migrations in Helicobacter pylori populations. D. Falush, T. Wirth, B. Linz, J.K. Pritchard, M. Stephens and 13 others, 2003. Science, 299: 1582-1585. [PDF]
The genetic structure of human populations. N.A. Rosenberg, J.K. Pritchard, J.L. Weber, H.M. Cann, K.K. Kidd, L.A. Zhivotovsky and M.W. Feldman, 2002. Science, 298: 2381-2385. (and technical comment, 2003) [PDF]
Dwarf8 polymorphisms associate with variation in flowering time. Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D, Buckler ES. Nat Genet. 2001 28:286-9. [PubMed Abstract]
Origin of extant domesticated sunflowers in eastern North America. Harter AV, Gardner KA, Falush D, Lentz DL, Bye RA, Rieseberg LH. Nature. 2004 430:201-5. [PubMed Abstract]
Emerging vectors in the Culex pipiens complex. Fonseca DM, Keyghobadi N, Malcolm CA, Mehmet C, Schaffner F, Mogi M, Fleischer RC, Wilkerson RC. Science. 2004 303:1535-8. [PubMed Abstract]
Empirical evaluation of genetic clustering methods using multilocus genotypes from 20 chicken breeds. Rosenberg NA et al. Genetics. 2001 159:699-713. [PubMed Abstract]