Population Structure
Population Structure (PCA & ADMIXTURE)
Decompose your germplasm into ancestral subpopulations before GWAS or selection.
How it works
Principal Component Analysis (PCA) on the genotype matrix gives a fast, model-free view of population stratification. The top PCs are essential covariates in GWAS — without them, subpopulation differences inflate false positives. We complement PCA with an ADMIXTURE-style ancestry analysis that estimates fractional ancestry from K subpopulations and uses cross-validation error to choose the best K.
Formula
PCA: eigendecomposition of the centered, scaled genotype matrix. ADMIXTURE: maximum-likelihood estimation of Q (ancestry) and P (allele-frequency) matrices under a model of K ancestral populations.
What you get
- ▸PC1–PC4 scatter plots with percent variance explained
- ▸Ancestry-proportion stacked bars for K=2..8
- ▸Cross-validation error curve to select K
When to use it
- ▸Before any GWAS run on a diverse panel
- ▸When sampling parents for a breeding program from multiple gene pools
- ▸To verify dataset composition before genomic selection
References
Run Population Structure on your data
Open the module and upload a CSV.