FST Outlier tests in genome scans for local adaptation: when do they tell the truth and what are we missing?

K.E. Lotterhos and M.C. Whitlock

Next-generation technology has made it possible to obtain large amounts of genomic data. But how do we find locally-adapted genes from this mountain of data? One method is to look for outliers in the distribution of FST (FST outlier tests, FDIST (Beaumont and Nichols 1996) and BAYESCAN (Foll and Gaggiotti 2008)). Using a large-scale landscape genetics simulator, we compared these two programs for three demographic histories that are common to tree populations: isolation-by-distance, expansion from one refuge, and expansion from two refugia with secondary contact. The latter two demographies were non-equilibrium scenarios. We found a large number of false positive FST outliers with these demographic histories, especially with the refugia scenarios. We show that the default parameters in Bayescan produce more false positives than FDIST, but that the number of false positives can be decreased with the prior-odds parameter without affecting power. We also propose that the FDIST method can be improved by simulating the actual number of populations in the data, rather than by the number of samples collected.