SC Colloquium: "Fun with selecting population genetics models, or Goldilocks’ principle"
Department of Scientific Computing
Florida State University
Over the last few years I have tried to convince users of my software MIGRATE to not simply accept results of using the program-supplied default population model. Users now routinely test whether population are panmictic or truly structured, or evaluate specific immigration hypotheses conditioned on the data. An additional dimension of model complexity can be added by considering not only the population structure but also the structure of the sequence data. For example, comparison of a contiguous stretch of DNA versus many small chunks. Setting up such model comparisons on a computer cluster allows 'quick' calculations of many potential partitionings of the data joint with the population structure. Results of such experiments show that the parameter estimates from extreme models are considerably different than parameter estimates from models that are 'optimal' for the data. I show examples from simulated data and two real datasets (small human LPL data set; 10000 bp from chromosome 2L of D. melanogaster populations in Zambia and Uganda).