Similar selection pressures on fluid g and educational attainment-related SNPs

Author: Davide Piffer. Email:

A recent GWAS has examined the additive genetic variance accounting for variation in general cognitive function or fluid g (Davies et al., 2015)These were assessed using a battery of information-processing tests including memory, block design, matrix reasoning, reaction time, letter-number sequencing (Davies et al. , 2015).

Since my use of an educational attainment GWAS has been criticized for being affected by environmental variables and for not being strictly an intelligence measure, I decided to see if I could replicate this result on an independent sample and using different measures, hopefully tapping into a more “culture-free” construct, such as fluid g. The typical reaction to using educational attainment is that it could be influenced by environmental variables correlated to genetic variation (see for example the comments by this reviewer:

13 SNPs with genome-wide significance (p<5*10-8) were identified (Davies et al., 2015).  10 hits (i.e. the allele with a positive effect on the phenotype) were derived and 3 were ancestral alleles. Table 1 reports the average frequency of the 13 SNPs for the 26 populations in 1000 Genomes  and the frequency of the top 10 SNPs with an effect on years on education from Rietveld et al. (2013) . The correlation between the two polygenic scores (e.g. average population frequency of GWAS hits) is very high: r= 0.964. Their correlation to population IQ is also substantial: r= 0.817 and 0.715 for Davies et al, 2015 and Rietveld et al, 2013, respectively.

Table 1. Average frequency of intelligence (fluid g) and education (years of education)-increasing alleles from two independent GWAS.

Population  Davies et al, 2015. Top 13 SNPs Rietveld et al., 2013. Top 10 SNPs IQ
Afr.Car.Barbados 0.262 0.317 83
US Blacks 0.320 0.360 85
Bengali Bangladesh 0.371 0.368 81
Chinese Dai 0.498 0.463
Utah Whites 0.521 0.534 99
Chinese, Bejing 0.509 0.468 105
Chinese, South 0.477 0.448 105
Colombian 0.471 0.476 83.5
Esan, Nigeria 0.286 0.341 71
Finland 0.556 0.573 101
British, GB 0.529 0.548 100
Gujarati Indian, Tx 0.391 0.403
Gambian 0.262 0.325 62
Iberian, Spain 0.533 0.566 97
Indian Telegu, UK 0.280 0.293
Japan 0.425 0.399 105
Vietnam 0.511 0.491 99.4
Luhya, Kenya 0.228 0.292 74
Mende, Sierra Leone 0.311 0.355 64
Mexican in L.A. 0.358 0.370 88
Peruvian, Lima 0.300 0.288 85
Punjabi, Pakistan 0.324 0.357 84
Puerto Rican 0.476 0.483 83.5
Sri Lankan, UK 0.308 0.323 79
Toscani, Italy 0.553 0.562 99
Yoruba, Nigeria 0.270 0.340 71

As overrepresentation of derived alleles among GWAS hits is a potential counfound (due to different frequencies of derived alleles among population caused by drift and bottlenecks or GWAS artifacts: see my previous posts for an explanation), a baseline frequency of derived alleles (DAF) was estimated using the 693 SNPs significant for human stature in the largest GWAS to date (Wood et al, 2014).

A multiple regression was ran with population IQ and the two variables (baseline DAF and polygenic score) was ran for the two GWAS hits.

Table 2. Standardized beta coefficients. DAF= derived allele frequency. DP (derived alleles with positive effect on the trait).

  Baseline DAF Davies DP
Rietveld et al., 2013 0.406 0.464
Davies et al, 2015 0.307 0.587

Both polygenic scores emerged as better predictors than baseline DAF. A DAF-calibrated score was calculated by subtracting baseline DAF from the frequency of derived hits. This likely represents selection signal on derived alleles as it controls for evolutionary dynamics such as random drift and population bottlenecks. Since the two population-level polygenic scores were highly correlated (r= 0.953), an average score was computed and is reported in table 3, ranked in descending order. This score is highly correlated to the average of the two polygenic scores obtained using all the SNPs (table 1), r= 0.971. However, the correlation with population IQ is slightly lower, at r= 0.687.

Table 3. DAF-calibrated polygenic scores for derived alleles and average polygenic score. Ranked in descending order. DAF= derived allele frequency.

Population DAF-free Derived hits. Rietveld et al, 2013 DAF-free Derived hits. Davies et al, 2013 Average
Toscani, Italy 0.186 0.178 0.182
Finland 0.188 0.173 0.180
Iberian, Spain 0.188 0.155 0.171
British, GB 0.171 0.149 0.160
Utah Whites 0.160 0.140 0.150
Vietnam 0.114 0.160 0.137
Chinese, Bejing 0.092 0.155 0.123
Chinese Dai 0.087 0.145 0.116
Puerto Rican 0.104 0.116 0.110
Colombian 0.103 0.107 0.105
Chinese, South 0.064 0.127 0.096
Japan 0.025 0.072 0.049
Gujarati Indian, Tx 0.033 0.038 0.036
Mende, Sierra Leone 0.021 0.038 0.029
US Blacks 0.014 0.026 0.020
Esan, Nigeria 0.007 0.013 0.010
Bengali Bangladesh -0.011 0.022 0.006
Yoruba, Nigeria 0.005 0.002 0.004
Mexican in L.A. -0.007 0.000 -0.003
Gambian -0.022 -0.015 -0.018
Afr.Car.Barbados -0.036 -0.022 -0.029
Punjabi, Pakistan -0.036 -0.028 -0.032
Luhya, Kenya -0.046 -0.050 -0.048
Sri Lankan, UK -0.057 -0.042 -0.050
Peruvian, Lima -0.087 -0.045 -0.066
Indian Telegu, UK -0.090 -0.064 -0.077


We can see that genetic variants increasing fluid intelligence and educational attainment are highly correlated at the population-level, suggesting two things: 1) there are common selection pressures on the two phenotypes or 2) educational attainment is a good proxy for g and the SNPs found by Rietveld et al., 2013 are actually g-related (as was suggested by their replication on g in a sub-sample). The findings in the present study debunk two criticisms of my work: 1) That the observed allele frequency differences were “specific” to educational attainment and not really about intelligence and 2) that derived allele differences caused by GWAS artifacts or random drift could mediate the effects. I showed that the observed effects are not due to different baseline derived allele frequencies, thus ruling this out as a possible confound. A discrepancy with IQ estimates is that East Asians lag behind Europeans and that South Asians and Hispanics don’t perform better than sub-Saharan Africans, a finding that is difficult to explain at present.

Again, we observe a tendency for derived alleles (human-specific mutations or not shared with non-human primates) to be overrepresented among the most significant intelligence GWAS hits, confirming the prediction stemming  from the evolutionary fact that intelligence has dramatically increased during human evolution.


Davies, G., Armstrong, N., Bis, J. C., et al. (2015). Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53949).Molecular Psychiatry, 20:183-192. doi: 10.1038/mp.2014.188

Rietveld, C.A., Medland, S.E., Derringer, J., Yang, J., Esko, T., Martin, N.W., et al. (2013). GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science, 340, 1467-1471. doi:

Wood AR, Esko T, Yang J,et al.: Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet. 2014; 46(11): 1173–86.







Using derived alleles to amplify selection signatures on intelligence

Author: Davide Piffer

The aim of this study is to identify polygenic selection signatures on intelligence across 26 populations from 1000 Genomes. In the next post, I will expand on this to include more populations (at the expense of SNPs number and reliability)!

Derived allele frequencies and background calibration

At a theoretical level, an ancestral allele is the allele that was carried by the last common ancestor between humans and other primates whereas an allele is derived when it arose in the human lineage after the split from other primates. In practice, this allele is usually ascertained via comparison with chimpanzees. One limitation of this procedure is that if a mutation arose in chimpanzees after the split from humans, then the ancestral allele is not the chimp allele. Thus, 1000 Genomes infers ancestral alleles via alignment with 6 primate species (Ensembl, 2015).

Frequencies of derived alleles are not the same for all populations. Substantial DAF (derived allele frequency) differences across populations have been found, largely due to random drift and population bottlenecks but in part also shaped by different selection pressures (Henn et al., 2015). Non-African populations tend to have higher frequencies of derived alleles, and DAF is positively correlated to distance from Africa (Henn et al., 2015). There are also potential issues with GWAS. For example, a reviewer of a previous submission ( suggested that the minor alleles picked by the GWAS (carried on European subjects) tend to have higher frequencies among the GWAS reference population (i.e. Europeans) than the average genome-wide frequencies of minor alleles. Minor alleles are more likely to be derived alleles, hence these derived alleles will have higher frequencies among Europeans compared to other populations. If derived alleles tend to have a positive effect, the frequency of alleles with positive effect may be higher among Europeans than other populations.

A novel methodology suggested here to deal with this confound is to create a variable which represents a good approximation to the average frequencies of derived alleles picked up by GWA studies. For this purpose, the significant hits (N= 693) from the largest GWAS of human stature to date (Wood et al., 2014) were grouped by allele status. The average frequency of derived alleles (including both alleles with a positive and a negative effect) was computed and then averaged into a single variable, henceforth the DAF index (table 1). Negative and positive alleles were given equal weight to avoid positive selection bias on the index.

Table 1. Mean derived allele frequencies and country IQ.

Population Height Derived IQ
Afr.Car.Barbados 0.298 83
US Blacks 0.309 85
Bengali Bangladesh 0.363 81
Chinese Dai 0.359
Utah Whites 0.382 99
Chinese, Bejing 0.365 105
Chinese, South 0.362 105
Colombian 0.372 83.5
Esan, Nigeria 0.286 71
Finland 0.385 101
British, GB 0.381 100
Gujarati Indian, Tx 0.365
Gambian 0.291 62
Iberian, Spain 0.378 97
Indian Telegu, UK 0.362
Japan 0.366 105
Vietnam 0.360 99.4
Luhya, Kenya 0.291 74
Mende, Sierra Leone 0.283 64
Mexican in L.A. 0.376 88
Peruvian, Lima 0.373 85
Punjabi, Pakistan 0.366 84
Puerto Rican 0.369 83.5
Sri Lankan, UK 0.362 79
Toscani, Italy 0.376 99
Yoruba, Nigeria 0.285 71

Using the DAF from the GWAS on human stature, we note that derived alleles (col.  2) tend to be at lower frequencies among African than non-African populations, confirming the findings of a recent study (Henn et al., 2015) on different mutational load at common variants. The hypothesis that this phenomenon could mediate the association between IQ and polygenic scores is also confirmed by DAF’s positive correlation with population IQ (r=0.767).Note that the confounding effect would be present only when there are more derived positive than ancestral positive. If these are represented in equal proportions, the overrepresentation of derived alleles in some populations will be perfectly balanced by the underrepresentation of ancestral alleles and viceversa. However, in cases where there is a dramatic overrepresentation of derived alleles (such as the top significant hits in Rietveld et al., 2013), it is necessary to control for background DAF. Moreover, having a larger sample of SNPs (such as that from the height GWAS comprising 693 SNPs) will enable us to have a more accurate estimate of the background DAF than that we could gain from using a smaller subset of SNPs.

A DAF-calibrated polygenic score is then created by subtracting the DAF index from the average frequency of derived alleles with positive effect from GWAS SNPs. Table 2 reports standardized scores, in descending order (sorted by the mean value of the two scores).

Note that we could also apply the reverse procedure and calculate a background frequency of ancestral alleles (1-DAF). Then one could subtract that from the average frequency of ancestral alleles with positive effect. This is perhaps justified for traits such as height which were not subject to a dramatic increase during human evolution. However, since intelligence has been subject to a sharp increase and most intelligence-enhancing mutations are likely to be human-specific and not shared with our primate ancestors, by focusing on derived alleles one likely amplifies the signal of selection.

Table 2. Background “DAF-free” polygenic scores (P.S). Average is reported as Z scores and reported in descending order.

Population P.S, Rietveld et al., 2014 P.S, p<5*10-8 




Toscani, Italy 1.671 1.620 1.496 1.596
Iberian, Spain 1.567 1.646 1.391 1.535
Finland 1.358 1.645 1.113 1.372
British, GB 0.886 1.446 1.397 1.243
Vietnam 0.481 0.798 1.679 0.986
Japan 1.667 -0.230 1.124 0.854
Utah Whites 0.239 1.319 0.908 0.822
Chinese, Bejing 0.462 0.536 0.736 0.578
Chinese, South 0.494 0.221 0.893 0.536
Chinese Dai -0.229 0.485 0.414 0.223
Gujarati Indian, Tx -0.267 -0.135 -0.159 -0.187
Mende, Sierra Leone 0.133 -0.276 -0.453 -0.199
Colombian -0.847 0.672 -0.433 -0.202
Yoruba, Nigeria 0.309 -0.456 -0.551 -0.233
Puerto Rican -1.178 0.683 -0.220 -0.239
US Blacks -0.245 -0.353 -0.600 -0.399
Gambian 0.233 -0.770 -0.709 -0.415
Afr.Car.Barbados 0.187 -0.931 -0.922 -0.555
Esan, Nigeria -0.626 -0.444 -0.746 -0.605
Punjabi, Pakistan -0.760 -0.928 -0.164 -0.618
Bengali Bangladesh 0.262 -0.646 -1.532 -0.639
Luhya, Kenya -0.947 -1.044 -0.356 -0.782
Indian Telegu, UK -0.389 -1.558 -0.702 -0.883
Sri Lankan, UK -0.230 -1.177 -1.293 -0.900
Mexican in L.A. -2.045 -0.602 -0.508 -1.052
Peruvian, Lima -2.187 -1.523 -1.804 -1.838

The correlation between this score and that obtained using the raw frequencies (total polygenic score= derived and ancestral alleles with positive effect) is r=0.889. These are reported in table 3.

The calibrated scores are correlated to population IQ: r=0.462, 0.628 and 0.752 for the Rietveld et al., 2014, the GWAS significant and the other hits (p<5*10-7>=5*10-8), respectively.

The correlations between the mean calibrated and uncalibrated score and IQ are r=0.68 and 0.790, respectively.

Table 3. Total polygenic scores (Ancestral and derived alleles with positive effect), reported in descending order.

Population Rietveld et al 2014; N=67 p<5*10-8; N=10 p<5*10-7>=5*10-8; N=49 Average
Iberian, Spain 0.468 0.566 0.569 0.534
Toscani, Italy 0.467 0.562 0.568 0.532
Finland 0.465 0.573 0.530 0.523
British, GB 0.458 0.548 0.560 0.522
Utah Whites 0.459 0.534 0.530 0.507
Vietnam 0.459 0.491 0.565 0.505
Chinese, Bejing 0.471 0.468 0.555 0.498
Chinese, South 0.466 0.448 0.543 0.485
Puerto Rican 0.449 0.483 0.520 0.484
Colombian 0.445 0.476 0.519 0.480
Chinese Dai 0.454 0.463 0.520 0.479
Japan 0.474 0.399 0.554 0.476
Gujarati Indian, Tx 0.449 0.403 0.493 0.448
Mexican in L.A. 0.431 0.370 0.515 0.439
Punjabi, Pakistan 0.453 0.357 0.490 0.433
US Blacks 0.451 0.360 0.468 0.426
Mende, Sierra Leone 0.458 0.355 0.462 0.425
Yoruba, Nigeria 0.458 0.340 0.468 0.422
Esan, Nigeria 0.455 0.341 0.461 0.419
Bengali Bangladesh 0.450 0.368 0.435 0.418
Gambian 0.456 0.325 0.456 0.412
Afr.Car.Barbados 0.459 0.317 0.460 0.412
Sri Lankan, UK 0.458 0.323 0.445 0.409
Peruvian, Lima 0.427 0.288 0.498 0.404
Luhya, Kenya 0.450 0.292 0.463 0.402
Indian Telegu, UK 0.451 0.293 0.457 0.400

We can apply the reverse procedure to determine if ancestral alleles contain signal above and beyond the background AAF (ancestral allele frequency) distribution. We can carry this out using the Rietveld et al., 2014, the Rietveld et al., 2013 hits with p<5*10-7>=5*10-8, but it is not possible to use the top 10 SNPs because they contain only 1 ancestral allele with positive effect. Table 9 reports the difference between AP for Rietveld et al., 2014 and 2013 and the background AAF (AP-AAF), and population IQ.

Table 4. Ancestral alleles with positive effect – AAF.


Population AP-AAF; Rietveld et al., 2014 AP-AAF; Rietveld et al., 2013 (p<5*10-7>=5*10-8) IQ
Afr.Car.Barbados -0.003 -0.079 83
US Blacks -0.025 -0.079 85
Bengali Bangladesh -0.074 -0.105 81
Chinese Dai -0.052 -0.025
Utah Whites -0.062 -0.030 99
Chinese, Bejing -0.022 0.030 105
Chinese, South -0.035 0.000 105
Colombian -0.075 0.012 83.5
Esan, Nigeria 0.007 -0.087 71
Finland -0.067 -0.039 101
British, GB -0.075 0.008 100
Gujarati Indian, Tx -0.069 -0.051
Gambian -0.009 -0.096 62
Iberian, Spain -0.058 0.026 97
Indian Telegu, UK -0.061 -0.096
Japan -0.034 0.012 105
Vietnam -0.051 0.008 99.4
Luhya, Kenya -0.005 -0.099 74
Mende, Sierra Leone 0.005 -0.098 64
Mexican in L.A. -0.097 0.011 88
Peruvian, Lima -0.104 0.038 85
Punjabi, Pakistan -0.052 -0.055 84
Puerto Rican -0.056 0.006 83.5
Sri Lankan, UK -0.043 -0.092 79
Toscani, Italy -0.061 0.019 99
Yoruba, Nigeria 0.000 -0.079 71

The correlation between AAP-AAF (Rietveld et al, 2014) and IQ is negative: r=-0.472. The correlation between AAP-AAF (Rietveld et al, 2013) and IQ is positive: r= 0.742.


Controlling for different population DAFs does not substantially alter the overall pattern, although there is a slight reduction in fit (r x population IQ drops from 0.79 to 0.68), which we do not know if it is just a fluke. The far from perfect correlation with population IQ is due to the top place occupied by Europeans instead of East Asians and a tendency for Latin Americans and South Asians (Indians, Bangladeshi) to score as low as sub-Saharan Africans. We also notice that ancestral positive alleles do not have as strong a correlation to population IQ (r= -0.472 and 0.742) as derived positive alleles (table 4). This is expected on evolutionary grounds, as selection on intelligence should have acted on human-specific mutations rather than on ancestral variants shared with non-human primates.


Ensembl, 2015:

Davies, G., Armstrong, N., Bis, J. C., et al. (2015). Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53949).

Henn, B.M., Botigué, L.R., Peischl, S., Dupanloup,I.,  Lipatov,M., Maples,B.K., Martin, A.R., Musharoff, S., Cann, H., Snyder,M.P., Excoffier, L., Kidd, J.M.,  Bustamante, C.D. (2015). Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. PNAS ; published ahead of print December 28, 2015, doi:10.1073/pnas.1510805112

Rietveld, C.A., Medland, S.E., Derringer, J., Yang, J., Esko, T., Martin, N.W., et al. (2013). GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science, 340, 1467-1471. doi:

Rietveld, C.A., Esko, T., Davies, G., Pers, T.H., Turley, P., Benyamin, B., et al. (2014). Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proceedings of the National Academy of Sciences, USA, 111, 13790-13794. doi:10.1073/pnas.1404623111

Wood AR, Esko T, Yang J,et al.: Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet. 2014; 46(11): 1173–86.






The forbidden paper on the population genetics of IQ

Author: Davide Piffer

I submitted a paper to Intelligence in December, 2015. After about three weeks, I received a rejection letter from the new editor (Richard Haier). What was particularly irritating about one of the reviewers was the recommendation to reject without opportunity for revision. In my opinion, this stance is justified only in extreme instances of fatal flaws, otherwise it just reveals a hidden agenda or a general close-minded attitude. My policy has been for some time to post the reviews of rejected papers, because I do not believe that reviews should be hidden. Transparency is very important, particularly in science. Let the general public decide whose arguments provide a better fit to the data, not the dismissive attitude of a reviewer. The reviews are attached in the appendix and the paper can be downloaded from here

This review was obviously written by an expert in the field, although it is not devoid of some generic comments that are irritating because they leave the question that they raise unanswered and they do not provide any references to back up their claims (e.g. “one research group with control of an extremely large family cohort is currently working on a manuscript documenting that years of education is subject to a very peculiar form of confounding”). Really? Which “peculiar form of confounding”? Which large family cohort and which manuscript? Which research group?

Isn’t it funny how a reviewer can afford to be generic and not provide any justification or references to back up their claims, but the authors have got to take extreme pains to make sure that everything is backed by sound evidence? Why this double standard?Perhaps because reviewers work for free, and nobody likes to do unpaid work.

There are a few serious comments that deserve consideration. For example: “A GWAS of Europeans is more likely to detect SNPs with high minor allele frequencies.The minor allele is usually the derived allele, and thus the use of SNPs ascertained to have low p-values in a GWAS of Europeans will lead to an overrepresentation of SNPs with high derived allele frequencies specifically in Europeans. If the derived allele tends to have a positive effect (as the authors claim), this is certainly an issue that needs to be carefully addressed.”

This argument is not explained very clearly. It’s another example of how the reviewers expect crystal-clear clarity from the authors, but they can get away with making rather obscure comments that leave room for different interpretations.

This could mean two things. 1) That the GWAS tends to select trait increasing alleles that are derived.However, upon closer inspection it turns out to be fallacious. There is the wrong assumption that the GWAS hits always have a positive beta, which is not the case. Positive and negative betas are randomly distributed across GWAS hits. Thus, when the GWAS selects the hit with a negative beta, which should be more likely to be the minor allele and hence derived, the allele with a positive beta (in this case, IQ enhancing) is going to be more likely to be the major allele and hence ancestral.

Nonetheless, I counted the number of derived alleles among the alleles increasing height in the latest and biggest GWAS meta-analysis of variation human stature. This would give us an estimate of the GWAS bias towards picking derived alleles

The derived to total allele count ratio is 370/691 or 53.5%. Assessing the statistical significance this result is problematic because many SNPs are in linkage disequilibrium and violate the assumption that they represent independent observations. It’s likely that this is just a statistical fluke but nonetheless, we can give the reviewer’s fallacious reasoning the benefit of the doubt.

The derived to total allele count ratio for intelligence enhancing alleles is 42/66 or 63.6%. Good news is that here we can apply binomial probability to calculate statistical significance because the SNPs were pruned for LD by the authors (Rietveld et al., 2014). We can see that the probability p that X(the number of derived alleles)>=42 is 0.0179.

However, to be fair we have got to include the knowledge acquired by the height GWAS and assume that there is a bias for derived alleles in the GWAS results. The best estimate of this bias is equal to the percentage of derived alleles in excess of 50% in the height meta-analysis, that is 3.5%.

A binomial calculation assuming a background frequency of 53.5% will yield a p value of 0.062, which is not extremely strong but not too shabby either.

However, the more likely interpretation of the reviewer’s comment is that the minor alleles picked by the GWAS tend to have higher frequencies among the GWAS reference population (i.e. Europeans) than the average genome-wide frequencies of minor alleles. Minor alleles are more likely to be derived alleles, hence these derived alleles will have higher frequencies among Europeans compared to other populations. Since derived alleles tend to have a positive effect, the frequency of alleles with positive effect will tend to be higher among Europeans than other populations. It was hard work translating the  reviewers’ obscure words into an understandable sentence.

We can again give the reviewer benefit of the doubt and see if derived alleles with a positive effect have higher frequencies among Europeans compared to ancestral alleles with a positive effect and if their average frequencies are still correlated to population IQs.  

Table 1. Top 69 cognitive performance significant SNPs in Rietveld et al. (2014).


Population Derived positive Derived Negative Total PS IQ DP-DN
Afr.Car.Barbados 0.317 0.301 0.459 83 0.015
US Blacks 0.324 0.334 0.451 85 -0.010
Bengali Bangladesh 0.383 0.437 0.450 81 -0.054
Chinese Dai 0.373 0.411 0.454 -0.037
Utah Whites 0.401 0.444 0.459 99 -0.043
Chinese, Bejing 0.387 0.388 0.471 105 -0.001
Chinese, South 0.384 0.398 0.465 105 -0.014
Colombian 0.381 0.447 0.445 83.5 -0.066
Esan, Nigeria 0.297 0.279 0.455 71 0.018
Finland 0.415 0.451 0.465 101 -0.037
British, GB 0.406 0.455 0.458 100 -0.049
Gujarati Indian, Tx 0.380 0.435 0.449 -0.055
Gambian 0.310 0.300 0.456 62 0.010
Iberian, Spain 0.410 0.436 0.468 97 -0.026
Indian Telegu, UK 0.375 0.423 0.450 -0.048
Japan 0.399 0.400 0.474 105 -0.001
Vietnam 0.382 0.412 0.459 99.4 -0.030
Luhya, Kenya 0.299 0.296 0.450 74 0.003
Mende, Sierra Leone 0.301 0.277 0.458 64 0.024
Mexican in L.A. 0.374 0.473 0.431 88 -0.100
Peruvian, Lima 0.370 0.477 0.427 85 -0.108
Punjabi, Pakistan 0.376 0.418 0.453 84 -0.042
Puerto Rican 0.375 0.425 0.449 83.5 -0.050
Sri Lankan, UK 0.377 0.405 0.458 79 -0.028
Toscani, Italy 0.409 0.437 0.466 99 -0.028
Yoruba, Nigeria 0.305 0.285 0.458 71 0.020
r x IQ 0.833 0.654 0.413 -0.297


It is indeed the case, as the reviewer had predicted, that derived alleles have a higher frequency among Europeans, whether they have a positive effect or not. But the question is: Are derived alleles with a positive effect better predictors of population IQ than derived alleles with a negative effect? If the alleles contain signal that goes above and beyond that produced by being derived, the correlation between derived positive and country IQ should be stronger than that between derived negative and country IQ. In other words, this would tell us that the GWAS found signal above and beyond that provided simply by (ancestral vs derived) allele status.

The correlation between DP (derived alleles with positive effect) and country IQ is r= 0.83. The correlation between country IQ and AP (ancestral alleles with positive effect) is r=-0.65.

This implies that the signal in the total polygenic score (average frequency of all derived and ancestral alleles together) is partly driven by the derived alleles. However, a closer inspection of the matrix will tell us that the correlation between derived alleles with negative effect and IQ is r=0.65, which is lower than that between derived alleles with positive effect and population IQ (r=0.83).

Clearly, more SNPs are required to validate this picture.

Let’s look at the hits found by Rietveld et al. To avoid post-hoc classifications, I employed the same that I used for the analysis in my paper. There were 10 genome-wide significant SNPs (p<5*5*10-8). However, 9/10 alleles with positive effect were derived so there were not enough ancestral positive alleles to make a comparison.The SNPs with a p value between 5*10-7 and 5*10-8) had a sample N= 99. We can see that derived and ancestral alleles are equally represented (DA:AA=49:50).

The same procedure applied to the Rietveld et al. (2014) SNPs to control for differential distribution of derived alleles due to GWAS artifact or bottleneck effects (Henn et al., 2015) will be employed here. Alleles with a positive effect are divided into two sub-groups: those that are derived and those that are ancestral. Reversing their frequencies (1-n) yields the frequencies of derived negative and ancestral negative alleles, respectively. These are shown in table 2.

Table 2. Educational attainment SNPs with a p value between 5*10-7 and 5*10-8 from Rietveld et al. (2013).

Population Derived Positive Derived Negative IQ DP-DN
Afr.Car.Barbados 0.302 0.377 83 -0.075
US Blacks 0.328 0.387 85 -0.059
Bengali Bangladesh 0.339 0.468 81 -0.129
Chinese Dai 0.425 0.384 0.041
Utah Whites 0.471 0.412 99 0.059
Chinese, Bejing 0.446 0.336 105 0.111
Chinese, South 0.451 0.362 105 0.088
Colombian 0.399 0.360 83.5 0.039
Esan, Nigeria 0.298 0.372 71 -0.074
Finland 0.483 0.423 101 0.060
British, GB 0.492 0.372 100 0.120
Gujarati Indian, Tx 0.405 0.417 -0.012
Gambian 0.305 0.387 62 -0.082
Iberian, Spain 0.490 0.352 97 0.138
Indian Telegu, UK 0.376 0.458 -0.082
Japan 0.466 0.355 105 0.111
Vietnam 0.485 0.353 99.4 0.132
Luhya, Kenya 0.321 0.390 74 -0.069
Mende, Sierra Leone 0.309 0.381 64 -0.072
Mexican in L.A. 0.400 0.365 88 0.035
Peruvian, Lima 0.337 0.335 85 0.001
Punjabi, Pakistan 0.405 0.421 84 -0.016
Puerto Rican 0.406 0.363 83.5 0.043
Sri Lankan, UK 0.349 0.454 79 -0.105
Toscani, Italy 0.492 0.357 99 0.136
Yoruba, Nigeria 0.306 0.364 71 -0.058
r x IQ 0.891 -0.255 0.848

First, we can see that the reviewer’s claim that derived alleles have higher frequencies among Europeans is debunked, as this is true only for derived alleles with a positive effect , but not those with a negative effect, which actually reach higher frequencies among South Asians (e.g. Indian Telegu: 0.458) but are otherwise equally distributed across Africans (e.g. Esan Nigeria: 0.372) and Europeans (e.g. British: 0.372). What is their correlation with population IQ? If GWAS hits really had higher frequencies among Europeans than Africans simply because (according to the reviewer) of a methodological artifact, this should apply irrespective of the effect on educational attainment. In other words, positive and negative effect derived alleles should be found at higher frequencies among Europeans. What about the polygenic scores correlations to population IQ? Again, if the polygenic scores’ correlation to population IQ were driven only by derived allele status, alleles with a positive effect on educational attainment should not be more strongly correlated to population IQ than alleles with a negative effect.

We can see that the correlation between derived positive polygenic score and IQ is 0.89, much higher than that between derived negative and IQ (-0.25). This suggests that the alleles pick selection signal that goes above and beyond random drift or effects of GWAS artifact. Another interesting result is that ancestral alleles with a positive effect do not seem to predict population IQ (r=0.25) confirming my prediction that intelligence enhancing alleles should be overrepresented among human-specific mutations. If we assume that human-specific mutations with a positive effect on IQ at the individual level are the least likely to contain false positives, we can consider this as the best measure of selection pressure strength across populations. We can see that this index peaks among Europeans (highest scores for Italians and British= 49%) and East Asians (e.g. Chinese Bejing= 44.6%). South Asians have lower scores (Bangladesh= 33.9%), and even lower in sub-Saharan African populations (around 30%).

Perhaps another measure of selection would be the difference between derived positive and derived negative (dp-dn) allele frequencies. This would take into account the DAF (derived allele frequencies) distributions due to population bottlenecks and drift. We can see that even this measure is substantially correlated to population IQ (r=0.85).With this methodology, it turns out that the (dp-dn) score for the Rietveld et al. (2014) 69 SNPs is weakly but negatively correlated to population IQ (r=-0.297).

Another way to validate a measure is to see how well it replicates across datasets: Are derived allele frequencies from one dataset correlated to derived allele frequencies in the other?What we are interested here is whether derived allele frequencies with a positive effect on intelligence have similar frequencies across datasets. If they do, this suggests that they are picking up more than random noise.

It turns out that the correlation between derived positive allele frequencies in the two datasets (Rietveld et al., 2013 and Rietveld et al., 2014) is positive (r= 0.88). On the other hand, the correlation between the derived negative alleles is near zero (r= 0.08). This suggests that alleles with a positive effect on IQ pick up selection signal, whereas the alleles with a negative effect on IQ represent noise. If these represented mere noise, then also the method of subtracting dn from dp would not be sound. Again, more data are needed to shed light on this issue.


A somewhat puzzling finding is the dramatic drop in the percentage of derived alleles with a positive effect when value goes above the conventional GWAS significance threshold (p<5*108). 9/10 of the GWAS significant hits were derived. However, only about 50% of those belonging to the second group (p value between 5*10-7 and 5*10-8) were derived. The dramatic drop is perhaps an artifact of adopting a dichotomous approach, dividing the groups by a conventional threshold. One would have to correlate the p value to the derived vs ancestral allele status. This was done in my paper using the 67 alleles found by Rietveld et al. (2014) to increase cognitive performance, and a slightly positive effect was found. Using the 109 SNPs (top 10 + 99 making up the second group), yields a correlation r= -0.019. Since derived alleles are coded as 1 and ancestral ones as 0, this implies that there is a very weak association between derived status and low p value. However, this is driven entirely by the top 10 SNPs. A limitation of this analysis is that the SNPs are not independent in LD, hence if there are clusters of SNPs around a certain p value, this will bias the derived allele count giving undue weight to alleles in that p value range. Bigger samples of SNPs pruned for LD will be required to replicate the association between derived status and positive effect found in the Rietveld et al. (2014) data set.


The reviewer thinks that the derived alleles are not necessarily enriched for intelligence enhancing signal and stated:  “it is not necessarily the case that an association between derived status and a positive effect points toward selection increasing the mean of the trait. Such selection can actually lead to the opposite association (between derived status and a negative effect) at certain allele frequencies.”

I must confess that I do not understand this argument. Surely if a mutation unique to the human lineage (arisen after the most recent common ancestor of all living humans) had been detrimental, making humans less intelligent than primates, this would have been selected against, hence disappearing from the genome? Purifying selection is much more common than positive selection because random mutations are usually deleterious.

Selection increasing the mean of the trait does actually produce an increase in derived alleles when there has been positive directional selection for the trait in a species. We know that this is the case for humans, as cranial capacity and behavioral complexity has dramatically increased in the last 4 million years and modern humans are much more intelligent than non-human primates.  Selection must necessarily have increased the intelligence-enhancing mutations, hence the derived alleles.

The reviewer’s argument would apply to height, as there has not really been increase in stature, at least from Homo Erectus to Homo Sapiens Sapiens. And that’s indeed what we found: height increasing alleles are only marginally enriched for derived alleles (53%), a finding that is likely a fluke.

Another comment worthy of consideration is this: “the extrapolation to non-European populations is still problematic because the accuracy of the polygenic score declines in such populations as a result of differing LD patterns (Scutari et al., 2015). “

Differences in LD should simply reduce the frequency differences at the tag SNPs between populations, compared to the real causal SNPs. This is due to a phenomenon called “attenuation”. Indeed, correction for attenuation is used “to rid a correlation coefficient from te weakening effect of measurement error (Jensen, 1998). This scenario works in the case that the frequency differences between tag and causal SNPs are due to random error, so that the mean frequency of the cognitive ability alleles is equal to the (genome-wide) background frequency (which for a mathematical reasons, is 50%). If instead there is a systematic bias, so that the mean frequency of the causal alleles is lower than the background frequency, then attenuation will reduce observed population-level frequency differences at tag alleles. As the reviewer says,” A GWAS of Europeans is more likely to detect SNPs with high minor allele frequencies”. Hence, the average frequency of at the causal alleles identified by the GWAS tends to be lower than 50%. That this is true, can be seen from the tables displaying the average frequencies of educational attainment increasing alleles, which tend to be much lower than 50%, especially at the lowest p values.

For example, let the average frequency of causal alleles be 40 % in the reference European population.  We also know that the average genome-wide frequency of alleles is 50 % in all populations (the sum of two alleles is always 100). If LD breaks down at some loci so that the tag SNP is uncorrelated to the causal SNP, the tag SNP in the non-European population will have a bias towards higher frequency compared to that of the European population. Hence, differences in LD should cause non-European populations to have higher frequencies at the tag SNPs (that is, the “GWAS hits”) than European populations and to reduce frequency differences among these populations, as all of them tend to be closer to 50%.

So this is the opposite than what the reviewer said:“ Now suppose that in a different population the SNPs are uncorrelated, the reference allele at the causal SNP has a somewhat higher frequency, and the reference allele at the tag SNP has a much lower frequency. Then the inference made from comparing the polygenic scores of the two populations is exactly the opposite of the truth. “

The reviewer’s argument can perhaps apply to a single SNPs but there is no reason why there should be a systematic bias in the direction predicted by that argument and sadly the reviewer just assumes that this is so, without providing any justification.


Jensen, A.R. (1998). The g Factor: The Science of Mental Ability Praeger, Connecticut, USA

Henn et al. (2015). Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. PNAS




Reviewers’ comments:


Reviewer #1:I recommend the rejection of this manuscript without opportunity for revision. It does not meet the very high standards for demonstrations of natural selection acting to differentiate modern human populations that have been set by recent publications (Turchin et al., 2012; Robinson et al., 2015). Here I will only detail a few of the manuscript’s shortcomings.


The authors do not address the possibility that the GWAS results of Rietveld et al. (2013) are contaminated by confounding (cognition- or education-affecting environmental variables that happen to be correlated with genetic variation). Although the original publications of the SSGAC deal with this issue to some extent, they do not come up to the standards set in the papers that I have cited in the previous paragraph. Furthermore, one research group with control of an extremely large family cohort is currently working on a manuscript documenting that years of education is subject to a very peculiar form of confounding. Until these results are published and well absorbed, any naive inferences regarding the basis of racial differences should be regarded with skepticism.


The authors also do not address the issue of ascertainment bias. A GWAS of Europeans is more likely to detect SNPs with high minor allele frequencies. The minor allele is usually the derived allele, and thus the use of SNPs ascertained to have low p-values in a GWAS of Europeans will lead to an overrepresentation of SNPs with high derived allele frequencies specifically in Europeans. If the derived allele tends to have a positive effect (as the authors claim), this is certainly an issue that needs to be carefully addressed.


True, it may be that ascertainment bias is less of an issue when all SNPs regardless of p-value are used to construct a polygenic score. But the extrapolation to non-European populations is still problematic because the accuracy of the polygenic score declines in such populations as a result of differing LD patterns (Scutari et al., 2015). An example will make this clear. Suppose that two SNPs in perfect LD in Europeans have quantitatively close positive reference betas. Now suppose that in a different population the SNPs are uncorrelated, the reference allele at the causal SNP has a somewhat higher frequency, and the reference allele at the tag SNP has a much lower frequency. Then the inference made from comparing the polygenic scores of the two populations is exactly the opposite of the truth. We can conclude from this that the use of polygenic scores to infer the causes of intercontinental differences requires much more care than given to it here.


Because stabilizing selection (favoring the “golden mean,” as the authors put it) also eliminates genetic variation, higher dispersion of allele frequencies across populations is by itself not diagnostic of directional selection.


The fact that a large fraction of the enhancing alleles reported by Rietveld et al. (2014) SNPs are derived does not mean very much. First, as it is likely that many of the SNPs are not causal, the relationship between derived alleles at different polymorphic sites must be addressed. Second, even if it be assumed that these are the causal SNPs, it is not necessarily the case that an association between derived status and a positive effect points toward selection increasing the mean of the trait. Such selection can actually lead to the opposite association (between derived status and a negative effect) at certain allele frequencies.


A general comment is that the appropriateness of much of the hypothesis testing in this paper is difficult to judge. The stochastic model justifying a particular statistical test is usually unclear. Is the source of randomness inaccuracy in the GWAS estimates? The inherent stochasticity of evolution?


Robinson, M. R., Hemani, G., Medina-Gomez, C., et al. (2015). Population differentiation of height and body mass index across Europe. Nature Genetics, 47, 1357-1362.


Scutari, M., Mackay, I., & Balding, D. J. Using genetic distance to infer the accuracy of genomic prediction. arXiv:1509.00415.


Turchin, M. C., Chiang, C. W. K., Palmer, C. D., Sankararaman, S., Reich, D., GIANT Consortiu, & Hirschhorn, J. N. (2012). Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nature Genetics, 44, 1015-1019.


Reviewer #2: I found this paper extremely reader-unfirendly.  Starting from the title – which is cumbersome, to the tables – which lack meaningful explanations and notes, to references to specialist concepts – that require much further explanaion for non-expert reader, to the general structure of the write up – the paper needs extensive revisions before it can be considered for publication for Intelligence.

The paper is full of poorly justified conlusions.  For example, in the abstract, the author claims: ‘Cognitive-enhancing SNPs were significantly enriched for derived alleles

(64%), that is human-specific mutations that originated after the split from the most recent common ancestor between humans and other primates.’  However, the Derived vs ancestral alleles section on page 9 does not present the releavant analyses in details, and therefore the conclusion is not justified.

The paper is full of sentences that would require further clarifications for non-expert audience.  For exampole: ‘Differences in allele frequencies between populations can be created by directional selection when the strength and/or direction of selection on the phenotype differs among populations. In this case it is also characterized as diversifying selection, in contrast to stabilizing selection which tends to favor the “golden mean.”


‘Diversifying selection is most commonly measured using the Fst index at or around single loci (Holsinger & Weir, 2009).’  This needs to be expained further.


‘Some SNPs had opposite betas on the two outcome variables (yes/no college completion and total years of education).’ This requires further discussion.


I could go on giving examples of unclear sentences, but I believe that the paper needs to be worked on- the author should consult with non-expert (in this specific area) intelligence researchers – to arrive at a clearer, more streamlined and better explained manuscript.  All analyses require futher explanations, perhaps, with specific examples, that would talk the reader through every step of the analyses.





Agreement between Q-Q plot and Shapiro-Wilk test of normality

Davide Piffer – 03/08/2015

Q-Q plots are commonly used to detect deviations from the normal distribution. This can be done visually or – more formally – calculating the correlation between the theoretical and the empirical distributions.

Another widely used test of normality is the Shapiro-Wilk test. This produces a coefficient W with a value of 1 corresponding to perfect normality (no deviation from the theoretical distribution) and lower values representing deviations from normality.

My goal was to determine the degree of agreement between the estimates produced by these two methods. In order to achieve this, I computed the correlation between the theoretical (x axis) and the empirical (y axis) for the Q-Q plots and carried out the Shapiro-Wilk test on several continuous variables. Then, I correlated the W value to the Q-Q plot correlation coefficient.


Variables were taken from two files (NineHitsBetaFst_B.csv and Factors.csv) in the data set I used for the population genetics study of intelligence. The vectors represent allele frequencies or factors derived from allele frequencies via factor analysis (Piffer, 2015).

Data files containing the vectors can be downloaded from:

Results of the analysis are reported in this spreadsheet:

R was used to carry out the analysis.

R Code is in the appendix.


The correlation between Q-Q xy and Shapiro-Wilk W was r=0.993 (N=19; p<0.001).

Figure 1. Relationship between Q-Q plot xy correlation and Shapiro-Wilk W.


The relationship between the two variables can be approximately described by this formula:

1-W =~ 2(1-Corr Q-Q plot),

e.g. 9SNPsGIDist: Q-Q corr= 0.952 and Shapiro-W= 0.905. This can be seen from table 1.

Table 1. Relationship between the two methods (1-x).

1-Corr Q-Q 1-W (1-W)/(1-Corr Q-Q)
0.0322736 0.0661 2.048113628
0.0355605 0.07253 2.039622615
0.0231317 0.04782 2.067292936
0.0230315 0.04779 2.074984261
0.0471659 0.09458 2.005262276
0.0264781 0.05437 2.05339507
0.0881257 0.16912 1.919076955
0.0270037 0.05495 2.034906328
0.0243319 0.04994 2.052449665
0.0221553 0.04577 2.065871372
0.0267268 0.05474 2.048131464
0.0651654 0.12811 1.965920565
0.0228832 0.04754 2.077506642
0.0308334 0.06289 2.039671266
0.0267176 0.05426 2.030871036
0.0276686 0.05681 2.053230015
0.0328761 0.07879 2.396573803
0.0728126 0.15363 2.109937016
0.0384661 0.08824 2.293967935

There is indeed a slight tendency for the ratio to fall as departures from normality get bigger (i.e. with strong departures from 1, W is slightly less than twice as big as 1-corr Q-Q, whereas it is slightly more than twice as big when departures from normality are small).


There is a very strong agreement between two commonly used methods to test for normality of data. An advantage of the Shapiro-Wilk test is that it provides a test of the null hypothesis that the population is normally distributed. However, p values have many issues, besides being affected by sample size such that a very large sample size will always result in rejection of the null hypothesis even in the the presence of tiny deviations from normality (Kirkegaard, 2014).


Kirkegaard, E. (2014).W values from the Shapiro-Wilk test visualized with different datasets.

Piffer, D. (2015). A review of intelligence GWAS hits: their relationship to country IQ and the issue of spatial autocorrelation. Figshare,


#Dataset NineHitsBetaFst_B


qqChr21Fst=qqnorm(newdata3$Chr21.Fst)#creates Q-Q plot and assigns it a name

cor(qqChr21Fst$x,qqChr21Fst$y)#computes correlation between x and y axes of Q-Q plot

shapiro.test(newdata3$Chr21.Fst) # Shapiro-Wilk test














































#Dataset Factors











#Scatterplot (Q-Q cor vs Shapiro-Wilk W)


newdatascatterplot=na.omit(qqplots..BetaFst)#load .csv file with results (download from Google Docs link)

scatterplot(newdatascatterplot$SHAPIRO.WILKS.W~newdatascatterplot$CORR.Q.Q.PLOT,main=”Q-Q Plot xy cor vs Shapiro-Wilk W (r=0.99)”, xlab=”Shapiro-Wilk W”,ylab=”Q-Q Plot xy cor”,smoother=FALSE) #creates regression scatterplot with Q-Q plot correlation and Shapiro-Wilk W

cor(newdatascatterplot$SHAPIRO.WILKS.W,newdatascatterplot$CORR.Q.Q.PLOT) #computes correlation between the two methods

Ice cream, anyone?

I submitted my paper to a journal which is famous for publishing second class papers, thinking that my paper, which is obviously superior to the average paper published on Evolutionaty Psychology (the little journal of just so stories for kids), would get a fair hearing. However, my hopes were shattered when a board of experts decided that it was not suitable for the journal. Of course, in the best of totalitarian tradition, reasons were not provided for the rejection. This shows how a few gatekeepers decide to stop ideas from becoming public if they do not conform to their tastes. You, the know it all who rejected my paper, do you know that science is not like buying an ice cream? You cannot simply say “I want chocolate, I do not want strawberry”. You need to justify your decisions. Othwerwise, how can you expect authors to justify their hypotheses with sound empirical evidence? But of course you do not know what science really is. Icecream, Bern Hard Funk?

Dear …..,

Thank you for your submission to Evolutionary Psychology. We have given your submission full attention. However, after consultation with the Editorial board, we have decided that your manuscript is not suitable for publication in Evolutionary Psychology, and thus won’t be sent out for in-depth review. I am sorry for being the bearer of what must be negative news. The Editors of Evolutionary Psychology aim to give quick feedback particularly with submissions, which are unlikely to get accepted even after in depth review and/or revision. Alas your submission falls into this category and was therefore rejected at this stage.

Best of luck with your work.



Fitness or longevity? Life as a photocopyier or a time machine?

The standard view of evolutionary biology is that life’s aim is to maximise the number of genes (in reality, not genes but “alleles” is the correct term) that are passed on from one generation to the next. In reality, this disregards the imporance of time. In this model, popularized by R.Dawkins, life is a photocopier which focuses on making as many copies of an allele as possible.

In fact, a more accurate formulation of fitness would be this: the total amount of time that an allele is present in the DNA of a living organism. Thus, number of alleles x time or f= N x t.

If an organism generates 2 long- lived offspring that die when they’re 100 years old, its fitness would be:  200, the same level of fitness of an organism that has 200 offspring each dying after 1 year or 2400 offspring with a lifespan of 1 month.

This accounts for organisms’ tendency to extend their life well past their reproductive age – a phenomenon that classical evolutionary biology cannot explain without recourse to just-so stories (e.g. the fitness benefits of the elderly to subsequent generations) – or for the tendency of evolution to produce long-lived species which generate few offspring (K-selection).