January 18, 2018

Race compared to Family and Gender

Two ways to put the magnitude of race differences into perspective is to compare it to gender and family relations. This uses Fst distances which simply means proportion of variance between a population compared to the proportion of variance within.

For example, if the Fst distance between population A and population B was 0.15, that would mean that 15% of the variance was between the populations, while 85% was within the populations, or “common to both populations”.

Family Relations

An Fst value is, in practice, a kind of “inverse kinship coefficient”, or “anti-kinship”. You can multiply the Fst value by 2 and that is the “anti-kinship”.

For example, the kinship coefficient of a parent to a child is 0.5, while the kinship coefficient of the parent to a grandchild is 0.25, and the coefficient between two siblings is 0.5.

The Fst distance between Europeans and Africans is 0.166, which can be modeled as a kinship coefficient of -0.332.

Parent-Child                                       0.5

Siblings                                               0.5

Aunt, Uncle, Niece, Nephew            0.25

Grandparent                                      0.25

First Cousin                                        0.125

European-African                             -0.332

An explanation of why Fst functions as an “inverse kinship coefficient divided by 2” can be found in Henry Harpending’s paper, “Kinship and Population Subdivision”:

Race vs. Gender

The next way to put race differences into perspective is to model gender differences as an Fst distance.

Males and females differ due to the Y chromosome. Males have it, and the Y chromosome deactivates analogous genes on the X chromosome. The Y chromosome is only 458 genes, compared to approximately 22,500 genes on the genome.

Assuming every single gene on the Y chromosome replaces a gene on the X chromosome, that can be modeled as an Fst distance between males and females of 1.00, but only on 458 genes. This is because 100% of the genes on the Y chromosome are assumed to deactivate an analogous gene on the X chromosome. It’s an allele frequency difference of 100% but only on those genes.

When the fact that males and females are identical on the rest of the genome is taken into consideration, the Fst distance between males and females can be modeled as 0.021. The point of doing so is not because this is a useful fact, but to compare the magnitude of this difference to race differences.

This is assuming every single gene on the Y-chromosome differs from the analog on the X-chromosome it deactivates. For every gene this is not true about (where the Y-chromosome gene is the same as the X-chromosome gene), that will make the Fst distance between males and females smaller. So really 0.021 should be seen as a MAXIMUM.

Comparison Fst Distance

Europeans – Africans                                0.166

Europeans – East Asians                          0.097

Europeans- Amerindians                         0.095

Males – Females (maximum)                  0.021

In terms of magnitude, the race differences completely dwarf gender differences. However, gender differences are a “hard break” at specific sites, while the race differences are spread over the whole genome.

While the genders have a 100% allele (gene variant) frequency difference at 458 locations, the races will have something like an average 30% allele frequency difference at 7,500 locations.

So while race differences, in total genes differing, completely swamps gender differences, it is more continuous in nature, whereas the gender differences are smaller in total, but typological.

But the point here is not to say one is more or less important than the other, it’s just to show how silly the claim “race doesn’t exist” is by putting it on the same scale at other commonly recognized genetic relations – family and gender.

Facebook Comments
  • Gorter Did Nothing Wrong

    “When the fact that males and females are identical on the rest of the genome”
    Except each of them will differ by about 3 billion base pairs from each other.

    “DNA studies do not indicate that separate classifiable subspecies (races) exist within modern humans. While different genes for physical traits such as skin and hair color can be identified between individuals, no consistent patterns of genes across the human genome exist to distinguish one race from another. There also is no genetic basis for divisions of human ethnicity. People who have lived in the same geographic region for many generations may have some alleles in common, but no allele will be found in all members of one population and in no members of any other.”

    “the races will have something like an average 30% allele frequency difference at 7,500 locations.”
    “While it is possible to find biological and genetic variation roughly corresponding to race, this is true for almost all geographically distinct populations: the cluster structure of genetic data is dependent on the initial hypotheses of the researcher and the populations sampled. When one samples continental groups, the clusters become continental; with other sampling patterns, the clusters would be different. If one sampled only Icelanders, Mayans and Maoris, three distinct clusters would form; all other populations would be composed of genetic admixtures of Maori, Icelandic and Mayan material. While differences in particular allele frequencies can be used to identify populations that loosely correspond to the racial categories common in Western social discourse, the differences are of no more biological significance than the differences found between any human populations (e.g., the Spanish and Portuguese)”

    “Genetic distances generally increase continually with geographic distance, which makes a dividing line arbitrary. Any two neighboring settlements will exhibit some genetic difference from each other, which could be defined as a race. Therefore, attempts to classify races impose an artificial discontinuity on a naturally occurring phenomenon.”

    FST Measurements are useless in this context for populations with almost no structure, like humans. We don’t branch off from a common ancestor in the way you are thinking. You are pretty much just measuring geographical distance.

  • L.Q. Cincinnatus

    if you’re using the 20k number when you talk about the total amount of genes in the human genome , you’re referring to protein coding genes. In that case, there are only around 70 protein coding genes in the Y chromosome, while 458 would be coding genes + pseudogenes, and if you include noncoding genes the number goes up by 100 hundred. Moreover, a few of the protein coding genes, around 20, are in pseudoautosomal regions, which means they don’t count as differences between the X and Y chromosomes.

    The estimate is still in the ballpark and the overall point stands but I just wanted to point this out.

    • Ryan Faulk

      Would you be willing to rewrite this to give a better comparison?

      • L.Q. Cincinnatus

        To do that I’d have to know where the difference 30% allelic frequency difference at 7500 loci comes from. Also, as a general point, I’m not sure I agree with using Fst when it comes to male vs female population, I mean, heterozigosity comparisons kinda break down since one of the sexes is hemizigous when it comes to sexual chromosomes.

        Again, the estimate is still more or less there: using coding genes we’d have a 100% difference on ~50 genes out of ~20k, including noncoding/pseudo genes it would be a 100% difference on ~500 genes out of ~50k. It’s still very little.