Family Tree DNA
Header Graphic
pricing projects testimonials FAQ privacy about Careers
    Forgot Your Password?

THE GAP QUICK INTERPRETATION GUIDE

divider


I - General Concepts

1. How do I compare results?

2. What is TMRCA or MRCA?

3. What are Haplotypes and Haplogroups?

4. What is the Genetic Distance Report on the GAP/Members page?

II - Relatedness

5. We have discovered that a branch of the XXXXX family has a match on the 12 marker test with a branch of the ZZZZZ family. In exchanging information, we have discovered that both the XXXXX and the ZZZZZ families lived in the vicinity of each other in the 1600s. However, there is no known historical connection between the two families.

6. What do I tell my group members when they match a person with a different surname?

III - Specific Markers Technical Questions

7. How do I score DYS 389 1 & 2. It seems that these 2 markers are 13, 29 and 14, 30 but your Genetic Distance Report only shows these two people as being 1 point away. I’m confused.

8. I hear about ‘fast moving markers’…what does that mean?

9. I don’t understand how you calculated the difference at DYS 464.

10. I don’t understand how you calculated the difference at YCA II a/b.

Answers

1. How do I compare results?

The Anthropology community uses 2 models to explain polymorphism, the scientific name for mutations. These models are known as the Stepwise model and the infinite allele model. An explanation of each is contained below. All of the information below is supplied by Dr. Bruce Walsh, University of Arizona, who serves on FTDNA’s Advisory Board as our population geneticist consultant. Please find a more in depth explanation on this subject directly from Dr. Bruce’s page especially prepared for FTDNA customers at: http://nitro.biosci.arizona.edu/ftdna/TMRCA.html

Stepwise model-- The stepwise mutational model tries to better account for the actual mutational process that occurs at micro satellite markers. What is scored is marker length.

The stepwise mutation model looks at the frequency spectrum (0,1,2,3 ..) of the mismatches, namely how many loci show no mismatches, 1 mismatch, 2 mismatches, etc. Its simplest form is the one-step, symmetric model, which assumes only one step per mutation, with equal probability of increasing (+1) or decreasing (-1). More complex stepwise mutational models can be constructed, but this is a little premature until more information on the mutational process is available. Currently, there is very good evidence for the one-step model, as only 1/30 to 1/50 of all (the small number of) observed mutations are two-steps.

If most mutations are one-step, but a few two-step mutants exist, then the true distribution of the TMRCA (See Q #2) falls between the infinite alleles and stepwise models. For example, suppose we observed an exact match at 24 markers, and a two-step mismatch at one marker. This is most probably a single mutation as opposed to two (or more) one-step mutations. If the latter were true, we would expect other loci to have mutated as well, unless they had lower mutations rates.


Infinite Allele Model-- The infinite allele model is just a fancy term that population geneticists use to refer to those occasions where each mutations may be of different lengths (i.e., there are an infinite number of states that an allele can mutate too, hence each mutation is assumed to be unique).

This is the simplest mutation model, simply scoring loci as match/no match.

There is a risk of undercounting the total number of mutations, and hence underestimating the actual TMRCA. If individuals are identical at a large fraction of the markers, this risk is very small. As individuals differ at more and more markers, the undercounting can become more severe.



2. What is TMRCA or MRCA?

Population geneticists are constantly focused on WHEN a connection most likely occurred between 2 individuals. This is expressed as t, or time to the Most Recent Common Ancestor, (MRCA), and the time to the MRCA is written as TMRCA. The TMRCA is always given as a number of years, or generations, but it is coupled with a probability, i.e. 50% of the time the MRCA would have lived X number of generations ago or less, or 80% of the time the MRCA would have lived Y number of years ago, or less.

It is a fallacy to attempt to tie TMRCA to a particular year, i.e. “I have an exact match, 12/12 therefore we have a connection that occurred 1675.”



3. What are Haplotypes and Haplogroups?

Haplogroups are clusters of Haplotypes (expressed as exact or near exact 12 or 25 marker matches) that are in a tight proximity to each other. Expressed another way Haplotypes are subsets of a Haplogroup. Think of the Haplotypes as the leaves of a tree, and the Haplogroups as the limbs of a tree…in fact the Haplogroups are the limbs of the tree of Homo Sapien-Sapien—our unique branch of humanity. The Haplogroups have been crafted into what is called a Philogenetic network, and the male version can be seen here: http://www.familytreedna.com/snps-r-us.aspx.

Please note that people in different Haplogroups cannot be related within many thousands of years, and that each male test result provides a prediction of the Haplogroup currently about 90% of the time. In general the following rule of thumb may be used:

 

Haplogroup Designation

R1b Western Europe

R1a Eastern Europe

I Nordic

J2 Semitic

E3b Semitic

Q3 Native American



4. What is the Genetic Distance Report on the GAP/Members page?

Genetic distance is FTDNA’s method used to most accurately describe the relationship between 2 people. The method follows either the Step wise model or the Infinite allele model of distance calculation. FTDNA has adopted the methods of calculations as recommended by the Hammer Lab at the University of Arizona.

We have written an algorithm that takes into account the peculiar nature of DYS 389 (whose nomenclature is odd and a direct comparison of 2 samples will tend to overstate or understate the actual difference---see Q # 7 ) and DYS 464 which is a multi-copied marker found generally at 4 different places within the Y-Chromosome. For more information on DYS 464 please see Question # 9. The actual corrected distance is presented when you run the FTDNA Genetic distance report.



5. We have discovered that a branch of the XXXXX family has a match on the 12 marker test with a branch of the ZZZZZ family. In exchanging information, we have discovered that both the XXXXX and the ZZZZZ families lived in the vicinity of each other in the 1600s. However, there is no known historical connection between the two families.

This is a good example of why we brought out the FTDNA 25 marker test. 

Because Europe was settled recently, from a historical time perspective (about 10,000 years ago), and by just a few homogenous peoples, the DNA in Europe is less diverse then any place on the planet.  This means that people living in the same general area of Europe could be quite related, and because DNA doesn't change quickly (i.e. is stable) r random matches occur with regularity.  To separate the wheat from the chaff, so to speak, you can have a single representative of each of the two groups refine their test to our 25 marker test.  If they diverge then you know that the match was a random match or from before the time that people accepted surnames. However, if they match exactly, or perhaps are 24/25, then the connection is most likely close.



6. What do I tell my group members when they match a person with a different surname?

The results of when two people who match are based upon statistical probability which means that the time period when a ‘match’ might have taken place is only a range of time. The match might fall within the first 50% period to MRCA, or it might not.

A surname is a good prior piece of knowledge that increases the likelihood that if you match a person with your surname you are likely related to them, say within 14 generations , and this will happen 50% of the time. If you want, seek, or need a higher confidence level you need to increase the # of markers or extend the # of generatins back to obtain a higher confidence, say 80% of the time the ‘match’ took place within a greater # of generations.



7. How do I score DYS 389 1 & 2. It seems that these 2 markers are 13, 29 and 14, 30 but your Genetic Distance Report only shows these two people as being 1 point away. I’m confused.

The following illustration is provided by Mr. Kevin Duerinck, and this section is largely a direct quote from his web site.

One note regarding loci 389i and 389ii, reported by Doug Mumma: "the number of "repeats" reported as DYS389II includes the number of "repeats" reported at DYS389I. As a result, if a one-step mutation is observed at DYS389I, then DYS389II will also show a one-step mutation. The "double reporting" of the repeats observed at DYS389I is not desirable. To eliminate this "doubling", many researchers report DYS389II as either 389II-I or as 389AB, which then allows you to distinguish whether a mutation occurred on the 389I portion of the locus or on the other portion of the 389 locus. All you need to do is subtract the number of repeats reported for 389I from those reported for 389II and identify it as either 389AB or 389ii-i for clarity."

Below are 6 examples. To calculate the total mutations at each locus, you need to figure out mutations at 389i and 389ii. First, figure out the mutations for DYS389i by subtracting one kit's value from the other.( EX. 15-14 = 1) Second, to see how many mutations are at 389ii, just subtract one kit's 389i value from the same kit's 389ii value, then do it with the other kit. Now compare. So here goes:

Example 1 
These are the real values on Kevin’s site. The other examples below are fake).
 kit 281 (FTDNA)     DYS389i     DYS389ii     Mutations
                                              14          31
kit 288 (FTDNA)                   15          32           1
Calculations: 14 vs. 15 is one mutation. In both cases 389ii - 389i = 17, so no
mutation in 389ii. So total is 1 mutation.
 
Example 2 
kit 281 (FTDNA)     DYS389i     DYS389ii     Mutations
                                              14          31
kit 288 (FTDNA)                   13          30           1
Calculations: same as above, only different direction; still one mutation.
 
Example 3 (1 value lesser (389i), 1 value greater (389ii) compared kit 281)
kit 281 (FTDNA)     DYS389i     DYS389ii     Mutations
                                              14          31
kit 288 (FTDNA)                   13          32           3
Calculations: There is one mutation in 389i (14 vs. 13), and there is an
additional change(s) in 389ii. In this case, 31-14=17, vs. 32-13=19. So
there are 2 changes in 389ii and 1 in 389i, so 3 mutations total.
 
Example 4 (1 value lesser (389ii), 1 value greater (389i) than compared kit 281)
kit 281 (FTDNA)     DYS389i     DYS389ii     Mutations
                                              14          31
kit 288 (FTDNA)                    15          30           3
Calculations: Same as above, only different direction. One mutation in 389i, and then
for 389ii (31-14=17 vs. 30-15=15, so two mutations there). So 3 total.
 
Example 5 (1 value greater (389i), 1 value equal (389ii) than compared kit 281--This

example was from the list a while ago--2 mutations. Compare to example 6 below)
kit 281 (FTDNA)     DYS389i     DYS389ii     Mutations
                                              14          31
kit 288 (FTDNA)                   15          31           2
Calculations: One mutation at 389i, and for 389ii (31-14=17 vs. 31-15=16); i.e., 1
mutation), so 2 mutations total.
 
Example 6 (1 value equal (389i), 1 value greater (389ii) than compared kit 281--The


 "opposite" of example 5)
kit 281 (FTDNA)     DYS389i     DYS389ii     Mutations
                                              14          31
kit 288 (FTDNA)                   14          32           1
Calculations: One mutation. 31-14=17 vs. 32-14=18.

Let's give another example. Let's say there is a mutation at 389i but it is not showing up at 389ii. What does that mean? It means that there are two mutations in the DNA. You see, 389ii has two contributions that are added together. One of those is reported separately under 389i. If the second part happens to mutate in the opposite direction from the change in 389i, then the two mutations will appear to "cancel" in the sum reported as 389ii. By the way, thanks to Greg Bonner and FTDNA for their advice on these issues.”

(quoted from Mr. Kevin Duerinck’s web site)



8. I hear about ‘fast moving markers’…what does that mean?

The current estimate by the Anthropological community of the mutation rate of Y chromosomal markers used for genealogy is .002, which means that 1 mutation is expected to occur 1 time, per marker, every 500 generations. While this number may be accurate for unrelated males within a population it appears to understate the actual mutation rate when comparisons are made from within a family. Therefore we highlight markers which appear to be more volatile in red, as you can see when you click the GENERATE page from within your GAP. The volatility rate of these markers hasn’t been established. We believe that a standard rate of change, across the entire panel, is not likely either.

If you have 2 people who match, exactly, except on a single marker and that marker is one that is highlighted in RED within your surname group then the current estimate of distance between 2 people are related is probably overstated, and they are more closely related than could be assumed by the distance suggested by a standard single marker deviation.

  A comprehensive evaluation of marker by marker volatility rates is currently being organized by the Molecular Lab for Science and Evolution at the University of Arizona and FTDNA surname groups with verified documented genealogies will be able to participate in this study which began during August 2003.



9. I don’t understand how you calculated the difference at DYS 464.

DYS 464 was discovered at the University of Arizona by Dr. Alan Redd. This highly volatile (fast mutating) marker is included in the panel to help show changes even within family groups that are closely related. DYS 464 is replicated 4 times in 98.5% of people from Europe and the Middle East (the balance having 5 or 6 copies). Because the marker’s location on the Y-Chromosome is not determinable we sort the marker from smallest to largest (385a/b is treated the same way), and therefore it is possible to overstate or understate the actual genetic distance when making a comparison by eye.

The FTDNA Genetic Distance Report solves this potential problem as the actual difference is calculated for you.

  Markers 464a-d are copies found at different locations on the Y chromosome. In about 1.5% of the test subjects, more than 4 copies will be present, representing Markers 464e, 464f, 464g. If those additional Markers are found, they are now shown on the individual's Y-DNA DYS Values page, and on the Group Administrator's Generate Y-DNA Scores page. If these additional Markers are not found, the columns for them in the reports will not appear. At the current time, 7 copies are the maximum number of Markers found for this DYS.

Results are always reported from low to high, when reading from left to right. When a mismatch occurs, it must be taken into consideration whether the number of apparent mismatches are a result of the order of presentation of the Markers. The order of the results for these Markers may make it appear as if there are more mismatches than are actually present. Here are two examples:


Results Genetic Distance
13 15 16 16
13 15 15 16 1

16 16 17 18
15 16 16 18 1

In the first example, the mutation did not cause the results to be reordered, so it is very clear that there is one mismatch. In the second example, if a 2 point mutation occurred (17-2 = 15) the loss of 2 repeats caused the results to be reordered. NOTE—464 shows more 2 step mutations then can be explained by the step wise model. On the surface, it looks like there are two mismatches, but this illusion is caused by the results being reordered, and there is only 1 mismatch, the 17 becoming the 15. Remember, DYS 464a-d is a highly polymorphic Marker; in fact it appears to be the fastest moving marker in our entire test. (Polymorphic means rapidly changing!) Evidence strongly suggests that DYS 464 follows the infinite allele model and therefore the visible 2 step change form 15 to 17 is treated as a single markers single mutation for comparison when the genetic distance report.

Group Administrators, when comparing two individuals results, if it appears that there are more than 1 mismatch for Markers 464a-d, be sure to check the genetic distance report (Members Page of your GAP), to verify the number of mismatches, following guidelines set by the folks at the Lab for Molecular Science and Evolution at the University of Arizona.



10. I don’t understand how you calculated the difference at YCA II a/b.

YCA II a/b are the only markers in our system that uses a bi (2) base pair repeat motif. These 2 base pair repeats are typically more volatile, and more to the point then tend to 'move' by 1 repeat or many repeats at a time, while longer repeat motifs generally are more stable such that they move following the step wise model. Therefore we treat the markers YCA II a/b under the infinite allele model http://nitro.biosci.arizona.edu/ftdna/models.html#Infinite rather then the stepwise model of population genetics. In simple terms this means that a move by 1 or 4 'repeats will be counted as a single evolutionary event rather then 4 evolutionary events when calculating genetic distance.

====================================

Contact us if you need any assistance or have any questions.

 


divider

return to top