FAQ
An overview of Y-DNA testing & surname projects will be found after any FAQs, below.
Y-DNA Testing and Surname Projects:
An Overview
Dave Nicolson – May 2018
The Y chromosome is found in males and passed down with few changes from father to son, generation after generation. It contains over 59 million base pairs, which is about 2% of the human genome, although large portions of the Y chromosome are difficult to get accurate data from. Mutations (changes) can and do occur over time and are passed down with the modified Y chromosome to all descendants of the line with the mutation, so they can be used to identify people who are related in that they will share recent mutations or patterns of mutations.
Genealogically useful Y mutations that are tested include “STRs” and “SNPs.”
You may not recognize it by the name “Short Tandem Repeats” (STRs), but STR tests are the standard Y-DNA tests (Y-37, etc.) that many of us have taken since they were introduced by FTDNA well over a decade ago. Y-STR tests examine a number of particular positions (loci, singular is locus) on the Y chromosome where repeats are known to occur, to count how many short repeats (like repeats of AAC, as in AACAACAACAACAACAACAACAAC) are found there in the tested person; the number of repeats is the value returned for that STR (e.g., 8 in the example I just gave). Each generation, there is a small chance that any given STR will gain or lose one or more repeats, just by random chance. Over a number of generations, such changes tend to accumulate as slowly growing differences between two descendants of the same Y ancestor. We can compare two living people’s STRs and get an insight as to how closely related they are likely to be, and what STR values the common ancestor likely had.
Generally, any given locus has a certain probability of changing in each generation, some tend to change more often, some less often. A comparison of 2 people who differ only in three ‘faster changing’ loci won’t look as significant as between two people who differ in three of the ‘slower changing’ loci. Overall, these changes (mutations), happened randomly, and sometimes a later mutation in the same locus could undo a prior change (10 repeats in a distant ancestor changed to 11 in his descendant, then back to 10 in a later descendant). Or the ancestors of two unrelated people might have converged randomly (a 15 became 14then 13 for one line, and an 11 became 12 then 13 for the other). These sorts of things can skew attempts to estimate time to most common ancestor as we generally assume that mutations happened at an ‘average’ rate; we cannot seethe actual history of the mutations that led to the current haplotypes, we can only see the current haplotypes, themselves.
FTDNA offers STR tests for 12, 25, 37, 67, or 111 markers, at prices ranging from $59 to $339 (5/2018 prices via projects, list prices are a bit higher, occasional sales are lower). Tests for fewer markers are harder to interpret with confidence, and they will get many false matches (‘matches’ that look closer than they really are), typically. Increasing the number of markers weeds out a portion of the false matches, as more loci = stronger signal-to-noise ratio, typically. For the same reason, tests with fewer markers are difficult to group in the project with confidence, and adding more markers generally helps with grouping. Many admins suggest 37markers as a sweet spot, but ‘more is better.’ Those members with an unusually large number of matches should consider upgrading to more STRs, but all 12s and most 25s really should upgrade, if possible.
We continue to rely almost exclusively on STR test results to group members into lineages in our project’s Y report pages, but there are other options for grouping, as well...
The Rise of “SNP Tests”
More recently, we have seen the rise of a new kind of test, one based on “Single Nucleotide Polymorphisms” (SNPs). These are loci that are just one base pair (scored as an A, C, T, or G), rather than a string composed of a repeating motif (those were STRs). The individual base pairs of interest rarely mutate, and don’t generally mutate additional times (whether “back” or to a different value). For that reason, two people with a matching non-standard value at a SNP will probably have gotten that value from a common Y ancestor, although it could be quite far back (thousands of years) for most of the well-known SNPs.From previous sequencing efforts, we have a good idea of which values are the ‘standard’ (reference, or ‘ancestral’) for each locus, and that helps us identify mutations. By combining knowledge of who has which sets of SNP mutations and who has the ancestral type, it is possible to build & fill out a tree showing how the world’s Y chromosomes have passed down and changed, a “phylogeny” of how males are all related on the Y lineages. For example, separate trees are maintained on this principle, by FTDNA, by ISOGG.org, and by YFull.com.
There are several testing options that are based on SNPs, but they can all be divided into two fundamental groups: discovery testing vs. confirmation testing. In confirmation testing, we have a particular known SNP or SNPs of interest, and we test people for just that to see if they belong to the group with the mutated (or derived) value(s). In discovery testing, however, much of the subject’s Y chromosome is sequenced to determine all the base pairs in large portions of the chromosome; that includes known SNPs and any not-yet-discovered (‘new’) SNPs, and the latter may define branches in theY tree we don’t even know about yet (branches where YOUR recent SNPs will be found).
The general state of knowledge of the Y tree, particularly at FTDNA, often ends with ‘terminal’ SNPs on branch tips that are still thousands of years old, so confirmation testing has its limits (it is typically most useful following discovery testing in the same group of likely relatives). Still, there is definitely a place for it, as when we have a newly- detected sub-branch candidate, and we test a number of specific people (suspected or known relatives, etc.) hoping to confirm that the sub-branch is real, and to learn how the group is divided by it. But again, if the final SNPs we know for a branch are, say, 4000 years old, it may be gratifying to know we are on that branch and not another, but it typically has no value for genealogy, even for an old surname group!
"Confirmation testing" options include individual SNP tests ($39 at FTDNA, $18 atYSeq.net), and SNP Packs or"Panels." The latter are combined tests of a large set of SNPs (a hundred or more in most cases) for a cost close to $100 (give or take).Obviously, the panels are more cost effective if you pick the right SNP Pack.,but they do not lead to new knowledge, nor do they generally give us any genealogically-relevant information, other than dividing us into groups who might be related vs. cannot be related in the last few thousand years.Additionally, if individual SNP testing is used to explore one’s tree placement, it can easily end up with a dozen tests, and that puts you over $200 at YSeq, and at FTDNA it’d be well over $400! Those who think they might be that curious would do well to instead consider discovery testing...
At FTDNA, "discovery testing"means the Big Y test (or the Y Elite test from Full Genomes Corp, or even whole genome sequencing from a number of providers). These tests are truly the 'gold standard,' as they sequence most of the useful Y chromosome (10 million base pairs or more), and so yield several tens of thousands calls for known SNPs and additional calls for ‘novel variants’ (new mutations not yet seen in other testers, they define your branch beyond the ‘terminal’ SNP); this isn’t cheap (hundreds of dollars, the exact amount depending on how many STRs you’ve tested, since it requires and includes an upgrade to Y-DNA111), but still it can reveal almost all your potentially-useful SNPs, known and unknown, meaning that you should never have to test your Y DNA again. Then it is a question of waiting for haplogroup admins and FTDNA to recognize all your SNPs.The volume of sequencing from such “Next Generation” tests as Big Y and Y Elite have generated a tremendous amount of data relating to SNPs without any built-in tree structure, and this “SNP Tsunami” as it is sometimes called initially left FTDNA unable to keep up with haplogroup admins (who had less volume to deal with), but FTDNA has done a great job in overcoming the “Tsunami” in the last 1-2 years.
Note also that confirmation testing can be done more cheaply at YSeq, and unlike at FTDNA, new SNPs can be quickly added to their catalog, but it is critical that you tell the admins of all your results from outside FTDNA (and you’re asking for guidance before SNP testing, right?).
Have You Had a Big Y Test Done?
If you have had a Big Y test done, you may wish to consider additional analysis of your raw Big Y data to identify all thelikely ‘novel SNPs,’ some of which could even be genealogically relevant (and some of which FTDNA might not have recognized yet). This can be a free step (if you happen to be in haplogroup R1b-P312 you can get this by giving theBig Y data to the ytree.net project (“The Big Tree”), or a $49 step if FullGenomes Corp or YFull are used (and they do provide additional value, I personally prefer YFull). With the Big Y, we often count on new SNPs occurring at a rate around 140 years per mutation (give or take, since it is random!) on any given person’s ancestral Y line. That means that in a larger subgroup/lineage of a project, there could be multiple markers shared by all in the group (these may define the larger group), and one or more that are shared by subsets only (these may define branching within the larger group); identification of these can help show the branching of our Y “family trees”within the time of surnames generally (roughly in the last 500-1000 years). As we accumulate members who have taken the Big Y test (or the better but more expensive Y test from Full Genomes Corp), we expect to develop some understanding of the true branching pattern of our various Y lines, even when we don’t have a paper trail showing it.
IN SUMMARY... Upgrade your STRs, if it is feasible... if Big Y isn’t an option for you due to its cost, but a few SNP tests or a SNP Pack might be an option, please ask your haplogroup admin (you did join the project for your haplogroup, right?) and your surname project admin to discuss options.