Why Y DNA testing?
The Y chromosome can only be passed from father to son and as such is passed almost completely unchanged. Changes or mutations occur every couple of generations but those mutations are also passed on giving each male lineage a unique signature that grows in uniqueness over the generations. The mutations can be thought of as a path of bread crumbs or footprints in the snow that never wash away. As we follow the path back through the woods we can find common ancestors with other people and potentially origins information.
Comparisons between potentially related people allows us to build a descendants tree back to a common ancestor. This can be used as a proof for genealogical records but can go way back into ancient times to understand where our people came from and how as we compare with historical and archaeological records. Your Y DNA has tremendous potential to reliably track lineages from the head to the toe, from the ancient to the present. This is way beyond autosomal DNA testing capabilities, where the mixing between parents is so great that it is hard to reliably call relationships after just a few generations and there are few guarantees at all.
The comparison method for tree building is simple, just compare validated mutations across potential relatives. If you have two brothers that have the same mutation unique to them then you can legitimately assume their most recent common ancestor, their father, had it. If you check a male cousin and he has the same mutation you can back up a step and know the grandfather had it. However, you need apples to apples comparisons of test results. It is of little value to do tests that few others are doing. Think of this as a team sport. You have to get others potentially related to take the same tests as you do or find them in the matching database. This is why the size of the matching database and recruiting efforts are so important.
Y DNA tracking is both very granular, sensitive to the generations, and consistent, due to the father-son inheritance staying intact. As a comparison, Big Y looks for SNPs across about 700 times more base locations as full mitochondrial DNA sequencing does. The granularity on the Y side means refined branching support for genetic genealogy. However, Y DNA only tracks a paternal lineage, your father's father's father, etc. Still, this can be applied across your family tree. Every mother has a father, every grandmother had father and so on. The key is to find male relatives, male cousins and uncles, and get them deeply Y DNA tested while you can.
Why DNA testing at a full platform genetic genealogy company?
FTDNA provides much more than bare bones DNA tests. They have a true laboratory and our not just a test remarketeer. They have no annual subscription charges, but provide a comprehensive platform for genetic genealogy including:
1. A robust project management web based platform for sharing and categorizing information that is critical to the newbies, future generations and non-technicians.
2. The largest accessible matching database for Y genetic genealogy. This is a database of real people who’ve shown some inclination for spending their own money on DNA testing.
3. Privacy protection and compliance with the EU General Data Privacy Regulations and pertinent state laws. Some DNA test vendors are using your personal results for general health/medical research and may not be able to protect your privacy. Some vendors are purely just resellers and your DNA sample and results may not be subject to western international business norms.
4. Integrated and backward-compatible data management, which is the foundation of a large matching database. Old Deep Clade, Advanced Test SNPs, SNP Packs and Big Y results are all used for haplotree and haplogroup assignment and project display.
5. Convenience because of multiple test types (Big Y700, STR panels, SNP Packs, mtDNA, Family Finder, etc.) available based on DNA samples that are safely stored.
6. Family tree and GEDCOM support integrated to DNA matching results.
7. Reasonable pricing which helps attract the most genetic genealogy testers.
8. Faster test processing time for SNP Pack and Big Y tests because they own their lab and test in-house.
9. They are an early feeder of your SNPs into a broader set of offerings including products that are used all over the world.
10. Excellent record and prospects as a proven on-going, self-funding concern.
Why Y STRs and how are they used?
STRs - Short Tandem Repeats:
STRs are very useful in a multi-pronged approach, but should not be used out of context with SNPs. STRs support the formation of useful matching database because:
1) They have long legacy of usage and standardization, even beyond genealogy into forensic sciences.
2) They are consistently measured in standard STR based panels.
3) For the legacy STRs (Y111), everyone gets measured and issues like no calls are rare.
STRs are useful in several ways
1) Team or cluster building and finding people who might be closely related to you.
2) Time to Most Recent Common Ancestor (TMRCA) estimates.
3) As a guide (only) for advanced and deeper SNP testing.
4) Cross-checking and validation of newly discovered SNPs.
5) As differentiators at the tips of the branches (the leaves) in tree building as fenced in by SNPs. However STRs should not be used to assume branch placement as a replacement for SNPs.
Why Y SNPs and how are they used?
SNPs - Single Nuclear Polymorphisms
SNPs support the discovery of the paternal lineage tree of mankind because:
1) They have a very strict father-son inheritance property
2) They are generally very stable, making for a high reliability tree
3) They have a very high opportunity for mutation, providing great resolution in the branching
SNPs are critical for their benefits.
1) SNPs can document a Y DNA tree that is very accurate, granular and comprehensive from the ancient to current genealogy.
2) Time to Most Recent Common Ancestor (TMRCA) estimations.
3) Eliminating false matches caused by STR convergence.
Using either SNPs counting or STR variance, TMRCA estimates are subject to error ranges and anomalies. All mutations can occur in fits and starts. There is a lot we don't know.
Why Big Y Next Generation Sequencing for discovering SNPs?
Big Y700 is probably the most important Y DNA test that you can take because it goes beyond testing for public and known SNPs. Big Y700 discovers your own line of SNPs rather than just the known SNPs. Here's an analogy to help explain Big Y700 using the Lewis and Clark Expedition with an aviation twist.
Lewis and Clark's primary objectives included to explore and map the newly acquired territory, find a practical route across the western half of continent. They left St. Louis in 1804 and arrived at the Pacific Ocean late 1805. In this analogy, we think of the Pacific Coast as our genetic genealogy homeland, a place or status where our genealogically known family connects to the Y DNA tree of mankind. It is not the same for all of us, as each of our families has a distinctive location. Lewis and Clark founded what would be Fort Clastrop on he edge of Astoria, Oregon. From Astoria Column, a tower, you can see the Pacific Ocean, the Cascade Mountains and the Columbia River.
Single SNP testing is like flying a two seater from St. Louis and hoping to land in Astoria without knowing where Astoria is. The plane is low priced and reliable but has bad gas mileage. More importantly, Astoria may not even have coordinates on the map yet or a landing strip. This kind of approach is most applicable when someone who is highly probable to be on the edge of your genealogically known family has already done a Big Y test and has built a very tall tower or lighthouse to go with a new landing strip. That tower in Astoria could be thought as a super version of Astoria Column and it is built with 111 Y STR markers.
Fixed SNP panel/pack testing is like flying the two seater from St. Louis hopscotching across the country, landing at a handful of small airports and getting out and taking a good set of photos at each location and then deciding the next location to fly to. Fixed SNP packs/panels are a low entry price way to go, but suffer the same problem any fixed SNP test suffers. What if your Astoria hasn't been discovered? Perhaps, even your State of Oregon has not been discovered. You also might have troubles if your eyesight or navigation system isn't so good. For good navigation you'd want to have at least 111 Y STR markers.
Big Y Discovery testing is like having a super high speed, fuel efficient jet traversing back and forth on multiple paths high across the sky on mostly clear days taking special photos of the countryside between St. Louis and the Pacific Coast. It is scanning over 14.5 million locations. If your Astoria turns out to be San Diego, Long Beach or Tacoma, that's okay. Big Y is accomplishing what Lewis and Clark were doing, mapping the route for settlers to follow in the form of lower entry price tests. Unfortunately, your family of genealogical record might not even be on the maps for the mass migration of settlers to come, that is without SNP discovery testing like Big Y700. It's just a fact of the Y chromosome just as it is of the geography. The settlers won't go to a place when they don't know where it is or even know it exists.
Only a member of your genealogical family can discover your Astoria and erect the Astoria Column of 111 markers for the settlers. We need leaders from each family.
I'm asking you to start thinking about Big Y if you haven't already. Be a lead-explorer! There are now several thousand Big Y results completed for the R1b haplogroup. It works. Big Y results can come in quickly (the vast majority come in 10-12 weeks but this varies.) Pooling of resources at the project/family/surname levels can help share the cost, but be look for holiday, DNA Day, Father's Day sales promotions.
Big Y700 Learning Center at FTDNA
Big Y700 Help and Guidance Facebook forum
Lewis and Clark Expedition of the Louisiana Purchase description
Astoria Column at destination of Lewis and Clark Expedition
What should I do with my Big Y700 results?
FTDNA will update the haplotree and your haplogroup label automatically. This may happen in a few days to several weeks after receiving your results. The results are phased in, not all at one time. You can see your SNP results and where you fit it in the haplotree under the "HAPLOTREE & SNPs" button on your FTDNA account. There is also an external version of the FTDNA Public Haplotree where you can browse up and down the branches or search for your haplogroup or a specific SNP. You can view analysis by surname and country there. You can learn more about it at the FTDNA Public Haplotree Learning Center.
Very Important: Contact your Y111 and Y67 matches and ask them to consider Big Y700 too. This will help identify the branching the genealogical timeframe.
Work with your haplogroup project administrators. Many do extensive free analyses. As long as you are in the R1b All Subclades Project and provide Limited Access, we will join you to the major subclade haplogroup project that can most help you. There are Raw Results files that your administrators are likely to ask for.
The VCF Raw Results include both a VARIANTS.VCF and a REGIONS.BED file that can be opened with a spreadsheet. Consider uploading your VCF Raw Results zipped folder to the Y Data Warehouse at the web site below. This is the general community's free file sharing vehicle and makes your results eligible for the Big Tree and the SNP Age Estimates web pages that volunteers produce. Submit your VCF folder using the Y DNA Warehouse Raw Data Submission form. The output is on a very visual tree, the Big Tree for R1b.
FTDNA provides on-line tools. They include the Big Y Matches, Terminal SNP Guide and the Big Y Chromosome Browser described here at the FTDNA Big Y Learning Center. A great tutorial on these tools is the DNA-eXplained article Working with the new Big Y. Big Y700 bonus STRs are used on your matches display as explained by the DNA-eXplained article Big Y 700 STR matching. Another excellent eXplained article describes the new Big Y Block Tree diagram. This diagram cross-analyzes autosomal DNA Family Finder origins with terminal Y haplogroups. A scale to estimate the age of branches is provided.
You can ask for help on the R1b-YDNA Yahoo group. There are also a large number of Big Y testers on the Big Y Help forum on Facebook. This is a place great place to ask questions and receive help without solicitations to purchase additional services or products. Big Y Help and Guidance Facebook forum
Another type of Raw Results file is the BAM file. It is a very large and complex file that requires special software utilities to view and manipulate. This is recommended only for people who are willing to become experts. There are also third party interpretation services. Some are volunteer or project administrator run and some are professional services that require small fees.
Should I upgrade Y STRs, even if I have limited or no matches?
Yes, upgrading can still be of benefit. There is no magic in the STR panels and what one panel, such as 1-12, shows does not guarantee that another panel, such as 68-111, will not have new matches. The STR panels are essentially a random arrangement of STRs.
Here is my personal case (as of June 2015). Every case will be different. We probably don't care about the averages, we probably just care about our individual case and the related individuals. I remember going through this by incrementally upgrading over the years and realized I would have been smarter to have bitten the bullet from the git go. I try not to add up the money of nickel and diming myself to death.
This is generally what I saw along the way. Although there were more people in the mix from when I actually went through the incremental STR upgrades from 12 to 25, then from 25 to 37, then from 37 to 67, then to 96 (to fill out Ysearch) and then finally to 111 which included a rebuy of some in the 96.
A) 48 people at 12 STRs; 6 at GD=0, 41 at GD=1.
I remember at the time when I had only 12 STRs. I was frustrated as there were no patterns and there were people with Italian names, those from Latin America and Central Asia. There was no one with my last name and very few that looked at Irish (my lineage) at all. It just didn't make any sense. I do have a surname variant in the mix today, but he wasn't there when I went through this step and he has never replied to my emails.
After encouragement from others and the recommendation received to go to the gold standard (I was told back then) of 67 STRs, I instead took a small step and upgraded to 25 STRs. Here is what I now see on that screen.
B) 7 matches at 25 STRs, but only two of those were on the 12 STR matches report so there are five new individuals. (0 at GD=0, 0 at GD=1, 7 at GD=2).
It was apparent to me there was no magic in the first 12 STRs. They were not scientifically selected to support genetic genealogy. I think they were just happenstance of the early academic studies and what worked for them. At 25 STRs still I saw no patterns although I got rid some of the strangest geographies. I probably stewed for a year before going to 37 STRs.
C) 4 matches at 37 STRs, but only two of those were on the 25 STR matches report, so I have two new individuals at 37 STRs. (0 at GD=0, 0 at GD=1, 0 at GD=2, 2 at GD=3, 2 at GD=4).
I thought was getting somewhere at 37 STRs as I now was down to British Isles sounding names but nothing that seemed to have a relationship to my family's genealogy and history. A GD of 3 at 37 STRs is not that good so I knew I should have gone to 67 STRs prior. I am slower learner, but I went ahead to the next step. I resigned myself that I probably wasn't getting anywhere, but at worst case I would be a leader and establish where I fit in the matching database so others could find me in the future.
D) 19 matches at 67, but I dropped one of my 37 STR matches while gaining 16 new matches. (0 at GD=0, 0 at GD=1, 0 at GD=2, 0 at GD=3, 2 at GD=4, 2 at GD=5, 4 at GD=6, 11 at GD=7).
What's up with that? I have 19 people on 67 matches screen while only 4 on my 37 matches. The answer is there is no magic in the STRs included by panel at 1-12, 13-25, 26-37 and 38-67. It turns out I can see I have a couple red herring STRs in my 1-12 and 13-25 panel results which were throwing things off. These were very recent mutations that made me look more distantly related to people I was related to 300-500 and 500-1100 years ago.
At 67 STRs, a clear pattern emerged, FINALLY. The clear majority of matches were either residents of Wales or had Welsh surnames. There was one new person, with the a surname that just didn't fit at all. I pretty much ignored him, which was a mistake, and started contacting the Welsh related people. Since my surname is Walsh this all started to make sense. Another clear pattern emerged that the veteran hobbyists pointed out to me. I had a pair of slow moving STRs that was emerging as a distinctive pattern within the matches.
E) 3 matches at 111, one that appeared for the first time at 111 STRs, the other two were only on the 67 STRs matches screen. None of these three 111 matches were on my 12, 25 or 37 STR matches screen. (0 at GDs 0 thru 6, 1 at GD=6 2 at GD=10).
The GD=6 at 111 is pretty good, but it was the unexpected surname. I knew I had to contact him. It turns out he had not (at the time) updated his MDKA and he knew there was a surname change. The ancestor was of a surname variant of mine and from the same county in Ireland as my most distant known ancestor. We've now both taken Big Y and have identified an SNP that only we two share that is in the range of 200 to 500 years old. It's now included in SNP packs and panels so someone else may stumble up on us without having to go through the expense we did. These people in my 111 STR matches group are ones that I would be willing to contribute/pool money for discovery testing like Big Y. Ironically, there is some magic (rhyme or reason) in the panels. The 68-111 panel was organized last and was specifically organized for smoothly behaving STRs that are useful for genetic genealogy. The problem is you have to buy the other STRs to get to the 68-111 panel anyway. There are over ten thousand R1b people in the database at 111 STRs. It is has been the new gold standard for some time.
Will you see patterns like I saw as I upgraded? Maybe, maybe not. Your individual case might be entirely different. However, here is the key and it depends, it depends on your curiosity and interest in genetic genealogy. You don't know what you will find until you test, which is why we test. Essentially, we are on an exploration and discovery mission. We don't know what's on the other side. No guarantees, and for goodness sake, don't spend money you need for something else, but if you are interested I think advanced testing can be very valuable.
STRs are additive to SNPs at the end of the branches and twigs of the tree. In Big Y, there is an SNP about one of every four generations, actually father-son transmissions. If you go all the way to 111 STRs, the rate of a change to 1-111 STR panels is about one every three generations. When you multiply the probabilities we see that there is an opportunity for mutation one of every two father-son transmissions or 50% of the time on average. This is very granular. We are moving towards discerning grandfathers from grandsons. This is very helpful when your genealogical records trail hits a brickwall.
What are SNP Packs and which should I consider?
I can not recommend SNP Packs right now as there have been too many changes to the haplotree and many packs need updating.
What are terminal SNPs? novel SNPs? private and public SNPs?
Terminal SNPs are the SNPs that mark an individual's placement on the haplotree on their most youthful known branch. An individual's terminal SNP is not permanent as updates to the hapotree may change the known branching. This is particularly apparent when new Big Y results show up as good matches for an existing Big Y result. Big Y testers invariably have their own list of novel SNPs.
Novel SNPs are just SNPs that are not known and/or documented. We find novel SNPs from Big Y discovery testing. The new SNPs are oftentimes unnamed, but even if named they are private to just that tester. These are also called singleton SNPs. The first Big Y tester with a set of new or novel SNPs may find they sit there, latent.
Novel SNPs often become public as additional people test. As new Big Y results come in and some of the novel SNPs are found in another individual, a new subclade or branch on the haplotree is discovered. When the branch is documented and submitted to those who maintain haplotrees, the novel SNPs are no longer novel. They are now public SNPs on the haplotree, shared by more than one person. This will cause a new haplogroup label to be generated for both the new Big Y tester and the original Big Y tester within whom the then novel SNPs were discovered. This is why Big Y testing is so important. It's how we build out the tree. See above, "Why Big Y Next Generation Sequencing for discovering SNPs?"
What myFTDNA Account Profile Settings are important and what should they be?
Please take some time to review and update the information in your kit at Family Tree DNA.
Go to https://www.familytreedna.com/
Click the LOGIN TAB on the top of the screen and enter your kit number and password to open your MyFTDNA page.
Click the MANAGE PERSONAL INFORMATION link on the left of the page. Or, use the drop-down menu at top right to open your profile.
There is a lot you can configure. Please take some time to click around and setup your profile.
In particular, please check and consider the following.
1. On the CONTACT INFORMATION tab:
Input your current mailing address. This is used in case FTDNA needs to send you a new test kit to upgrade your kit. It is also useful if a DNA Project Administrator wants to contact you and your email is not working.
Input multiple email addresses if you can. This is helpful if your email address stops working for any reason. If you have a beneficiary or relative that you might want to take over your kit someday, input their email address too. If you want, input the email address of your DNA Project Administrators. Any email that you input here could someday takeover management of the kit if you are no longer able to do so.
If the contact person is not the person who gave the DNA sample, then please input the name of the DNA donor and put the contact person as c/o (Care Of) in the address line. For example, John James Smith, c/o Donna Smith Jones.
2. On the ACCOUNT SETTINGS tab:
Change the Personal Information default from Private to Basic or Full. This allows others with FTDNA login access to view the information you share in your profile. This is useful for people who match you and for others in any DNA project groups you join.
If you have a web site or family tree online, you can show the link in the ABOUT ME box. If the DNA donor has passed, or is no longer able to donate additional DNA, then you might want to include a note explaining this in the ABOUT ME box.
3. On the GENEALOGY tab - FAMILY TREE:
Even if you have not yet created a family tree on FTDNA, please change the default Family Tree privacy settings. Hopefully, someday you will create or upload a tree. Or, a project admin might do it for you. So, it will help if these settings are configured. In order to use DNA for genealogy, you want people to check your tree. I set my tree to Public for deceased people, but individually select those I wish to retain as Private (because they are living). If you have a gedcom file of your family tree, please upload it by clicking on the FAMILY TREE button on your kit's main page.
If you don't have a gedcom file, and can't make one, then you can manually create a tree by clicking on FAMILY TREE then clicking the profile icon. If you have a tree on Ancestry.com or elsewhere then you can get a gedcom.
Or, if someone else has you in their tree, they might be able to give you a gedcom. If you need help creating a gedcom or extracting your tree from Ancestry, go to http://www.nixternal.com/export-gedcom-file-from-ancestry-com/
4. On the GENEALOGY Tab - MOST DISTANT ANCESTORS:
Input your Most Distant Ancestors. These should be the most distant known ancestors you have in your direct paternal and direct maternal lines. Only input names that you know with high confidence. It helps if you include dates and location info with the name, although you may have to abbreviate words.
If the date is approximate, use the letter "c" as abbreviation for circa in front of the date. Circa is the standard term meaning around or about.
For example, John Henry Smith, bc.1822, Scotland
5. On the GENEALOGY tab - SURNAMES:
Input all the surnames of your known ancestors on all branches of your lineage. This is very useful because the matching tools allow people to search matches for surnames. If you have a surname with variations in spelling, it can help to input each variation. That way you will show up whichever variation some uses to search their matches.
6. On the BENEFICIARY INFORMATION tab:
Input the name and contact information of someone you want to take over the kit should you pass away or become unable to manage it. If you don't have anyone to make your beneficiary, then ask one of your DNA Project Administrators for their contact information to make them your kit beneficiary. It is very sad that many people pass away without designating a kit beneficiary. That makes their DNA kit of limited use for future researchers.
7. On the PRIVACY & SHARING tab:
Change most of the default settings here. Make sure to have all levels of matching selected and to opt in to sharing your origins.
Matching Preferences/Y DNA - All Levels
Origin Sharing - Opt in to Sharing
Project Sharing/Group Profile - Opt in to Sharing
mtDNA Coding Region - Opt in to Sharing (if you want your maternal side lineage shared to)
8. On the PROJECT PREFERENCES tab and select Advanced or at a minimum Limited Access for the project administrators.
Note: These project web pages and the project in general are geared towards genetic genealogy as a hobby and should not be considered a platform for forensic, legal or academic research.