FAQ – Frequently Asked Questions - W&N2a mtDNA haplogroup project
Q: What is mtDNA?
A: mtDNA is mitochondrial DNA which resides outside the core of the cells in the mitochondria, which are controlling the metabolism (power consumption) of the cells. There are 37 known genes in mtDNA, but a substantial part of the mtDNA is non-coding.
Q: How is mtDNA inherited?
A: mtDNA is inherited from mother to child in an unbroken line, which means that mtDNA can tell the story of your matrilineal ancestors (mother's mother's mother's mother's ... mother) in an unbroken line of females to the dawn of time.
Q: What is a mutation on mtDNA?
A: mtDNA consist of approximately 16.569 string positions in a ring of nucleic acids, each position coded by one of the four letters ACGT. Long strings of these nucleic acids forms genes. Whenever one or more of these positions change its coding letter, we call this a mutation that changes the function of that position. Since malfunctioning genes normally influences the individuals survival, most lasting mutations are found in the non-coding regions of mtDNA. The simplest form of a mutation is described by two letters and a number; e.g. C16292T where the first letter is the original coding letter, the number is the position on the string and the last letter is the new coding letter. C16292T means that at position 16292 the mtDNA has changed from C to T.
Q: Are there other forms of mutations on mtDNA?
A: Yes, we also find backmutations, insertions, deletions, frameshifts and heteroplasmies in mtDNA.
Q: What is a backmutation?
A: A backmutation is a description of what happens when a preexisting mutation is mutating back to its initial state, e.g. C16292T! We denote it by an exclamation mark which indicates that at position 16292 of this particular kit has a C despite the fact that some of its matrilineal ancestors carried the T mutation on this position at some time in the past. FTDNA sometimes calls backmutations missing mutations.
Q: What is an insertion?
A: An insertion is one or more extra coding position inserted into the mtDNA string. These are typically named by e.g. 522.1C, where the first number is the position on the mtDNA, the second number is the sequence of insertion and the letter is the coding letter. Sometimes a string of coding letters are inserted at one position, typically either of the same type (e.g. CCCC) or alternating (e.g. ACAC).
Q: What is a deletion?
A: A deletion means that one position in the mtDNA ring is missing, and is typically denoted as e.g. 7869d where the number is the position on the mtDNA ring and the "d" identifies a deletion. Sometimes deletions occur as twins or triplets or even a multiple of triplets. In subclade W1+C16295T+C8270T we have an example of three triplets that are deleted.
Q: What is a frameshift?
A: Sometimes an insertion or deletion is followed by a similar deletion or insertion at another position in the mtDNA ring. This is probably due to some kind of error correction scheme in the copying of mtDNA strings in the cell. The result is something we call a frameshift, where a number of the coding letters on positions within the frame delimited by the insertion/deletion are shifted. This leads to a cascade of insertions, deletions and mutations in the notation of FTDNA, but is probably only one or a few genetical events. We have found frameshifts in haplogroups W5a1a1a and W5b/W5b1.
Q: What is a heteroplasmy?
A: A heteroplasmy is an ongoing mutation. Since all cells contain many mitochondria, sometimes we have mtDNA strings that are coded with different letters on the same position within the same cell. It means that you have two different coding letters at this position. A heteroplasmy is typically denoted by another letter than ACGT or d, e.g. C16292Y where the Y indicates that the bearer both carries the original for C at position 16292 and the mutation to T. Most heteroplasmies cannot survive many generations and are therefor normally quite recent, often the last mutation to occur. Many heteroplasmies are therefor private.
Q: What do we mean by a private mutation?
A: When we describe a mutations as private, it is a mutation that is only seen in one kit tested so far, hence it is private for that particular kit. The opposite is a mutation shared with other kits, which are public, since they are shared among several members of a branch. We never publicize private mutations since these can be used to identify individuals and hence intrude on your privacy.
Q: How are mutations used to identify relationship between our members?
A: Since mtDNA mutations occur only with a span of many generations between each mutation, we can build a tree with the order of the mutations that have occurred. This tree is called a phylogenetic tree. Each individual on a branch of this tree will have common ancestors, the closer to the twigs of the tree these are, the closer the relationship.
Q: What is the genetic distance (GD)?
A: FTDNA uses a measure called genetic distance (GD) which is equal to the number of mutations that discern two kits, assuming that a low GD indicates a closer relationship. GD0 is no mutations, GD1 is one mutation and so forth... If you have a multiple insertion, deletion or frameshift, or even a heteroplasmy, this counting algorithm fails to identify relationships, since e.g. a multiple deletion even though it is one mutational event, may count as a large genetic distance.
Q: Where can I read more about the basics of mtDNA?
A: One good introduction is https://dna-explained.com/2017/05/09/mitochondrial-dna-your-moms-story/
Q: What is a haplogroup?
A: A haplogroup consists of a collection of mtDNA kits that share a common ancestor way back in time. These are typically denoted by a string of alternating letters and numbers, e.g. W3a1d. W is the mother subclade of W3 that is the mother subclade of W3a which is the mother of subclade W3a1 which is the mother of subclade W3a1d.
Q: Why are some subclades described by a number of mutations?
A: All known branches of haplogroup W forms the W phylogeny. Official branches found on www.phylotree.org are named with their short form like W3a1d. When we identify a new subclade in our project, we use its mutations to describe its exact position in the phylogeny. As example, W3a1d+T5442C is a subbranch under W3a1d which is uniquely defined by a mutation from T to C on position 5442.
Q: How old is haplogroup W?
A: All members of the haplogroup W probably share a common matrilineal ancestor sometime during the Ice Age when a large km thick ice sheet covered parts of the Northern hemisphere including Northern Europe and Siberia. South of this ice sheet there was a barren arid tundra landscape with few animals surviving. Anatomically Modern Humans had arrived Europe 45.000 years ago, and barely survived in a few pockets in Southern Europe and the Near East during the Last Glacial Maximum of the Last Ice Age.
Q: Are there many branches under W?
A: Yes, from the original W ancestor, W has split in many branches and subbranches, during 2018 we have identified more than a hundred new subclades under W and are starting to get information on the geographic spread of many subbranches, sometimes hinting at an origin and migrational route for that particular subbranch.
Q: What is the geographic spread of haplogroup W?
A: Haplogroup W is spread over most of Europe, Northern Africa, South-West Asia, Central Asia, Siberia, Northern India and Pakistan and some subranches as far as Thailand and Laos. In modern times many W descendants have emigrated to North America, South America, South Africa and Australia, but haplogroup W is not originating from these continents and is not present in their indigenous people with the exception of Africa north of Sahara.
Q: Why is the project called W&N2a haplogroup project?
A: N2a is the sister haplogroup of W with only a very few tested kits in three known branches. Both W and N2a has a common maternal ancestor in haplogroup N2 which is estimated to be 44.500 years old. This is around the point of time when Anatomically Modern Humans started to move from the Near East to the West into Europe and North into the Pontic Steppes. We cannot pinpoint an exact origin of haplogroup N2, as we have no ancient DNA from this branch, but N2a is most common in Armenians.
Q: What can ancient DNA tell us?
A: Recent advances in DNA sequencing has allowed analysis of some hundred mtDNA ancient samples from prehistoric times. Even though haplogroup W has always been a minority haplogroup, there is a fast growing number of ancient samples known. These tell the story that W1 and W5 probably arrived Europe from Anatolia (current day Turkey) as Neolithic farmers about 7-8.000 years ago, while the W3 and W6 arrived Europe through a Northern route across the Pontic steppes north of the Black Sea at the brink of the Bronze Age about 5-6.000 years ago. The number of samples are still sparse, so we must expect new discoveries in the future as a wealth of ancient samples are analysed.
Q: What are the oldest known ancient W samples?
A: The oldest known W1 individual BAR271 is a 8.500-8.200 years old Neolithic male from Barcin in Turkey [Mathieson et al 2015]. Barcin is located close to the origin of agriculture. Over the next 3.000 years, we have a trail of 5 Neolithic W1 samples from different ancient grave sites and caves in current Germany, Hungary and Scotland. One of them belong to the W1c subclade. The most famous ancient W1 is called the "Amesbury Archers Companion"- a Chalcolithic Bell Beaker individual that is about 4.500-4.100 years old [Olalde et al 2017]. The Bell Beaker culture is associated with metal works in gold and copper, as well as secondary products from animal herds. This can be interpreted as the Neolithic farmer W1 females was adapting to the newly arriving Bronze Age herder cultures.
Q: What about ancient findings from other subclades?
A: There are also two known Neolithic W5 individuals from the Early Neolithic 7.800-7.500 years ago in southern Hungary and Late Neolithic 5.300-5.000 years ago from Poland. W5 is also found in Chalcolithic Corded Ware and Bell Beaker cultures in current day Germany. The two subclades W3 and W6 seems to have arrived Europe later with the Bronze Age Secondary Products revolution. The oldest known W3a1, W6 and W6c are from the Late Neolithic Yamnaya culture which probably carried the Indo-European languages to Europe and was the predecessor to the Bell Beakers. W3a1 is found in Bronze Age Bell Beakers and Unetice cultures in current day Germany, while W6a is found in Baltic Late Neolithic and Chalcolithic Corded Ware cultures. W3a1 is also found during Bronze Age along the Dalmatian coast of Croatia. W3b is found in the hillfort of the Lchashen Metsamor culture in current day Armenia where the largest collection of Bronze Age horse-drawn carriages has been found and a later Iron Age Steppe herder in current Kazakhstan. We also assume that many branches of W also migrated with their Indo-European language and secondary products culture south and east into Iran, Pakistan and India during the Bronze Age, due to the geographic spread of some specific subbranches. The W3a1b even ventured as far as Thailand and Laos in South-East Asia.
Q: Are there any online resources related to haplogroup W?
A: One of the best resources to haplogroup W is www.thecid.com
Q: What is GenBank?
A: GenBank is the genetic sequence database of the National Institute of Health (NIH). It is a collection of all publicly available DNA sequences, including full sequence mitochondrial DNA. Most of the mtDNA sequences come from published research projects. A considerable number are also privately submitted by individuals who have full sequence tests at FTDNA. These individual submissions have been important in defining the current mtDNA tree at PhyloTree.
Q: How do I submit my mtDNA to build a new version of the PhyloTree?
A: See Ian Logan's mtDNA web pages for a brief description of GenBank. There is also a description on how to upload your own full sequence mtDNA results to GenBank at http://www.ianlogan.co.uk/submission.htm. Please consider this if you have a full sequence test and your subclade is rare or under-represented at GenBank.
Q: What is the Most Distant Maternal Ancestor?
A: Your Most Distant Maternal Ancestor is the earliest known female ancestor on your maternal line given by your mother (maiden name), her mother (maiden name), her mother again – your maternal grandmother (maiden name), her mother again – your maternal great grandmother (maiden name) etc. There should be no males between you and your Most Distant Maternal Ancestor. When you enter your Most Distant Maternal Ancestor at FTDNA, please do not enter a living person without their consent and do not enter speculative ancestors (you should be able to document the steps between each generation).
Q: How do I enter my Most Distant Maternal Ancestor at FTDNA?
A: Log into your account at FTDNA, highlight your name in the upper right corner and you will see «My Profile», click on «My Profile», select the Genealogy tab, select «Earliest Known Ancestors».
· The «Country of Origin» should be the country where your Most Distant Maternal Ancestor was born or first documented. If your Most Distant Maternal Ancestor was born in the USA, her Country of Origin is United States. Answer NO to the question: Are you NativeAmerican? Since N2a and W are Eurasian haplogroups, you shall not use UnitedStates (Native American) even if some of your ancestors are known to be Native Americans.
· The «Name of Most Distant Maternal Ancestor» should include her maiden name, year and place of birth [Example: Alvarette Holmes b.1912 Syracuse, NY, USA]. Remember to click SAVE.
· The «Ancestral Location» allows you add or update the location marker for the birthplace of your Most Distant Maternal Ancestor. This is very helpful for determining the Place of Origin for the haplogroup subclades on our public website for both Old World and New World origins as your ancestor’s location will show on the maps.