H4 mtGenome

  • 710 members


FAQ Frequently Asked Questions

I am Finnish and I have many matches from the British Isles especially Ireland and USA OR I am American of Irish ancestry and I have many Finnish matches. Why so many?

A very, very good question.

In part it is because a larger percentage of people with British (especially Irish) ancestry have tested and also a relatively large percentage of the Finnish population have tested. But USA based people still dominate testing and hence their ancestry is seen more often. Countries with low test rates such as Eastern Europe (Poland, Czech Republic, Slovakia, Belarus and Russia) and the Baltic States (Estonia, Lithuania, Latvia) will appear much less often simply because fewer people have tested. The imbalance between the USA and some parts of Europe will continue in line with the average wage level of people in these countries.


But there more exact matches than we have an explanation for and this is one of the puzzles in our results that are intriguing the project.

How can I contribute?

There are many ways, first join this project and then the simplest way is to enter your most distant maternal ancestor (mother’s mother’s mother etc) into your profile details. At present only about half of us have entered this information. The space is under your name at the top right hand side, on the tab labelled My Profile, then near the bottom, just above projects.

Use your enthusiasm for family history to take that maternal line back further and update it as you make progress.


If you have any matches on the full mtDNA test then make contact with them to see if you have any geographical areas in common. You will have many more matches in the full database than the people who have chosen to join the project, since only about one third have joined. As a project we learn more by having more members because more patterns may become visible in our mutations data. So please invite your matches to join this project as well as working with them on genealogy. You may like to use a message such as:

“Dear (name), My name is (name) and we are an exact match on our mtDNA and share the same sub group of (H4etc) in the H4 haplogroup. I am a member of the FTDNA H4 project and that allows me to share ideas about the deep ancestry of the maternal lines of our family trees. As a member we can allow our mutations to be visible to the volunteer administrators and they may see patterns which can help us as individuals and as a project group.

You can also enter your most distant maternal ancestor and that is useful as well as joining the project.

Please join: you will learn a lot too, it will cost you no money and you may spend as much or as little time on the project as you wish.

Best regards, (name)"

Fill in your name and sub group details to make sense of this message. 

Because we have many (zero GD)exact matches on H4a1a1a please do not flood all of the potential recipients, select only the ones nearest in geography to your most distant maternal ancestor so that you each both benefit.

When were the mutations that give the haplogroups OR when was HG H4a1a1a created?

DNA mutations are a random event and this means that they do not come along at a given space in time. The median (average, the 50% likely of being this value) can be calculated from details of our DNA but there is no way from the present looking back to give an exact answer. Each mutation clearly did happen at a very precise time, but we cannot measure it accurately using our modern DNA. Ancient DNA from graves over Europe will help give better time pointers and also better geographical clues to our deep common ancestor.


Using the supplementary data on page 91 from the published work of Behar gives the following table of the last mutation, for the H4 haplogroup. SD is the normal way of expressing the uncertainty in a measurement and it stands for standard deviation. This date of the mutation for the sub-clade is also the 50% likely time to the most recent common ancestor, (TMRCA) of any two people in the same sub-clade. The median is the time when there is a 50% chance of the match being closer in time and also a 50% chance of the TMRCA being longer ago.



1 Behar 2012 http://www.ncbi.nlm.nih.gov/pubmed/22482806 “A Copernican Reassessment of the Human mitochondrial DNA tree from its root.” 


Times of each sub-clade

Using the supplementary data on page 91 from the published work of Behar 2012 [1] gives the following table of the last mutation for the H4 haplogroup. SD is the normal way of expressing the uncertainty in a measurement and it stands for standard deviation.



Median no. of years ago 50% chance

Uncertainty Standard deviation SD
















































































So if you are lucky enough to be H4a1a1a1a1 then your last mutation has a 66% chance of being after 0AD and now,with a median time (50% chance) of the mutation predicted to be 1300 years ago. But if you are a pure H4 with no extra mutations then the last common ancestor could be 12,000 years ago (or more).


If you take each of the sub clades of H4 and calculate the average mutation rate from the parent clade to the next then the range of mutation times to the next mutation which defines a sub-clade, is between a minimum time of 700 years and the maximum 5000 years. That is quite a wide range, and the average is 2700 years. Our more recent mutations, which will form more sub-clades eventually as more people test, would possibly bring this average down a little. For example I have 6 mutations from the H4a1a1a recipe of which three are the “hot spots” of 16519T, 309.1C and 315.1C, one is heteroplasmy and two are unique. So my average for these extra is 1000 years per mutation or 3000 years depending on how many you include in the assessment.

The answer to the particular question of when was H4a1a1a mutated is there is a 66% chance that it is somewhere between 4000 years ago and 7000 years ago, a 17% chance that it was longer ago than that and a 17% chance that it is more recent. So that is the time to our most recent common ancestor.

The older the clade the slower the phylogenetic rate seems to be so some of the basic splits have times of 18,000 years per mutation.


FTDNAsays a 50% chance of an exact match having a most recent common ancestor in the last 125 years or 5 generations. Where does this value come from?

This stated 50% chance does not match with our combinedexperience, nor with that of other genetic genealogists. Roberta Estes usesthis phrase on her blog https://dna-explained.com/2016/01/06/we-matchbut-are-we-related/I personally think that the 5 generation estimate of a 100% match for the full sequence is overly optimistic. In fact, a lot overly optimistic”.

FTDNA does not say where their estimate comes from, in either their learning centre, nor is it in the papers they list as useful reading. The earliest scientific studies in the 1990s measuring mutations using pedigrees found one mutation per 40 generations.  This paper was looking at a disease mutation and not the hot spots with more frequent mutations, which many of us have, so perhaps they justified it for us since for pedigree matching a neutral hot spot is a good match. But the science has moved on and 100 generations per mutation on average is now what the scientists are saying. So 100 generations or 2500 years for a 50% chance of a MRCA with one genetic difference GD.



Can you tell me more about the 50% chance of a match within 5 generations, 125years, since I and some of my zero difference matches have genealogies which g oback 200-300 years and we are in different countries still.


Basically it is difficult to measure the mutation rate accurately, and different methods of measuring it give different answers. Original estimates varied by a factor of a 100. What the science has found is complicated and after intense debate it is not completely understood yet. At present it is accepted that there is the phylogenetic rate with time between mutations is up to 10 times longer than that measured from pedigrees. Different parts of the mtDNA mutate at different rates and the impact on the person and their descendants is different. In mitochondrial DNA the mutations may not be neutral i.e. have no impact on health. As an extreme example if a mutation caused a woman to be infertile (the worst case) then we would not see that today, since she would have no descendants. Other less serious impairments would also be selected against over time if the woman had less children. In biology this is known as purifying selection.

The result as measured now is a time dependent mutation rate since poor mutations get filtered out by purifying selection, and thus cannot be seen in modern populations. So the very older mutations appear to have a longer mutation time then the modern ones.

A later paper by Howell 2003 pools the mutations measured by 11 different teams using slightly different ways and they had in total 28 mutations in 2633 generations, so 1 mutation per 100 generations, with a range from 50 to 150 generations.


One of the papers Sigurdottir 2001 used the known ancestral lines of more than 800 women from Iceland and compares their mutations. This is exactly what we are trying to use our test results for and so may be the best comparison.Their particular result gives a 232 generations for a 50% chance of a common ancestor, but with a huge uncertainty of between 15 to 500 generations. A large number of mutations are needed to get a lower uncertainty and mtDNA clearly does not mutate that often.

It is interesting to see that the average time between mutations using the phylogenetic approach of Behar agrees very well with the 100 generations of the Howell review since 27 years is a typical generation time.

Even more recently Kivisild 2015 has reviewed the findings from ancient DNA which is from human burials that can be accurately dated by radiocarbon testing and these have given some calibration points of mutation time and it is 1 mutation per 2500 years on average, so again 100 generations.


Yet despite these long times some people have different mutations from their mothers and their sisters and an example of this was found in the U haplogroup. The degree of heteroplasmy changes, which is the start of mutations and leads to different mutations of mtDNA in different parts of the body, but of course only those in the ovaries get passed on to children.

The exact frequency of mutations is still not completely known and sites do mutate back and fro. The new sequencing machines can see subtleties that could not be seen even a few years ago and we will see more developments as the state of the art progresses. The latest information says that children from older mothers tend to have more mutations for example.


 Howell 2003 “The Pedigree Rate of Sequence Divergence in the Human Mitochondrial Genome”



 Sigurdottir 2001“The Mutation Rate in the Human mtDNA Control Region”



 Kivisild 2015 “Maternal ancestry and population history from whole mitochondrial genomes” review  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367903/


Is it better to have a hundred exact FGS matches or 2 for genealogy purposes?

Somewhat surprisingly it is better to have 2 exact matches because that implies that the mutation is more recent and hence may be within the written history of peoples or even the range of genealogy records. If you have a hundred exact matches there are two options; either a more recent ancestor had huge families and many of her descendants are all keen on genetic genealogy or the mutation was much longer ago and it has had time to be carried by more people. This last case is sadly more likely.


Which of my exact matches do I start with?

Start with the ones in the same geographical area as your ancestor. So then work back as far as you can go with the paper trail, working together with your matches.

I am not mathematically minded but is there a calculator that I can use?

Calculators are available for YDNA and one can be modified to be used for mtDNA

See this web page http://clandonaldusa.org/index.php/tmrca-calculator

It need to be ‘fixed’ to be able to use it, to also reflect the most recent measurements of the mutation time for mtDNA

Put 16500 into the number of markers box

Put 1 into no of nonmatchingmarker

Put 5.2e-7 into the mutation rate (just write it as it looks)

Tick the cumulative probability box

Create graph or generate list


Then check that you have put all the figures in right by seeing that the curve shows about 0.5 on the vertical axis at about 100 generationson the horizontal.

You can then use it to look at the TMRCA for your matches

If you put in zero for nonmatching marker then you will also see that it is not a straight line with time and probability.



Probability of a match

Within these number of Generations




















FTDNA are re-writing some of their guidance with the phrase " In general those who share the same direct maternal ancestor within the last 15 generations should have mtDNA results which match exactly". 15 generations is about a 20% chance using this model and that corresponds pretty well.

Therefore should have meansan 80% chance of no mutation.