comparing DNA match data - redux #dna
I had two known paternal cousins which had matches in that region, and non of the other 28 triangulated with both of them. So I only had two valid Paternal matches.
There were five who matched each other (triangulated) and they were presumed Maternal matches. None of them were known relatives and the total match was too small to expect to find the relationship through genealogical trees.
The other 23 large matches did not triangulate with anyone. These were all false matches, or what I prefer calling, pseudomatches.
The final determination is that in MyHeritage, over 76% of the large (bigger than 25 cM) matches are false matches. They are not real. This is undeniable data.
--
Bob Smiley
Kirkland, Washington USA
Dorann Jacobs Cafaro
Charleston, SC
732 687-5318
Researching Grauer, Dray, Marcus, May, Weisbach, Hecht, Cohen, Kahn
(Dray being a brick wall)
Yellow - Largest estimate for a given segment
Blue - Segment only reported by one site
MyHeritage | ||||||||
Chromosome | Start Location | End Location | Start RSID | End RSID | Centimorgans | SNPs | ||
6 | 1.056E+08 | 112518090 | rs1417736 | rs7738951 | 8.7 | 3584 | ||
6 | 1.516E+08 | 1.603E+08 | rs4489165 | rs220729 | 13.5 | 5632 | ||
7 | 4776780 | 46390788 | rs75662232 | rs898930 | 62.2 | 26368 | ||
7 | 67594586 | 82373405 | rs7777871 | rs35847872 | 17.3 | 6656 | ||
16 | 88165 | 2908703 | rs2541696 | rs8046218 | 6.1 | 1920 | ||
22 | 36604605 | 41736267 | rs132744 | rs5751080 | 7.8 | 2816 | ||
FTDNA | ||||||||
Chromosome | Start Location | End Location | Centimorgans | Matching SNPs | ||||
6 | 1.063E+08 | 112621250 | 7.94 | 2370 | ||||
6 | 1.51E+08 | 1.601E+08 | 14.86 | 3984 | ||||
7 | 4940038 | 47102645 | 64.32 | 18302 | ||||
7 | 68622989 | 83429355 | 16.03 | 4575 | ||||
13 | 81485364 | 91104009 | 6.02 | 2546 | ||||
14 | 35853388 | 44331104 | 6.50 | 2412 | ||||
22 | 36639742 | 44744200 | 11.96 | 3146 | ||||
GEDmatch | ||||||||
Chr | B37 Start Pos'n | B37 End Pos'n | Centimorgans (cM) | SNPs | Segment threshold | Bunch limit | SNP Density Ratio | |
2 | 2.274E+08 | 2.319E+08 | 7.2 | 722 | 201 | 120 | 0.28 | |
6 | 1.056E+08 | 112345811 | 8.3 | 1177 | 222 | 133 | 0.36 | |
6 | 1.515E+08 | 1.599E+08 | 12.7 | 1704 | 190 | 114 | 0.33 | |
7 | 4646190 | 46618644 | 61.2 | 8710 | 181 | 108 | 0.34 | |
7 | 75059630 | 83311410 | 10.5 | 1470 | 191 | 114 | 0.35 | |
14 | 30714764 | 33709003 | 8.1 | 418 | 199 | 119 | 0.25 | |
22 | 36570688 | 45118533 | 14.7 | 1174 | 192 | 115 | 0.25 |
--
Steven Usdansky
usdanskys@...
USDANSKY (Узданский): Turec, Kapyl, Klyetsk, Nyasvizh, Slutsk, Grosovo
SINIENSKI: Karelichy, Lyubcha, Navahrudak
NAMENWIRTH: Bobowa, Rzepiennik
SIGLER: "Minsk"
Check out Kitty Cooper's blog on Ashkenazi DNA matches. Due to endogamy within the Ashkenazi population we need to look for a much larger threshold.
"Finding relatives is difficult because all of us Ashkenazim are related multiple times, both way back when and more recently. Most AJs look like 4th or 5th cousins to each other even when that is not the case. Cousin marriages, uncle-niece marriages, and other close family marriages abound in our trees. In my own family, on my Jewish line, my great grandmother fixed up her sister with her husband’s brother to get that dowry for the family business so I have double third cousins. Click here for my article from back in 2014 that suggested that we are all descended from 350 people in the 1300s." https://blog.
She also has an older one you can check out at her blog site.
Carolyn Lea (nee Schwarzbaum)
OKC, OK
Schwarzbaum(Posen/NY/Georgia, US)
Lewisohn/Levison
Rothschild
Wittkowski
Basch
MyHeritage
Chromosome | Start Location | End Location | Start RSID | End RSID | Centimorgans | SNPs |
1 | 109759998 | 151341570 | rs2924 | rs41284998 | 25.2 | 7552 |
2 | 79935344 | 89125131 | rs76042919 | rs2012201 | 8.3 | 3712 |
4 | 37349537 | 55823638 | rs13120150 | rs7684211 | 14.1 | 7168 |
5 | 149420698 | 167290121 | rs2304070 | rs116611957 | 20.2 | 9600 |
10 | 72680375 | 79844239 | rs7079039 | rs7100515 | 6.9 | 3968 |
10 | 108448200 | 115522272 | rs1252035 | rs2419878 | 7.6 | 3968 |
11 | 36423369 | 63143137 | rs12290256 | rs201479912 | 15.2 | 10368 |
12 | 5906873 | 12094208 | rs6489659 | rs78137435 | 11.7 | 2688 |
18 | 11225618 | 19261413 | rs12457940 | rs11662721 | 6.3 | 2048 |
20 | 53357605 | 56460394 | rs74422090 | rs6099816 | 8.3 | 2304 |
FamilyTree DNA
Chromosome | Start Location | End Location | Centimorgans | Matching SNPs |
1 | 109727832 | 113810931 | 6.38 | 1123 |
2 | 79803287 | 88999248 | 8.68 | 1436 |
4 | 37250289 | 55984959 | 15.47 | 2816 |
5 | 149294495 | 159073110 | 9.55 | 2197 |
10 | 108316278 | 115620108 | 8.01 | 1649 |
11 | 36376603 | 44312566 | 6.71 | 1419 |
12 | 5775875 | 12208613 | 11.94 | 1718 |
18 | 11229655 | 19485946 | 6.91 | 847 |
20 | 54409617 | 56478627 | 6.17 | 831 |
GEDMatch
Chr | B37 Start Pos'n | B37 End Pos'n | Centimorgans (cM) | SNPs | Segment threshold | Bunch limit | SNP Density Ratio |
4 | 37,250,289 | 49,061,848 | 12.4 | 1,721 | 202 | 121 | 0.32 |
5 | 149,301,335 | 158,958,254 | 10.2 | 1,630 | 195 | 117 | 0.32 |
10 | 108,859,104 | 115,614,740 | 8.1 | 1,164 | 198 | 118 | 0.33 |
11 | 36,412,655 | 44,311,968 | 7 | 1,095 | 196 | 117 | 0.3 |
12 | 6,009,595 | 12,208,578 | 12.1 | 1,226 | 184 | 110 | 0.37 |
20 | 53,438,513 | 56,478,015 | 7.8 | 884 | 195 | 117 | 0.4 |
Some idea about where the discrepancy occurs came from a note in the FTDNA report: the section on Chr 1 where MyHeritage reported a 25.2cM segment match was marked by FTDNA as a "SNP poor region: not tested for Family Finder."
The bottom line for my purposes is that the MH report in this case was misleading and a more realistic reading (3 out of 4 test sites) suggests this match falls below the threshold where I could reasonably expect to find a family tree connection. I do want to point out that in the other matches I reviewed, MyHeritage was often within range of the other services' results and my experience in this case does not mean that MH testing should be deprecated generally. I also want to note that I was fortunate to have the cooperation of the other test taker, who engaged in an exchange to explore a possible connection, shared pointers to other sites where she was tested and provided access to her family tree. Thanks also to those who responded to my post.
Lee David Jaffe
===============
Surnames / Towns: Jaffe / Suchowola and Bialystok, Poland ; Stein (Sztejnsapir) / Bialystok and Rajgrod ; Roterozen / Rajgrod ; Joroff (Jaroff, Zarov) / Chernigov, Ukraine ; Schwartz (Schwarzstein) / Ternivka, Ukraine ; Weinblatt / Brooklyn, Perth Amboy, NJ ; Koshkin / Snovsk, Ukraine ; Rappoport / ? ; Braun / Wizajny, Suwalki, Ludwinowski / Wizajny, Suwalki
"MyHeritage have updated their matching algorithms since I wrote that post. However, there is still a very high false positive rate because they accept uploads in different file formats and there is very little overlap in the SNPs used for matching on the different chips. See:
https://isogg.org/wiki/Autosomal_SNP_comparison_chart
I don’t trust matches sharing below about 40 cM at MyHeritage and I’ve also found some problematic matches sharing between 40 and 50 cM. The matches at AncestryDNA are much more reliable.
I have two kits at MyHeritage. I did a new test on the Global Screening Array and with that kit 47% of my matches don’t match either of my parents. My other kit is a transfer from Ancestry and with that kit 38% of my matches don’t match either of my parents. I also get over a thousand more matches with the transfer kit.
With Jewish matches at MyHeritage I would suggest you ignore all the low and medium confidence matches and only work with matches where the largest segment is 30 cM or more and where are there are preferably at least two reasonable-sized segments. If you look in the chromosome browser you’ll probably find that a lot of the matches where you appear to share high cM totals are actually made up of lots of really tiny segments that are more likely to be false. You may want to recalibrate the total cM shared and remove all the segments under 10 cM to get a more realistic estimate of the total cM shared."
I am aware that a lot of people mistakenly put faith in smaller sized segments when many of these relate to false positives and should be ignored - some of these are the ones joined together to make larger segments by MH. Debbie recommends only looking at those matches where there is a higher 30cm segment plus some larger other segments, and ignore the smaller ones under 10cm. That is what I normally do anyway, though I have personally looked for one segment of 25cm plus, rather than 30cm, as these seem to be more prevalent. Though again, I am also aware that you are more likely to get a larger chromosome segment on some chromosomes of around 25cm, which has a skewing effect, so 30cm would avoid that.
This strategy would avoid those matches where clearly no recent family tree relationship can be found, and help in resolving the problem of endogamy.
Jill Whitehead, Surrey, UK
The dramatic differences between My Heritage and Ancestry are most noticeable when Ancestry thinks the total is <90cM. Above that threshold, TIMBER is not in effect. For Nancy's match to Mary, Ancestry calculates this as 67cM so TIMBER is in effect and they downweight this to 21cM. [Most of the matches you list above are well above 90cM.] Ancestry is basically saying that 2/3 of the DNA shared is too common to be genealogically meaningful. Some folks don't like TIMBER but my experience is that it is very useful, and Ancestry numbers are far more accurate than My Heritage. I doubt this is a traceable match.
The main question is why Ancestry says the total is 67cM and My Heritage says 124cM. If Mary doesn't transfer to GEDmatch, we're left guessing, but given the problems My Heritage has with imputing data, I wouldn't be surprised if that accounts for the bulk of the difference. One researcher who tested both her parents found that 32% of all matches had no match with either of her parents. https://cruwys.blogspot.com/2018/01/a-chromosome-browser-and-new-matching.html That's a lot of bad data!
--
Steve Toub
My original post featured two DNA match reports, one from MyHeritage and another from Ancestry, for my mother Nancy and an unknown person Mary. MyHeritage reported total shared DNA of 123.8 cM (1.8%), in 10 segments, the largest being 25.2cM (and 3 other segments larger than 10cM). In turn, Ancestry reported 21cM (66cM unweighted) across 7 segments with the longest 15cM. I asked the list for help understanding such a wide discrepancy in these reports.
I receive several responses, here and privately, most sharing a consistent view that Ancestry's algorithms are more conservative while MyHeritage often inflates its reports. I don't want to go into a lot of detail here (you can see my original post and the full exchange at https://groups.jewishgen.org/g/main/message/673482) but here is one representative quote:
These responses were alarming. Anyone who has spent any time trying to use DNA to further their research – especially in Ashkenazi inheritance where endogamy muddies the waters – will recognize that a DNA match of 123.8 with a 25.2 segment is considered significant while a match of 21cM (or even 66cM) and a 15cM longest segment is to be given lower priority. Processing what to do with this information, I wondered whether other matches I'd been exploring also exhibited such variation and were correspondingly questionable.
I reviewed a table of my mother's DNA matches collected from Ancestry, MyHeritage, GEDMatch and FTDNA (see below). For the 22 instances where I found matches from two or more testing sites, the average (mean) difference between the highest and lowest reported total was 18cM (when the 102.8 discrepancy reported for Mary was removed from the calculation). Seven of the 21 (again, excluding Mary) posted above average variations: the largest was 71.6cM and the smallest 1.2cM. In all but a few cases the variation was not significant and would not affect a determination about which matches to pursue.
At the same time, a review of the data strongly suggests that assertions that MyHeritage inflates its reports and Ancestry is more conservative don't hold up. In fact, Ancestry reported a higher total cM than MyHeritage did in 7 of the 9 instances where a head-to-head comparison was available. And Ancestry posted the highest total cM, compared to any other test site, in half (11 of 22) cases.
For my part, this leaves me still looking for an answer to my original question: What can I make of two tests between my mother and Mary reporting such extremely different results? If I can't explain it away with the algorithms used by the two testing services, what is the explanation? And, on the practical side, which test do I trust? I understand that a 3rd party comparison is most likely to answer that question, if Mary is willing to copy her results to GEDMatch. I've asked her but she hasn't responded so far. Even if this individual case is resolved eventually what do we make of such discrepancies? This reminds me of something I found in a fortune cookie: "A person with one watch is certain of the time. A person with two watches isn't sure." Testing with more than one service is supposed to help with our family history research, not create more uncertainty.
Lee David Jaffe
===============
Surnames / Towns: Jaffe / Suchowola and Bialystok, Poland ; Stein (Sztejnsapir) / Bialystok and Rajgrod ; Roterozen / Rajgrod ; Joroff (Jaroff, Zarov) / Chernigov, Ukraine ; Schwartz (Schwarzstein) / Ternivka, Ukraine ; Weinblatt / Brooklyn, Perth Amboy, NJ ; Koshkin / Snovsk, Ukraine ; Rappoport / ? ; Braun / Wizajny, Suwalki, Ludwinowski / Wizajny, Suwalki
A.S. | FTDNA | 582 |
A.S. | MyHeritage | 590.4 |
A.S. | Ancestry | 598 |
A.S. | GEDmatch | 603.3 |
B-D | GEDMatch | 99.4 |
B-D | FTDNA | 105 |
B.F. | FTDNA | 62 |
B.F. | MyHeritage | 63.2 |
B.G. | GEDMatch | 118.8 |
B.G. | FTDNA | 130 |
B.G. | MyHeritage | 131.5 |
B.H. | GEDmatch | 57.3 |
B.H. | MyHeritage | 128.9 |
B.W. | GEDMatch 5 | 60.1 |
B.W. | MyHeritage | 69.8 |
D.H. | GEDMatch | 175.7 |
D.H. | Ancestry | 205 |
G.L. | MyHeritage | 183.3 |
G.L. | Ancestry | 197 |
H.R. | GEDMatch | 192.9 |
H.R. | Ancestry | 195 |
J.F. | MyHeritage | 85 |
J.F. | Ancestry | 106 |
J.P. | MyHeritage | 138.6 |
J.P. | Ancestry | 148 |
J.Q. | FTDNA | 85 |
J.Q. | GEDMatch | 92.2 |
J.Q. | MyHeritage | 96 |
M.B | FTDNA | 287 |
M.B | GEDmatch | 287 |
M.B | MyHeritage | 300.2 |
M.L. | GEDMatch | 139.2 |
M.L. | Ancestry | 143 |
M.R. | FTDNA | 220.9 |
M.R. | GEDMatch | 221.8 |
M.R. | MyHeritage | 223.5 |
M.R. | Ancestry | 234 |
M.R. | FTDNA | 239 |
Mary | Ancestry | 21 |
Mary | MyHeritage | 123.8 |
N.T. | FTDNA | 125 |
N.T. | GEDMatch | 130.2 |
N.T. | Ancestry | 150 |
N.T. | MyHeritage | 153.5 |
P.G. | GEDMatch | 168.8 |
P.G. | GEDMatch | 172.1 |
P.G. | GEDMatch | 172.2 |
P.G. | MyHeritage | 197.9 |
P.G. | Ancestry | 207 |
R.A. | GEDMatch | 221.4 |
R.A. | Ancestry | 258 |
R.F. | GEDmatch | 315.4 |
R.F. | Ancestry | 324 |
S.H. | GEDMatch | 105.3 |
S.H. | MyHeritage | 115 |
S.H. | Ancestry | 122 |
S.R. | GEDMatch | 104.3 |
S.R. | Ancestry | 118 |