Monday, August 4, 2014

How to Stop the Circulation of Bad Family Tree Data

When I first began into geneaology, I was astounded and delighted with the amount of information I found in already-created family trees online.  With the click of a few buttons, someone else's entire gedcom could be integrated with me.  Hundreds - or even thousands of years of ancestry scored in a matter of a few minutes!, and other similar genealogy research sites have made it easy to research, publish, share and copy family trees, even if you have zero genealogy research experience.  They have also made it terribly easy to amplify the errors of the inexperienced exponentially.

So, then came the part where I realized most of what I got from others was trash.  I spent days - nay, weeks, weeding out individuals and relationships that were not possible or were not borne out by proof. Amidst all my grumbling and complaining and kicking myself for ever having used them to begin with, I started to see some common threads between the inaccuracies.  I think much of it comes from a fundamental misunderstanding of what constitutes proof versus evidence, some of it is wishful thinking and some of it is just an honest mistake.

Whatever combination of these it is, some "genealogists" leap over impossible hurdles to make connections that are not there.  Mothers give birth to babies posthumously or at the age of two.  Father's father children years after their own death.  Mothers have children a mere two months apart.  Mothers and fathers live to a hundred to give birth to a baby.  There was a king or queen or famous person once, who lived in the same city and they had an unknown child when they were 205 years old who was my 32nd great grandfather.  What?

I have, in my own family, several examples of these myths becoming fact without any real logical foundation. One of them is the myth of Jordanus de Sheppey who is, 90% of the time, mistakenly reported to be the son of King Harold II.

The Norwood line can be traced fairly unerringly back to a man named Jordanus de Sheppey.  The myth in the Norwood clan (in short) is that during his life, Harold Godwinson (King Harold II) had a son who hid away on the Isle of Sheppey and changed his name to Jordanus, becoming Jordanus de Sheppey. There is absolutely nothing in the way of proof tying King Harold to the Norwood family and it is actually mathematically impossible that Harold's son is 'our' Jordanus.  [3]   But despite the story failing basic proof and logic tests, it continues to be propagated - and even argued - mostly because, I think, being related to a king is better than being related to a.... brick wall.

Question Books and Authorities

Although sometimes someone will cite a wikipedia article, in my experience, where genealogy research is concerned, books and 'authorities' are preferred sources to websites or hearsay. Often, family trees will cite a book or, what is thought to be, an authority on the subject at hand.   However, genealogy books and 'authorities' are sometimes in error too.

Genealogy books, in particular, are often self published.  This means that they have no editor or publisher to answer to for the veracity of their facts (not that this makes a book error proof either, by any stretch).  The author presents, as is, the results of their research for others to use.  While this, in large part, is a good thing  because it makes available a wealth of information and documentation on families that might otherwise not be accessible, the other side of this windfall of published information is that it can be wrong.

For example, in my own family myth that I mentioned above, there is a book,  The Norwoods III by G. Marion Norwood Callam, that is often cited as the source for the information linking HKing Harold to Jordanus de Sheppey.  This book in and of it's self is an amazingly rich resource for Norwood genealogists.  If taken at face value, it seems to prove the connection between Jordanus and King Harold II.  However, when one digs into the dates presented in the book for the lives and deaths of the various individuals, we find that although Harold did have children and we do have a Jordanus in our family, if Harold's son was really the husband of our Cicely and father of our Stephen, Cicely and Stephen would have had to have been born much earlier and lived much longer than evidence related to our Stephen and Cicely suggests or than is realistic.   The math just doesn't work.  [4]

Mrs Callam researched this family to her best ability and, I'm sure, vetted her information in the best ways possible.  But she, like other genealogy authors, is a fallible researcher just like any of us.  It is possible, and common, for the research of authorities to be inaccurate.  In and of it's self, information from a book does not qualify as proof.

Popularity Doesn't Indicate Truth

Often, a fact is repeated or trusted because it is so widely accepted - it must be true!  Surely someone more knowledgeable than me would have spoken up before now, were it not true!  Unfortunately, the clamor of the inaccuracies often out-vocalizes the voices of reason.  It's just math - and google algorithms.  One person posts their family tree on, or a similar website, and it has an error in it.  Over a couple of years, 10 people copy that erroneous individual (or the whole tree) into their own tree.  Over the course of a couple of years, 10 more people copy each of those erroneous individuals or trees.  A couple more years go by and 10 more people copy each of those errors into their own tree.  Now, over the course of 6 years, you have amplified one mistake to 1,000 mistakes, which is, most likely, conservative, given how frequently I encounter some of my known family myth errors.

Meanwhile, you have a person or a few people - heck, even a hundred people who have called out this error on a forum post or in a blog entry like this one.  First off, thanks to google algorithms which factor in popularity, it is HARD to find these dissensions.  I found one very well reasoned dissension on the error I mentioned above - and it's long since been removed from geocities, where it once was.  But secondly, unless every one of those thousand people with the error in their published tree read that error report and take action to remove the error from their tree, it will continue to get copied over and over for years to come, despite the error.  Unfortunately, many of these trees that have the error in them are dead.  Either because their owner has forgotten about it or the owner is otherwise unable to maintain the tree.  But the tree is still accessible and is regularly copied.

Insist Upon Proof

In short, evidence are the bits of documents and information a person gathers that can become proof of a conclusion.  A proof argument or statement is a written explanation giving logical reasons why the evidence bears out the conclusion.

In the example I used above, a piece of evidence might be Cicily's signed charter. In and of it's self, it is not proof of anything.  But when placed together with other pieces of evidence, it becomes possible to put together an argument of proof for or against birth or death dates or relationships with other individuals having their own evidence associated with them.

When confronted by someone else's evidence, put together your own argument for proof to determine if their evidence holds up for you.  Do the math and check it against all available evidence, even if the source is an author or an authority.  Use this process to unearth more evidence that you might not have - or to debunk evidence that you might have associated with an individual in error.

Use Primary Sources

Primary sources are original sources for information that have not been altered.  A birth certificate is a primary source.  A book citing a birth certificate is not a primary source.  Until you have laid eyes on the birth certificate it's self, it's hearsay.  Hearsay is questionable evidence, at best.  If you read it in a book or saw it in a family tree, go to the bibliography and find out where the author got the information and then go to that source - and it's source until you see the original or a legible photo or photocopy of the original.

Therein lies the stuck of many a genealogist.  Original documents are expensive to order, when they exist and they often do not exist at all.  The original of the above mentioned charter from Cicely is a great example of an original that I can't get my hands on, if it even exists.  Worse, I lost count of how many courthouses that held records of my family have burned down, destroying years of originals that will never be available to anyone.  Sometimes all we have available to us are secondary sources or even best guesses based upon what sources we have.   The lack of a primary sources does not elevate secondary sources to primary sources or negate the need for proving your conclusions.  It just makes proving out that individual that much more difficult and should make us more cautious about drawing conclusions.

Responsible Publishing

If and when you publish your family tree, consider doing the following :

  • Cite all of your sources, all of the time, in their entirety.  Give someone else the chance to do their own due diligence - and potentially even turn up unintentional errors in your tree.  
  • Consider limiting who you share your tree with to only people who you have a research relationship with.  This will help limit the impact of any potential mistakes you might have made on the research 'community', as it were.
  • Make sure your contact information appears in the tree and potentially on every individual.  Depending upon the site you are using to share your tree, it might be possible for only individuals to be copied to other trees.  You want your information to follow them in those instances.
  • Don't publish individuals or facts you are unsure about - or indicate them clearly.  Others might not do their own due diligence - they might just trust you.  Make your data something they can trust and indicate to them where you aren't completely sure.  
  • Keep your trees updated.  If you cannot reasonably update family trees in three places at the same speed at which you uncover new information, do not publish three threes in three places. 
  • Respond to people with questions.  If you published it, be ready to explain, elaborate or otherwise support what you have published.    
  • Be open to corrections.  If someone else finds an error, be open to modifying your tree to reflect the truth, even if it means not being related to that king you thought you were related to.

  1. Registrum Roffense, or a Collection of Ancient Records, Charters, and Instruments of Divers Kinds.  Transcribed by  by John Thorpe.  1769.  Printed by W and J Richardson. 
  2. Callam, G. Marion Norwood. The Norwoods III. A Chronological History. GMNC, Pub. Printed by Gadds Printers, Ltd., Wenban Road, Worthing, BN11 1HZ. 1997. pp 22, 24
  3. King Harold lived 1022 – 1066.  If we work backwards up the family line from you/me and up, which is the proper way to determine ancestry, Jordanus would have lived about 1135 - 1197.  That's mostly guesswork because there is no actual documentation of his life - only of his children and wife.  But we can be reasonably sure of the range of possible death years based upon some tax and court records from Cicely and his son, Stephen, and based upon his children's birth dates, have a pretty good idea of the range of possible birth years.  Harold Godwinson was long dead by the time our Jordanus was born.  Jordanus is the brick wall of our family.
  4. As part of the case for Jordanus being the son of King Harold II, in the book, GMNC states that Sir Stephen de Northwode was born in 1120 and died in 1196 (76 years old) [2].  However, there are existing records of a charter Stephen's mother, Cicily wrote in the year 1200 naming her father and her sons, including Stephen [1], which are also cited by the book.  If Cicily was of childbearing age in 1120 (at least 13, but more than likely over 16), Stephen's purported birth year, but was still alive in 1200 to write the charter, she would have been almost, if not, 100 years old when she signed the charter, which is unlikely.  Further, in the same book, GMNC says that Jordanus de Sheppey died at Salisbury in 1126 [2], which makes sense if he is the son of Harold -  but that would mean that his wife Cicily and his children outlived him by 70 some odd years - which does not make as much sense. 


  1. ...and i am marie of roumania

  2. Thanks Carrie. Just came to same conclusion. If Jordanus was a descendant of Harold II, he'd probably have to be a grandson.

  3. Even though the King-to-Jordanus de Sheppey connection isn't likely, what a wonderful ancestral tapestry we Norwoods have back a thousand years. Most people I know don't even know who their two sets of grandparents are/were. We are so lucky to know our ancestors.

  4. my name is Simon Norwood in nc usa I am proud to be able to track back to even this man. I dont need to be attached to a king. Thank you to all the gene trackers in the family for putting together these charts.

  5. Thanks for getting something that refutes the connection to King Harold and Jordanus. When I first started my research 11 years ago I was thrilled to find such a rich history. But when I started to look at the dates I realized that something was definitely wrong. I've talked with medieval historians and they have said that King Harold died without having had children with Edith "LongNeck". If they did there is no record that has been found linking children to them. Another thing I learned, the Baron de Northwode line didn't continue with fathers and sons...about 1400 something an nephew inherited the title. I don't have that reference book title off hand but I remember finding it in my search for proving my Norwood's connection to the de Northwodes, which I still haven't been able to do.