There’s just so many frustrations when assembling data from many sources to create a pedigree-database of this size. One of the major stumbling blocks we’ve hit is the problems with identifying dogs.
The root to the problem is that dogs generally don’t have a universal registration-number. Irish wolfhounds have travelled the earth for many years, and these days it’s quite common to import dogs from other countries or show dogs in other countries. In some cases the dog will retain it’s original registration number. In other cases, it will get a new one. So how do we know whether it’s the same dog or not? In most cases we can check date of birth and parents to get a conclusive answer. Although it’s time consuming, it’s possible. It’s quite another thing when we inherit someone’s database with tens of thousand of entries. Having one registration number that followed the dog would make this much easier.
There’s no foolproof method of handling this. Commercial pedigree software like Breedmate will use the dog’s name as an identifier. However, that’s not a good practice either. We have several generic names which appear lots of times in our dataset. For instance, we have 14 Bran, 9 Tara and 9 Wolf in the dataset. Which of these 14 Bran is the sire to a given progeny? It gets increasingly hard to tell in these days when we also have Artificial Insemination.
Is the use of duplicate names something we used to have, but is gone in these days of Kennel Names? No. The last of our 14 Brans was born in 1994. In addition some breeders seem to recycle names, so they can have two dogs with a kennel name and the same given name over a period of less than ten years. This obviously creates problems when it comes to finding out what the pedigree of a given dog actually is, especially when they are both bred from.
Another problem we’ve run into is a question of grammar. Many kennel prefixes are used with the genitive s, like in Pitlochry’s. In English, this is correct grammar. In many other languages, the apostrophe isn’t used, so Pitlochrys would be the correct term. Breeders aren’t language experts and neither are Kennel Club registrars, so sometimes things go wrong. Thus dogs are recorded both with and without the genitive apostrophe in different sources. Some well-meaning keepers of databases try to change this double info into what they believe is correct, which leads to further problems. Adding to this, there’s no established practice, even for the same breeder in how to handle genitive apostrophe. For some litters, we will see two dogs listed with a genitive apostrophe, and two without.
The same goes for obvious misspellings. There are many dogs out there which are unfortunate
enough to have their name misspelled by their breeder or the person registering it. This problem gets even worse when we try to transcribe from other alphabets into a standard latin alphabet. Well-meaning show-secretaries and keepers of databases try to correct the spelling, resulting in two distinctly different names, making it hard to see whether the dog is registered previously or not.
Back to where we started. All these problems would be gone if we had a common registration system with kennel clubs respecting other KC’s registration numbers. We are grateful that this is being worked on, and have to live with these problems for the time being.
So how do we identify a dog once it’s input into the database? By using standard practices in database design. Every dog has a unique identifier in the database. When selecting sire and dam, we add their unique identification number to the dog in question. So the relationship is expressed solely in numbers. A dog is thus represented like this:
ID: 18457, SireId: 10620, DamId: 18464.
When we discover a duplicate in our dataset, all references to the duplicate is changed into a reference to the correct dog.
When creating relationship-statistics and looking up a dog’s pedigree, we only ever use these numbers.