How Research Happens
“To define the terms is to win the argument.”
I first heard that saying in my childhood, and it impressed me greatly; how could terms have such a large impact on success? As I have grown I have learned that the words we use tend to infiltrate our thoughts, guiding them in particular directions; they do not prevent our thinking in other ways, but they do change the effort required to do so. The words themselves suggest a story, and to use them to tell a different story requires effort.
Many of the terms used in genealogy today do not lend themselves to the way I see this endeavour. “Source“, “evidence“, “conclusion“, “fact“, “proof“: these are words that illuminate particular aspects of research, and not the aspects that I find most useful.
In this article I wish to suggest a different kind of story about research, and a different set of words to support it.
A Research Narrative
Once upon a time our ancestors lived lives, met people, and did things. Nearly all of that past is completely and irretrievably lost, but some small portions of those lives created artifacts that still survive today: government and church documents, personal letters and diaries; monuments, footprints, photographs, eponyms, traditions, and so on. These artifacts were created freely by everyone without much thought about our future use of them.
Reconstructing a picture of the past from the limited contents of these artifacts is the job of a historian.
When I discover an artifact, I generally find it makes a number of claims. A photograph makes the claim “a person existed who looked like this and posed for a photograph;” a birth record makes many claims, including the existence of a birth event and three people, with roles the people played in the events and names by which the people were known and the date and location of the event.
There are two important things to keep in mind about the claims made by artifacts. The first is that they are not always correct, a truth that words like “information” and “fact” disguise. Mistakes happen, memory fades, and there are many circumstances were people simply lie. The decision to believe any given claim is just that: a decision, one left to the researcher’s judgment.
The second important property of artifacts’ claims is that, with very rare exceptions, artifacts do not refer to one another. The decision to believe that two artifacts are making claims about the same person is, again, a *decision*; a decision whose existence can be hidden by words like “evidence” and “source”. Artifact X may claim “a person named Tom Jones was born 1800-03-05” and Artifact Y may claim “a person named Tom Jones died 1880-07-09 in his eighty-first year”, but it is the *researcher* who claims that these two people were the same person; the researcher, and not either of the artifacts. “Matches” are claims of cross-artifact equivalence created by the researcher.
Once we have a set of claims about a common topic, either from one or several sources, we can often infer additional claims from them. Each of these inferences is based on our understanding of what kinds of claims “make sense”: people are only born once, British parents with a shared surname generally married prior to the birth, people named Elisabeth are almost certainly female, etc. Some people are more liberal with these inferences than others, but almost all researchers end up with a mix of claims whose source is an artifact, and claims whose source is an inference that the researchers created.
Taking the set of matched claims from artifacts and the additional claims we have inferred from them, we arrive at our “belief ” about how the past was. That belief is nothing more nor less than a set of claims we chose to accept as true. Which claims we believe is, at its core, our decision.
Putting all the pieces together, we have the following research narrative:
1. We discover artifacts;
2. We parse the claims those artifacts make;
3. We select a set of claims as “relevant”;
4. We match up the things those selected claims discuss;
5. We infer additional ideas from those claims, often including that some claims are wrong;
6. We accept a set of claims as our belief of how the past was.
These steps are also illustrated in Figure 1. (Click to enlarge)
Of Two Minds
The result of the research process is a belief about the past. Some like to call this belief a “conclusion,” but that phrase has always bothered me, suggesting as it does some level of finality. We start with a belief (e.g., “I have ancestors”) and as we research that belief changes.
One common state of research is to have multiple conflicting beliefs. Some of these take the form of dichotomies: “either there were two cousins sharing a surprising number of vital statistics, or there was one person who often claimed that his uncle was his father.” Others are unresolved contradictions: “she can’t have died three years before her last child was born, but everything else I believe points to her doing just that.” Being of two minds is the natural state of research, the almost inevitable side-effect of inquiry.
Having several equally-likely but contradictory beliefs may also be the “most correct” interpretation of the available artifacts. In a community with widespread name reuse and not a lot of record-keeping, how likely is is that the surviving artifacts make enough claims to clearly indicate even how many people with the same name there were, let alone which one of them was engaged in each event the artifacts claim occurred? There may never be enough evidence to defend any one “conclusion” as significantly more probable than another; the desire to force a single conclusion from inconclusive data is one of the most common paths to sloppy research.
The conclusion fallacy is also one of the most common causes of tension in cooperative research. When you combine the naturally inconclusive nature of the extant claims with the natural differences in the priorities and perspectives between researchers, it is almost inevitable that researchers will disagree on which alternative is “most likely” or “the best alternative”. The more researchers are involved in your research, the more important it is to accept multiple contradictory beliefs simultaneously.
Researchers Create Claims
In the research narrative I presented earlier, I asserted that researchers create claims both in the matching of other claims and as a result of inference. I have found that researcher-created claims are not widely recognized by genealogists, so in this section I intend to provide an introductory overview to the topic of historical inference.
First, let me emphasize the difference between an artifact *making* (or claiming) a claim and a researcher *creating* (or constructing) a claim. When a document claims “John was 15 when he died,” that claim stands on the authority of the document itself. If I have reason to doubt the document, I likewise lose faith in the claim. Conversely, when a researcher creates the claim “John was 15 when he died,” the researcher also creates a supporting argument or inference, such as “based on the birth and death dates we already know.” The claim that the researcher created stands or falls based on the strength of the inference created to support it, not on the reputation of the researcher.
In genealogy we often speak of the “source” of a claim and of “citing” our sources. When we speak of sources we generally mean “that which leads us to believe;” hence, the source of a claim is either an artifact or an inference. The inference may be *attributed* to the researcher who created it, but to *cite* an individual as a source is to treat the individual as a witness, not a researcher.
The Structure of an Inference
I believe that inferences are first-class citizens of genealogical research and should be discussed on an equal footing with artifacts and claims. To help reach that point, we need to explore what an inference is. The logicians and mathematicians among our readers will recognize that I am glossing over many details, but at its core each inference contains three parts:
* A set of consequents. These are the claims who cite the inference as their source.
* A set of antecedents. These are the claims from which we derive the newly created claims. If one or more of our antecedents is false, the inference no longer holds and we have no reason to believe any of the consequents.
* A rationale. This can be anything from a law of nature (e.g. “each person has exactly one biological mother”) to general trend (e.g. “the more time passes between an event and the creation of an artifact describing the event, the more likely it is that the claims of that artifact are incorrect”). It explains why we believe that the antecedents are sufficient evidence to infer the consequents.
Every researcher has created many inferences. To see some of your own, identify a claim in your belief that is not explicitly claimed by any artifact and ask “why do I believe this?” For example, if you ask “why did I say this person was male?” and answer “because Benjamin is a boy’s name,” you have identified the consequent (being male), the antecedent (being named Benjamin), and the rationale (Benjamin is a boy’s name).
An aside about the words “source” and “evidence.” Taking the most straightforward linguistic application of the words, the evidence that supports a particular inference are its antecedent claims, but the source of the inference is its rationale. That use of the words is not universal among genealogists I have spoken with, nor does there seem to be a consensus of use at all. Because of this, I prefer to use different terms: antecedent, rationale, and “support.” Something (artifact, claim, inference, or rationale) *supports* a claim if disbelieving that thing is enough to no longer have reason to believe the supported claim. Thus, the most common use of the “sources” I have seen I would term “supporting artifacts.” I prefer that longer term because “supporting” does not suggest an exhaustive character as “source” does.
By far, the most common inferences I see are matches, inferences with the consequent “person X in claim A and person Y in claim B are the same individual.” There are rationales behind these inferences, often of the form “it is unlikely that two individuals would be this similar” where the antecedents are the similar claims. These inferences are arguably *the* core component of research; their existence is what turns an archive into a history. Too often I see these absolutely central inferences being glossed over as if they were self-evident or even non-existent. Do not forget to communicate how and why you assembled your view of the past from the claims made by the various artifacts you considered.
Proofs
As part of my graduate work, I took several courses on mathematical proofs. In those courses, we defined a “proof” is a social construct, a form of persuasive writing. The goal of a proof is to convince the reader that
1. the author has constructed a chain of inferences that lead to the claim being made;
2. the original antecedent claims of the inferences (called “axioms” in logic) are believable; and
3. the rationale associated with each inference along the way supports its inference.
Insofar as I can tell, that is a good description of proofs in genealogy too. A genealogical proof is a document whose purpose is to convince the reader that inferences with strong rationale lead from the set of relevant artifacts to the researcher’s belief.
I am of two minds when it comes to genealogical proofs. They are the only aspect of genealogy as commonly discussed that gives inferences their proper due, but they also help to enshrine conclusions as the ultimate objective of research. An additional mark against them is the colloquial understanding of “proven = true” as opposed to the more accurate “proven = convincingly defended.” Let us all focus on the good of proofs – the sharing of our inferences and their rationale – but leave the persuasive writing to those who have a legitimate need to persuade.
Summary
(I was going to title this section “conclusion,” but what kind of example would that set?)
Genealogical research is the process of discovering artifacts, parsing the claims they make, selecting and matching those claims, inferring new claims, and selecting a set of the resulting claims to believe. Every step along this path is uncertain and error-prone, and the natural state of research is not one conclusive belief but a set of alternative beliefs. One key activity of research is the creation of inferences to support the matching of claims and to make explicit what other claims suggest. Expressing these inferences and their supporting claims and rationales clearly is key to helping other researchers understand your work.
There are many common terms today that obscure various aspects of sound research practices. “Information” and “fact” disguise the uncertain nature of claims; “evidence” and “source” hide the inferences that bridge artifacts and beliefs; “conclusion” and “proof” ignore the uncertainty that is inescapably present in the very incomplete picture of the past that surviving artifacts can provide. I do not claim that the terms I have suggested in the place of these terms are themselves ideal; no terminology is perfect, and some of my readers have probably already identified aspects of research that my terms obscure. However, I offer them to you, my fellow researchers, with the hope that the different perspective they suggest will help us all to recognize and correct some of the weaknesses in our own research practices.
Barbara says
I believe that much of what you wrote above in implicit in the genealogical work I do. Some of your statements are a matter of semantics, whether I refer to a “fact”, a hypothesis, or an “inference.”
In order for our work to be intellectually honest and to reach the correct conclusion or summary or decision, we must always question our assumptions and inferences, e.g., Do these two records really mean that this man was the son of that man? Is it possible that something else is indicated by them?
So my corollary to your essay is “Always question your inferences and conclusions. Continue seeking contradictory or confirming evidence.” In almost every field that requires rigor, theorem is almost always a theorem; it is rarely “proved” to 100% certainty.
Luther Tychonievich says
@Barbara
I’m glad you find much of what I say implicit in your work, and hope by making it explicit we can help frame the discussion and spread good research practises more widely.
I agree with your corollary, and I would emphasise that you can question your reasoning without new evidence. New theories do not always follow from the accrual of new evidence.
(As an aside, I believe you meant theories, which are supported by evidence, and not theorems, which are deduced from axioms.)
Annick H. says
Here is an exemple of a situation when an “artifact” refers to another: When you research birth registers for certain periods in France, you will find “notes marginales” written across someone’s birth record. In finding such a birth record you can have information about the marriage date, to whom and where and the death date and where. These are helpful hints to further your research of more “supporting artifacts”.
Luther Tychonievich says
As you point out, artefacts often contain unexpected claims, and those “off topic” claims are often useful in locating other artefacts. But I would not go so far as to state that they refer to other artefacts directly.
I have occasionally encountered truly artefact-referencing-artefacts; for example, a journal entry might say “my driver’s license says I have blond hair, but it is actually brown.” The subject of this claim in the journal (artefact 1) is another document (artefact 2). But I still don’t know if the driver’s license I’ve found (artefact 3) is the one the journal writer intended to describe or not; matching artefacts 2 and 3 remains an inference on my part.