The Reference Class Problem

Let’s say John Smith gets married and we want to calculate the probability that he gets a divorce. Is it possible to find a fixed, objective probability that his marriage will end in divorce? Some interpretations of probability would say yes. Marriage is no different than flipping of a coin. Over an arbitrary time frame, say 20 years, we can observe the frequency of married couples that get a divorce over the total number of married couples in the US:

                      P(Divorce)=  (Married couples that get a divorce)/(Total number of married couples)

But not so fast. As it turns out, this is just one of many interpretations of probability. This approach (which is what we’re typically taught in school) is called frequentism (1), which says that probabilities are defined by relative frequencies (2). However, as we’ll soon see, frequentism isn’t without its problems.

Let’s go back to John Smith and his marriage. We know that 50% of marriages in the US end in divorce (that’s not really correct, but let’s just assume it is). Can we really say that John Smith’s marriage will have 50% fixed, objective probability of ending in divorce? On not very close inspection, this doesn’t seem right, for the simple reason that lots of things affect divorce rates, e.g:

  • Individuals who marry at a young age have a greater likelihood of getting a divorce (3).
  • People with high levels of educational attainment have lower rates of divorce (4).
  • Divorce rates also differ among race (5).

These are just some of the things that need to be considered. The economy, income, religion, and many other factors also have large effects on divorce rates.

So what class of traits should we use to calculate the probability that John Smith will get a divorce? Let’s consider a few:

  1. The population of married couples in the US
  2. The class of married couples who are white in the US
  3. The class of married couples who are white and have bachelors degrees in the US
  4. The class of married couples who are white, have bachelors degrees, and who married at the age of 24 in the US
  5. The class of married couples who are white, have bachelors degrees, who are religious, and who married at the age of 24 in the US
  6. The class of married couples who are white, have bachelors degrees, who are religious, have a high yearly income, and who married at the age of 24 in the US

But this just raises another question, what class should we use? Why not use [5] over [6]? Or why not expand [6] and calculate a probability for each subsequent class?

In probability theory, this is known as the reference class problem:

Let us then suppose, instead, that John Smith presents himself, how should we in this case set about obtaining a series for him? In other words, how should we collect the appropriate statistics? It should be borne in mind that when we are attempting to make real inferences about things as yet unknown, it is in this form that the problem will practically present itself.

At first sight the answer to this question may seem to be obtained by a very simple process, viz. by counting how many men of the age of John Smith, respectively do and do not live for eleven years. In reality however the process is far from being so simple as it appears. For it must be remembered that each individual thing has not one distinct and appropriate series, to which, and to which alone, it properly belongs. We may indeed be practically in the habit of considering it under such a single aspect, and it may therefore seem to us more familiar when it occupies a place in one series rather than in another; but such a practice is merely customary on our part, not obligatory. It is obvious that every individual thing or event has an indefinite number of properties or attributes observable in it, and might therefore be considered as belonging to an indefinite number of different classes of things. By belonging to any one class it of course becomes at the same time a member of all the higher classes, the genera, of which that class was a species. But, moreover, by virtue of each accidental attribute which it possesses, it becomes a member of a class intersecting, so to say, some of the other classes. John Smith is a consumptive man say, and a native of a northern climate. Being a man he is of course included in the class of vertebrates, also in that of animals, as well as in any higher such classes that there may be. The property of being consumptive refers him to another class, narrower than any of the above; whilst that of being born in a northern climate refers him to a new and distinct class, not conterminous with any of the rest, for there are things born in the north which are not men.

When therefore John Smith presents himself to our notice without, so to say, any particular label attached to him informing us under which of his various aspects he is to be viewed, the process of thus referring him to a class becomes to a great extent arbitrary. If he had been indicated to us by a general name, that, of course, would have been some clue; for the name having a determinate connotation would specify at any rate a fixed group of attributes within which our selection was to be confined. But names and attributes being connected together, we are here supposed to be just as much in ignorance what name he is to be called by, as what group out of all his innumerable attributes is to be taken account of; for to tell us one of these things would be precisely the same in effect as to tell us the other. In saying that it is thus arbitrary under which class he is placed, we mean, of course, that there are no logical grounds of decision; the selection must be determined by some extraneous considerations. Mere inspection of the individual would simply show us that he could equally be referred to an indefinite number of classes, but would in itself give no inducement to prefer, for our special purpose, one of these classes to another. This variety of classes to which the individual may be referred owing to his possession of a multiplicity of attributes, has an important bearing on the process of inference which was indicated in the earlier sections of this chapter, and which we must now examine in more special reference to our particular subject. (Venn 1876: 194–195)

In the example above, our list is hardly exhaustive. John Smith’s marriage essentially has an indefinite number of measurable variables and we can always narrow our reference class (6). By changing our references class, we’ll also be changing the relative frequency that our event occurs. In the context of frequentism, this makes finding a fixed, objective probability (that John Smith’s marriage will end in divorce) an arbitrary endeavor (7). But frequentism isn’t alone with this problem. As Alan Hájek puts it, most interpretations of probability “face their own version of the reference class problem (8). And the ones that don’t “say precious little about what probability is.

As we can see, the reference class problem gives rise to some serious difficulties in probability theory. While it certainly doesn’t make frequentism and other interpretations useless, it does show their limitations. This problem won’t stop scientists, journalists, and others from using statistical probabilities, as it shouldn’t, but it should make us aware of the difficulty in finding objective probabilities that are actually useful.


1. See page 221 of David Freedman popular statistics textbook:

People talk loosely about chance all the time, without doing harm. What are the chances of getting a job? of meeting someone? of rain tomorrow? But for scientific purposes, it is necessary to give the word chance a definite clear interpretation. This turns out to be hard, and mathematicians have struggled with the job for centuries. They have developed some careful and rigorous theories, but these theories cover just a small range of the cases where people ordinarily speak of chance. This book will present the frequency theory, which works best for processes which can be repeated over and over again, independently and under the same conditions. Many games fall into this category, and the frequency theory was originally developed to solve gambling problems.

2. See (Hájek 2007)

Actual frequentists such as Venn (1876) in at least some passages and, apparently, various scientists even today, identify the probability of an attribute or event A in a reference class B with the relative frequency of actual occurrences of A within B. Note well: in a reference class B.




divorce and age


divorce and race

6. This doesn’t even take into account the infinite number of subjective variables that can cause or prevent a divorce

7. See (Hájek 2007)

By changing the reference class we can typically change the relative frequency of A, and thus the probability of A. In Venn’s example, the probability that John Smith, a consumptive Englishman aged fifty, will live to sixty-one, is the frequency of people like him who live to sixty-one, relative to the frequency of all such people. But who are the people “like him”? It seems there are indefinitely many ways of classifying him, and many of these ways will yield conflicting verdicts as to the relative frequency.

8. Some of these interpretations include classical, logical, propensity, and subjectivism


  1. John Venn on the Reference Class Problem by LK
  2. The Truth About the Divorce Rate is Surprisingly Optimistic by Brittany Wong
  3. Divorce and Demographic by State by Jacob Langenfeld
  4. Is Marriage Under Siege? by Karen Sternheimer


1. Bramlett MD, Mosher WD. First marriage dissolution, divorce, and remarriage: United States. Advance data from vital and health statistics; no. 323. Hyattsville, Maryland: National Center for Health Statistics. 2001.

2. Elliott, Diana B., and Tavia Simmons. “Marital Events of Americans: 2009.” Census Bureau Home Page. 2011.

3. Hájek, Alan (2007). The reference class problem is your problem too. Synthese 156 (3):563-585.

4. Venn, John. The Logic of Chance. An Essay on the Foundations and Province of the Theory of Probability, with Especial Reference to Its Logical Bearings and Its Application to Moral and Social Science. London: Macmillan, 1876. Print.



Filed under History, Home, Philosophy

2 responses to “The Reference Class Problem

  1. Nice post! The problem of finding the right reference class, or conditioning on the right variables, is probably a big reason people distrust statistical inferences. Since the results can change so much depending on what you’re conditioning on. And generally, your decision to include or exclude is always reasonably open to criticism.

    However, I think a frequentist would say the probability of him getting a divorce is either 0 or 1, much the same way they treat hypotheses as either true or false in hypothesis testing, i.e. it’s not a random variable. Any sort of one off event doesn’t have a probability in freqentism. You could talk about the broader population of people who share certain attributes with John Smith, and those relative frequencies.

    It’s also important to note that a lot of statistical inferences go in the opposite direction. Particular (sample) to general (population). Which is an easier inference in many cases!

    • You could talk about the broader population of people who share certain attributes with John Smith, and those relative frequencies.

      Would a frequentist think that it’s valid to make an inference from this population to John Smith? It makes sense that frequentist don’t assign probabilities to one-off events, but how do they make inferences about human populations then? Also, have you read the Hájek paper? You might find that interesting.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s