Recall from my previous post that causal inference techniques are largely concerned with discerning associative relationships (relationships between two variables for which a change in one variable can cause a change in another) from causal relationships (relationships between the data describing a cause and an effect, for which the cause is an event that contributes to the production of another event, the effect). In order to complete these causal inference tasks, it is helpful to develop a concise language for describing causal and associative relationships. A well designed analytical language can provide the descriptive tools necessary to construct and validate hypothesized causal relationships. Before we go any further in our exploration of causal inference, we must first describe a simple yet expressive notation of hypothesized causal relationships between variables. This notation, referred to in causal inference literature as Structural Causal Models (SCMs), will help to simplify our further discussion of the relationship between causality and probability.
The earliest known version of SCMs, were introduced by geneticist Sewell Wright1 around 1918, originally for infering the relative importance of factors which determine the birth weight of guinea pigs. He used the construction to develop the methodology of path analysis, a technique commonly used for causal inference tasks over layered and complex processes, such as phenotypic inheritence.
Directed Acyclic Graphs
Before one can dive into the definition of a structural causal model one must ensure familiariaty with directed acyclic graphs (DAGs) which are commonly used to describe the relationships between causes and their corresponding effects. A DAG is a graph, comprised of nodes and edges, for which the direction of an edge determines the relationship between the two nodes on either side. DAGs also do not have any cycles or paths comprised of at least one edge that start and end with the same node. Below are some examples of some DAGs and some graphs that cannot be classified as a DAG:
Now that you are armed with a understanding of the structures which define a DAG, I can easily describe how exactly a structural causal model is constructed.
Constructing A Structural Causal Model
A structural causal model is comprised of three components:
A set of variables describing the state of the universe and how it relates to a particular data set we are provided. These variables are: explanatory variables, outcome variables, and unobserved variables. Both outcome variables and explanatory are observed variables, or variables describing processes measured in our data set, while unobserved variables are “background processes” for which we do not have observational data. For practical causal inference, its helpful to make a distinction between outcome variables, which an analyst is interested in changing via intervention and explanatory variables, which an analyst believes can be altered in order to cause a desired change. In an SCM, observed variables are represented by an arbitrary single letter variable name, while unobserved variables are represented by the letter , with an arbitrary single letter subscript. For example, for the analysis of the effect of Ice Cream Consumption on Drownings described in my previous post, we can represent explanatory variable Ice Cream Consumption as , outcome variable Drownings as , and unobserved variable Temperature as .
Causal relationships, which describe the causal effect variables have on one another. Specifically, causal relationships extend from observed and unobserved variables to observed variables. Such relationships are written using the assignment operator () and function notation () with a subscript labelling the variable which they effect. For example, we can represesent the causal relationship of an unobserved variable Temperature on an explanatory variable Ice Cream Consumption.
- A probability distibution defined over unobserved variables in the model, describing the likelihood that each variable takes a particular value.
Structural causal models are tightly linked with directed acyclic graphs, in that the relationships between the observed variables included within an SCMs adhere to the same set of restrictions defining DAGs. All causal relationships between said variables must be one directional, and no variable can have causal influence on itself as the result of a cycle, commonly referred to as a feedback loop. Why must we place such a restriction on SCMs? Hold that question, I will revisit it towards the end of the post.
The SCM of the example presented in my previous post can be represented as follows, in conjunction with an arbitrary probability distribution defined over the unobserved variable Temperature (describing likelihood that a given month has a particular average monthly temperature). It describes the an hypothesized causal effect of Temperature on Ice Cream Consumption as well as an effect of Temperature and Ice Cream Consumption on Drownings.
For now, we will focus on how the first two of these components interact to comprise a structural causal model. We will discuss how structural causal models allow us to use probability to infer causal relationships in a future post.
As previously mentioned, the relationships between observed variables in a structural causal model adhere to the same set of restrictions which define a directed acyclic graph. Thus, structural causal models are commonly represented with causal graphs, extensions of directed acyclic graphs used to thoroughly communicate hypotheses of causal relationships between variables. The rules defining the construction of these causal graphs are as follows.
Causal Graphs Are The DAG Representations Of Structural Causal Models
Every SCM can be represented as a DAG, with variables represented as nodes, and relationships between variables represented as edges. Hypothesized causal relationships amongst outcome and explanatory variables are represented by solid arrows in the direction of causality. For example, the SCM defining a single causal relationship between an explantory variable and an outcome variable:
can be represented by the following causal graph.
Causal Graphs Use Dashed Arrows To Represent Causal Relationships From Unobserved Variables
As previously mentioned, unobserved variables represent processes we cannot see in our data and for which we cannot test hypotheses of their causal effect. Thus, we cannot use unobserved variables to explain changes in explanatory and outcome variables. To represent this restricted utility of unobserved variables, we use a dashed line to represent a possible causal relationship from an unobserved variable to an observed variable. For example, the SCM defining a single causal relationship between an unobserved variable and an outcome variable:
can be represented by the following causal graph.
To illustrate the utility of this notation, let’s use a new example. consider the impact of Aptitude on Years Of Higher Education and Income. With Aptitude being a catch all term for the traits that influence students to spend more time in school and make more later in life. This is a structural model commonly analyzed by labor economics researchers, interested in quantifying the value of additional education after high school. Aptitude cannot be easily measured, as there are a variety of factors that effect both educational and socioeconomic outcomes (possible explanations include: inate intelligence, work ethic, cultural values, or greed). In conjunction with an arbitrary probability distribution over Aptitude, the SCM describing said causal relationships is as follows.
This SCM can also be represented by the following causal graph.
Causal Graphs Use Bidirectional Arrows To Represent Possible Associative Relationships Between Unobserved Variables
For some analytic strategies over causal graphs, it is helpful to represent a possible associative relationship between two unobserved variables. Since these variables are unobserved, and describe processes for which we have no data, it is impossible to infer a causal direction of this associative relationship, or ensure that an associative relationship even exists. To visualize this ambiguity, we represent these “possible” relationships with a dashed bidirectional arrow when drawing a causal graph. Consider an SCM describing a process in which two unobserved variables have a possible associative relationship, each having an affect on one of two observed variables, hypothesized to have a causal relationship:
The last line of this SCM represents the possible associative relationship between and , as is the mathematical symbol for “not independent”. The causal graph of this SCM is as follows.
Why Must An SCM Define A DAG?
Now that I have presented the structures which define a causal graph, I can illustrate an answer to this question, posed when I first introduced the concept of an SCM. To many, the requirement of edges to have a one directional representation is intuitive, as causal relationships similarly flow in one direction. However, it is not as clear exactly why SCMs must be represented by acyclic graphs. This becomes clearer after analyzing a familiar example. Consider a hypothesized causal relationship between three explanatory variables Buyers (), Sellers (), Marketing Spend () and an outcome variable Revenue () described by the following causal relationships.
Which corresponds to the following causal graph.
Such a cycle is an example the virtuous cycle of marketplace dynamics, describing the many moving parts which must be aligned to kick start a successful marketplace business (please checkout Lenny Rachitsky’s amazing blog series for more on this topic).
Note that within this graph, which consists of a 4 edge cycle, there exists an edge from Buyers to Revenue, implying that a change in the amount of buyers on a platform causes a change in that platforms monthly revenue. In addition, note that there exist an edge from Revenue to Marketing Spend, from Marketing Spend to Sellers, and from Sellers to Buyers, and thus, a change in monthly revenue can cause businesses to change their marketing spend, eventually attracting more buyers to their platform. However, these two opposite causal relationships over the same variables, Buyers and Revenue, contradict the definition of a causal relationship presented in my previous post, as one directional relationships from a cause to an effect. Thus, we cannot define a SCM from these hypothesized causal relationships.
Why is this useful?
As we will see in future posts, structural causal models provide a powerful representation of causal relationships, enabling the abstract analyses that often yield powerful practical methodologies for determining causal effects. Causal graphs, the graph-based counterparts of SCMs are similarly useful to analysts; they facilitate visualizations as well as utilizations of graph theory for causal inference tasks. For example, in future posts I will discuss algorithms for automatically identifying structural causal models out of undirected graphs which represent solely associative relationships. In my next post, I will use structural causal models to explain confounding bias, a term used to describe processes for which unobserved variables have a direct affect on both explanatory and outcome variables.
- Wright, S. (1921). “Correlation and causation”. Journal of Agricultural Research. 20: 557–585.↩
- Lacerda, Gustavo, et al. “Discovering Cyclic Causal Models by Independent Components Analysis.” ArXiv:1206.3273 [Cs, Stat], June 2012. arXiv.org, http://arxiv.org/abs/1206.3273.↩
- Richardson, Thomas S. “A Discovery Algorithm for Directed Cyclis Graphs.” ArXiv:1302.3599 [Cs], Feb. 2013. arXiv.org, http://arxiv.org/abs/1302.3599.↩