# Structural Causal Models

May 27, 2020

Hey! This post references terminology and material covered in previous blog posts. If any concept below is new to you, I **strongly** suggest you check out its corresponding post.

### Getting In To A Causal Flow

May 20, 2020What is causal inference? Why is it useful? How can you use to amplify your decision-making capabilities?

## Describing Causality

Recall from my previous post that causal inference techniques are largely concerned with discerning **associative relationships** (relationships between two variables for which a change in one variable can cause a change in another) from **causal relationships** (relationships between the data describing a **cause** and an **effect**, for which the cause is an event that contributes to the production of another event, the effect). In order to complete these causal inference tasks, it is helpful to develop a concise language for describing causal and associative relationships. A well designed analytical language can provide the descriptive tools necessary to construct and validate hypothesized causal relationships. Before we go any further in our exploration of causal inference, we must first describe a simple yet expressive notation of hypothesized causal relationships between variables. This notation, referred to in causal inference literature as **Structural Causal Models** (SCMs), will help to simplify our further discussion of the relationship between causality and probability.

The earliest known version of SCMs, were introduced by geneticist Sewell Wright^{1} around 1918, originally for infering the relative importance of factors which determine the birth weight of guinea pigs. He used the construction to develop the methodology of path analysis, a technique commonly used for causal inference tasks over layered and complex processes, such as phenotypic inheritence.

**Figure 1:**Drawings from Wright's 1921 paper,

*Correlation and Causation*. The bottom image presents an ancestor to

**causal graphs**, a representation of structural causal models describing the relationships between a variety of genetic factors and a guinea pigs birth weight. Wright's

*path tracing rules*defined a set of rules for using a set of associative relationships, to generate a causal graph, presented above with the top and bottom image respectively.

## Directed Acyclic Graphs

Before one can dive into the definition of a structural causal model one must ensure familiariaty with **directed acyclic graphs** (DAGs) which are commonly used to describe the relationships between causes and their corresponding effects. A DAG is a graph, comprised of nodes and edges, for which the direction of an edge determines the relationship between the two nodes on either side. DAGs also do not have any cycles or paths comprised of at least one edge that start and end with the same node. Below are some examples of some DAGs and some graphs that cannot be classified as a DAG:

**Figure 1:**One graph that can be classified as directed and acyclic and two that cannot. Note that the undirected graph lacks directed edges (represented by arrows) and thus cannot represent concepts such as causality, in which a cause has influence on an effect, and not the other way around. Also note the purple arrows in the directed cyclic graph, denoting a path from the red node back to itself, comprised of 3 edges.

Now that you are armed with a understanding of the structures which define a DAG, I can easily describe how exactly a structural causal model is constructed.

## Constructing A Structural Causal Model

A structural causal model is comprised of three components:

A set of variables describing the state of the universe and how it relates to a particular data set we are provided. These variables are:

**explanatory variables**,**outcome variables**, and**unobserved variables**. Both outcome variables and explanatory are**observed variables**, or variables describing processes measured in our data set, while unobserved variables are “background processes” for which we do not have observational data. For practical causal inference, its helpful to make a distinction between outcome variables, which an analyst is interested in changing via intervention and explanatory variables, which an analyst believes can be altered in order to cause a desired change. In an SCM, observed variables are represented by an arbitrary single letter variable name, while unobserved variables are represented by the letter $\textcolor{#A93F55}{u}$, with an arbitrary single letter subscript. For example, for the analysis of the effect of Ice Cream Consumption on Drownings described in my previous post, we can represent explanatory variable**Ice Cream Consumption**as $\textcolor{#7A28CB}{i}$, outcome variable**Drownings**as $\textcolor{#EF3E36}{d}$, and unobserved variable**Temperature**as $\textcolor{#A93F55}{u_t}$.**Causal relationships**, which describe the causal effect variables have on one another. Specifically, causal relationships extend from observed and unobserved variables to observed variables. Such relationships are written using the assignment operator ($\textcolor{#52414C}{\leftarrow}$) and function notation ($\textcolor{#52414C}{f}$) with a subscript labelling the variable which they effect. For example, we can represesent the causal relationship of an unobserved variable**Temperature**on an explanatory variable**Ice Cream Consumption**.

$\color{#52414C}\textcolor{#7A28CB}{i} \leftarrow f_i(\textcolor{#A93F55}{u_t})$

- A probability distibution defined over unobserved variables in the model, describing the likelihood that each variable takes a particular value.

Structural causal models are tightly linked with directed acyclic graphs, in that the relationships between the observed variables included within an SCMs adhere to the same set of restrictions defining DAGs. All causal relationships between said variables must be one directional, and no variable can have causal influence on itself as the result of a cycle, commonly referred to as a **feedback loop**. Why must we place such a restriction on SCMs? Hold that question, I will revisit it towards the end of the post.

The SCM of the example presented in my previous post can be represented as follows, in conjunction with an arbitrary probability distribution defined over the unobserved variable **Temperature** (describing likelihood that a given month has a particular average monthly temperature). It describes the an hypothesized causal effect of **Temperature** on **Ice Cream Consumption** as well as an effect of **Temperature** and **Ice Cream Consumption** on **Drownings**.

$\color{#52414C}\textcolor{#7A28CB}{i} \leftarrow f_i(\textcolor{#A93F55}{u_t})$

$\color{#52414C}\textcolor{#EF3E36}{d} \leftarrow f_d(\textcolor{#A93F55}{u_t}, \textcolor{#7A28CB}{i})$

For now, we will focus on how the first two of these components interact to comprise a structural causal model. We will discuss how structural causal models allow us to use probability to infer causal relationships in a future post.

## Causal graphs

As previously mentioned, the relationships between observed variables in a structural causal model adhere to the same set of restrictions which define a directed acyclic graph. Thus, structural causal models are commonly represented with **causal graphs**, extensions of directed acyclic graphs used to thoroughly communicate hypotheses of causal relationships between variables. The rules defining the construction of these causal graphs are as follows.

### Causal Graphs Are The DAG Representations Of Structural Causal Models

Every SCM can be represented as a DAG, with variables represented as nodes, and relationships between variables represented as edges. Hypothesized causal relationships amongst outcome and explanatory variables are represented by solid arrows in the direction of causality. For example, the SCM defining a single causal relationship between an **explantory variable** and an **outcome variable**:

$\color{#52414C}\textcolor{#7A28CB}{o} \leftarrow f_o(\textcolor{#A93F55}{e})$

can be represented by the following causal graph.

**Figure 2:**A causal graph representing a singular causal relationship.

### Causal Graphs Use Dashed Arrows To Represent Causal Relationships From Unobserved Variables

As previously mentioned, unobserved variables represent processes we cannot see in our data and for which we cannot test hypotheses of their causal effect. Thus, we cannot use unobserved variables to explain changes in explanatory and outcome variables. To represent this restricted utility of unobserved variables, we use a dashed line to represent a possible causal relationship from an unobserved variable to an observed variable. For example, the SCM defining a single causal relationship between an **unobserved variable** and an **outcome variable**:

$\color{#52414C}\textcolor{#7A28CB}{o} \leftarrow f_o(\textcolor{#A93F55}{u})$

can be represented by the following causal graph.

**Figure 2:**A causal graph representing an unobserved variable's effect on an observed variable

To illustrate the utility of this notation, let’s use a new example. consider the impact of **Aptitude** on **Years Of Higher Education** and **Income**. With **Aptitude** being a catch all term for the traits that influence students to spend more time in school and make more later in life. This is a structural model commonly analyzed by labor economics researchers, interested in quantifying the value of additional education after high school. Aptitude cannot be easily measured, as there are a variety of factors that effect both educational and socioeconomic outcomes (possible explanations include: inate intelligence, work ethic, cultural values, or greed). In conjunction with an arbitrary probability distribution over **Aptitude**, the SCM describing said causal relationships is as follows.

$\color{#52414C}\textcolor{#7A28CB}{y} \leftarrow f_a(\textcolor{#A93F55}{u_a})$

$\color{#52414C}\textcolor{#EF3E36}{i} \leftarrow f_i(\textcolor{#A93F55}{u_a}, \textcolor{#7A28CB}{y})$

This SCM can also be represented by the following causal graph.

**Figure 2:**A causal graph representing a hypothesis for the causal relationships that define the effect that higher education has on income.

### Causal Graphs Use Bidirectional Arrows To Represent Possible Associative Relationships Between Unobserved Variables

For some analytic strategies over causal graphs, it is helpful to represent a possible associative relationship between two unobserved variables. Since these variables are unobserved, and describe processes for which we have no data, it is impossible to infer a causal direction of this associative relationship, or ensure that an associative relationship even exists. To visualize this ambiguity, we represent these “possible” relationships with a dashed bidirectional arrow when drawing a causal graph. Consider an SCM describing a process in which two unobserved variables have a possible associative relationship, each having an affect on one of two observed variables, hypothesized to have a causal relationship:

$\color{#52414C}\textcolor{#7A28CB}{e} \leftarrow f_a(\textcolor{#A93F55}{u_a})$

$\color{#52414C}\textcolor{#EF3E36}{o} \leftarrow f_i(\textcolor{#A93F55}{u_b}, \textcolor{#7A28CB}{e})$

$\color{#52414C}\textcolor{#A93F55}{u_a} \not\!\perp\!\!\!\perp \textcolor{#A93F55}{u_b}$

The last line of this SCM represents the possible associative relationship between $\textcolor{#A93F55}{u_a}$ and $\textcolor{#A93F55}{u_b}$, as $\not\!\perp\!\!\!\perp$ is the mathematical symbol for “not independent”. The causal graph of this SCM is as follows.

**Figure 2:**A causal graph representing a structural model containing two unobserved variables and two observed variables.

## Why Must An SCM Define A DAG?

Now that I have presented the structures which define a causal graph, I can illustrate an answer to this question, posed when I first introduced the concept of an SCM. To many, the requirement of edges to have a one directional representation is intuitive, as causal relationships similarly flow in one direction. However, it is not as clear exactly why SCMs must be represented by acyclic graphs. This becomes clearer after analyzing a familiar example. Consider a hypothesized causal relationship between three explanatory variables **Buyers** ($\textcolor{#7A28CB}{b}$), **Sellers** ($\textcolor{#7A28CB}{s}$), **Marketing Spend** ($\textcolor{#7A28CB}{m}$) and an outcome variable **Revenue** ($\textcolor{#EF3E36}{r}$) described by the following causal relationships.

$\color{#52414C}\textcolor{#7A28CB}{m} \leftarrow f_r(\textcolor{#EF3E36}{r})$

$\color{#52414C}\textcolor{#7A28CB}{s} \leftarrow f_r(\textcolor{#7A28CB}{m})$

$\color{#52414C}\textcolor{#7A28CB}{b} \leftarrow f_r(\textcolor{#7A28CB}{s})$

$\color{#52414C}\textcolor{#EF3E36}{r} \leftarrow f_r(\textcolor{#7A28CB}{b})$

Which corresponds to the following causal graph.

**Figure 2:**A DAG representing a cyclical causal relationship.

Such a cycle is an example the virtuous cycle of marketplace dynamics, describing the many moving parts which must be aligned to kick start a successful marketplace business (please checkout Lenny Rachitsky’s amazing blog series for more on this topic).

Note that within this graph, which consists of a 4 edge cycle, there exists an edge from **Buyers** to **Revenue**, implying that a change in the amount of buyers on a platform causes a change in that platforms monthly revenue. In addition, note that there exist an edge from **Revenue** to **Marketing Spend**, from **Marketing Spend** to **Sellers**, and from **Sellers** to **Buyers**, and thus, a change in monthly revenue can cause businesses to change their marketing spend, eventually attracting more buyers to their platform. However, these two opposite causal relationships over the same variables, **Buyers** and **Revenue**, contradict the definition of a causal relationship presented in my previous post, as one directional relationships from a cause to an effect. Thus, we cannot define a SCM from these hypothesized causal relationships.

There is significant scholarship regarding analysis of a variation of causal models which allow for cyclic causal graphs ^{2} ^{3} , and hopefully I’ll get to cover this in a future post.

## Why is this useful?

As we will see in future posts, structural causal models provide a powerful representation of causal relationships, enabling the abstract analyses that often yield powerful practical methodologies for determining causal effects. Causal graphs, the graph-based counterparts of SCMs are similarly useful to analysts; they facilitate visualizations as well as utilizations of graph theory for causal inference tasks. For example, in future posts I will discuss algorithms for automatically identifying structural causal models out of undirected graphs which represent solely associative relationships. In my next post, I will use structural causal models to explain *confounding bias*, a term used to describe processes for which unobserved variables have a direct affect on both explanatory and outcome variables.

- Wright, S. (1921). “Correlation and causation”. Journal of Agricultural Research. 20: 557–585.↩
- Lacerda, Gustavo, et al. “Discovering Cyclic Causal Models by Independent Components Analysis.”
*ArXiv:1206.3273*[Cs, Stat], June 2012.*arXiv.org*, http://arxiv.org/abs/1206.3273.↩ - Richardson, Thomas S. “A Discovery Algorithm for Directed Cyclis Graphs.”
*ArXiv:1302.3599*[Cs], Feb. 2013.*arXiv.org*, http://arxiv.org/abs/1302.3599.↩