In machine learning, causality relates to understanding how one variable affects another. For example, Nobel Laureates are known to be avid chocolate lovers, though the opposite isn’t necessarily true. While an increased consumption of chocolates won’t lead to a Nobel prize, understanding the dynamics of cause and effect can still be an important factor of success in certain machine learning projects.
In this tutorial, readers will find an introduction to causality and how to use it for their machine learning projects. The article will discuss the correlation between two events, what a causal inference means, and how to understand when correlation does not imply causation. Examples of counterfactual thinking and its importance will also be discussed, along with the list of different tools or models to answer the question of causation.
The article mentions the contribution of Judea Pearl and Donald Rubin to the concept of causal inference, introducing the Graphical Causal Models and Rubin causal model, respectively. Judea Pearl is an American philosopher, cognitive scientist, and computer scientist, currently a professor of computer science and statistics at the University of California, Los Angeles, best known for his work in philosophy of science, especially on causal and counterfactual reasoning. Similarly, Donald Rubin is an American statistical scientist who developed the Rubin Causal Model, awarding him a Nobel Prize in 2020 for his work in causal inference.
Counterfactual thinking is essential for the understanding of the concept of causation, and the article provides different methods to answer questions related to this concept. The top of the list includes inverse probability weighting, instrumental variable methods, and the potential outcome framework. While the bottom includes deep learning and supervised ML, these methods are considered to be the least valid and robust.
Understanding and correctly interpreting causality is essential for Machine Learning projects and this tutorial provides a informative overview on the topic. Knowing the different methods and formulated questions for counterfactual thinking will help readers properly evaluate their data and develop improved projects that consider hidden causes and their impacts.