Autumn: Causal Discovery Through Program Synthesis
February 1, 2023
Humans develop theories of how the world works through a process of observation and experimentation. This process is mediated by a number of cognitive tools, such as the ability to infer causal connections from observations, build theories to structure our knowledge, and seek out information to validate or disprove our theories. We use these tools in everyday life and, more recently in our evolutionary history, in the practice of science.
If we could build systems that automate scientific discovery, scientific progress itself could be transformed. Such systems could tirelessly explore novel theories, limited only by computational resources, and subject to fewer of the biases, silos, and constraints that exist in scientific practice today. In fields such as climate science, they could discover climate models that also incorporate economic and social systems, informing more effective decision-making. They could aid in the discovery of new laws in physics, leading to a deeper understanding of phenomena such as superconductivity, advancing the design of new materials. In neuroscience, they could extract general principles of neural computation from functional or connectivity data, bridging the gap between our ability to measure the brain and our theoretical understanding of it. They could even be applied to the problem of science itself, leading to better meta-scientific theories of the principles and good practices of science, accelerating progress even further.
Automating intuitive theory discovery has equally transformative potential. Intuitive theories, such as the fact that objects fall when dropped and that icy streets are slippery, are the informal characterizations of everyday reality that we maintain in our minds and learn from experience. Many of the common-sense failures in even the most advanced AI systems, such as those seen in language models, in robotics, and in self-driving vehicles, can be blamed in large part to an absence of intuitive models of how the world works. It’s untenable to anticipate in advance all the models an AI system might need; hence, the ability to discover intuitive models from observations will be a critical component of artificial intelligence systems with greater common sense.
Theory discovery as program synthesis
Theory discovery requires exploration through some space of possible theories, but what is the structure of this space, and what exactly is a theory? At Basis, we adopt the view that programs are good candidate representations of both formal and intuitive models and theories. In other words, abstract theories such as Einstein’s special relativity, concrete models such as those used to predict CO2 change, and the intuitive models of heat that chefs rely on for culinary success, can be represented as symbolic structures that describe computations, i.e., as programs.
The relationship between theories and programs is not merely an analogy—modern programming languages have several features that make them well-suited to represent theories. They are expressive; they allow us to simulate, and hence model, phenomena in areas as diverse as physics, biology, and economics. There is also mounting evidence that humans use mental simulation, particularly probabilistic mental simulation, to do common sense reasoning about our environments1 and each other2. Programs have inherent causal structure3, which allows us to represent causal relationships in the world and imagine counterfactual “what-if” scenarios. Programs are built compositionally—programming languages possess several features that allow us to build programs of staggering complexity incrementally, starting from simple parts. Even then, programs are often interpretable, both engendering and expressing human understanding.
AutumnSynth
In light of this, we're developing techniques of program synthesis—algorithms that generate programs—for theory discovery. Can we create an algorithm that takes in some observations and automatically generates the program (i.e. code) that produced them? Such a program would be a model or theory of the data in the sense that it describes the mechanisms or laws which govern how the data came about. The programming languages community, and more recently the ML community, has explored automated program synthesis and produced instances of human-level programming capabilities, e.g. in Sketch4, AlphaCode5, and AlphaTensor6. However, these systems are designed to be tools to improve programmer productivity, whereas our goal is to advance toward novel scientific discovery by producing algorithms that can perform theory discovery broadly.
In collaboration with Columbia, MIT, and Stanford, our latest paper7 presented at POPL 2023 develops a new algorithm, AutumnSynth, that can synthesize programs in a new programming language called Autumn. We show that AutumnSynth can generate the source code for a novel suite of interactive grid-world games. Autumn is a reactive language—it allows us to succinctly express the temporal dynamics of objects and specify changes that occur as consequences of events. Solving the problem of causal theory discovery in the context of Autumn games, which are visually simple yet contain complex causal dependencies, is an important step toward systems that can learn to discover causal theories in complex real-world environments.
In contrast to Atari-style games, which are often used as benchmarks for Reinforcement Learning agents, Autumn games often lack any notion of winning or reward, and we are not aiming to develop systems that win the game. Instead, the goal is to figure out how everything in the world works, i.e., to build a model of the game. After all, humans often learn models without a clear external reward for doing so.
How AutumnSynth works
AutumnSynth is able to induce the source code of an Autumn program by observing an episode of gameplay by a human player. This gameplay data includes the visual input (what happens on the screen at each time step), and the player’s inputs (button presses and mouse clicks). It does so by uniting two previously disparate techniques in program synthesis: functional and automata synthesis. The first method, functional synthesis, attempts to attribute each observed change in an object across time steps to other on-screen objects or player actions. For example, in the Mario-style game, the “Mario” object moves around the grid-world and can collect coins by moving onto them. The method will notice that each time the player presses the left arrow, Mario moves one space to the left, and that each time Mario intersects with a coin object, that coin disappears. It will correctly attribute the causes of those state changes and define a functional mapping between them.
However, Autumn games often possess causal dependencies that functional synthesis cannot induce. For example, possessing a coin allows Mario to shoot a bullet, which can kill an enemy. But the number of coins currently in Mario’s purse is not displayed anywhere on the screen. Functional synthesis is now faced with an unsolvable problem: there are some frames in which pressing to shoot a bullet will succeed, and others in which it will fail, with no sign to distinguish them. The second method, automata synthesis, endows the algorithm with the ability to invent and track invisible variables, called latent variables, to explain such observations. In this case, the algorithm will learn to track the number of coins Mario possesses as part of the overall game state. AutumnSynth implements these two synthesis techniques as independent modules. Combining state learning (automata synthesis) with nonstate learning (functional synthesis) is a novel approach, despite the two methods being well-developed in the literature.
From here to human-level causal discovery
We evaluated AutumnSynth on the 30 CISC programs, as well as the Exploration, Modeling, and Planning Agent (EMPA) benchmark suite of Atari-style grid-world games8. It was able to learn models for 27 of the 30 CISC programs within a 24-hour time limit, as well as 21 of the 27 EMPA games. Program synthesis is a notoriously hard problem since the number of programs grows astronomically with program-size, and hence, synthesizing relatively large programs from a small amount of data in a reasonable time represents a significant advance. Its ability to learn models for EMPA games is also notable, as they are not written in the Autumn language and often involve large numbers of latent variables, as well as non-deterministic events.
This work provides a framework for combining functional and automata synthesis for causal theory induction across a variety of domains. It also suggests many dimensions for expansion toward our ultimate goal of building algorithms that can perform human-level scientific theory discovery. One important direction is active learning. Instead of being given a complete set of training data to study at one time, humans learn through an incremental process in which they take an active role. They decide how to interact with the world and which questions to ask in order to best update their internal model. Creating an algorithm that can perform this type of process would be a significant step toward understanding human intelligence, as well as toward more capable AI systems. Another direction is to treat program synthesis as a problem of probabilistic inference, and build upon Basis’ ongoing work developing general systems for probabilistic reasoning. Finally, we are looking at broader applications, moving from fully simulated worlds into the real world, and handling the increased complexity that this entails.
Contributors
Research: Ria Das, Armando Solar-Lezama, Joshua Tenenbaum, Zenna Tavares
Article: Karen Schroeder, Zenna Tavares
Illustration: Doug John Miller
Acknowledgments: Janine Kwoh and Rafal Urbaniak for comments and feedback on this article.
References
P. W. Battaglia, J. B. Hamrick, and J. B. Tenenbaum, “Simulation as an engine of physical scene understanding,” Proc. Natl. Acad. Sci., vol. 110, no. 45, pp. 18327–18332, Nov. 2013, doi: 10.1073/pnas.1306572110. ↩︎
C. L. Baker, J. Jara-Ettinger, R. Saxe, and J. B. Tenenbaum, “Rational quantitative attribution of beliefs, desires and percepts in human mentalizing,” Nat. Hum. Behav., vol. 1, no. 4, Art. no. 4, Mar. 2017, doi: 10.1038/s41562-017-0064. ↩︎
Z. Tavares, J. Koppel, X. Zhang, R. Das, and A. Solar-Lezama, “A Language for Counterfactual Generative Models,” in Proceedings of the 38th International Conference on Machine Learning, Jul. 2021, pp. 10173–10182. Accessed: Jan. 27, 2023. Available: https://proceedings.mlr.press/v139/tavares21a.html ↩︎
A. Solar-Lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat, “Combinatorial sketching for finite programs,” in Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, New York, NY, USA, Oct. 2006, pp. 404–415. doi: 10.1145/1168857.1168907. ↩︎
Y. Li et al., “Competition-level code generation with AlphaCode,” Science, vol. 378, no. 6624, pp. 1092–1097, Dec. 2022, doi: 10.1126/science.abq1158. ↩︎
A. Fawzi et al., “Discovering faster matrix multiplication algorithms with reinforcement learning,” Nature, vol. 610, no. 7930, Art. no. 7930, Oct. 2022, doi: 10.1038/s41586-022-05172-4. ↩︎
R. Das, J. B. Tenenbaum, A. Solar-Lezama, and Z. Tavares, “Combining Functional and Automata Synthesis to Discover Causal Reactive Programs,” Proc. ACM Program. Lang., vol. 7, no. POPL, p. 56:1628-56:1658, Jan. 2023, doi: 10.1145/3571249. ↩︎
P. A. Tsividis et al., “Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning.” arXiv, Jul. 26, 2021. doi: 10.48550/arXiv.2107.12544. ↩︎