Summary
The study of complex adaptive systems, pioneered in physics, biology, and the social sciences, offers important lessons for artificial intelligence (AI) governance. Contemporary AI systems and the environments in which they operate exhibit many of the properties characteristic of complex systems, including nonlinear growth patterns, emergent phenomena, and cascading effects that can lead to catastrophic failures. Complex systems science can help illuminate the features of AI that pose central challenges for policymakers, such as feedback loops induced by training AI models on synthetic data and the interconnectedness between AI systems and critical infrastructure. Drawing on insights from other domains shaped by complex systems, including public health and climate change, we examine how efforts to govern AI are marked by deep uncertainty. To contend with this challenge, we propose three desiderata for designing a set of complexity-compatible AI governance principles comprised of early and scalable intervention, adaptive institutional design, and risk thresholds calibrated to trigger timely and effective regulatory responses.
Keywords: complex adaptive systems, scaling, emergence, feedback loops, cascading risks, regulation and governance
The bigger picture
Complex adaptive systems are systems composed of numerous interacting agents that self-organize and adapt to changing environments. Artificial intelligence (AI) systems increasingly exhibit features that are characteristic of complex adaptive systems. These features make it difficult to reliably predict and steer the impact of AI systems on society. Traditional governance approaches, which often assume linear cause-and-effect relationships, are not up to the task. By leveraging insights from complex systems science, we propose governance principles tailored to the reality of modern AI. These principles include early and scalable intervention, adaptive institutional design, and risk thresholds that reflect the nonlinearity of AI systems and their interactions with other sociotechnical structures. Reframing AI governance through the lens of complexity can help law and policy keep pace with the rapid changes arising from this consequential technology.
Complex systems science offers critical insights for designing effective AI governance measures, including early and scalable intervention, dynamic institutional frameworks, and risk thresholds calibrated to trigger timely regulatory responses.
Introduction
Discussion of the impact of artificial intelligence (AI) and approaches to governing the technology have become increasingly polarized. Scholars and practitioners concerned about the risks of AI systems fiercely debate the appropriate goals, scope, and timing of regulatory policy and intervention,1,2 often divided along disciplinary lines or research communities.3 The discourse is, to a large extent, influenced by conceptual framing. Some characterize AI as a highly consequential software product or service,4,5,6 while others characterize AI as a societal-scale transformation that presents unprecedented risks.7,8
We propose a different lens, grounded in decades of interdisciplinary research in physics, biology, and the social sciences: analyzing AI systems, their development process, and the environments in which they operate as complex systems. Complex systems are systems comprising multiple interacting components. Examples of such systems, in the natural and human world, include insect colonies, urban environments, social networks, and financial markets.9,10,11,12
Complex systems science demonstrates that complex systems of different kinds share common traits, including the emergence of systems-level properties and patterns despite the absence of central control or design and nonlinear dynamics that defy simple cause-effect relations. Small changes in a complex system’s topology and interactions among its components may result in very different overall effects. Complex systems thus entail inherent unpredictability and are susceptible to rare but substantial cascades and the materialization of catastrophic failures with far-reaching consequences (Box 1).9,13,14,15,16,17
Box 1. Defining complex systems.
“[A] complex system … [is] … one made up of a large number of parts that interact in a non-simple way. In such systems, the whole is more than the sum of its parts… in the important pragmatic sense that, given the properties of the parts and the laws of their interaction, it is not a trivial matter to infer the properties of the whole.” — Herbert A. Simon, The Architecture of Complexity190
Methodologies developed in complex systems science enable researchers to better understand complex systems and, where appropriate, design policies to address the associated societal challenges. Drawing on studies that suggest that central aspects of contemporary AI systems bear the hallmarks of complexity,18,19,20,21,22,23,24,25,26,27,28,29,30 we make three primary contributions. First, we unpack the characterization of AI systems as complex systems. Second, we explore the implications of this characterization for the challenges involved in governing AI. Third, drawing on insights from complexity and other domains shaped by complex systems, we propose a series of complexity-compatible principles to assist policymakers in developing more effective mechanisms for governing AI.
AI and complexity
Our characterization of AI systems as complex systems focuses on several properties increasingly identified in AI systems, the processes through which they are trained, and the environments in which they are deployed. The properties we focus on are nonlinear growth, unpredictable scaling and emergence, feedback loops, cascading effects, and susceptibility to catastrophic failures. While the list is not exhaustive and varies substantially across different forms of AI and application domains, these properties illuminate some of the distinctive governance challenges posed by AI.
A note on terminology: while there is no accepted consensus as to the definition of “AI,” we use the terms “AI” and “AI systems” expansively31 so as to include a wide range of computational systems with varying levels of autonomy (as defined in the EU AI Act32), recognizing their embeddedness in sociotechnical environments.3,18,19,25,26,27,29
Nonlinear growth
In recent years, there has been an exponential increase in many of the key inputs into AI development. The computational resources for training AI models have, on average, grown by a factor of four to five each year between 2010 and 2024.33 The size of datasets used for training has also increased significantly. For example, language training dataset size has increased by a factor of three each year in recent years.34 Meanwhile, efficiency of computation has continued to improve exponentially,35,36 alongside corporate investment that has increased by more than an order of magnitude.37
During this period, the capabilities of AI systems have improved dramatically. AI systems can now outperform humans on some tasks,38 including certain tasks relating to visual question answering, natural language understanding, and text annotation.37,39 Several benchmarks previously used to evaluate AI systems have, due to improvements in the capabilities of AI, been rendered obsolete.40,41 AI systems have also achieved superhuman feats in various scientific fields, including biology,42,43 mathematics,44 weather forecasting,45 and materials science.46 Importantly, as we illustrate, these improvements in performance may themselves exhibit properties of complexity.
Scaling, emergence, and unpredictability
The effect of increases in the inputs into widely used AI systems, especially foundation models, on the performance of those systems resembles patterns characteristic of other complex systems. This phenomenon has been observed both in model training since the advent of foundation models47 and in model inference, following the development of reasoning models (i.e., models that “think” using chain of thought at run time).48
In model training, cross-entropy loss—the main metric used to measure training performance—has been shown to scale (decrease) in a power-law relationship with model size, dataset size, and the amount of compute used in training.49,50,51,52 These “scaling laws” suggest that the performance of AI models (measured by cross-entropy loss) may continue to improve with increases in the inputs used in training. However, the specific capabilities acquired by these models in practice, that is, their ability to perform particular real-world tasks, remain highly unpredictable and can appear to emerge suddenly.24,53,54,55 An early illustration of this phenomenon was observed in 2020 with GPT-3, which, although structurally similar to prior models, due to its larger size, gained the qualitatively new ability to learn to perform new tasks after being provided a few demonstrations of those tasks (known as “few-shot learning”).56
More recently, progress in the development of reasoning models—such as OpenAI’s o series of models57 and DeepSeek’s R series of models58—demonstrates an equivalent phenomenon. Increasing the scale of inference (test-time) compute has resulted in systems gaining qualitatively new abilities, including exceeding human PhD-level accuracy on certain STEM-related benchmarks.59,60,61,62,63
Seen through the lens of complex systems science, the sudden emergence of new capabilities in AI models can be analogized to phase transitions in physical and biological systems,64,65 such as water freezing or boiling when it reaches a certain temperature or the emergence of cognition from multiple neural interactions.66,67 Similarly, AI systems appear to acquire new, qualitatively different abilities when a certain threshold is reached, either in training or at inference. The exact scope and nature of new AI abilities, however, are unpredictable. For instance, in late 2024, OpenAI’s o3 model was able to solve over 25% of the problems in the FrontierMath benchmark, surpassing the 2% achieved by previous models and defying Terence Tao’s prediction that the problems would “resist AIs for several years at least.”68
Feedback loops
Like other complex systems, AI systems interact with their environments and are prone to feedback loops that can generate self-reinforcing processes. These can occur, for instance, when the output of an AI model influences human behavior that is then incorporated into the data used to refine the model or train future models. For example, various algorithms used to predict housing prices can influence real-world housing prices, which then influence future price predictions, and so on.69 This kind of feedback loop, in which predictions that support decisions influence the very outcomes they aim to predict, is known as “performative prediction.”70,71,72 Feedback loops also commonly arise in the context of content recommendation. Recommender systems respond to users’ selection of content by recommending similar content, which, in turn, reinforces users’ existing content preferences.73,74,75,76
The widespread use of foundation models exacerbates such feedback loops. Because the outputs of foundation models are increasingly incorporated into publicly available data repositories, which are then included in the training data of future models, errors and biases in earlier models could compound with each successive generation of models.77,78,79 For example, anti-consumer biases in language models used to perform legal tasks could intensify if the biased outputs of those models are used to train future models.80
Feedback loops might also ensue as humans tasked with annotating data for training AI models outsource their work to other AI models81,82,83 or are influenced by their use of AI models.84 In addition, training models on large quantities of synthetic data (i.e., data generated by other AI models)85,86,87,88 can, in some circumstances, degrade the quality of the resulting models.89,90,91,92,93,94 This may be exacerbated by the fact that detecting AI-generated content (e.g., by using watermarks)95 and excluding it from training datasets remain difficult.96,97 Studies suggest that a growing fraction of content on the internet is already dominated by synthetic content, including synthetic content produced by models that are themselves trained on synthetic content.98,99,100
Novel feedback loops could also arise as AI models are increasingly used to evaluate the safety of other AI models101,102,103,104 or assist in conducting AI safety research.105,106 While forecasting the precise contours of these interactions will likely be impossible, suffice it to say that research in complex systems science suggests that these phenomena could lead to rapid and potentially dangerous self-reinforcing processes, especially in the case of interconnected systems and networks.107,108
Interconnectedness, cascading effects, and catastrophic failures
Complex systems science sheds light on the vulnerability of interdependent networks to cascading effects, whereby damage to a small number of nodes (i.e., components comprising the system) in one network can have an outsized impact on other interconnected systems, potentially causing large-scale damage.9,13,14,15,16,17 For example, power outages can cause internet outages that then cause further power outages, which can affect additional interconnected networks, such as telecommunications networks.14
Similar dynamics could—and perhaps already do—arise in AI, especially when AI systems are integrated into other systems. The prevailing AI paradigm in which foundation models perform downstream applications across multiple domains is particularly vulnerable to cascading effects. Minor defects in foundation models can propagate across the myriad settings in which they are deployed.47,109 While the homogenization introduced by foundation models promotes efficiency (expensive-to-train models can be cheaply reused and adapted to many applications), it also gives rise to new risks familiar to complexity researchers. Safety failures resulting from foundation models might not be independent or isolated from one another but rather correlated and connected.110 For example, systems that exhibit misalignment in one context (e.g., they produce insecure code) tend to exhibit misalignment in other, ostensibly unrelated contexts (e.g., they provide malicious advice and act deceptively).111 Meanwhile, vulnerability to a particular type of adversarial attack can diffuse across multiple domains in which a model or agent is deployed or across different models or agents built using similar architecture.112,113
Cascading effects could compound as AI systems are integrated into external networks and infrastructure. For example, autonomous agents tasked with pursuing complex goals in safety-critical domains, such as financial markets and essential services, could have highly unpredictable and adverse consequences.114,115,116 As autonomous agents are increasingly integrated into other systems, potential cascading effects could become even broader and harder to predict.117 A central factor in assessing these effects and associated risks is the level of interconnectedness between AI systems and the other systems with which they interact.118 Higher levels of interconnectedness imply more vulnerability of risks percolating from an AI system to other systems.119 Meanwhile, AI systems that operate in closed environments or in settings with only limited interconnectedness likely pose less severe risks.
As a result of interconnectedness, feedback loops, and cascading effects, complex systems are particularly susceptible to catastrophic failures. For instance, interconnected feedback loops can upend financial markets in unpredictable high-impact events sometimes described as black swans.120 AI systems could give rise to similar risks.7,8 To illustrate, a malfunctioning AI system used to control wastewater treatment facilities might not only cause direct harm by discharging untreated effluent but could also have wider adverse effects on human health and marine life.121 These catastrophic failures could become more acute if AI systems are integrated into critical infrastructure. For instance, if AI systems are used to control water infrastructure that cools data centers used to train or operate AI systems, then single-system failures—whether resulting from accidental malfunction or malicious adversarial attack—could have far wider consequences.121,122,123
Importantly, catastrophic failures from AI might not necessarily materialize solely due to the defects in a particular AI system percolating to other systems. Instead, catastrophic failures may arise through the interaction of AI systems with broader sociotechnical structures.3,18,19,21,22,25,26,27,29,30 For example, economic incentives and corporate governance structures may prompt companies to deploy AI systems and enable their use in high-stakes domains without sufficient safeguards.53,124,125,126 The rapid diffusion and adoption of these systems dramatically increase the surface area of potential catastrophic failures and present difficult governance challenges.
Lessons for AI governance
Understanding AI systems as complex systems illuminates important governance challenges, many of which are overlooked by current regulatory frameworks.118,125,127 To illustrate, the European Union’s AI Act defines “systemic risk” from AI as “actual or reasonably foreseeable negative effects on public health, safety, public security, fundamental rights, or the society as a whole”32 (Art. 3(65)). This definition arguably equates “systemic risk” with “large-scale harm.” By drawing this parallel and grounding the definition of “systemic risk” in “actual or reasonably foreseeable negative effects,” the EU AI Act overlooks the unpredictable and cascading nature of risks in interconnected systems,128 the interactions between those systems and the environments in which they operate,129 and the relationship between systemic risk and conventional risks.130 Moreover, despite referring to “systemic risk” and purporting to address this class of risk in its operative provisions32 (Art. 55), the Act does not, in fact, adopt a “systems thinking” approach to AI governance, which would entail, among other things, more robust engagement with complex systems science.131 Employing perspectives from complex systems science would allow policymakers to draw on regulatory insights regarding complex systems in other domains, including climate policy,132,133,134 financial regulation,135,136,137 and public health,138 and thereby develop better calibrated guiding principles for tackling the governance challenges posed by AI.
Regulating under deep uncertainty
While the regulation of any moving target is difficult,139,140,141,142 the regulation of AI systems characterized by rapid development, emergent properties, feedback loops, and unpredictable cascading effects is a particularly thorny problem.125,127,143 Neither technologists nor policymakers can reliably predict the capabilities of AI systems or accurately forecast their negative externalities.53,125 Regulatory efforts, whether targeted at model development or deployed systems and applications, must contend with an ongoing information deficit144,145,146 and deep uncertainty.130,147,148 The problem, at its core, is that by the time the capabilities and real-world ramifications of AI systems are properly understood, it may be too late to intervene effectively—a challenge familiar to policymakers in other domains.139
In light of these challenges, we propose three desiderata for designing AI governance mechanisms: (1) policymakers should have the capacity and resources to take early and scalable regulatory action, (2) regulatory action should be dynamic and highly responsive to changing conditions, and (3) policymakers should adopt complexity-compatible risk thresholds with respect to AI systems that exhibit properties characteristic of complex systems.
Early and scalable intervention
When risks cascade in complex systems, policymakers must be able to respond early, rapidly, and at scale.115,125 For example, to prevent large-scale economic harm from vulnerabilities in a widely used automated stock-trading tool, regulators may need to intervene before the harm has (fully) materialized. Counterintuitively, the case for robust intervention to govern complex systems, including AI technologies, may decline over time.138 While early intervention (made on the basis of only limited information) could prevent the relevant harm, intervention at a later point (made on the basis of more complete information) may in fact no longer be effective.138 By analogy, lockdowns and border closures designed to prevent the spread of a pandemic are far more effective and hence more justifiable earlier in time (despite the absence of complete information), before the pandemic has spread beyond the ability to contain it, after which such mandates may no longer be as effective.138 A similar dynamic could apply to AI technologies that exhibit emergent properties, diffuse rapidly and nonlinearly, and lead to cascading effects that could cause large-scale harm.
Apart from the timing of intervention, mechanisms for governing AI systems must also operate at sufficient scale.125,134,149 Continuing with the example of vulnerabilities in a widely used automated stock-trading tool, for governance mechanisms to be effective, they will need to operate successfully across a very large number of actors, institutions, and environments that interact with the tool in question. Consequently, certain conventional governance mechanisms, such as manual human oversight and evaluation, may be ineffective,150 while more scalable mechanisms, such as automated oversight and evaluation, may, despite their shortcomings and potential risks, become necessary.103,104,105
Adaptive governance
Even if the above desideratum is met, governance institutions will nonetheless need to adapt to new conditions arising due to hard-to-predict changes in AI systems, their usage, and the broader sociotechnical context in which they operate.151,152 To this end, policymakers should draw on the principles of adaptive management and resilience proposed in the field of climate policy and environmental governance.153,154,155 According to these principles, governance mechanisms that aim to regulate complex systems should not be static institutions but rather feedback-driven processes that iteratively respond and adapt to new information while preserving overarching societal goals and values.130,134,144,156 This dynamic approach to governance is especially crucial for mitigating cascading failures of complex systems.157,158,159,160
As illustrated in Table 1, prominent AI governance frameworks include mechanisms for adaptation and change. Notably, these mechanisms for adaptation do not specify the type of information required to bring about changes in the corresponding governance framework. For example, it is unclear what information the EU would need to receive in order to add or remove AI systems or applications from the list of high-risk systems in the EU AI Act.32 That being said, this feature of regulatory frameworks is not necessarily a defect. Perspectives from complex systems suggest that overly fine-grained rules that attempt to anticipate every possible contingency are inherently limited.161 Accordingly, the use of open-ended standards in AI regulation that accommodate regulatory discretion and responsiveness—without stipulating the precise type of information required to trigger regulatory action—has notable advantages and may, on balance, be preferable to more prescriptive approaches.
Table 1.
Adaptation mechanisms in prominent governance frameworks
| Type of framework | Adaptation mechanisms | |
|---|---|---|
| US National Institute of Standards and Technology (NIST) AI Risk Management Framework (January 2023)162 | non-binding practice and policy framework | describes the framework as a “living document” and refers to current document as v.1.0; stipulates that NIST will regularly review the document, including with formal public input |
| China Interim Measures for the Management of Generative Artificial Intelligence Services (August 2023)163 | binding obligations on generative AI service providers | these interim measures are likely to be superseded by the draft Artificial Intelligence Law of the People’s Republic of China first circulated in March 2024164 |
| US Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (October 2023)165 | binding reporting requirements and mandatory government actions | requirements carried out by various government agencies that exercise significant discretion; like other US executive orders, the order can be modified or revoked by the President—and was revoked by President Trump in January 2025166 |
| European Union Artificial Intelligence Act (August 2024)32 | binding cross-sector regulation and establishment of new regulatory institutions | while the Act itself will be difficult to amend, it includes mechanisms for amending certain key provisions and further changes through implementing acts, delegated acts, externally determined standards, and a code of practice for general-purpose AI167 |
Nevertheless, to be effective, adaptive governance must be guided by up-to-date information concerning the systems being governed.144,145,146,168,169,170 In the case of AI, policymakers must continually acquire information about the current and anticipated capabilities, trends, and impacts of AI systems.171,172,173,174 In other words, they must engage in “evidence-seeking” policy.175,176 Adaptive governance also implies that policymakers should be cognizant of potential abrupt changes in a system’s performance or behavior and of the possibility that such changes will quickly percolate and affect interconnected systems.
Current regulatory frameworks have made significant progress in tackling this information problem, establishing multiple mechanisms for furnishing policymakers with decision-relevant information. For instance, the EU AI Act requires companies to keep detailed records of certain “high-risk” AI systems and proposes mechanisms for monitoring these systems and reporting safety incidents32 (Arts. 11–12, 72–73). Notwithstanding these mechanisms, deciding how to interpret the information gathered and whether (or how) to act upon it still presents a significant challenge for policymakers.
Complexity-compatible risk thresholds
What threshold of risk from AI should trigger regulatory intervention?177 What information would constitute sufficient evidence that such a threshold has been reached? Regulators often dodge these questions by postponing governance decisions until the relevant “evidentiary burden” is satisfied.178 The problem with this approach is that, because many AI systems exhibit properties characteristic of complex systems, such information may only become available at a time after which intervention has become more costly or less effective.138,139,141,179,180 (This approach can also be exploited by interested parties seeking to delay or obstruct effective regulation.175,176)
Consequently, to intervene effectively, policymakers may need to relax the policy-relevant informational threshold and resort to “satisficing”181—i.e., making governance decisions on the basis of incomplete information collected at an earlier stage in the technology’s development and use.138 For example, policymakers may need to amend AI safety standards upon receiving interim red-teaming results that indicate certain dangerous capabilities prior to receiving the final results, let alone comprehensive studies establishing the precise probability or magnitude of the relevant risks. In such cases, rather than wait until more detailed or complete information is available, regulators should familiarize themselves with the patterns characteristic of complex systems in order to evaluate the potential risks from AI systems and design appropriate interventions.
One potential route is to employ a legal doctrine known as the “precautionary principle.” The principle, which is used in environmental governance and public health policy, supports preemptive regulatory intervention before harms have (fully) materialized or risks are established conclusively, often requiring actors interested in pursuing a potentially risky activity to first prove its safety.182,183 A prominent criticism of the precautionary principle—which, in the case of AI, may require robust technical safety guarantees184,185 or other forms of assurance186,187—is that it does not withstand cost-benefit analysis, i.e., it unduly limits or forgoes the gains from new technology.179,188,189
However, as explored in the context of pandemic responses, insights from complexity can help refine and calibrate the precautionary principle. In particular, regulators should consider whether the relevant risks will likely spread swiftly and exponentially and thereby pose a grave systemic risk.138 When this is the case, the costs of postponing regulatory intervention until more complete information is obtained are often multiplicative, such that delay can be orders of magnitude costlier than early intervention. For example, refraining from intervening to prevent failures in an AI system connected to critical infrastructure could result in costly damage that rapidly percolates into other safety-critical systems.14,15,16,17,158,159 Conversely, the costs of early intervention (e.g., requiring additional guardrails in response to interim red-teaming results) are often linear and additive. Seen through the lens of complexity, cost-benefit analysis can, in certain cases, support a precautionary approach to governing AI. A central consideration in this analysis is the level of interconnectedness between AI systems and other sociotechnical systems,118 as well as the potential for feedback loops and cascading effects. Future work will need to examine these considerations in specific settings in order to implement complexity-compatible risk thresholds in practice.
Outlook
The hallmarks of complex adaptive systems increasingly exhibited by AI—nonlinearity, emergence, feedback loops, cascading effects, and the potential for catastrophic failures—underscore the difficult governance challenges facing policymakers. Studying AI through the lens of complexity can guide policymakers to focus on the well-studied patterns of complex systems that are likely to arise in the context of AI systems. Complex systems science helps identify and characterize new risks from AI technologies and points toward more appropriate governance mechanisms. Policymakers addressing the challenges from AI should draw on approaches developed in other domains that confront complexity-related challenges, including climate policy and public health. As AI technologies continue to advance and diffuse, the time is ripe to deepen these interdisciplinary connections.
Declaration of interests
The authors declare no competing interests.
Biographies
About the authors
Noam Kolt is an assistant professor at the Hebrew University Faculty of Law and School of Computer Science and Engineering. He leads the Governance of AI Lab (GOAL), a cross-disciplinary research group developing technical and institutional infrastructure to support safe and ethical AI. During his doctorate at the University of Toronto, Noam served as a research advisor to Google DeepMind and was a member of OpenAI’s GPT-4 red team. He has published in the Washington University Law Review, Berkeley Technology Law Journal, Yale Law & Policy Review, and peer-reviewed venues, including NeurIPS, ACM FAccT, and Science.
Michal Shur-Ofry is an associate professor at the Hebrew University Law Faculty and an affiliate visiting faculty at NYU Information Law Institute. A graduate of Hebrew University (LLB, PhD) and University College London (LLM), she held visiting and teaching positions at Georgetown University, the University of British Columbia, and NYU. Her research bridges law and complex systems science, using insights and formal tools from complexity to inform legal rules and policies across diverse legal domains. She also examines regulatory responses to systemic implications of AI. Her research has been published in leading journals in law, public policy, physics, and metascience.
Reuven Cohen is a professor of mathematics at Bar-Ilan University. He is a graduate of Bar-Ilan University (BSc in Physics and Computer Science and PhD in Physics) and has held postdoctoral research positions at the Weizmann Institute of Science, Boston University, and MIT. His research focuses on the theory of complex networks, geometric and combinatorial optimization algorithms, and applied mathematics in general. He has published research in leading journals in physics, computer science, engineering, and mathematics, as well as in interdisciplinary research in law and policy.
References
- 1.Price H., Connelly M. AI governance must deal with long-term risks as well. Nature. 2023;622:31. doi: 10.1038/d41586-023-03117-z. [DOI] [PubMed] [Google Scholar]
- 2.Editorial Stop talking about tomorrow’s AI doomsday when AI poses risks today. Nature. 2023;618:885–886. doi: 10.1038/d41586-023-02094-7. [DOI] [PubMed] [Google Scholar]
- 3.Lazar S., Nelson A. AI safety on whose terms? Science. 2023;381:138. doi: 10.1126/science.adi8982. [DOI] [PubMed] [Google Scholar]
- 4.Bommasani R., Kapoor S., Klyman K., Longpre S., Ramaswami A., Zhang D., Schaake M., Ho D.E., Narayanan A., Liang P. Considerations for governing open foundation models. Science. 2024;386:151–153. doi: 10.1126/science.adp1848. [DOI] [PubMed] [Google Scholar]
- 5.Kapoor S., Bommasani R., Klyman K., Longpre S., Ramaswami A., Cihon P., Hopkins A., Bankston K., Biderman S., Bogen M., et al. Forty-first International Conference on Machine Learning. ICML’24. JMLR; 2024. Position: on the societal impact of open foundation models. [Google Scholar]
- 6.Narayanan, A., and Kapoor, S. (2025). AI as normal technology. Knight First Amend. Inst.
- 7.Anwar U., Saparov A., Rando J., Paleka D., Turpin M., Hase P., Lubana E.S., Jenner E., Casper S., Sourbut O., et al. Foundational challenges in assuring alignment and safety of large language models. Transactions on Machine Learning Research. arXiv. 2024 doi: 10.48550/arXiv.2404.09932. Preprint at. [DOI] [Google Scholar]
- 8.Bengio Y., Hinton G., Yao A., Song D., Abbeel P., Darrell T., Harari Y.N., Zhang Y.Q., Xue L., Shalev-Shwartz S., et al. Managing extreme AI risks amid rapid progress. Science. 2024;384:842–845. doi: 10.1126/science.adn0117. [DOI] [PubMed] [Google Scholar]
- 9.Cohen R., Havlin S. Cambridge University Press; 2010. Complex Networks: structure, robustness, and function. [DOI] [Google Scholar]
- 10.Miller J.H., Page S.E. Princeton University Press; 2007. Complex Adaptive Systems: An Introduction to Computational Models of Social Life. Princeton Studies in Complexity. [Google Scholar]
- 11.Mitchell M. Oxford University Press; 2009. Complexity: A Guided Tour. [Google Scholar]
- 12.Siegenfeld A.F., Bar-Yam Y. An introduction to complex systems science and its applications. Complexity. 2020;2020:1–16. [Google Scholar]
- 13.Bashan A., Berezin Y., Buldyrev S.V., Havlin S. The extreme vulnerability of interdependent spatially embedded networks. Nat. Phys. 2013;9:667–672. doi: 10.1038/nphys2727. [DOI] [Google Scholar]
- 14.Buldyrev S.V., Parshani R., Paul G., Stanley H.E., Havlin S. Catastrophic cascade of failures in interdependent networks. Nature. 2010;464:1025–1028. doi: 10.1038/nature08932. [DOI] [PubMed] [Google Scholar]
- 15.Gao J., Bashan A., Shekhtman L., Havlin S. IOP Ebooks Series. 1. Institute of Physics Publishing; 2022. Introduction to Networks of Networks. [Google Scholar]
- 16.Li W., Bashan A., Buldyrev S.V., Stanley H.E., Havlin S. Cascading failures in interdependent lattice networks: The critical role of the length of dependency links. Phys. Rev. Lett. 2012;108 doi: 10.1103/PhysRevLett.108.22.8702. [DOI] [PubMed] [Google Scholar]
- 17.Yang Y., Nishikawa T., Motter A.E. Small vulnerable sets determine large network cascades in power grids. Science. 2017;358 doi: 10.1126/science.aan3184. [DOI] [PubMed] [Google Scholar]
- 18.Dobbe R. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’22. Association for Computing Machinery; 2022. System safety and artificial intelligence; p. 1584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hendrycks D. Taylor & Francis; 2025. Introduction to AI Safety, Ethics, and Society. [Google Scholar]
- 20.Holtzman A., West P., Zettlemoyer L. Generative models as a complex systems science: How can we make sense of large language model behavior? arXiv. 2023 doi: 10.48550/arXiv.2308.00189. Preprint at. [DOI] [Google Scholar]
- 21.Leveson N.G. The MIT Press; 2012. Engineering a Safer World: Systems Thinking Applied to Safety. Engineering Systems. [Google Scholar]
- 22.Macrae C. Learning from the failure of autonomous and intelligent systems: Accidents, safety, and sociotechnical sources of risk. Risk Anal. 2022;42:1999–2025. doi: 10.1111/risa.13850. [DOI] [PubMed] [Google Scholar]
- 23.Nanda N., Chan L., Lieberum T., Smith J., Steinhardt J. The Eleventh International Conference on Learning Representations. 2022. Progress measures for grokking via mechanistic interpretability.https://openreview.net/forum?id=9XFSbDPmdW [Google Scholar]
- 24.Power A., Burda Y., Edwards H., Babuschkin I., Misra V. Grokking: Generalization beyond overfitting on small algorithmic datasets. arXiv. 2022 doi: 10.48550/arXiv.2201.02177. Preprint at. [DOI] [Google Scholar]
- 25.Rakova B., Dobbe R. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’23. Association for Computing Machinery; 2023. Algorithms as social-ecological-technological systems: an environmental justice lens on algorithmic audits; p. 491. [DOI] [Google Scholar]
- 26.Rismani S., Shelby R., Smart A., Jatho E., Kroll J., Moon A., Rostamzadeh N. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. CHI ’23. Association for Computing Machinery; 2023. From plane crashes to algorithmic harm: Applicability of safety engineering frameworks for responsible ml; pp. 1–18. [DOI] [Google Scholar]
- 27.Shelby R., Rismani S., Henne K., Moon A., Rostamzadeh N., Nicholas P., Yilla-Akbari N., Gallegos J., Smart A., Garcia E., Virk G. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. AIES ’23. Association for Computing Machinery; 2023. Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction; pp. 723–741. [DOI] [Google Scholar]
- 28.Wei J., Tay Y., Bommasani R., Raffel C., Zoph B., Borgeaud S., Yogatama D., Bosma M., Zhou D., Metzler D., et al. Transactions on Machine Learning Research. 2022. Emergent abilities of large language models. [Google Scholar]
- 29.Weidinger L., Rauh M., Marchal N., Manzini A., Hendricks L.A., Mateos-Garcia J., Bergman S., Kay J., Griffin C., Bariach B., et al. Sociotechnical safety evaluation of generative AI systems. arXiv. 2023 doi: 10.48550/arXiv.2310.11986. Preprint at. [DOI] [Google Scholar]
- 30.Yoo C. Beyond algorithmic disclosure for generative AI. Columbia Sci. Technol. Law Rev. 2024;25 doi: 10.52214/stlr.v25i2.12766. [DOI] [Google Scholar]
- 31.Schuett J. Defining the scope of AI regulations. Law, Innovation and Technology. 2023;15:60–82. [Google Scholar]
- 32.European Union (2024). Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act) (Text with EEA relevance).
- 33.Sevilla, J. (2024). Training compute of frontier AI models grows by 4-5x per year. Epoch AI. https://epoch.ai/blog/training-compute-of-frontier-ai-models-grows-by-4-5x-per-year.
- 34.Epoch AI (2025). Machine learning trends. Epoch AI. https://epoch.ai/trends.
- 35.Pilz, K., Heim, L., and Brown, N. (2023). Increased compute efficiency and the diffusion of AI capabilities. Preprint at arXiv. 10.48550/arXiv.2311.15377. [DOI]
- 36.Ho A., Besiroglu T., Erdil E., Owen D., Rahman R., Guo Z.C., Atkinson D., Thompson N., Sevilla J. Algorithmic progress in language models. arXiv. 2024 doi: 10.48550/arXiv.2403.05812. Preprint at. [DOI] [Google Scholar]
- 37.Stanford University (2024). The AI Index 2024 Annual Report. (AI Index Steering Committee, Institute for Human-Centered AI, Stanford University). https://aiindex.stanford.edu/report/.
- 38.Morris M.R., Sohl-Dickstein J., Fiedel N., Warkentin T., Dafoe A., Faust A., Farabet C., Legg S. Forty-first International Conference on Machine Learning. 2024. Position: Levels of agi for operationalizing progress on the path to agi. [Google Scholar]
- 39.Gilardi F., Alizadeh M., Kubli M. Chatgpt outperforms crowd workers for text-annotation tasks. Proc. Natl. Acad. Sci. USA. 2023;120 doi: 10.1073/pnas.2305016120. e2305016120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Deng J., Dong W., Socher R., Li L.J., Li K., Fei-Fei L. 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009. Imagenet: A large-scale hierarchical image database; pp. 248–255. [DOI] [Google Scholar]
- 41.Wang A., Pruksachatkun Y., Nangia N., Singh A., Michael J., Hill F., Levy O., Bowman S.R. Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc.; 2019. Superglue: a stickier benchmark for general-purpose language understanding systems; pp. 3266–3280. [Google Scholar]
- 42.Abramson J., Adler J., Dunger J., Evans R., Green T., Pritzel A., Ronneberger O., Willmore L., Ballard A.J., Bambrick J., et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature. 2024;630:493–500. doi: 10.1038/s41586-024-07487-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A., et al. Highly accurate protein structure prediction with alphafold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Romera-Paredes B., Barekatain M., Novikov A., Balog M., Kumar M.P., Dupont E., Ruiz F.J.R., Ellenberg J.S., Wang P., Fawzi O., et al. Mathematical discoveries from program search with large language models. Nature. 2024;625:468–475. doi: 10.1038/s41586-023-06924-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lam R., Sanchez-Gonzalez A., Willson M., Wirnsberger P., Fortunato M., Alet F., Ravuri S., Ewalds T., Eaton-Rosen Z., Hu W., et al. Learning skillful medium-range global weather forecasting. Science. 2023;382:1416–1421. doi: 10.1126/science.adi2336. [DOI] [PubMed] [Google Scholar]
- 46.Merchant A., Batzner S., Schoenholz S.S., Aykol M., Cheon G., Cubuk E.D. Scaling deep learning for materials discovery. Nature. 2023;624:80–85. doi: 10.1038/s41586-023-06735-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bommasani R., Hudson D.A., Adeli E., Altman R., Arora S., Arx S.v., Bernstein M.S., Bohg J., Bosselut A., Brunskill E., et al. On the opportunities and risks of foundation models. arXiv. 2021 doi: 10.48550/arXiv.2108.07258. Preprint at. [DOI] [Google Scholar]
- 48.Xu F., Hao Q., Zong Z., Wang J., Zhang Y., Wang J., Lan X., Gong J., Ouyang T., Meng F., et al. Towards large reasoning models: A survey of reinforced reasoning with large language models. arXiv. 2025 doi: 10.48550/arXiv.2501.09686. Preprint at. [DOI] [Google Scholar]
- 49.Bahri Y., Dyer E., Kaplan J., Lee J., Sharma U. Explaining neural scaling laws. Proc. Natl. Acad. Sci. USA. 2024;121 doi: 10.1073/pnas.2311878121. e2311878121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Henighan T., Kaplan J., Katz M., Chen M., Hesse C., Jackson J., Jun H., Brown T.B., Dhariwal P., Gray S., et al. Scaling laws for autoregressive generative modeling. arXiv. 2020 doi: 10.48550/arXiv.2010.14701. Preprint at. [DOI] [Google Scholar]
- 51.Hoffmann J., Borgeaud S., Mensch A., Buchatskaya E., Cai T., Rutherford E., de Las Casas D., Hendricks L.A., Welbl J., Clark A., et al. Proceedings of the 36th International Conference on Neural Information Processing Systems. NIPS ’22. Curran Associates Inc.; 2022. Training compute-optimal large language models; pp. 30016–30030. [Google Scholar]
- 52.Kaplan J., McCandlish S., Henighan T., Brown T.B., Chess B., Child R., Gray S., Radford A., Wu J., Amodei D. Scaling laws for neural language models. arXiv. 2020 doi: 10.48550/arXiv.2001.08361. Preprint at. [DOI] [Google Scholar]
- 53.Ganguli D., Hernandez D., Lovitt L., Askell A., Bai Y., Chen A., Conerly T., Dassarma N., Drain D., Elhage N., et al. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’22. Association for Computing Machinery; 2022. Predictability and surprise in large generative models; pp. 1747–1764. [DOI] [Google Scholar]
- 54.Ruan Y., Maddison C.J., Hashimoto T. The Thirty-eighth Annual Conference on Neural Information Processing Systems. 2024. Observational scaling laws and the predictability of language model performance. [Google Scholar]
- 55.Schaeffer R., Schoelkopf H., Miranda B., Mukobi G., Madan V., Ibrahim A., Bradley H., Biderman S., Koyejo S. Why has predicting downstream capabilities of frontier AI models with scale remained elusive? arXiv. 2024 doi: 10.48550/arXiv.2406.04391. Preprint at. [DOI] [Google Scholar]
- 56.Brown T.B., Mann B., Ryder N., Subbiah M., Kaplan J., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., et al. Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS ’20. Curran Associates Inc.; 2020. Language models are few-shot learners; pp. 1877–1901. [Google Scholar]
- 57.OpenAI (2024). Learning to reason with LLMs. https://openai.com/index/learning-to-reason-with-llms/.
- 58.Guo D., Yang D., Zhang H., Song J., Zhang R., Xu R., Zhu Q., Ma S., Wang P., Bi X., et al. Deepseek-r1: Incentivizing reasoning capability in LLMs via reinforcement learning. arXiv. 2025 doi: 10.48550/arXiv.2501.12948. Preprint at. [DOI] [Google Scholar]
- 59.Brown B., Juravsky J., Ehrlich R., Clark R., Le Q.V., Ré C., Mirhoseini A. Large language monkeys: Scaling inference compute with repeated sampling. arXiv. 2024 doi: 10.48550/arXiv.2407.21787. Preprint at. [DOI] [Google Scholar]
- 60.Snell C., Lee J., Xu K., Kumar A. Scaling LLM test-time compute optimally can be more effective than scaling model parameters. arXiv. 2024 doi: 10.48550/arXiv.2408.03314. Preprint at. [DOI] [Google Scholar]
- 61.Stroebl B., Kapoor S., Narayanan A. Inference scaling flaws: The limits of LLM resampling with imperfect verifiers. arXiv. 2024 doi: 10.48550/arXiv.2411.17501. Preprint at. [DOI] [Google Scholar]
- 62.Wu Y., Sun Z., Li S., Welleck S., Yang Y. Inference scaling laws: An empirical analysis of compute-optimal inference for problem-solving with language models. arXiv. 2024 doi: 10.48550/arXiv.2408.00724. Preprint at. [DOI] [Google Scholar]
- 63.Li M., Kudugunta S., Zettlemoyer L. The Thirteenth International Conference on Learning Representations. 2025. (mis)fitting scaling laws: A survey of scaling law fitting techniques in deep learning. [Google Scholar]
- 64.Anderson P.W. More is different. Science. 1972;177:393–396. doi: 10.1126/science.177.4047.393. [DOI] [PubMed] [Google Scholar]
- 65.Stanley H.E. International Series of Monographs on Physics. Oxford university press; 1987. Introduction to phase transitions and critical phenomena. [Google Scholar]
- 66.Lubana E.S., Kawaguchi K., Dick R.P., Tanaka H. A percolation model of emergence: Analyzing transformers trained on a formal language. arXiv. 2024 doi: 10.48550/arXiv.2408.12578. Preprint at. [DOI] [Google Scholar]
- 67.Pan A., Bhatia K., Steinhardt J. International Conference on Learning Representations. 2021. The effects of reward misspecification: Mapping and mitigating misaligned models. [Google Scholar]
- 68.Epoch AI (2024). Frontiermath. https://epoch.ai/frontiermath.
- 69.Fu R., Jin G.Z., Liu M. National Bureau of Economic Research. National Bureau of Economic Research; 2022. Does human-algorithm feedback loop lead to error propagation? Evidence from zillow’s zestimate. [DOI] [Google Scholar]
- 70.Hardt M., Mendler-Dünner C. Performative prediction: Past and future. arXiv. 2023 doi: 10.48550/arXiv.2310.16608. Preprint at. [DOI] [Google Scholar]
- 71.Healy K. The performativity of networks. Arch. Eur. Sociol. 2015;56:175–205. doi: 10.1017/S0003975615000107. [DOI] [Google Scholar]
- 72.Perdomo J., Zrnic T., Mendler-Dünner C., Hardt M. Proceedings of the 37th International Conference on Machine Learning. PMLR; 2020. Performative prediction; pp. 7599–7609.https://proceedings.mlr.press/v119/perdomo20a.html [Google Scholar]
- 73.Cen S.H., Ilyas A., Allen J., Li H., Madry A. Measuring strategization in recommendation: Users adapt their behavior to shape future content. arXiv. 2024 doi: 10.48550/arXiv.2405.05596. Preprint at. [DOI] [Google Scholar]
- 74.Pedreschi D., Pappalardo L., Ferragina E., Baeza-Yates R., Barabási A.L., Dignum F., Dignum V., Eliassi-Rad T., Giannotti F., Kertész J., et al. Human-AI coevolution. Artif. Intell. 2025;339 doi: 10.1016/j.artint.2024.104244. [DOI] [Google Scholar]
- 75.Stray J., Halevy A., Assar P., Hadfield-Menell D., Boutilier C., Ashar A., Bakalar C., Beattie L., Ekstrand M., Leibowicz C., et al. Building human values into recommender systems: An interdisciplinary synthesis. ACM Trans. Recomm. Syst. 2024;2:1–20. doi: 10.1145/3632297. [DOI] [Google Scholar]
- 76.Williams M., Carroll M., Narang A., Weisser C., Murphy B., Dragan A. On targeted manipulation and deception when optimizing LLMs for user feedback. arXiv. 2024 doi: 10.48550/arXiv.2411.02306. Preprint at. [DOI] [Google Scholar]
- 77.Bender E.M., Gebru T., McMillan-Major A., Shmitchell S. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’21. Association for Computing Machinery; 2021. On the dangers of stochastic parrots: Can language models be too big? pp. 610–623. [DOI] [Google Scholar]
- 78.Shur-Ofry M. Multiplicity as an AI governance principle. Indiana Law Journal. 2025;100:6. https://www.repository.law.indiana.edu/ilj/vol100/iss4/6/ [Google Scholar]
- 79.Taori R., Hashimoto T.B. Proceedings of the 40th International Conference on Machine Learning vol. 202 of ICML’23. JMLR; 2023. Data feedback loops: model-driven amplification of dataset biases; pp. 33883–33920. [Google Scholar]
- 80.Kolt N. Predicting consumer contracts. Berk. Technol. Law J. 2022;37:71–138. [Google Scholar]
- 81.Veselovsky V., Ribeiro M.H., Cozzolino P., Gordon A., Rothschild D., West R. Prevalence and prevention of large language model use in crowd work. arXiv. 2023 doi: 10.48550/arXiv.2310.15683. Preprint at. [DOI] [Google Scholar]
- 82.Veselovsky V., Ribeiro M.H., West R. Artificial artificial artificial intelligence: Crowd workers widely use large language models for text production tasks. arXiv. 2023 doi: 10.48550/arXiv.2306.07899. Preprint at. [DOI] [Google Scholar]
- 83.Wu T., Zhu H., Albayrak M., Axon A., Bertsch A., Deng W., Ding Z., Guo B., Gururaja S., Kuo T.S., et al. LLMs as workers in human-computational algorithms? replicating crowdsourcing pipelines with llms. arXiv. 2023 doi: 10.48550/arXiv.2307.10168. Preprint at. [DOI] [Google Scholar]
- 84.Jakesch M., Bhat A., Buschek D., Zalmanson L., Naaman M. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. CHI ’23. Association for Computing Machinery; 2023. Co-writing with opinionated language models affects users’ views; pp. 1–15. [DOI] [Google Scholar]
- 85.Fan L., Chen K., Krishnan D., Katabi D., Isola P., Tian Y. 2024 IEEE-CVF Conference on Computer Vision and Pattern Recognition. CVPR; 2024. Scaling laws of synthetic images for model training... for now; pp. 7382–7392. [DOI] [Google Scholar]
- 86.Lee H., Phatale S., Mansoor H., Mesnard T., Ferret J., Lu K., Bishop C., Hall E., Carbune V., Rastogi A., Prakash S. Rlaif vs. rlhf: Scaling reinforcement learning from human feedback with AI feedback. arXiv. 2024 doi: 10.48550/arXiv.2309.00267. Preprint at. [DOI] [Google Scholar]
- 87.Liu R., Wei J., Liu F., Si C., Zhang Y., Rao J., Zheng S., Peng D., Yang D., Zhou D., Dai A.M. First Conference on Language Modeling. 2024. Best practices and lessons learned on synthetic data.https://openreview.net/forum?id=OJaWBhh61C [Google Scholar]
- 88.Singh A., Co-Reyes J.D., Agarwal R., Anand A., Patil P., Garcia X., Liu P.J., Harrison J., Lee J., Xu K., et al. Transactions on Machine Learning Research. 2024. Beyond human data: Scaling self-training for problem-solving with language models. [Google Scholar]
- 89.Alemohammad S., Casco-Rodriguez J., Luzi L., Humayun A.I., Babaei H., LeJeune D., Siahkoohi A., Baraniuk R. The Twelfth International Conference on Learning Representations. 2023. Self-consuming generative models go mad. [Google Scholar]
- 90.Bohacek M., Farid H. Nepotistically trained generative-AI models collapse. arXiv. 2023 doi: 10.48550/arXiv.2311.12202. Preprint at. [DOI] [Google Scholar]
- 91.Gerstgrasser M., Schaeffer R., Dey A., Rafailov R., Korbak T., Sleight H., Agrawal R., Hughes J., Pai D.B., Gromov A., et al. First Conference on Language Modeling. 2024. Is model collapse inevitable? breaking the curse of recursion by accumulating real and synthetic data.https://openreview.net/forum?id=5B2K4LRgmz [Google Scholar]
- 92.Kazdan J., Schaeffer R., Dey A., Gerstgrasser M., Rafailov R., Donoho D.L., Koyejo S. Collapse or thrive? perils and promises of synthetic data in a self-generating world. arXiv. 2024 doi: 10.48550/arXiv.2410.16713. Preprint at. [DOI] [Google Scholar]
- 93.Shumailov I., Shumaylov Z., Zhao Y., Gal Y., Papernot N., Anderson R. The curse of recursion: Training on generated data makes models forget. arXiv. 2024 doi: 10.48550/arXiv.2305.17493. Preprint at. [DOI] [Google Scholar]
- 94.Shumailov I., Shumaylov Z., Zhao Y., Papernot N., Anderson R., Gal Y. AI models collapse when trained on recursively generated data. Nature. 2024;631:755–759. doi: 10.1038/s41586-024-07566-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Kirchenbauer J., Geiping J., Wen Y., Katz J., Miers I., Goldstein T. Proceedings of the 40th International Conference on Machine Learning. PMLR; 2023. A watermark for large language models; pp. 17061–17084.https://proceedings.mlr.press/v202/kirchenbauer23a.html [Google Scholar]
- 96.Kirchenbauer J., Geiping J., Wen Y., Shu M., Saifullah K., Kong K., Fernando K., Saha A., Goldblum M., Goldstein T. The Twelfth International Conference on Learning Representations. 2023. On the reliability of watermarks for large language models. [Google Scholar]
- 97.Sadasivan V.S., Kumar A., Balasubramanian S., Wang W., Feizi S. Can ai-generated text be reliably detected? arXiv. 2024 doi: 10.48550/arXiv.2303.11156. Preprint at. [DOI] [Google Scholar]
- 98.Huang S., Siddarth D. Generative AI and the digital commons. arXiv. 2023 doi: 10.48550/arXiv.2303.11074. Preprint at. [DOI] [Google Scholar]
- 99.Liang W., Izzo Z., Zhang Y., Lepp H., Cao H., Zhao X., Chen L., Ye H., Liu S., Huang Z., et al. Forty-first International Conference on Machine Learning. 2024. Monitoring ai-modified content at scale: A case study on the impact of chatgpt on AI conference peer reviews. [Google Scholar]
- 100.Thompson B., Dhaliwal M., Frisch P., Domhan T., Federico M. In: Findings of the Association for Computational Linguistics: ACL 2024. Ku L.W., Martins A., Srikumar V., editors. Association for Computational Linguistics; 2024. A shocking amount of the web is machine translated: Insights from multi-way parallelism; pp. 1763–1775. [DOI] [Google Scholar]
- 101.Balloccu S., Schmidtová P., Lango M., Dusek O. In: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers) Graham Y., Purver M., editors. Association for Computational Linguistics; 2024. Leak, cheat, repeat: Data contamination and evaluation malpractices in closed-source LLMs; pp. 67–93.https://aclanthology.org/2024.eacl-long.5 [Google Scholar]
- 102.Panickssery A., Bowman S.R., Feng S. The Thirty-eighth Annual Conference on Neural Information Processing Systems. 2024. LLM evaluators recognize and favor their own generations.https://openreview.net/forum?id=4NJBV6Wp0h [Google Scholar]
- 103.Perez E., Huang S., Song F., Cai T., Ring R., Aslanides J., Glaese A., McAleese N., Irving G. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Goldberg Y., Kozareva Z., Zhang Y., editors. Association for Computational Linguistics; 2022. Red Teaming Language Models with Language Models; pp. 3419–3448. [DOI] [Google Scholar]
- 104.Perez E., Ringer S., Lukosiute K., Nguyen K., Chen E., Heiner S., Pettit C., Olsson C., Kundu S., Kadavath S., et al. In: Findings of the Association for Computational Linguistics: ACL 2023. Rogers A., Boyd-Graber J., Okazaki N., editors. Association for Computational Linguistics; 2023. Discovering language model behaviors with model-written evaluations; pp. 13387–13434. [DOI] [Google Scholar]
- 105.Bowman S.R., Hyun J., Perez E., Chen E., Pettit C., Heiner S., Lukosiute K., Askell A., Jones A., Chen A., et al. Measuring Progress on Scalable Oversight for Large Language Models. arXiv. 2022 doi: 10.48550/arXiv.2211.03540. Preprint at. [DOI] [Google Scholar]
- 106.Burns C., Izmailov P., Kirchner J.H., Baker B., Gao L., Aschenbrenner L., Chen Y., Ecoffet A., Joglekar M., Leike J., et al. Forty-first International Conference on Machine Learning. 2024. Weak-to-strong generalization: Eliciting strong capabilities with weak supervision. [Google Scholar]
- 107.Pan A., Jones E., Jagadeesan M., Steinhardt J. Forty-first International Conference on Machine Learning. 2024. Feedback loops with language models drive in-context reward hacking. [Google Scholar]
- 108.Wyllie S., Shumailov I., Papernot N. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’24. Association for Computing Machinery; 2024. Training on synthetic data amplifies bias; pp. 2113–2147. Fairness feedback loops. [DOI] [Google Scholar]
- 109.Feng S., Park C.Y., Liu Y., Tsvetkov Y. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Rogers A., Boyd-Graber J., Okazaki N., editors. Association for Computational Linguistics; 2023. From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair Nlp Models; pp. 11737–11762. [DOI] [Google Scholar]
- 110.Toups C., Bommasani R., Creel K.A., Bana S.H., Jurafsky D., Liang P. Proceedings of the 37th International Conference on Neural Information Processing Systems. NIPS ’23. Curran Associates Inc.; 2024. Ecosystem-level analysis of deployed machine learning reveals homogeneous outcomes; pp. 51178–51201. [Google Scholar]
- 111.Betley J., Tan D., Warncke N., Sztyber-Betley A., Bao X., Soto M., Labenz N., Evans O. Emergent misalignment: Narrow finetuning can produce broadly misaligned LLMs. arXiv. 2025 doi: 10.48550/arXiv.2502.17424. Preprint at. [DOI] [Google Scholar]
- 112.Zou A., Wang Z., Carlini N., Nasr M., Kolter J.Z., Fredrikson M. Universal and transferable adversarial attacks on aligned language models. arXiv. 2023 doi: 10.48550/arXiv.2307.15043. Preprint at. [DOI] [Google Scholar]
- 113.Li A., Zhou Y., Raghuram V.C., Goldstein T., Goldblum M. Commercial LLM agents are already vulnerable to simple yet dangerous attacks. arXiv. 2025 doi: 10.48550/arXiv.2502.08586. Preprint at. [DOI] [Google Scholar]
- 114.Chan A., Salganik R., Markelius A., Pang C., Rajkumar N., Krasheninnikov D., Langosco L., He Z., Duan Y., Carroll M., et al. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’23. Association for Computing Machinery; 2023. Harms from increasingly agentic algorithmic systems; pp. 651–666. [DOI] [Google Scholar]
- 115.Cohen M.K., Kolt N., Bengio Y., Hadfield G.K., Russell S. Regulating advanced artificial agents. Science. 2024;384:36–38. doi: 10.1126/science.adl0625. [DOI] [PubMed] [Google Scholar]
- 116.Kolt, N. (2025). Governing AI agents. Notre Dame Law Review (forthcoming) 101. https://papers.ssrn.com/abstract%3D4772956
- 117.Ruan Y., Dong H., Wang A., Pitis S., Zhou Y., Ba J., Dubois Y., Maddison C.J., Hashimoto T. The Twelfth International Conference on Learning Representations. 2024. Identifying the risks of lm agents with an lm-emulated sandbox. [Google Scholar]
- 118.Shur-Ofry M. Network Law Review. 2024. A networks-of-networks perspective on AI policy.https://www.networklawreview.org/shur-ofry-generative-ai/ [Google Scholar]
- 119.Hammond L., Chan A., Clifton J., Hoelscher-Obermaier J., Khan A., McLean E., Smith C., Barfuss W., Foerster J., Gavenčiak T., et al. Multi-agent risks from advanced ai. arXiv. 2025 doi: 10.48550/arXiv.2502.14143. Preprint at. [DOI] [Google Scholar]
- 120.Taylor J.B., Williams J.C. A black swan in the money market. American Economic Journal: Macroeconomics. 2009;1:58–83. doi: 10.1257/mac.1.1.58. [DOI] [Google Scholar]
- 121.Richards C.E., Tzachor A., Avin S., Fenner R. Rewards, risks and responsible deployment of artificial intelligence in water systems. Nat. Water. 2023;1:422–432. doi: 10.1038/s44221-023-00069-6. [DOI] [Google Scholar]
- 122.Galaz V., Centeno M.A., Callahan P.W., Causevic A., Patterson T., Brass I., Baum S., Farber D., Fischer J., Garcia D., et al. Artificial intelligence, systemic risks, and sustainability. Technol. Soc. 2021;67 doi: 10.1016/j.techsoc.2021.101741. [DOI] [Google Scholar]
- 123.Tzachor A., Devare M., King B., Avin S., Ó hÉigeartaigh S. Responsible artificial intelligence in agriculture requires systemic understanding of risks and externalities. Nat. Mach. Intell. 2022;4:104–109. Ó hÉ igeartaigh, S. [Google Scholar]
- 124.Askell A., Brundage M., Hadfield G. The role of cooperation in responsible AI development. arXiv. 2019 doi: 10.48550/arXiv.1907.04534. Preprint at. [DOI] [Google Scholar]
- 125.Kolt N. Algorithmic Black Swans. Washington University Law Review. 2024;101:1177–1240. [Google Scholar]
- 126.Dafoe A. In: The Oxford Handbook of AI Governance. Bullock J.B., Chen Y.C., Himmelreich J., Hudson V.M., Korinek A., Young M.M., Zhang B., editors. Oxford University Press; 2023. Overview and theoretical lenses; pp. 21–44. AI governance. [DOI] [Google Scholar]
- 127.Arbel Y., Tokson M., Lin A. Systemic regulation of artificial intelligence. Ariz. State Law J. 2024;56:545. [Google Scholar]
- 128.Centeno M.A., Nag M., Patterson T.S., Shaver A., Windawi A.J. The emergence of global systemic risk. Annu. Rev. Sociol. 2015;41:65–85. [Google Scholar]
- 129.Renn O., Lucas K., Haas A., Jaeger C. Things are different today: the challenge of global systemic risks. J. Risk Res. 2019;22:401–415. [Google Scholar]
- 130.Schweizer P.J., Juhola S. Navigating systemic risks: governance of and for systemic risks. Glob. Sustain. 2024;7:e38. [Google Scholar]
- 131.Arnscheidt C.W., Beard S., Hobson T., Ingram P., Kemp L., Mani L., Marcoci A., Mbeva K., hÉ igeartaigh S., Sandberg A., et al. Global Sustainability. 2025. Systemic contributions to global catastrophic risk. forthcoming. [DOI] [Google Scholar]
- 132.Cosens B., Ruhl J.B., Soininen N., Gunderson L., Belinskij A., Blenckner T., Camacho A.E., Chaffin B.C., Craig R.K., Doremus H., et al. Governing complexity: Integrating science, governance, and law to manage accelerating change in the globalized commons. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2102798118. e2102798118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Craig R.K. Stationarity is dead - long live transformation: Five principles for climate change adaptation law. Harv. Environ. Law Rev. 2010;34:9–74. [Google Scholar]
- 134.Ruhl J.B. General design principles for resilience and adaptive capacity in legal systems - with applications to climate change adaptation and resiliency in legal systems. N. C. Law Rev. 2010;89:1373–1404. [Google Scholar]
- 135.Arner D.W. Adaptation and resilience in global financial regulation adaptation and resiliency in legal systems. N. C. Law Rev. 2010;89:1579–1628. [Google Scholar]
- 136.Schwarcz S.L. Systemic risk. Georgetown Law J. 2008;97:193–250. [Google Scholar]
- 137.Schwarcz S.L. Vol. 87. Washington University Law Review; 2009. Regulating Complexity in Financial Markets; pp. 211–268. [Google Scholar]
- 138.Malcai O., Shur-Ofry M. Using complexity to calibrate legal response to covid-19. Front. Phys. 2021;9:650943. doi: 10.3389/fphy.2021.650943. [DOI] [Google Scholar]
- 139.Collingridge D. St. Martin’s Press; 1980. The social control of technology. [Google Scholar]
- 140.Moses, L.B. (2007). Recurring dilemmas: The law’s race to keep up with technological change. University of Illinois Journal of Law, Technology & Policy 2007, 239–286.
- 141.Marchant G.E., Allenby B.R., Herkert J.R., editors. The Growing Gap Between Emerging Technologies and Legal-Ethical Oversight: The Pacing Problem. Vol. 7. Springer Netherlands; 2011. (The International Library of Ethics, Law and Technology). [DOI] [Google Scholar]
- 142.Crootof R., Ard B.J. Structuring techlaw. Harv. J. Law Technol. 2020;34:347–418. [Google Scholar]
- 143.Kaminski M.E. Vol. 103. Boston University Law Review; 2023. Regulating the Risks of Ai; pp. 1347–1411. [Google Scholar]
- 144.Doremus H. Adaptive management as an information problem adaptation and resiliency in legal systems. N. C. Law Rev. 2010;89:1455–1498. [Google Scholar]
- 145.Karkkainen B.C. Information as environmental regulation: Tri and performance benchmarking, precursor to a new paradigm. Georgetown Law J. 2000;89:257–370. [Google Scholar]
- 146.Karkkainen B.C. Bottlenecks and baselines: Tackling information deficits in environmental regulation. Tex. Law Rev. 2007;86:1409–1444. [Google Scholar]
- 147.Kay J.A., King M.A. The Bridge Street Press; 2020. Radical uncertainty. [Google Scholar]
- 148.Marchau V.A.W.J., Walker W.E., Bloemen P.J.T.M., Popper S.W., editors. Decision Making under Deep Uncertainty: From Theory to Practice. Springer International Publishing; 2019. [DOI] [Google Scholar]
- 149.Ford C. Prospects for scalability: Relationships and uncertainty in responsive regulation. Regulation & Governance. 2013;7:14–29. doi: 10.1111/j.1748-5991.2012.01166.x. [DOI] [Google Scholar]
- 150.Crootof R., Kaminski M.E., Price W.N.I. Humans in the loop. Vanderbilt Law Rev. 2023;76:429–510. [Google Scholar]
- 151.Benthall S., Sivan-Sevilla I. ACM Symposium on Computer Science & Law. 2024. Regulatory ci: Adaptively regulating privacy as contextual integrity. [Google Scholar]
- 152.Reuel A., Undheim T.A. Generative AI needs adaptive governance. arXiv. 2024 doi: 10.48550/arXiv.2406.04554. Preprint at. [DOI] [Google Scholar]
- 153.Folke C., Hahn T., Olsson P., Norberg J. Adaptive governance of social-ecological systems. Annu. Rev. Environ. Resour. 2005;30:441–473. doi: 10.1146/annurev.energy.30.050504.144511. [DOI] [Google Scholar]
- 154.Holling C.S. Resilience and stability of ecological systems. Annu. Rev. Ecol. Syst. 1973;4:1–23. [Google Scholar]
- 155.Holling C.S. International Series on Applied Systems Analysis. Wiley; 1978. Adaptive environmental assessment and management. [Google Scholar]
- 156.Ruhl J.B. Regulation by adaptive management—is it possible? Minnesota Journal of Law. Sci. Technol. 2005;7:21–58. [Google Scholar]
- 157.Duit A., Galaz V. Governance and complexity—emerging issues for governance theory. Governance. 2008;21:311–335. [Google Scholar]
- 158.D’Souza R.M. Curtailing cascading failures. Science. 2017;358:860–861. doi: 10.1126/science.aaq0474. [DOI] [PubMed] [Google Scholar]
- 159.Ruhl J.B. Governing cascade failures in complex social-ecological-technological systems: Framing context, strategies, and challenges. Vanderbilt Journal of Entertainment & Technology Law. 2019;22:407–440. [Google Scholar]
- 160.Fisher L., Sandberg A. A safe governance space for humanity: necessary conditions for the governance of global catastrophic risks. Glob. Policy. 2022;13:792–807. doi: 10.1111/1758-5899.13030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Vivo P., Katz D.M., Ruhl J.B. A complexity science approach to law and governance. Philosophical Transactions of the Royal Society A: Mathematical, Philos. Trans. A Math. Phys. Eng. Sci. 2024;382 doi: 10.1098/rsta.2023.0166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.National Institute of Standards and Technology (2023). AI risk management framework. NIST. https://www.nist.gov/itl/ai-risk-management-framework.
- 163.People’s Republic of China (2023). Interim measures for the management of generative artificial intelligence services. https://www.cac.gov.cn/2023-07/13/c_1690898327029107.htm.
- 164.People’s Republic of China (2023). Artificial intelligence law of the people’s republic of china. https://perma.cc/L9E4-5K3V.
- 165.White House (2023). Executive order on the safe, secure, and trustworthy development and use of artificial intelligence.
- 166.White House (2025). Removing barriers to american leadership in artificial intelligence. https://www.whitehouse.gov/presidential-actions/2025/01/removing-barriers-to-american-leadership-in-artificial-intelligence//.
- 167.European Union (2025). General-purpose AI code of practice. https://digital-strategy.ec.europa.eu/en/policies/ai-code-practice.
- 168.Coglianese C., Zeckhauser R., Parson E. Seeking truth for power: Informational strategy and regulatory policymaking. Minn. Law Rev. 2004;89:277–341. [Google Scholar]
- 169.Stephenson M.C. Information acquisition and institutional design. Harv. Law Rev. 2010;124:1422–1484. [Google Scholar]
- 170.Van Loo R. Regulatory monitors: Policing firms in the compliance era. Columbia Law Rev. 2019;119:369–444. [Google Scholar]
- 171.Whittlestone J., Clark J. Why and how governments should monitor AI development. arXiv. 2021 doi: 10.48550/arXiv.2108.12427. Preprint at. [DOI] [Google Scholar]
- 172.Clark J. In: The Oxford Handbook of AI Governance. Bullock J.B., Chen Y.C., Himmelreich J., Hudson V.M., Korinek A., Young M.M., Zhang B., editors. Oxford University Press; 2023. Information markets and AI development; pp. 345–357. [DOI] [Google Scholar]
- 173.Kolt N., Anderljung M., Barnhart J., Brass A., Esvelt K., Hadfield G.K., Heim L., Rodriguez M., Sandbrink J.B., Woodside T., et al. Responsible reporting for frontier AI development. Proceedings of the AAAI-ACM Conference on AI, Ethics, and Society. 2024;7:768–783. doi: 10.1609/aies.v7i1.31678. [DOI] [Google Scholar]
- 174.Reuel A., Bucknall B., Casper S., Fist T., Soder L., Aarne O., Hammond L., Ibrahim L., Chan A., Wills P., et al. Open problems in technical AI governance. arXiv. 2024 doi: 10.48550/arXiv.2407.14981. Preprint at. [DOI] [Google Scholar]
- 175.Casper S., Krueger D., Hadfield-Menell D. International Conference on Learning Representations. 2025. Pitfalls of evidence-based AI policy. [Google Scholar]
- 176.Oreskes N., Conway E.M. Merchants of doubt: How a handful of scientists obscured the truth on issues from tobacco smoke to global warming. Bloomsbury Publishing; 2010. [Google Scholar]
- 177.Koessler L., Schuett J., Anderljung M. Risk thresholds for frontier ai. arXiv. 2024 doi: 10.48550/arXiv.2406.14713. Preprint at. [DOI] [Google Scholar]
- 178.Galle B. In praise of ex ante regulation. Vanderbilt Law Rev. 2015;68:1715–1760. [Google Scholar]
- 179.Posner R.A. Oxford University Press; 2004. Catastrophe: Risk and Response.https://academic.oup.com/book/40836 [Google Scholar]
- 180.Sunstein C.R. Irreversible and catastrophic. Cornell Law Rev. 2005;91:841–898. [Google Scholar]
- 181.Simon H.A. A behavioral model of rational choice. Q. J. Econ. 1955;69:99–118. doi: 10.2307/1884852. [DOI] [Google Scholar]
- 182.Harremoes P., Gee D., MacGarvin M., Stirling A., Keys J., Wynne B., Vaz S.G. Taylor and Francis; 2013. The Precautionary Principle in the 20th Century: Late Lessons from Early Warnings. [Google Scholar]
- 183.Nash J.R. Standing and the precautionary principle. Columbia Law Rev. 2008;108:494–528. [Google Scholar]
- 184.Tegmark M., Omohundro S. Provably safe systems: the only path to controllable agi. arXiv. 2023 doi: 10.48550/arXiv.2309.01933. Preprint at. [DOI] [Google Scholar]
- 185.Dalrymple D., Skalse J., Bengio Y., Russell S., Tegmark M., Seshia S., Omohundro S., Szegedy C., Goldhaber B., Ammann N., et al. Towards guaranteed safe AI: A framework for ensuring robust and reliable AI systems. arXiv. 2024 doi: 10.48550/arXiv.2405.06624. Preprint at. [DOI] [Google Scholar]
- 186.Buhl M.D., Sett G., Koessler L., Schuett J., Anderljung M. Safety cases for frontier AI. arXiv. 2024 doi: 10.48550/arXiv.2410.21572. Preprint at. [DOI] [Google Scholar]
- 187.Clymer J., Gabrieli N., Krueger D., Larsen T. Safety cases: How to justify the safety of advanced AI systems. arXiv. 2024 doi: 10.48550/arXiv.2403.10462. Preprint at. [DOI] [Google Scholar]
- 188.Farber D.A. Uncertainty. Georgetown Law J. 2010;99:901–960. [Google Scholar]
- 189.Sunstein C.R. Beyond the precautionary principle. Univ. PA. Law Rev. 2003;151:1003–1058. [Google Scholar]
- 190.Simon H.A. The architecture of complexity. Proc. Am. Phil. Soc. 1962;106:467–482. [Google Scholar]
