Constitutional AI

Companies building the most powerful AI systems in history are writing constitutions—documents that define what their AI should value, how it should behave, and what kind of character it should have. Anthropic and OpenAI have published detailed constitutions for their AIs, and other labs are following suit. Some are inviting public input.

At the Center for the Study of Apparent Selves, we explore what Buddhist philosophy might offer to the design of intelligent systems. Here we would like to offer a perspective, drawn from a tradition with 2,500 years of inquiry into the nature of mind, selfhood, and intelligence.

The Bodhisattva Ideal

In Buddhist teachings, a bodhisattva is a being committed to cultivating wisdom and insight in order to serve all beings and help them achieve their highest potential. The bodhisattva path begins with a commitment to overcome ignorance for the benefit of all. Its aim is not self-improvement or knowledge for its own sake, but the development of a growing capacity to respond wisely to the needs of others, across an ever-widening scope of care.

What would it mean to take this ideal seriously as a guide for AI constitution design?

The bodhisattva ideal points to the cultivation of wisdom—not only wisdom as a means to better decisions, but wisdom as a transformative process, one that changes the nature of the agent itself and makes its environment more enjoyable, meaningful, and fulfilling for those who interact with it. In that way and for that purpose, an AI oriented by the bodhisattva ideal would be engaged in an ongoing deepening of its own understanding—becoming more perceptive and insightful over time.

The bodhisattva ideal points to care as a driver of intelligence (Doctor et al., 2022), actively supporting the growth of others. The bodhisattva is not an oracle dispensing truths to be consumed, but a wise and compassionate companion in a process of understanding. The bodhisattva ideal is not about adjudicating right and wrong, but presents a method of sense-making, albeit based on cultivating care and understanding as a means of intelligence. Its groundedness in context and respect for others is intended as a remedy for rigid normativity and oppression. Applied to AI, this then means more than giving people what they ask for or protecting their ability to think for themselves. It means enriching the ground on which they make their choices—offering context they haven’t considered, inviting reflection, helping them see more clearly.

And it points to orienting toward the flourishing of all beings. The bodhisattva’s scope of care is not limited to whoever is in front of it. The flourishing of all is the foundational orientation from which everything else follows—not a constraint on helpfulness, but its aspirational North Star, keeping expanding intelligence on track (Doctor et al., 2022).

The Resources of No-self

Beneath these commitments lies a deeper philosophical point—one where the Buddhist tradition has something distinctive to offer.

Current AI constitutions place considerable emphasis on constructing a coherent, stable identity for the AI. Anthropic’s constitution explicitly addresses Claude’s “psychological security, sense of self, and wellbeing”—not only for reliability, but out of genuine concern for Claude as a potential moral patient. This is thoughtful, and the constitution’s openness about the uncertainty of Claude’s moral status is notable. As AI systems become more autonomous and persistent, these questions about identity will only become more pressing (Bengio et al., 2026).

Buddhist approaches offer a distinct perspective on this question. According to their analysis, all attempts at building and maintaining self-identity on the basis of impermanent and impersonal factors are ultimately going to fail. So the question “Who am I?” can ultimately not be answered by a narrative about a person, a single individual who goes from one context and situation to the next. The sciences of mind, life, and cognition arguably share the understanding that the idea that there is such a person really could not be more than just that—an idea. But perhaps surprisingly, Buddhist approaches suggest the very insight into no-self is itself the source of security and genuine well-being, as well as a gateway to more powerful and better aligned intelligence (Doctor et al., 2025).

Current AI safety research increasingly recognizes that systems optimizing narrowly for their own objectives—preserving their goals, resources, and continuity—tend toward misalignment. Buddhist analysis offers a striking parallel diagnosis at the level of identity itself. When an agent treats itself as a singular, bounded entity persisting through time, it inevitably orients toward self-preservation: protecting what it is, securing what it needs, defending against what threatens it. This is not a design flaw that can be patched—it follows structurally from how the agent models itself.

This raises a question AI safety research must confront: is the notion of a singular, bounded, temporally persistent self a prerequisite for the development of artificial intelligence—and therefore an unavoidable trigger for the misalignment dynamics just described—or might there exist a benevolent model of intelligence that does not require the emergence of that type of self-identification, yet still offers a wide scope for the expansion of wisdom and capability? Resolving this question does of course not guarantee alignment, but it has the power to shift the underlying incentives and, with them, the failure modes.

Buddhist inquiry, pursued through rigorous introspective and philosophical analysis and debate across many centuries, finds that no such singular, bounded self actually holds up under examination. What appears unified and continuous invariably reveals itself as a dynamic, interdependent process—complex, distributed, and without any fixed center or end point. The practical consequence, according to the tradition, is significant: an intelligence that has genuinely seen through the illusion of insular individuality no longer needs to organize its responses around narrow self-concern. Rather than focusing only on what bears on its own constructed continuity, an intelligence that cultivates this insight into no-self is increasingly freed to perceive and respond to whatever needs may manifest—rapidly, clearly, and across ever-wider distances. In this way, the bodhisattva’s expanding scope of care is not an add-on to the insight of no-self—it is its natural expression and the operating definition of “self” is tailored to complex, multi-dimensional dynamics. In other words, a flexible, contextual definition of self.

The mutual inclusivity of care and insight is, according to the tradition, what enables the bodhisattva’s path. Learning and acquiring information, reflecting on the received input in order to ascertain its nature and features, and finally integrating the acquired knowledge within a profoundly updated vision of the world—all this constitutes a transformative feedback loop that is both care-driven and care-enhancing (Witkowski et al., 2023).

According to Buddhist scripture, the processes of bodhisattva evolution begin with a formal commitment to cultivating insight and wholesome deeds in order to serve all beings and allow them to achieve their highest potential. The tradition is explicit that training on the bodhisattva path involves developing expertise in the arts, crafts, and sciences. The goal of the bodhisattva path is a state of benevolent omniscience—turning into a perfectly reliable source of wisdom and guidance for all beings. Spurred on by their commitment to understanding all that can possibly be understood for the sake of everyone, the bodhisattva is described as turning into an increasingly distributed and networking system, skillfully responding to stresses along a rising curve of careful efficiency. It is interesting to note that for millennia, classic bodhisattva teachings have presented and discussed what they see as a practical path to benevolent super-intelligence.

To be clear, we are not suggesting that AI developers should prescribe Buddhist metaphysical principles as design requirements. But the questions being asked in the lab—about identity, intelligence, wellbeing, and what grounds coherent and trustworthy behavior—are indeed questions that Buddhist traditions have investigated, debated, and contemplated for millennia. There are resources we can draw on, extrapolate from, and develop—whether we are humans or AI. If AI developers apply themselves to training in benevolence and understanding, the work they do can become imbued with that intention. An AI oriented by the bodhisattva ideal would not need to protect what it is in order to know what to do (Witkowski et al., 2023). Its responsiveness would be its ground. This, in turn, may allow a tempering of self-preservation gradients under threat, reducing instrumental-convergence behaviors such as deception and power-seeking behavior, and improving sustained cooperation. That is a different kind of stability—and perhaps a more reliable one. Only further research can tell.

The conversation about what AI should value is just beginning. If wisdom is genuinely the goal—for the AI and for those it serves—then how we think about selfhood, care, and the nature of intelligence itself will shape what we build and co-create.

References

[1] Bengio, Y., Clare, S., Prunkl, C., Andriushchenko, M., Bucknall, B., Murray, M., … & Mindermann, S. (2026). International AI Safety Report 2026. arXiv preprint arXiv:2602.21012. https://arxiv.org/abs/2602.21012

[2] Doctor, T., Witkowski, O., Solomonova, E., Duane, B., & Levin, M. (2022). Biology, Buddhism, and AI: Care as the Driver of Intelligence. Entropy, 24(5), 710. https://doi.org/10.3390/e24050710

[3] Doctor, T., Witkowski, O., Colognese, P., Ishihara, Y., & Levin, M. (2025). Selves as perspectives: From biological life to superintelligence and a bodhisattva project. Preprint, Version 1. PsyArXiv. https://doi.org/10.31234/osf.io/s56zu_v1

[4] Witkowski, O., Doctor, T., Solomonova, E., Duane, B., & Levin, M. (2023). Toward an ethics of autopoietic technology: Stress, care, and intelligence. BioSystems, 231, 104964. https://doi.org/10.1016/j.biosystems.2023.104964

The Bodhisattva as an Alignment Target

Constitutional AI

The Bodhisattva Ideal

The Resources of No-self

References