Alignment via opposed values

Feb 04, 2026

It may seem like aligning intelligent agents with each other requires that they share the same values. But in some cases, alignment is achieved via opposed values. For example, you pick something up and hold it by squeezing your fingers around in so that they’re all pushing against each other. The opposed values allow you to pick up the object. The opposable thumb is one of our greatest assets as humans precisely because it can oppose the other fingers.

This phenomenon whereby great abilities are achieved via aligned opposition is seen all over. Antagonistic muscle pairs are crucial for motor behavior. If one muscle pulls in one direction, its antagonist can pull in the opposite direction to slow down and smooth out the motion. Learning to slow down movements is a vital part of how infants become competent at organizing their bodies to achieve their goals. In the legal system, we pit a prosecutor against a defense attorney, each of whom is pushing for an opposite goal. In the economy, businesses compete with each other, and the result is the discovery of consumer preferences. Consumers themselves compete with each other to obtain scarce goods and services. And buyers and sellers oppose each other: buyers want to buy low while sellers want to sell high. In general, negative feedback control might be an example of this type of thing: a system is made to function well by introducing some forces that oppose certain tendencies.

Opposition is helpful to a system when it is highly coordinated, as if controlled by a virtual governor to enable the production of developmental/behavioral/cognitive phenomena. But how do you coordinate opposing forces? The answer is by getting them to agree on abstract conditions, such as marginal benefit and marginal cost. Two opposing basketball teams end up highly coordinated—offense and defense converging on the same spots on the floor—because they each agree about the highest value spots on the floor to score or defend. Another example is how buyers and sellers end up converging on an equilibrium price because even though their concrete goals are different—buy low vs sell high—each shares the same abstract goal, which is to not waste any of their budget. Buyers will bid up the price of a good in order to make effective use of their money, while sellers will lower their asking price in order to make effective use of their wares.

The value of opposition creates incentives for the system as a whole to keep either side from getting too strong, enabling a world where each basketball team tries as hard as possible to win while salary limitations and draft lotteries prevent one team for dominating all others into infinity. Additionally, even if a team were to achieve such dominance, the result might be death via boredom. No individual player on the team would have any fun playing basketball, and so the team itself might collapse.

The shared abstract conditions that opposed agents may agree on don’t have to be programmed into the agents but can emerge organically from optimization within the rules of the game—basketball teams figure out over time that they should try to take higher-value shots on the margin and defend higher-value shots on the margin.

This kind alignment via opposition may even show up in mathematics. Free and forgetful adjoint functors, for example, oppose each other in that one produces some structure while the other throws it away. Both agree on an abstract condition, however, which is to be as lazy about their tasks as possible while still getting the job done. The opposed processes end up becoming something like dual descriptions of the same structure.

The simple combination of opposed goals plus agreement on abstract conditions can create highly dynamic and evolving systems. Basketball games, for example, exhibit evolving patterns of play as offense and defense shift in response to each other. With enough time, teams exhibit new strategies that were never specified into the rules of basketball or in the organization of the team, and individual players develop new competencies that were never built by evolution nor were waiting latent in the brain to be unleashed, such as dunking.

Interesting Essays

Discussion about this post

Ready for more?