4 Comments
User's avatar
Little Kenny's avatar

At a first glance, it seems that a challenge to extending the concept of an anatomical compiler to an AI alignment compiler is the heterogeneity of AI models. A specific instance of an anatomical compiler, I presume, would be purpose-built for a specific type of cellular network in a specific type of organism, consisting of a specific set of cell types and signaling pathways exchanging specific types of messages. Whereas there is a whole world of AI models beyond the headlining LLMs coming into existence with an endlessly wide variety of meanings symbolically encoded in the information being exchanged with other humans and AIs in each of their respective environments.

(Although to be complete, I could also mention here that the electricity supply and other elements of keeping the hardware running, and/or the economic tokens some of them are given with which to subcontract other digital services, could be implemented into AI models as meaningful signals, and this realm would be more homogeneous across them all.)

Are there any thoughts on this?

Benjamin Lyons's avatar

I would like to imagine that, although I do envision a class of technologies for various systems rather than just one, we do find general systems that apply across a whole range of diverse intelligences, suitable for controlling a system made of humans, AIs, cyborgs, and other weird things. I believe markets are such a technology...we'll see!

Little Kenny's avatar

Has there been any thought given to whether an AI alignment compiler would be put into use while a model is being trained, or post-training (i.e., when it’s deployed), or both?

Benjamin Lyons's avatar

I've always just thought of it as a post-training thing. But maybe alignment compilers could be used to help align the outcomes of a training process to the desired ends. I don't know!