Discussion about this post

User's avatar
Mike X Cohen, PhD's avatar

Fascinating, thought-provoking, and troubling. Thanks for the great write-up!

LLMs learn complex patterns in token sequences -- patterns that are way too complex for humans to identify or understand -- and then continue generating token sequences that match those patterns.

I guess what happens here is that these preferences and other behavioral characteristics are embedded in the training token sequences in ways that we don't anticipate or understand.

Expand full comment

No posts