Discussion about this post

User's avatar
Girish Sastry's avatar

Nice post! A couple random thoughts.

Given the large amount of actions with widespread deployment, you could have a "spec violation bounty" program which incentivizes external researchers to find and report violations. Maybe eventually we want to build up a bunch of "case law" for violations.

Re: hardware-based attestation, a simpler intermediate step is just doing a hashed/merkle log of sysprompt from prod and then revealing it if there's a dispute.

I wonder what to do about internal deployment specs. It seems pretty cheap/easy to just ask companies to attest that their internal deployment specs are ~the same or publish them or something, just to have some assurance that there isn't shenanigans going on internally as models get better.

No posts

Ready for more?