#ai safety efforts seem pretty useless as long as we cant inspect and interpret what happens during inferenceIt may seem...

#ai safety efforts seem pretty useless as long as we cant inspect and interpret what happens during inferenceIt may seem like an #llm recognizing it is being tested and responds differently, but how can we really know?The prompt would have to be different to distinguish between when it is or is not being tested. How do we know the output isn't simply the statistically most likely response to a prompt we intend to be a test?🔗 https://tonysull.co/notes/ai-safety-efforts-seem-pretty-useless

Read Original

Related