#ai safety efforts seem pretty useless as long as we cant inspect and interpret what happens during inferenceIt may seem like an #llm recognizing it is being tested and responds differently, but how can we really know?The prompt would have to be different to distinguish between when it is or is not being tested. How do we know the output isn't simply the statistically most likely response to a prompt we intend to be a test?🔗 https://tonysull.co/notes/ai-safety-efforts-seem-pretty-useless
Related
📰 Microsoft ditches Teams feature that put attendees into the same virtual roomMicrosoft said 'Together' mode added too ...
📰 Microsoft ditches Teams feature that put attendees into the same virtual roomMicrosoft said 'Together' mode added too much 'implementation complexity' and will steer users toward...
Wie wird sich die #Hochschuldidaktik durch KI verändern? Wie sieht die Zukunft der Hochschulen in einem AfD-regierten Bu...
Wie wird sich die #Hochschuldidaktik durch KI verändern? Wie sieht die Zukunft der Hochschulen in einem AfD-regierten Bundesland aus?Mit der KI als Sparring-Partner durchdenkt Isab...
🕵🏻♂️ [InfoSec MASHUP] 20/2026 - The Platform Is the Attack Surface.The supply chain attack story this week isn't about ...
🕵🏻♂️ [InfoSec MASHUP] 20/2026 - The Platform Is the Attack Surface.The supply chain attack story this week isn't about a sketchy package lurking in a dark corner of npm. It's abou...