Mastodon discussion Discussions Mar 25 4 views

#ai safety efforts seem pretty useless as long as we cant inspect and interpret what happens during inferenceIt may seem...

by tonysull

#ai safety efforts seem pretty useless as long as we cant inspect and interpret what happens during inferenceIt may seem like an #llm recognizing it is being tested and responds differently, but how can we really know?The prompt would have to be different to distinguish between when it is or is not being tested. How do we know the output isn't simply the statistically most likely response to a prompt we intend to be a test?🔗 https://tonysull.co/notes/ai-safety-efforts-seem-pretty-useless

Read Original

Safety/Alignment

Metadata

Reblogs Count: 1
Account: tonysull@indieweb.social

Mastodon discussion 10m ago

📰 Microsoft ditches Teams feature that put attendees into the same virtual roomMicrosoft said 'Together' mode added too ...

📰 Microsoft ditches Teams feature that put attendees into the same virtual roomMicrosoft said 'Together' mode added too much 'implementation complexity' and will steer users toward...

Mastodon discussion 12m ago

Wie wird sich die #Hochschuldidaktik durch KI verändern? Wie sieht die Zukunft der Hochschulen in einem AfD-regierten Bu...

Wie wird sich die #Hochschuldidaktik durch KI verändern? Wie sieht die Zukunft der Hochschulen in einem AfD-regierten Bundesland aus?Mit der KI als Sparring-Partner durchdenkt Isab...

Mastodon discussion 12m ago

🕵🏻‍♂️ [InfoSec MASHUP] 20/2026 - The Platform Is the Attack Surface.The supply chain attack story this week isn't about ...

🕵🏻‍♂️ [InfoSec MASHUP] 20/2026 - The Platform Is the Attack Surface.The supply chain attack story this week isn't about a sketchy package lurking in a dark corner of npm. It's abou...

#ai safety efforts seem pretty useless as long as we cant inspect and interpret what happens during inferenceIt may seem...

Metadata

Related

📰 Microsoft ditches Teams feature that put attendees into the same virtual roomMicrosoft said 'Together' mode added too ...

Wie wird sich die #Hochschuldidaktik durch KI verändern? Wie sieht die Zukunft der Hochschulen in einem AfD-regierten Bu...

🕵🏻‍♂️ [InfoSec MASHUP] 20/2026 - The Platform Is the Attack Surface.The supply chain attack story this week isn't about ...