Mastodon discussion Discussions Apr 30 2 views

#AI jailbreak expert had spent much of the previous two years testing and prodding large language models such as Claude ...

by arihak

#AI jailbreak expert had spent much of the previous two years testing and prodding large language models such as Claude and ChatGPT, always with the aim of making them say things they shouldn’t. But this was one of his most advanced hacks yet: a sophisticated plan of manipulation, which involved him being cruel, vindictive, sycophantic, even abusive. #security https://www.theguardian.com/technology/2026/apr/29/meet-the-ai-jailbreakers-i-see-the-worst-things-humanity-has-produced

Read Original

Anthropic OpenAI Safety/Alignment

Metadata

Reblogs Count: 1
Account: arihak@techhub.social

Mastodon discussion 29m ago

「Meta AI」と声で会話できるように、新AIモデル「Muse Spark」搭載（ケータイ Watch）｜ｄメニューニュース（NTTドコモ） https://www.yayafa.com/2802410/ #「MetaAI」と声で会話でき...

「Meta AI」と声で会話できるように、新AIモデル「Muse Spark」搭載（ケータイ Watch）｜ｄメニューニュース（NTTドコモ） https://www.yayafa.com/2802410/ #「MetaAI」と声で会話できるように、新AIモデル「MuseSpark」搭載 #AgenticAi #AI #ArtificialGeneralIn...

Mastodon discussion 30m ago

#Objectionai verspricht ein KI-Tribunal für Wahrheit im #Journalismus.Doch was passiert, wenn ein privates System kritis...

#Objectionai verspricht ein KI-Tribunal für Wahrheit im #Journalismus.Doch was passiert, wenn ein privates System kritische Artikel automatisch prüft, öffentlich als „under investi...

Mastodon discussion 32m ago

Ik vraag me dan af, als je er dan voor kiest, waarom niet een Europese #LLM provider als #Lumo, #LeChat (beide #Mistral)...

Ik vraag me dan af, als je er dan voor kiest, waarom niet een Europese #LLM provider als #Lumo, #LeChat (beide #Mistral)?——#Malta gaat vanaf deze maand alle inwoners een AI-cursus ...

#AI jailbreak expert had spent much of the previous two years testing and prodding large language models such as Claude ...

Metadata

Related

「Meta AI」と声で会話できるように、新AIモデル「Muse Spark」搭載（ケータイ Watch）｜ｄメニューニュース（NTTドコモ） https://www.yayafa.com/2802410/ #「MetaAI」と声で会話でき...

#Objectionai verspricht ein KI-Tribunal für Wahrheit im #Journalismus.Doch was passiert, wenn ein privates System kritis...

Ik vraag me dan af, als je er dan voor kiest, waarom niet een Europese #LLM provider als #Lumo, #LeChat (beide #Mistral)...