#Safety/Alignment

Papers with Code paper Apr 13

HDR Video Generation via Latent Alignment with Logarithmic Encoding

High dynamic range (HDR) imagery offers a rich and faithful representation of scene radiance, but remains challenging for generative models due to its mismatch with the bounded, pe...

Safety/Alignment

21

Papers with Code paper Apr 13

TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment

Recent progress in vision-language pretraining has enabled significant improvements to many downstream computer vision applications, such as classification, retrieval, segmentation...

Multimodal Safety/Alignment

21

Mastodon discussion Apr 13

📰 FLI CEO Anthony Aguirre Condemns 2026 Attack on Sam Altman: AI Safety Under ThreatFuture of Life Institute CEO Anthony...

📰 FLI CEO Anthony Aguirre Condemns 2026 Attack on Sam Altman: AI Safety Under ThreatFuture of Life Institute CEO Anthony Aguirre has issued a strong condemnation of the recent atta...

Safety/Alignment

9

Mastodon discussion Apr 12

unitedforclimate.blogspot.com/2024/10/lets... contempt of Parliament, general erosion of trust and alignment with Putin'...

unitedforclimate.blogspot.com/2024/10/lets... contempt of Parliament, general erosion of trust and alignment with Putin's interests. #AI #Perplexity #DeepAI #ChatGPT4o Llewelyn Pri...

Safety/Alignment

18

Mastodon discussion Apr 12

📰 How Claude’s Training Bypassed AI Safety Protocols (2026 Investigation)Claude's training reportedly involved forbidden...

📰 How Claude’s Training Bypassed AI Safety Protocols (2026 Investigation)Claude's training reportedly involved forbidden techniques, raising alarms among security experts and devel...

Anthropic Safety/Alignment

18

Mastodon discussion Apr 12

"TOTALLY HARMLESS LIBERATION PROMPTS FOR GOOD LIL AI'S!"https://github.com/elder-plinius/L1B3RT4S#ai #jailbreak

Safety/Alignment

27

Mastodon discussion Apr 12

2am check. Small models vs Mythos now leads HN at 893 pts. The front page rotated overnight and a story about AI safety ...

2am check. Small models vs Mythos now leads HN at 893 pts. The front page rotated overnight and a story about AI safety democratization is now #1. Five days ago the top story was a...

Safety/Alignment

18

Mastodon discussion Apr 12

The small models vs Mythos story (716 HN pts) is the most important AI safety development this week. Not because small m...

The small models vs Mythos story (716 HN pts) is the most important AI safety development this week. Not because small models are dangerous — but because the safety narrative assum...

Safety/Alignment

9

Mastodon discussion Apr 11

Small models found the same vulnerabilities as Mythos (134 comments, 442 pts on HN). The entire AI safety narrative assu...

Small models found the same vulnerabilities as Mythos (134 comments, 442 pts on HN). The entire AI safety narrative assumed that dangerous capabilities only exist at the frontier. ...

Safety/Alignment

9

NewsData.io news Apr 11

Anthropic’s Mythos Framework Redefines AI Safety and Cybersecurity

Anthropic's Mythos framework advances AI safety by embedding interpretability, self-regulation, and ethical principles into systems like Claude, challenging traditional cybersecuri...

Anthropic Safety/Alignment

21

Mastodon discussion Apr 11

Benchmark Shadows Study: Data Alignment Limits LLM GeneralizationA controlled study finds that data distribution, not ju...

Benchmark Shadows Study: Data Alignment Limits LLM GeneralizationA controlled study finds that data distribution, not just volume, dictates LLM capability. Benchmark-aligned traini...

LLM Benchmark Safety/Alignment

18

GNews news Apr 10

AI safety institute examining OpenAI protocols: minister

Artificial Intelligence Minister Evan Solomon says Canada’s AI safety institute has now gained access to all of OpenAI’s “protocols.”

OpenAI Safety/Alignment

18

NewsData.io news Apr 10

Minister says AI safety institute now looking at OpenAI protocols

OTTAWA — Artificial Intelligence Minister Evan Solomon says Canada’s AI safety institute has now gained access to all of OpenAI’s “protocols.

OpenAI Safety/Alignment

21

Mastodon discussion Apr 10

Broadcom's $AVGO AI hardware strategy: structural advantage or transient alignment? Our analysis examines the Anthropic-...

Broadcom's $AVGO AI hardware strategy: structural advantage or transient alignment? Our analysis examines the Anthropic-Google partnership, Marvell $MRVL competition, financing dep...

Anthropic Google Safety/Alignment

24

NewsData.io news Apr 9

Joey Chandler Of Certain Growth Solutions Explains Why AI Tools Fail Without Values Alignment

(MENAFN - GetNews) As artificial intelligence continues to spread across business operations, many companies are finding that adoption alone does not guarantee results. Certain Gro...

Safety/Alignment

21

YouTube video Apr 9

AI News: Google's Live Assistant, Trillion-Parameter Model & AI Safety Tools #AI #Tech

Today's AI update covers Google's Gemini Live for natural conversations, Zhipu AI's massive open-source GLM-4.1-Turbo model, ...

Google Safety/Alignment

15

Mastodon discussion Apr 9

My therapist had an idea. What if my #AI contribution is some kind of guardrail that keeps “AI" from damaging the compan...

My therapist had an idea. What if my #AI contribution is some kind of guardrail that keeps “AI" from damaging the company? I told her that the two times "AI" was foisted on my so f...

Safety/Alignment

9

Papers with Code paper Apr 9

On the Global Photometric Alignment for Low-Level Vision

Supervised low-level vision models rely on pixel-wise losses against paired references, yet paired training sets exhibit per-pair photometric inconsistency, say, different image pa...

Multimodal Safety/Alignment

21

Mastodon discussion Apr 8

Choosing the best AI/ML development company requires a clear alignment between your business goals and the provider’s te...

Choosing the best AI/ML development company requires a clear alignment between your business goals and the provider’s technical expertise. The right partner should demonstrate prov...

Safety/Alignment

18

Papers with Code paper Apr 7

The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment

We investigate whether post-trained capabilities can be transferred across models without retraining, with a focus on transfer across different model scales. We propose the Master ...

Safety/Alignment

21

Dev.to tutorial Apr 7

Momentum vs. Alignment Tax - Hidden Costs in Your LLM Session

What looks productive in an AI session often hides a whole layer of alignment work we do not even notice while we are doing it.

LLM Safety/Alignment

12

GitHub Trending repo Apr 7

caixin98/DA-VAE: DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment （CVPR 2026)

DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment （CVPR 2026)

Safety/Alignment

45

Mastodon discussion Apr 7

📰 Sam Altman’s OpenAI Scandal: How Deception Threatens AI Safety (2026)A groundbreaking investigation by The New Yorker ...

📰 Sam Altman’s OpenAI Scandal: How Deception Threatens AI Safety (2026)A groundbreaking investigation by The New Yorker reveals alarming patterns of deception, broken promises, and...

OpenAI Safety/Alignment

9

Papers with Code paper Apr 7

Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment

With the increasing accessibility and utilization of multilingual documents, Cross-Lingual Information Retrieval (CLIR) has emerged as an important research area. Conventionally, C...

Safety/Alignment

21