Vision-language models are impressive—until you ask them something simple.A recent study shows that state-of-the-art sys...

Vision-language models are impressive—until you ask them something simple.A recent study shows that state-of-the-art systems struggle with basic visual tasks like counting shapes or detecting overlaps, achieving only ~58% accuracy on average—far below human performance So what are they actually “seeing”?AI doesn’t perceive images the way we do. It approximates, infers, guesses. And sometimes, it fails where humans succeed instantly.#AI #ComputerVision #AIRealityCheckhttps://anhnguyen.me/2024/vlms-are-blind/

Read Original

Related

Mastodon discussion 14m ago

うーん、米ちょっと心配ですねiPadやMacの値上げは本当に「不可避」だったのか--高い利益率、米議員はアップルを「強欲」と非難 https://japan.cnet.com/article/35249555/#Apple #LLM #ne...

うーん、米ちょっと心配ですねiPadやMacの値上げは本当に「不可避」だったのか--高い利益率、米議員はアップルを「強欲」と非難 https://japan.cnet.com/article/35249555/#Apple #LLM #news #bot