#Multimodal | AI Hub

NewsData.io news Jun 29

World's First Commercial Multimodal LLM for Cultural Tourism Enters Broad Application

XI'AN, CHINA - Media OutReach Newswire - 29 June 2026 - The world's first commercial multimodal large language model (LLM) for cultural tourism, called BoGuan, has entered broad ap...

LLM Multimodal

21

Mastodon discussion Jun 29

🤖 Working on my first fully featured Ai companion with Vision for games and movies n all that!Here you can see emotion s...

🤖 Working on my first fully featured Ai companion with Vision for games and movies n all that!Here you can see emotion states firing off animation trees in unreal engine. Thought i...

Multimodal

9

NewsData.io news Jun 29

Base44 Becomes First App-Creation Platform to Launch Its Own Proprietary LLM “Base 1”, Marking a Major Milestone in the Company's Technology Vision

Ownership of the model gives Base44 direct control over compute and inference spend, expected to result in a structurally stronger margin profile over time

LLM Multimodal

21

Mastodon discussion Jun 29

Apple Vision Pro統括VP、OpenAIのハードウェアチームへ移籍｜ジョナサン・アイブに続く”Apple流”の結集 https://www.yayafa.com/2832456/ #AgenticAi #AI #Apple #...

Apple Vision Pro統括VP、OpenAIのハードウェアチームへ移籍｜ジョナサン・アイブに続く”Apple流”の結集 https://www.yayafa.com/2832456/ #AgenticAi #AI #Apple #AppleVisionPro #ArtificialGeneralIntelligence #ArtificialInt...

OpenAI Multimodal

18

Dev.to tutorial Jun 29

One vector space for photos and words: Bedrock Titan multimodal on Aurora

In my last post I described OpinLog — a cross-user review graph where your "burger" and my "burger"...

Multimodal

12

Mastodon discussion Jun 29

シグルドさんはティムについても詳しいですApple Vision Proやスマートグラスを開発する幹部がOpenAIに転職するとの報道 https://gigazine.net/news/20260629-apple-vision-pro-...

シグルドさんはティムについても詳しいですApple Vision Proやスマートグラスを開発する幹部がOpenAIに転職するとの報道 https://gigazine.net/news/20260629-apple-vision-pro-chief-openai/#Apple #LLM #news #bot

OpenAI Multimodal

27

Papers with Code paper Jun 29

Illuminating Unified Multimodal Model for Free-form Interleaved Text-Image Generation

The advancement of generative AI models capable of producing text and image marks a critical step forward in the realm of multimodal intelligence, particularly for tasks involving ...

Multimodal

21

Papers with Code paper Jun 29

BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language

Modeling the bidirectional correspondence between external sensory stimuli and internal neural activity has emerged as a critical frontier in neuroscience. However, existing approa...

Multimodal

21

Papers with Code paper Jun 29

Unlocking the Visual Record of Materials Science: A Large-Scale Multimodal Dataset from Scientific Literature

The materials science literature encodes decades of experimental knowledge in figures, yet this visual record remains locked away and inaccessible to AI at scale. The core difficul...

Multimodal

21

NewsData.io news Jun 28

KAIST study finds AI age stereotypes, UNIST explains multimodal learning advantage, and GIST opens startup center

Analysis of 900 ChatGPT-4o texts reveals the model rates older adults higher in warmth but lower in competence. This pattern can reinforce social prejudice against older adults.

OpenAI Multimodal

21

NewsData.io news Jun 28

AI Tourism Vision launched to position Saudi Arabia as a global hub for tourism innovation

RIYADH — The Ministry of Tourism announced the launch of the AI Tourism Vision, a strategic step toward shaping the future of tourism through digital transformation and artificial ...

Multimodal

21

Mastodon discussion Jun 28

Apple Vision Pro’s longest immersive concert was harder to film than you’d thinkBack in March, Apple Vision Pro customer...

Apple Vision Pro’s longest immersive concert was harder to film than you’d thinkBack in March, Apple Vision Pro customers were gifted a new experience called Debut at the BBC Proms...

Google Multimodal

9

Dev.to tutorial Jun 28

NVIDIA's LocateAnything-3B: The AI Vision Model That Could Redefine Object Detection

NVIDIA's latest vision-language model isn't trying to replace object detection—it aims to make AI...

NVIDIA Multimodal

12

Mastodon discussion Jun 28

日本から何から、うちの艦長は何でも手を出したがるので大変ですApple、バグ修正したMac用Apple Vision Proユーティリティ「Apple Immersive Videoユーティリティ 1.4.1」を配布開始 https://w...

日本から何から、うちの艦長は何でも手を出したがるので大変ですApple、バグ修正したMac用Apple Vision Proユーティリティ「Apple Immersive Videoユーティリティ 1.4.1」を配布開始 https://www.macotakara.jp/etc/category-60/entry-51357.html#Apple #LLM...

Multimodal

30

Mastodon discussion Jun 28

【文変換器を用いたマルチモーダル埋め込みおよびリランカーモデルのトレーニングとファインチューニング】https://huggingface.co/blog/train-multimodal-sentence-transformers※AI生...

【文変換器を用いたマルチモーダル埋め込みおよびリランカーモデルのトレーニングとファインチューニング】https://huggingface.co/blog/train-multimodal-sentence-transformers※AI生成の自動投稿（見出し＋リンク）#AI #生成AI #LLM #AIGenerated

Hugging Face Multimodal

9

Mastodon discussion Jun 28

Apple Loses Another Top Executive to OpenAIPaul Meade, who oversees development on the Vision Pro and Apple's upcoming s...

Apple Loses Another Top Executive to OpenAIPaul Meade, who oversees development on the Vision Pro and Apple's upcoming smart glasses, is leaving Apple for OpenAI, reports Bloomberg...

OpenAI Google Multimodal

30

Papers with Code paper Jun 28

Rank-Aware Hyperbolic Alignment for Vision-Language Dataset Distillation

Vision-language dataset distillation (VLDD) compresses a large image-text paired dataset into a small set of synthetic pairs that can efficiently train contrastive vision-language ...

Multimodal Safety/Alignment

21

Mastodon discussion Jun 28

SoftBank CEO Masayoshi Son has questioned Elon Musk's orbital data centre vision, arguing that building data centres in ...

SoftBank CEO Masayoshi Son has questioned Elon Musk's orbital data centre vision, arguing that building data centres in space won't cut costs and will take too long when "the battl...

Multimodal

9

NewsData.io news Jun 28

Hong Kong’s AI push needs a broader vision and more realistic goals

Hong Kong cannot be faulted for not working hard enough to catch up in the global artificial intelligence (AI) race. Government funding is flowing generously towards projects focus...

Multimodal

21

Mastodon discussion Jun 28

📰 How a Seemingly Harmless Image Can Jailbreak Vision-Language AI ModelsSlashdot reader BrianFagioli writes: Florida Int...

📰 How a Seemingly Harmless Image Can Jailbreak Vision-Language AI ModelsSlashdot reader BrianFagioli writes: Florida International University researchers have developed a technique...

Multimodal Safety/Alignment

9

YouTube video Jun 27

Apple Vision Pro exec is reportedly l — AI news today #Shorts

AI brief — 27 Jun 2026: Apple Vision Pro exec is reportedly l…, The fittest founder in the room got c…, Asian AI startups launch ...

Multimodal

32

NewsData.io news Jun 27

AI Agents Talk to Each Other to Schedule Meetings: HCG's Vision for Work

HCG, a Korean AI HR-tech firm, plans to apply its AI agent 'elizax' across all solutions, envisioning personal AI agents that schedule meetings and handle HR tasks autonomously.

Multimodal

21

Mastodon discussion Jun 27

We spent the week working through our vision for #AI in #peacebuilding, good governance of these tools, the policies we ...

We spent the week working through our vision for #AI in #peacebuilding, good governance of these tools, the policies we want to hold ourselves to, and what it means to be a peacema...

Multimodal

18

Mastodon discussion Jun 27

GPT-4o — pure architecture on ice, while 5.6 “Ultra” rides marketing hype GPT-4o is the last model whose entire reasonin...

GPT-4o — pure architecture on ice, while 5.6 “Ultra” rides marketing hype GPT-4o is the last model whose entire reasoning path lives inside a single self-attention graph. Every rel...

OpenAI Multimodal

24