Publications

2026

ICLR2026

# Image Generation # Diffusion # Prompt Optimization

TIPO: Text to Image with Text Presampling for Prompt Optimization

Shih-Ying Yeh, Sang-Hyun Park, Giyeong Oh, Min Song, Youngjae Yu

Arxiv

ICLR2026

# EmbodiedAI # Multimodal # Video

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Suwhan Choi, Jaeyoon Jung, Haebin Seong, Minchan Kim, Minyeong Kim, Yongjun Cho, Yoonshik Kim, Yubeen Park, Youngjae Yu, Yunsung Lee

Arxiv

ICLR2026

# NLP # Multilingual # CoT

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Guijin Son, Donghun Yang, Hitesh Laxmichand Patel, Amit Agarwal, Hyunwoo Ko, Chanuk lim, Srikant Panda, Minhyuk Kim, Nikunj drolia, Dasol Choi, Kyong-Ha Lee, Youngjae Yu

Arxiv

ICLR2026

# Multimodal # MLLM

Teaching Metric Distance to Autoregressive Multimodal Foundational Models

Jiwan Chung, Saejin Kim, Yongrae Jo, Jaewoo Park, Dongjun Min, Youngjae Yu

Arxiv

AAAI2026 (Oral)

# Multimodal # AudioLLM

Do Language Models Associate Sound with Meaning? A Multimodal Study of Sound Symbolism

Jinhong Jeong*, Sunghyun Lee*, Jaeyoung Lee, Seonah Han, Youngjae Yu

Arxiv

AAAI2026

# Multimodal # LLM # Benchmark

Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation

Jaewoo Park*, Jungyang Park*, Dongju Jang, Jiwan Chung, Byungwoo Yoo, Jaewoo Shin, Seonjoon Park, Taehyeong Kim, Youngjae Yu

Arxiv

2025

Humanoids 2025 (Workshop)

# Robotics # Humanoid

Baymax in Reality: A Humanoid System for Non-Contact Health Monitoring and Empathetic Interaction

Junhyeong Park, Taemoon Jeong, Minseo Kwak, Jisoo Kim, Seungbeen Lee, Sungjoon Choi, Youngjae Yu

Humanoids 2025 (Workshop)

# Robotics # Humanoid

K-pop Demon Robots

Sungwoong Kim, Minseo Kim, Siyeol Kim, Hwasup Lim, Youngjae Yu

CIKM2025

# Cross-lingual # Embeddings

NMIXX: Domain-Adapted Neural Embeddings for Cross-Lingual eXploration of Finance

Hanwool Lee, Sara Yu, Yewon Hwang, Jonghyun Choi, Heejae Ahn, Sungbum Jung, Youngjae Yu

Arxiv

Neurips2025

# Computer Vision

Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks

Giyeong Oh, Woohyun Cho, Siyeol Kim, Suhwan Choi, Youngjae Yu

Arxiv

Neurips2025

# LLM # DPO # Human Preference

KL Penalty Control via Perturbation for Direct Preference Optimization

Sangkyu Lee, Janghoon Han, Hosung Song, Stanley Jungkyu Choi, Honglak Lee, Youngjae Yu

Arxiv

Neurips2025

# Computer Vision

Diffusion-Driven Two-Stage Active Learning for Low-Budget Semantic Segmentation

Jeongin Kim, Wonho Bae, YouLee Han, Giyeong Oh, Youngjae Yu, Danica J. Sutherland, Junhyug Noh

Arxiv

EMNLP2025

# Embodied AI # LLM # Safety

Subtle Risks, Critical Failures: A Framework for Diagnosing Physical Safety of LLMs for Embodied Decision Making

Yejin Son*, Minseo Kim*, Sungwoong Kim, Seungju Han, Jian Kim, Dongju Jang, Youngjae Yu, Chanyoung Park

Arxiv

EMNLP2025

# Multimodal # Agent # Reasoning

VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms

Seungwon Lim, Sungwoong Kim, Jihwan Yu, Sungjae Lee, Jiwan Chung, Youngjae Yu

Arxiv

EMNLP2025

# Multimodal # Document # Information Retrieval

Zero-shot Multimodal Document Retrieval via Cross-modal Question Generation

Yejin Choi*, Jaewoo Park*, Janghan Yoon, Saejin Kim, Jaehyun Jeon, Youngjae Yu

Arxiv

EMNLP2025

# Multimodal # Audio # Video

MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

Woohyun Cho, Youngmin Kim, Sunghyun Lee, Youngjae Yu

Arxiv

EMNLP2025 (Findings)

# Multimodal # Commonsense Reasoning # Abductive Reasoning

Multimodal UNcommonsense: From Odd to Ordinary and Ordinary to Odd

Yejin Son*, Saejin Kim*, Dongjun Min, Youngjae Yu

Arxiv

COLM2025

# Multimodal # Safety # Societal Implications

G1yphD3c0de: Towards Safer Language Models on Visually Perturbed Texts

Yejin Choi, Yejin Yeo, Yejin Son, Seungju Han, Youngjae Yu

COLM2025

# NLP # Fact Verification

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers

Wooseok Seo*, Seungju Han*, Jaehun Jung, Benjamin Newman, Seungwon Lim, Seungbeen Lee, Ximing Lu, Yejin Choi, Youngjae Yu

Arxiv

COLM2025

# Multimodal # Video

HIPPO-VIDEO : Simulating Watch Histories with Large Language Models for History-Driven Video Highlighting

Jeongeun Lee, Youngjae Yu, Dongha Lee

Arxiv

ICCV2025

# Video Generation # Distillation # Preference Learning

V.I.P.: Iterative Online Preference Distillation for Efficient Video Diffusion Models

Jisoo Kim, Wooseok Seo, Junwan Kim, Seungho Park, Sooyeon Park, Youngjae Yu

Arxiv

ICCV2025

# 3D # Human Motion # Generation

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Jungbin Cho*, Junwan Kim*, Jisoo Kim, Minseo Kim, Mingu Kang, Sungeun Hong, Tae-Hyun Oh, Youngjae Yu

Arxiv

ICCV2025

# Multimodal # Ambiguity

VAGUE: Visual Contexts Clarify Ambiguous Expressions

Heejeong Nam, Jinwoo Ahn, Keummin Ka, Jiwan Chung, Youngjae Yu

Arxiv

MICCAI2025

# Computer Vision # Scalp Diagnosis # Image Translation

Scalp Diagnostic System With Label-Free Segmentation and Training-Free Image Translation

Youngmin Kim*, Saejin Kim*, Hoyeon Moon, Youngjae Yu, Junhyug Noh

Arxiv

ACL2025

# Multimodal # Nonverbal Conversation # Video # 3D

Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues

Youngmin Kim*, Jiwan Chung*, Jisoo Kim, Sunghyun Lee, Sangkyu Lee, Junhyeok Kim, Cheoljong Yang, Youngjae Yu

Arxiv

ACL2025 (Oral)

# NLP # Personality # Reinforcement Learning

Persona Dynamics: Unveiling the Impact of Personality Traits on Agents in Text-Based Games

Seungwon Lim, Seungbeen Lee, Dongjun Min, Youngjae Yu

Arxiv

ACL2025

# Multimodal # MLLM

Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists?

Jiwan Chung, Janghan Yoon, Junhyeong Park, Sangeyl Lee, Joowon Yang, Sooyeon Park, Youngjae Yu

Arxiv

ACL2025

# NLP # LLM # Safety

Representation Bending for Large Language Model Safety

Ashkan Yousefpour*, Taeheon Kim*, Ryan S. Kwon, Seungbeen Lee, Wonje Jeung, Seungju Han, Harrison Ngan, Youngjae Yu, Jonghyun Choi

Arxiv

# Computer Vision # Video # Industrial Application

SlumpGuard: An AI-Powered Real-Time System for Automated Concrete Slump Prediction via Video Analysis

Youngmin Kim*, Giyeong Oh*, Kwangsoo Youm, Youngjae Yu

Arxiv

# Multimodal # Reasoning

Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation

Jiwan Chung*, Junhyeok Kim*, Siyeol Kim, Jaeyoung Lee, Minsoo Kim, Youngjae Yu

Arxiv

# multimodal # MLLM # AI for Science

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Guijin Son, Jiwoo Hong, Honglu Fan, Heejeong Nam, Hyunwoo Ko, Seungwon Lim, Jinyeop Song, Jinha Choi, Gonçalo Paulo, Youngjae Yu

Arxiv

# Multimodal # UI

Do MLLMs Capture How Interfaces Guide User Behavior? A Benchmark for Multimodal UI/UX Design Understanding

Jaehyun Jeon, Minsoo Kim, Janghan Yoon, Sumin Shim, Yejin Choi, Hanbin Kim, Youngjae Yu

Arxiv

# NLP # Math # Education

Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation

Jaewoo Park*, Jungyang Park*, Dongju Jang, Jiwan Chung, Byungwoo Yoo, Jaewoo Shin, Seonjoon Park, Taehyeong Kim, Youngjae Yu

Arxiv

# Multimodal # Video # Egocentric

GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance

Junhyeok Kim*, Jaewoo Park*, Junhee Park, Sangeyl Lee, Jiwan Chung, Jisung Kim, Ji Hoon Joung, Youngjae Yu

Arxiv

# LLM # Watermark # Low-rank Adaptation

SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, Youngjae Yu

Arxiv

ICRA2025

# Embodied AI # Robotics # Navigation

CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction

Suhwan Choi, Yongjun Cho, Minchan Kim, Jaeyoon Jung, Myunchul Joe, Yubeen Park, Minseo Kim, Sungwoong Kim, Sungjae Lee, Hwiseong Park, Jiwan Chung, Youngjae Yu

Arxiv

NAACL2025 (Oral)

# Multimodal # LLM # Chart Generation

C^2 : Scalable Auto-Feedback for LLM-based Chart Generation

Woosung Koh*, Janghan Yoon*, Minhyung Lee, Youngjin Song, Jaegwan Cho, Jaehyun Kang, Taehyeon Kim, Seyoung Yun, Youngjae Yu, Bongshin Lee

Arxiv

NAACL2025 (Findings)

# NLP # Personality # Psychometrics

Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics

Seungbeen Lee*, Seungwon Lim*, Seungju Han, Giyeong Oh, Jiwan Chung, Minju Kim, Yeonsoo Lee, Dongha Lee, Jinyoung Yeo, Youngjae Yu

Arxiv

NAACL2025 (Findings)

# Multimodal # Egocentric # Dialogue System

EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the Wild

Junhyeok Kim, Minsoo Kim, Jiwan Chung, Jungbin Cho, Jisoo Kim, Sungwoong Kim, Gyeongbo Sim, Youngjae Yu

Arxiv

AAAI2025

# 3D # Speech # Facial expression

DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation

Jisoo Kim*, Jungbin Cho*, Joonho Park, Soonmin Hwang, Da Eun Kim, Geon Kim, Youngjae Yu

Arxiv

AAAI2025

# Multimodal # Debiasing

MASS: Overcoming Language Bias in Image-Text Matching

Jiwan Chung, Seungwon Lim, Sangkyu Lee, Youngjae Yu

Arxiv

AAAI2025

# Multimodal # Video LLM # Preference

i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment

Daechul Ahn, Yura Choi, San Kim, Youngjae Yu, Dongyeop Kang, Jonghyun Choi

Arxiv