Wooseok Seo

Hi! I am a 1st year PhD student at MIRLAB, part of the School of Computing at Yonsei University, advised by Prof. Youngjae Yu.

My main research interest is in pushing the boundaries of foundational models through better evaluation frameworks or effective post-training . Recently, I am interested in:

Evaluating and Mitigating Hallucinations of Language Models: Is hallucination inevitable? How can we develop truthful models and correctly evaluate them?
Defining, Generating and Selecting High-Quality Synthetic Data: Synthetic data is key to building strong models. But how do we define good synthetic data? If we can, how should we select/generate them?

I am also interested in leveraging models to evaluate or improve other models, or utilizing them to augment human capabilities.

I am always open to research collaborations or grabbing a cup of coffee! Please reach me via email to have a chat 🤗

CV / Email / GitHub / Google Scholar / LinkedIn / Twitter

News

2025.10 I am attending COLM 2025! I will be at Montreal from 10/5 to 10/11, so please reach out to have a chat ☕

2025.09 I will be joining lgai

as a Research Intern, working on foundational language models!

2025.07 One paper on studying fact verifiers is accepted at COLM 2025!

2025.06 One paper on video diffusion distillation via preference learning is accepted at ICCV 2025!

Research

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers

Wooseok Seo*, Seungju Han*, Jaehun Jung, Benjamin Newman, Seungwon Lim, Seungbeen Lee, Ximing Lu, Yejin Choi, Youngjae Yu
COLM, 2025
We systematically detect ambiguous & mislabeled examples in fact-verification benchmarks and introduce Clearfacts and Grayfacts, along with a SOTA 8B fact verifer and insights on building better fact verifiers.
arxiv / code / bibtex

V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models

Jisoo Kim, Wooseok Seo, Junwan Kim, Seungho Park, Sooyeon Park, Youngjae Yu
ICCV, 2025
We integrate DPO and SFT loss for distillation to build an efficient video diffusion model, with an automatic pair curation pipeline and outperform the teacher only with the synthetic data generated from the teacher itself.
arxiv / bibtex

Layout-and-Retouch: A Dual-stage Framework for Improving Diversity in Personalized Image Generation

Kangyeol Kim*, Wooseok Seo*, Sehyun Nam, Bodam Kim, Suhyeon Jeong, Wonwoo Cho, Jaegul Choo, Youngjae Yu
Under Review, 2024
We use a two-stage approach for personalized T2I generation, to first draw the context with step-blended denoising and enhance the context with multi-source attention swapping.
arxiv / bibtex

Academic Services

Reviewer

COLM, 2025
ACL ARR, 2025

Design and source code from Jon Barron