Career Profile

Hi, I’m a Ph.D. student in CSE at Seoul National University. Now, I work at Vision and Learning Lab. I’m deeply passionate about multimodal research, focusing on extending language models beyond text to integrate other modalities. Here are some detailed aspects of my work:

  • Cross-modal Interaction: Investigating the dependencies between different modalities to create architectures that can seamlessly fuse textual, visual, and auditory data. Exploring innovative training methods to enhance the model’s ability to learn and leverage cross-modal correlations.
  • Multimodal LLMs: Integrating language models with non-textual inputs, such as images and audio, to enrich contextual understanding.
  • Spoken Dialogue Systems: Currently focused on transforming traditional language models into human-like, dynamic, speech-based conversational agents. Designing systems capable of real-time understanding and response, enhancing user engagement and accessibility.

Education

M.S./Ph.D. in Computer Science and Engineering

2022 - Now
Seoul National University

Advisor: Gunhee Kim

B.S. in Electrical and Computer Engineering

2015 - 2022
Seoul National University

Graduated with Summa Cum Laude

Publications

Behavior-SD: Behaviorally Aware Spoken Dialogue Generation with Large Language Models
Sehun Lee*, Kang-wook Kim*, Gunhee Kim
In NAACL, 2025
Meta-Learning Approach for Joint Multimodal Signals with Multimodal Iterative Adaptation
Sehun Lee*, Wonkwang Lee*, Gunhee Kim
In TMLR, 2024
Panoramic Vision Transformer for Saliency Detection on 360º Videos
Heeseung Yun, Sehun Lee, Gunhee Kim
In ECCV, 2022