Projects

Geometric Action Model for Visuomotor Control

An under-review geometry-aware action modeling approach for visuomotor control.

Under Review, 2026

#Robotics #Visuomotor Control #3D Vision

Repurposing Geometric Foundation Models for Multi-view Diffusion

We repurpose geometric foundation model features as latent space for multi-view diffusion, achieving 4.4× faster training convergence and zero-shot geometry decoding.

Under Review, 2026

#3D Vision #Diffusion Models #Generative AI

Project Page Builder

A visual builder for creating academic research project pages — drag-and-drop sections, multiple templates, LLM-powered paper extraction, and one-click GitHub deployment.

#Web Development #HCI #Developer Tools

Publication

CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models

A training technique that supervises attention maps using geometric correspondence, reducing training iterations by half while achieving superior multi-view generation quality.

CVPR 2026

#Generative AI #Diffusion Models #3D Vision

Publication

Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching

A method that learns condition-dependent source distributions for flow matching, enabling straighter transport paths and up to 3x faster convergence.

Under Review, 2026

#Generative AI #Flow Matching

Driving Practice: Optimizing Segmentation Models in Resource-Constrained Environments

A practical study on improving semantic segmentation for autonomous driving under limited data and compute, focusing on architecture choices, coordinate-aware convolutions, and optimizer/schedule tuning.

AIKU (Korea University AI Society) Internal Project

#Semantic Segmentation #Autonomous Driving #Deep Learning #Computer Vision

Revisiting Flow-Conditioned Motion Transfer via Pseudo-Flow and Consecutive Local Attention

A novel framework that redefines motion guidance in video diffusion by extracting 'pseudo-flow' from 2D attention layers, enabling more interpretable and robust motion transfer.

IEIE 2025 Undergraduate Paper Competition (Excellence Award)

#Motion Transfer #Video Generation #Computer Vision #Diffusion Models

Publication

ReMoTE: A Benchmark for Object Motion Transfer

A benchmark dataset and evaluation protocol for object motion transfer, introducing improved metrics that better correlate with human perception.

ITC-CSCC 2025 (Oral)

#Benchmark #Motion Transfer #Computer Vision

Latent Diffusion Models for Domain Adaptation

Finetuned semantic map-conditioned LDMs with ControlNet for unsupervised and unpaired Synthetic-to-Real image translation.

#Domain Adaptation #Diffusion Models #ControlNet

Horang Studio

AI profile picture generation service for Korea University festival — identity-preserving Stable Diffusion pipeline serving 2,000+ students.

#Generative AI #Stable Diffusion #Web Service

IConZIC (Image-Conditioned Zero-shot Image Captioning)

Proposed faster and stable caption generation utilizing Gibbs sampling and Masked VLM for zero-shot image captioning.

#NLP #VLM #Image Captioning

MelitsUp / @tune

Music recommendation service using VLM and LLM-based lyrics augmentation to match user preferences and context.

#Music RecSys #LLM #VLM

Projects.

Geometric Action Model for Visuomotor Control

Repurposing Geometric Foundation Models for Multi-view Diffusion

Project Page Builder

CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models

Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching

Driving Practice: Optimizing Segmentation Models in Resource-Constrained Environments

Revisiting Flow-Conditioned Motion Transfer via Pseudo-Flow and Consecutive Local Attention

ReMoTE: A Benchmark for Object Motion Transfer

Latent Diffusion Models for Domain Adaptation

Horang Studio

IConZIC (Image-Conditioned Zero-shot Image Captioning)

MelitsUp / @tune