Picture for Wonjong Rhee

Wonjong Rhee

Soft Injection of Task Embeddings Outperforms Prompt-Based In-Context Learning

Add code
Jul 28, 2025
Viaarxiv icon

ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation

Add code
Jul 02, 2025
Viaarxiv icon

Task-Specific Preconditioner for Cross-Domain Few-Shot Learning

Add code
Dec 20, 2024
Figure 1 for Task-Specific Preconditioner for Cross-Domain Few-Shot Learning
Figure 2 for Task-Specific Preconditioner for Cross-Domain Few-Shot Learning
Figure 3 for Task-Specific Preconditioner for Cross-Domain Few-Shot Learning
Figure 4 for Task-Specific Preconditioner for Cross-Domain Few-Shot Learning
Viaarxiv icon

Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models

Add code
Dec 03, 2024
Figure 1 for Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Figure 2 for Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Figure 3 for Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Figure 4 for Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Viaarxiv icon

A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets

Add code
Oct 14, 2024
Viaarxiv icon

Towards a Better Evaluation of Out-of-Domain Generalization

Add code
Jun 02, 2024
Viaarxiv icon

Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

Add code
May 18, 2024
Viaarxiv icon

An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM

Add code
Mar 27, 2024
Figure 1 for An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
Figure 2 for An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
Figure 3 for An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
Figure 4 for An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
Viaarxiv icon

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Add code
Mar 22, 2024
Figure 1 for Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
Figure 2 for Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
Figure 3 for Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
Figure 4 for Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
Viaarxiv icon

Improving Forward Compatibility in Class Incremental Learning by Increasing Representation Rank and Feature Richness

Add code
Mar 22, 2024
Viaarxiv icon