Picture for Kun-Yu Lin

Kun-Yu Lin

CoopDiff: Anticipating 3D Human-object Interactions via Contact-consistent Decoupled Diffusion

Add code
Aug 10, 2025
Viaarxiv icon

Panoptic Captioning: Seeking An Equivalency Bridge for Image and Text

Add code
May 22, 2025
Viaarxiv icon

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

Add code
May 21, 2025
Viaarxiv icon

ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding

Add code
Apr 25, 2025
Viaarxiv icon

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Add code
Apr 02, 2025
Figure 1 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 2 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 3 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 4 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Viaarxiv icon

Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks

Add code
Mar 31, 2025
Viaarxiv icon

ViSpeak: Visual Instruction Feedback in Streaming Videos

Add code
Mar 17, 2025
Figure 1 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Figure 2 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Figure 3 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Figure 4 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Viaarxiv icon

Task-Oriented 6-DoF Grasp Pose Detection in Clutters

Add code
Feb 24, 2025
Viaarxiv icon

ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations

Add code
Jan 24, 2025
Figure 1 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Figure 2 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Figure 3 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Figure 4 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Viaarxiv icon

TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching

Add code
Nov 26, 2024
Figure 1 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Figure 2 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Figure 3 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Figure 4 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Viaarxiv icon