Picture for Yifei Cao

Yifei Cao

EAGLE: Episodic Appearance- and Geometry-aware Memory for Unified 2D-3D Visual Query Localization in Egocentric Vision

Add code
Nov 12, 2025
Viaarxiv icon

From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling

Add code
Oct 01, 2025
Viaarxiv icon

MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark

Add code
Sep 26, 2025
Viaarxiv icon

Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction

Add code
Jun 14, 2025
Figure 1 for Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction
Figure 2 for Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction
Figure 3 for Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction
Figure 4 for Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction
Viaarxiv icon

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training

Add code
Feb 06, 2025
Viaarxiv icon

TranStable: Towards Robust Pixel-level Online Video Stabilization by Jointing Transformer and CNN

Add code
Jan 25, 2025
Viaarxiv icon

Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling

Add code
Jan 07, 2024
Figure 1 for Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling
Figure 2 for Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling
Figure 3 for Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling
Figure 4 for Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling
Viaarxiv icon