Picture for Yining Hong

Yining Hong

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Add code
Jun 18, 2025
Viaarxiv icon

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

Add code
May 28, 2025
Viaarxiv icon

From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems

Add code
Feb 11, 2025
Figure 1 for From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems
Figure 2 for From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems
Figure 3 for From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems
Figure 4 for From Hazard Identification to Controller Design: Proactive and LLM-Supported Safety Engineering for ML-Powered Systems
Viaarxiv icon

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

Add code
Oct 30, 2024
Figure 1 for SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Figure 2 for SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Figure 3 for SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Figure 4 for SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Viaarxiv icon

What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing

Add code
Sep 14, 2024
Figure 1 for What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Figure 2 for What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Figure 3 for What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Figure 4 for What Is Wrong with My Model? Identifying Systematic Problems with Semantic Data Slicing
Viaarxiv icon

FlexAttention for Efficient High-Resolution Vision-Language Models

Add code
Jul 29, 2024
Figure 1 for FlexAttention for Efficient High-Resolution Vision-Language Models
Figure 2 for FlexAttention for Efficient High-Resolution Vision-Language Models
Figure 3 for FlexAttention for Efficient High-Resolution Vision-Language Models
Figure 4 for FlexAttention for Efficient High-Resolution Vision-Language Models
Viaarxiv icon

3D-VLA: A 3D Vision-Language-Action Generative World Model

Add code
Mar 14, 2024
Figure 1 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 2 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 3 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Figure 4 for 3D-VLA: A 3D Vision-Language-Action Generative World Model
Viaarxiv icon

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

Add code
Jan 16, 2024
Figure 1 for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
Figure 2 for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
Figure 3 for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
Figure 4 for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
Viaarxiv icon

GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs

Add code
Nov 08, 2023
Figure 1 for GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Figure 2 for GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Figure 3 for GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Figure 4 for GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs
Viaarxiv icon

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Add code
Nov 06, 2023
Figure 1 for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Figure 2 for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Figure 3 for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Figure 4 for CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Viaarxiv icon