Picture for Feng Zheng

Feng Zheng

InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction

Add code
Dec 08, 2024
Viaarxiv icon

Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases

Add code
Dec 03, 2024
Figure 1 for Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases
Figure 2 for Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases
Figure 3 for Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases
Figure 4 for Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases
Viaarxiv icon

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

Add code
Nov 29, 2024
Figure 1 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Figure 2 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Figure 3 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Figure 4 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Viaarxiv icon

PlantCamo: Plant Camouflage Detection

Add code
Oct 23, 2024
Figure 1 for PlantCamo: Plant Camouflage Detection
Figure 2 for PlantCamo: Plant Camouflage Detection
Figure 3 for PlantCamo: Plant Camouflage Detection
Figure 4 for PlantCamo: Plant Camouflage Detection
Viaarxiv icon

MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

Add code
Oct 12, 2024
Figure 1 for MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Figure 2 for MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Figure 3 for MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Figure 4 for MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Viaarxiv icon

Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models

Add code
Oct 10, 2024
Figure 1 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Figure 2 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Figure 3 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Figure 4 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Viaarxiv icon

CAR: Controllable Autoregressive Modeling for Visual Generation

Add code
Oct 07, 2024
Figure 1 for CAR: Controllable Autoregressive Modeling for Visual Generation
Figure 2 for CAR: Controllable Autoregressive Modeling for Visual Generation
Figure 3 for CAR: Controllable Autoregressive Modeling for Visual Generation
Figure 4 for CAR: Controllable Autoregressive Modeling for Visual Generation
Viaarxiv icon

Unlocking Memorization in Large Language Models with Dynamic Soft Prompting

Add code
Sep 20, 2024
Figure 1 for Unlocking Memorization in Large Language Models with Dynamic Soft Prompting
Figure 2 for Unlocking Memorization in Large Language Models with Dynamic Soft Prompting
Figure 3 for Unlocking Memorization in Large Language Models with Dynamic Soft Prompting
Figure 4 for Unlocking Memorization in Large Language Models with Dynamic Soft Prompting
Viaarxiv icon

All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents

Add code
Aug 20, 2024
Figure 1 for All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents
Figure 2 for All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents
Figure 3 for All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents
Figure 4 for All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents
Viaarxiv icon

Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

Add code
Jul 16, 2024
Figure 1 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Figure 2 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Figure 3 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Figure 4 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Viaarxiv icon