Picture for Jiashi Feng

Jiashi Feng

NUS

SuperCLIP: CLIP with Simple Classification Supervision

Add code
Dec 16, 2025
Viaarxiv icon

Depth Anything 3: Recovering the Visual Space from Any Views

Add code
Nov 13, 2025
Viaarxiv icon

Puppeteer: Rig and Animate Your 3D Models

Add code
Aug 14, 2025
Figure 1 for Puppeteer: Rig and Animate Your 3D Models
Figure 2 for Puppeteer: Rig and Animate Your 3D Models
Figure 3 for Puppeteer: Rig and Animate Your 3D Models
Figure 4 for Puppeteer: Rig and Animate Your 3D Models
Viaarxiv icon

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Add code
Jul 10, 2025
Figure 1 for Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
Figure 2 for Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
Figure 3 for Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
Figure 4 for Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Figure 1 for Seed1.5-VL Technical Report
Figure 2 for Seed1.5-VL Technical Report
Figure 3 for Seed1.5-VL Technical Report
Figure 4 for Seed1.5-VL Technical Report
Viaarxiv icon

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer

Add code
Apr 14, 2025
Figure 1 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Figure 2 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Figure 3 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Figure 4 for The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
Viaarxiv icon

Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding

Add code
Apr 14, 2025
Figure 1 for Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Figure 2 for Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Figure 3 for Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Figure 4 for Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
Viaarxiv icon

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Add code
Apr 11, 2025
Figure 1 for Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Figure 2 for Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Figure 3 for Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Figure 4 for Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
Viaarxiv icon

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Add code
Apr 11, 2025
Viaarxiv icon

4th PVUW MeViS 3rd Place Report: Sa2VA

Add code
Apr 01, 2025
Viaarxiv icon