Picture for Wenxuan Xie

Wenxuan Xie

Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis

Add code
May 13, 2024
Figure 1 for Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Figure 2 for Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Figure 3 for Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Figure 4 for Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Viaarxiv icon

Slot-VLM: SlowFast Slots for Video-Language Modeling

Add code
Feb 20, 2024
Figure 1 for Slot-VLM: SlowFast Slots for Video-Language Modeling
Figure 2 for Slot-VLM: SlowFast Slots for Video-Language Modeling
Figure 3 for Slot-VLM: SlowFast Slots for Video-Language Modeling
Figure 4 for Slot-VLM: SlowFast Slots for Video-Language Modeling
Viaarxiv icon

Retrieval-based Video Language Model for Efficient Long Video Question Answering

Add code
Dec 08, 2023
Figure 1 for Retrieval-based Video Language Model for Efficient Long Video Question Answering
Figure 2 for Retrieval-based Video Language Model for Efficient Long Video Question Answering
Figure 3 for Retrieval-based Video Language Model for Efficient Long Video Question Answering
Figure 4 for Retrieval-based Video Language Model for Efficient Long Video Question Answering
Viaarxiv icon

Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API

Add code
Oct 07, 2023
Figure 1 for Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API
Figure 2 for Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API
Figure 3 for Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API
Figure 4 for Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API
Viaarxiv icon

Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators

Add code
Jun 02, 2023
Figure 1 for Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators
Figure 2 for Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators
Figure 3 for Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators
Figure 4 for Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators
Viaarxiv icon

Unifying Layout Generation with a Decoupled Diffusion Model

Add code
Mar 09, 2023
Figure 1 for Unifying Layout Generation with a Decoupled Diffusion Model
Figure 2 for Unifying Layout Generation with a Decoupled Diffusion Model
Figure 3 for Unifying Layout Generation with a Decoupled Diffusion Model
Figure 4 for Unifying Layout Generation with a Decoupled Diffusion Model
Viaarxiv icon

A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd

Add code
Jan 17, 2023
Figure 1 for A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd
Figure 2 for A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd
Figure 3 for A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd
Figure 4 for A Semi-supervised Sensing Rate Learning based CMAB Scheme to Combat COVID-19 by Trustful Data Collection in the Crowd
Viaarxiv icon

Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?

Add code
Sep 12, 2021
Figure 1 for Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Figure 2 for Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Figure 3 for Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Figure 4 for Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Viaarxiv icon

Unsupervised Visual Representation Learning by Tracking Patches in Video

Add code
May 06, 2021
Figure 1 for Unsupervised Visual Representation Learning by Tracking Patches in Video
Figure 2 for Unsupervised Visual Representation Learning by Tracking Patches in Video
Figure 3 for Unsupervised Visual Representation Learning by Tracking Patches in Video
Figure 4 for Unsupervised Visual Representation Learning by Tracking Patches in Video
Viaarxiv icon

Detect or Track: Towards Cost-Effective Video Object Detection/Tracking

Add code
Nov 13, 2018
Figure 1 for Detect or Track: Towards Cost-Effective Video Object Detection/Tracking
Figure 2 for Detect or Track: Towards Cost-Effective Video Object Detection/Tracking
Figure 3 for Detect or Track: Towards Cost-Effective Video Object Detection/Tracking
Figure 4 for Detect or Track: Towards Cost-Effective Video Object Detection/Tracking
Viaarxiv icon