Picture for Xiang Bai

Xiang Bai

Huazhong University of Science and Technology

Toward Real Text Manipulation Detection: New Dataset and New Solution

Add code
Dec 12, 2023
Figure 1 for Toward Real Text Manipulation Detection: New Dataset and New Solution
Figure 2 for Toward Real Text Manipulation Detection: New Dataset and New Solution
Figure 3 for Toward Real Text Manipulation Detection: New Dataset and New Solution
Figure 4 for Toward Real Text Manipulation Detection: New Dataset and New Solution
Viaarxiv icon

DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text

Add code
Nov 29, 2023
Figure 1 for DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text
Figure 2 for DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text
Figure 3 for DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text
Figure 4 for DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text
Viaarxiv icon

Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models

Add code
Nov 28, 2023
Figure 1 for Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models
Figure 2 for Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models
Figure 3 for Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models
Figure 4 for Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models
Viaarxiv icon

Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Add code
Nov 24, 2023
Figure 1 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 2 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 3 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Figure 4 for Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Viaarxiv icon

DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning

Add code
Oct 25, 2023
Viaarxiv icon

SingleInsert: Inserting New Concepts from a Single Image into Text-to-Image Models for Flexible Editing

Add code
Oct 12, 2023
Viaarxiv icon

A Discrepancy Aware Framework for Robust Anomaly Detection

Add code
Oct 11, 2023
Viaarxiv icon

Diffusion-based 3D Object Detection with Random Boxes

Add code
Sep 05, 2023
Figure 1 for Diffusion-based 3D Object Detection with Random Boxes
Figure 2 for Diffusion-based 3D Object Detection with Random Boxes
Figure 3 for Diffusion-based 3D Object Detection with Random Boxes
Figure 4 for Diffusion-based 3D Object Detection with Random Boxes
Viaarxiv icon

Turning a CLIP Model into a Scene Text Spotter

Add code
Aug 21, 2023
Figure 1 for Turning a CLIP Model into a Scene Text Spotter
Figure 2 for Turning a CLIP Model into a Scene Text Spotter
Figure 3 for Turning a CLIP Model into a Scene Text Spotter
Figure 4 for Turning a CLIP Model into a Scene Text Spotter
Viaarxiv icon

Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition

Add code
Aug 21, 2023
Viaarxiv icon