Text


Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval

Add code
Jun 26, 2025
Viaarxiv icon

Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration

Add code
Jun 26, 2025
Viaarxiv icon

Logios : An open source Greek Polytonic Optical Character Recognition system

Add code
Jun 26, 2025
Viaarxiv icon

Aligning Spoken Dialogue Models from User Interactions

Add code
Jun 26, 2025
Viaarxiv icon

Controllable 3D Placement of Objects with Scene-Aware Diffusion Models

Add code
Jun 26, 2025
Viaarxiv icon

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Add code
Jun 26, 2025
Viaarxiv icon

Structuralist Approach to AI Literary Criticism: Leveraging Greimas Semiotic Square for Large Language Models

Add code
Jun 26, 2025
Viaarxiv icon

DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document Images

Add code
Jun 26, 2025
Viaarxiv icon

DidSee: Diffusion-Based Depth Completion for Material-Agnostic Robotic Perception and Manipulation

Add code
Jun 26, 2025
Viaarxiv icon

TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence

Add code
Jun 26, 2025
Viaarxiv icon