Alert button

"Image": models, code, and papers
Alert button

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

Add code
Bookmark button
Alert button
Feb 05, 2024
Yang Jin, Zhicheng Sun, Kun Xu, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang Song, Kun Gai, Yadong Mu

Viaarxiv icon

Pre-training of Lightweight Vision Transformers on Small Datasets with Minimally Scaled Images

Feb 06, 2024
Jen Hong Tan

Viaarxiv icon

LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors

Feb 07, 2024
Sheng Jin, Xueying Jiang, Jiaxing Huang, Lewei Lu, Shijian Lu

Viaarxiv icon

3D-2D Neural Nets for Phase Retrieval in Noisy Interferometric Imaging

Feb 08, 2024
Andrew H. Proppe, Guillaume Thekkadath, Duncan England, Philip J. Bustard, Frédéric Bouchard, Jeff S. Lundeen, Benjamin J. Sussman

Viaarxiv icon

Déjà Vu Memorization in Vision-Language Models

Feb 03, 2024
Bargav Jayaraman, Chuan Guo, Kamalika Chaudhuri

Viaarxiv icon

NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness

Feb 12, 2024
Xabier Echeberria-Barrio, Mikel Gorricho, Selene Valencia, Francesco Zola

Viaarxiv icon

Unleashing the Infinity Power of Geometry: A Novel Geometry-Aware Transformer (GOAT) for Whole Slide Histopathology Image Analysis

Feb 08, 2024
Mingxin Liu, Yunzan Liu, Pengbo Xu, Jiquan Ma

Viaarxiv icon

Importance-Aware Image Segmentation-based Semantic Communication for Autonomous Driving

Jan 16, 2024
Jie Lv, Haonan Tong, Qiang Pan, Zhilong Zhang, Xinxin He, Tao Luo, Changchuan Yin

Viaarxiv icon

An Optimization Framework for Processing and Transfer Learning for the Brain Tumor Segmentation

Feb 10, 2024
Tianyi Ren, Ethan Honey, Harshitha Rebala, Abhishek Sharma, Agamdeep Chopra, Mehmet Kurt

Viaarxiv icon

Cacophony: An Improved Contrastive Audio-Text Model

Feb 10, 2024
Ge Zhu, Zhiyao Duan

Viaarxiv icon