Alert button

"Text": models, code, and papers
Alert button

Towards A Better Metric for Text-to-Video Generation

Jan 15, 2024
Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao, Weisi Lin, Wynne Hsu, Ying Shan, Mike Zheng Shou

Viaarxiv icon

Enhancing Image-Text Matching with Adaptive Feature Aggregation

Jan 18, 2024
Zuhui Wang, Yunting Yin, I. V. Ramakrishnan

Viaarxiv icon

A systematic investigation of learnability from single child linguistic input

Feb 12, 2024
Yulu Qin, Wentao Wang, Brenden M. Lake

Viaarxiv icon

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models

Feb 12, 2024
Zongbo Han, Zechen Bai, Haiyang Mei, Qianli Xu, Changqing Zhang, Mike Zheng Shou

Viaarxiv icon

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Feb 13, 2024
Fei Deng, Qifei Wang, Wei Wei, Matthias Grundmann, Tingbo Hou

Viaarxiv icon

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Feb 13, 2024
Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Viaarxiv icon

Vehicle Behavior Prediction by Episodic-Memory Implanted NDT

Feb 13, 2024
Peining Shen, Jianwu Fang, Hongkai Yu, Jianru Xue

Viaarxiv icon

Instilling Multi-round Thinking to Text-guided Image Generation

Jan 16, 2024
Lidong Zeng, Zhedong Zheng, Yinwei Wei, Tat-seng Chua

Viaarxiv icon

Enhanced Labeling Technique for Reddit Text and Fine-Tuned Longformer Models for Classifying Depression Severity in English and Luganda

Jan 25, 2024
Richard Kimera, Daniela N. Rim, Joseph Kirabira, Ubong Godwin Udomah, Heeyoul Choi

Viaarxiv icon

Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings

Feb 05, 2024
Gonçalo Gomes, Isabel Coutinho, Bruno Martins

Viaarxiv icon