Alert button

"Text": models, code, and papers
Alert button

Team Flow at DRC2023: Building Common Ground and Text-based Turn-taking in a Travel Agent Spoken Dialogue System

Dec 21, 2023
Ryu Hirai, Shinya Iizuka, Haruhisa Iseno, Ao Guo, Jingjing Jiang, Atsumoto Ohashi, Ryuichiro Higashinaka

Viaarxiv icon

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

Dec 22, 2023
Yuehao Yin, Huiyan Qi, Bin Zhu, Jingjing Chen, Yu-Gang Jiang, Chong-Wah Ngo

Viaarxiv icon

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

Dec 22, 2023
Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong Liu, Gang Yu

Viaarxiv icon

Towards a Unified Multimodal Reasoning Framework

Dec 22, 2023
Abhinav Arun, Dipendra Singh Mal, Mehul Soni, Tomohiro Sawada

Viaarxiv icon

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

Dec 21, 2023
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou

Viaarxiv icon

DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)

Dec 21, 2023
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer

Viaarxiv icon

Experimenting with Large Language Models and vector embeddings in NASA SciX

Dec 21, 2023
Sergi Blanco-Cuaresma, Ioana Ciucă, Alberto Accomazzi, Michael J. Kurtz, Edwin A. Henneken, Kelly E. Lockhart, Felix Grezes, Thomas Allen, Golnaz Shapurian, Carolyn S. Grant, Donna M. Thompson, Timothy W. Hostetler, Matthew R. Templeton, Shinyi Chen, Jennifer Koch, Taylor Jacovich, Daniel Chivvis, Fernanda de Macedo Alves, Jean-Claude Paquin, Jennifer Bartlett, Mugdha Polimera, Stephanie Jarmak

Viaarxiv icon

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

Dec 13, 2023
Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

Figure 1 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Figure 2 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Figure 3 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Figure 4 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Viaarxiv icon

TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer

Nov 14, 2023
Huashan Sun, Yixiao Wu, Yinghao Li, Jiawei Li, Yizhe Yang, Yang Gao

Viaarxiv icon

A-SDM: Accelerating Stable Diffusion through Redundancy Removal and Performance Optimization

Dec 24, 2023
Jinchao Zhu, Yuxuan Wang, Xiaobing Tu, Siyuan Pan, Pengfei Wan, Gao Huang

Viaarxiv icon