Alert button

"Text": models, code, and papers
Alert button

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

Dec 22, 2023
Yuehao Yin, Huiyan Qi, Bin Zhu, Jingjing Chen, Yu-Gang Jiang, Chong-Wah Ngo

Viaarxiv icon

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

Dec 22, 2023
Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong Liu, Gang Yu

Viaarxiv icon

Towards a Unified Multimodal Reasoning Framework

Dec 22, 2023
Abhinav Arun, Dipendra Singh Mal, Mehul Soni, Tomohiro Sawada

Viaarxiv icon

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

Nov 30, 2023
Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun, Chen Chen

Viaarxiv icon

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

Dec 21, 2023
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou

Viaarxiv icon

DP-AdamBC: Your DP-Adam Is Actually DP-SGD (Unless You Apply Bias Correction)

Dec 21, 2023
Qiaoyue Tang, Frederick Shpilevskiy, Mathias Lécuyer

Viaarxiv icon

Experimenting with Large Language Models and vector embeddings in NASA SciX

Dec 21, 2023
Sergi Blanco-Cuaresma, Ioana Ciucă, Alberto Accomazzi, Michael J. Kurtz, Edwin A. Henneken, Kelly E. Lockhart, Felix Grezes, Thomas Allen, Golnaz Shapurian, Carolyn S. Grant, Donna M. Thompson, Timothy W. Hostetler, Matthew R. Templeton, Shinyi Chen, Jennifer Koch, Taylor Jacovich, Daniel Chivvis, Fernanda de Macedo Alves, Jean-Claude Paquin, Jennifer Bartlett, Mugdha Polimera, Stephanie Jarmak

Viaarxiv icon

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

Dec 13, 2023
Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

Figure 1 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Figure 2 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Figure 3 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Figure 4 for Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
Viaarxiv icon

A-SDM: Accelerating Stable Diffusion through Redundancy Removal and Performance Optimization

Dec 24, 2023
Jinchao Zhu, Yuxuan Wang, Xiaobing Tu, Siyuan Pan, Pengfei Wan, Gao Huang

Viaarxiv icon

TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer

Nov 14, 2023
Huashan Sun, Yixiao Wu, Yinghao Li, Jiawei Li, Yizhe Yang, Yang Gao

Viaarxiv icon