Alert button

"Text": models, code, and papers
Alert button

Detecting and Preventing Hallucinations in Large Vision Language Models

Aug 11, 2023
Anisha Gunjal, Jihan Yin, Erhan Bas

Figure 1 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 2 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 3 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 4 for Detecting and Preventing Hallucinations in Large Vision Language Models
Viaarxiv icon

Consistent model selection in the spiked Wigner model via AIC-type criteria

Jul 24, 2023
Soumendu Sundar Mukherjee

Viaarxiv icon

HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text

May 29, 2023
Ruiyang Li, Fujun Yang, Xianjie Liu, Hongwei Shi

Figure 1 for HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text
Figure 2 for HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text
Figure 3 for HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text
Figure 4 for HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text
Viaarxiv icon

MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

Aug 12, 2023
Zixiang Wang, Linzheng Chai, Jian Yang, Jiaqi Bai, Yuwei Yin, Jiaheng Liu, Hongcheng Guo, Tongliang Li, Liqun Yang, Hebboul Zine el-abidine, Zhoujun Li

Figure 1 for MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction
Figure 2 for MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction
Figure 3 for MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction
Figure 4 for MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction
Viaarxiv icon

Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation

Aug 14, 2023
Wei Ji, Xiangyan Liu, An Zhang, Yinwei Wei, Yongxin Ni, Xiang Wang

Figure 1 for Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation
Figure 2 for Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation
Figure 3 for Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation
Figure 4 for Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation
Viaarxiv icon

Orthogonal Temporal Interpolation for Zero-Shot Video Recognition

Aug 14, 2023
Yan Zhu, Junbao Zhuo, Bin Ma, Jiajia Geng, Xiaoming Wei, Xiaolin Wei, Shuhui Wang

Figure 1 for Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
Figure 2 for Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
Figure 3 for Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
Figure 4 for Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
Viaarxiv icon

MedMine: Examining Pre-trained Language Models on Medication Mining

Aug 08, 2023
Haifa Alrdahi, Lifeng Han, Hendrik Šuvalov, Goran Nenadic

Viaarxiv icon

The Visual Language of Fabrics

Jul 25, 2023
Valentin Deschaintre, Julia Guerrero-Viu, Diego Gutierrez, Tamy Boubekeur, Belen Masia

Figure 1 for The Visual Language of Fabrics
Figure 2 for The Visual Language of Fabrics
Figure 3 for The Visual Language of Fabrics
Figure 4 for The Visual Language of Fabrics
Viaarxiv icon

ChatGPT for GTFS: From Words to Information

Aug 04, 2023
Saipraneeth Devunuri, Shirin Qiam, Lewis Lehe

Figure 1 for ChatGPT for GTFS: From Words to Information
Figure 2 for ChatGPT for GTFS: From Words to Information
Figure 3 for ChatGPT for GTFS: From Words to Information
Figure 4 for ChatGPT for GTFS: From Words to Information
Viaarxiv icon

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

May 22, 2023
Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao

Figure 1 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Figure 2 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Figure 3 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Figure 4 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Viaarxiv icon