Alert button

"Information": models, code, and papers
Alert button

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document

Add code
Bookmark button
Alert button
Mar 07, 2024
Yuliang Liu, Biao Yang, Qiang Liu, Zhang Li, Zhiyin Ma, Shuo Zhang, Xiang Bai

Figure 1 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 2 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 3 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Figure 4 for TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Viaarxiv icon

That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation

Mar 07, 2024
Georgi Pramatarov, Matthew Gadd, Paul Newman, Daniele De Martini

Figure 1 for That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation
Figure 2 for That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation
Figure 3 for That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation
Figure 4 for That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation
Viaarxiv icon

Debiasing Large Visual Language Models

Add code
Bookmark button
Alert button
Mar 08, 2024
Yi-Fan Zhang, Weichen Yu, Qingsong Wen, Xue Wang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan

Figure 1 for Debiasing Large Visual Language Models
Figure 2 for Debiasing Large Visual Language Models
Figure 3 for Debiasing Large Visual Language Models
Figure 4 for Debiasing Large Visual Language Models
Viaarxiv icon

Federated Learning Method for Preserving Privacy in Face Recognition System

Mar 08, 2024
Enoch Solomon, Abraham Woubie

Figure 1 for Federated Learning Method for Preserving Privacy in Face Recognition System
Figure 2 for Federated Learning Method for Preserving Privacy in Face Recognition System
Figure 3 for Federated Learning Method for Preserving Privacy in Face Recognition System
Figure 4 for Federated Learning Method for Preserving Privacy in Face Recognition System
Viaarxiv icon

Towards a Psychology of Machines: Large Language Models Predict Human Memory

Mar 08, 2024
Markus Huff, Elanur Ulakçı

Figure 1 for Towards a Psychology of Machines: Large Language Models Predict Human Memory
Figure 2 for Towards a Psychology of Machines: Large Language Models Predict Human Memory
Figure 3 for Towards a Psychology of Machines: Large Language Models Predict Human Memory
Viaarxiv icon

Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models

Mar 03, 2024
Yuchen Wu, Minshuo Chen, Zihao Li, Mengdi Wang, Yuting Wei

Figure 1 for Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models
Figure 2 for Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models
Figure 3 for Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models
Figure 4 for Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models
Viaarxiv icon

SGD with Partial Hessian for Deep Neural Networks Optimization

Add code
Bookmark button
Alert button
Mar 05, 2024
Ying Sun, Hongwei Yong, Lei Zhang

Figure 1 for SGD with Partial Hessian for Deep Neural Networks Optimization
Figure 2 for SGD with Partial Hessian for Deep Neural Networks Optimization
Figure 3 for SGD with Partial Hessian for Deep Neural Networks Optimization
Figure 4 for SGD with Partial Hessian for Deep Neural Networks Optimization
Viaarxiv icon

Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model

Mar 03, 2024
Rui Yang, Shunpu Zhang

Figure 1 for Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model
Figure 2 for Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model
Figure 3 for Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model
Figure 4 for Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model
Viaarxiv icon

Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer

Mar 04, 2024
Haoxu Wang, Ming Cheng, Qiang Fu, Ming Li

Figure 1 for Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer
Figure 2 for Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer
Figure 3 for Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer
Figure 4 for Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer
Viaarxiv icon

SSTKG: Simple Spatio-Temporal Knowledge Graph for Intepretable and Versatile Dynamic Information Embedding

Feb 19, 2024
Ruiyi Yang, Flora D. Salim, Hao Xue

Viaarxiv icon