Alert button

"Image": models, code, and papers
Alert button

CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding

Nov 15, 2023
Jianzong Wang, Yimin Deng, Ziqi Liang, Xulong Zhang, Ning Cheng, Jing Xiao

Figure 1 for CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding
Figure 2 for CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding
Figure 3 for CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding
Figure 4 for CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding
Viaarxiv icon

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

Nov 15, 2023
Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu

Viaarxiv icon

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

Nov 15, 2023
Naoki Sato, Hideaki Iiduka

Viaarxiv icon

EyeLS: Shadow-Guided Instrument Landing System for Intraocular Target Approaching in Robotic Eye Surgery

Nov 15, 2023
Junjie Yang, Zhihao Zhao, Siyuan Shen, Daniel Zapp, Mathias Maier, Kai Huang, Nassir Navab, M. Ali Nasseri

Viaarxiv icon

CogVLM: Visual Expert for Pretrained Language Models

Add code
Bookmark button
Alert button
Nov 06, 2023
Weihan Wang, Qingsong Lv, Wenmeng Yu, Wenyi Hong, Ji Qi, Yan Wang, Junhui Ji, Zhuoyi Yang, Lei Zhao, Xixuan Song, Jiazheng Xu, Bin Xu, Juanzi Li, Yuxiao Dong, Ming Ding, Jie Tang

Figure 1 for CogVLM: Visual Expert for Pretrained Language Models
Figure 2 for CogVLM: Visual Expert for Pretrained Language Models
Figure 3 for CogVLM: Visual Expert for Pretrained Language Models
Figure 4 for CogVLM: Visual Expert for Pretrained Language Models
Viaarxiv icon

InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists

Add code
Bookmark button
Alert button
Sep 30, 2023
Yulu Gan, Sungwoo Park, Alexander Schubert, Anthony Philippakis, Ahmed M. Alaa

Viaarxiv icon

LT-ViT: A Vision Transformer for multi-label Chest X-ray classification

Nov 13, 2023
Umar Marikkar, Sara Atito, Muhammad Awais, Adam Mahdi

Viaarxiv icon

FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design

Nov 13, 2023
Zhen Huang, Yihao Li, Dong Pei, Jiapeng Zhou, Xuliang Ning, Jianlin Han, Xiaoguang Han, Xuejun Chen

Viaarxiv icon

Multi-domain improves out-of-distribution and data-limited scenarios for medical image analysis

Oct 10, 2023
Ece Ozkan, Xavier Boix

Viaarxiv icon

NeuroQuantify -- An Image Analysis Software for Detection and Quantification of Neurons and Neurites using Deep Learning

Add code
Bookmark button
Alert button
Oct 19, 2023
Ka My Dang, Yi Jia Zhang, Tianchen Zhang, Chao Wang, Anton Sinner, Piero Coronica, Joyce K. S. Poon

Viaarxiv icon