Alert button
Picture for Alexander G. Hauptmann

Alexander G. Hauptmann

Alert button

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

Oct 09, 2023
Lijun Yu, José Lezama, Nitesh B. Gundavarapu, Luca Versari, Kihyuk Sohn, David Minnen, Yong Cheng, Agrim Gupta, Xiuye Gu, Alexander G. Hauptmann, Boqing Gong, Ming-Hsuan Yang, Irfan Essa, David A. Ross, Lu Jiang

Viaarxiv icon

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Jul 03, 2023
Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yanping Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang, Kevin Murphy, Alexander G. Hauptmann, Lu Jiang

Figure 1 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Figure 2 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Figure 3 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Figure 4 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Viaarxiv icon

Document Entity Retrieval with Massive and Noisy Pre-training

Jun 15, 2023
Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, Alexander G. Hauptmann, Hanjun Dai, Wei Wei

Figure 1 for Document Entity Retrieval with Massive and Noisy Pre-training
Figure 2 for Document Entity Retrieval with Massive and Noisy Pre-training
Figure 3 for Document Entity Retrieval with Massive and Noisy Pre-training
Figure 4 for Document Entity Retrieval with Massive and Noisy Pre-training
Viaarxiv icon

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Apr 05, 2023
Zhi-Qi Cheng, Qi Dai, Siyao Li, Jingdong Sun, Teruko Mitamura, Alexander G. Hauptmann

Figure 1 for ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
Figure 2 for ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
Figure 3 for ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
Figure 4 for ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
Viaarxiv icon

MAGVIT: Masked Generative Video Transformer

Dec 10, 2022
Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang

Figure 1 for MAGVIT: Masked Generative Video Transformer
Figure 2 for MAGVIT: Masked Generative Video Transformer
Figure 3 for MAGVIT: Masked Generative Video Transformer
Figure 4 for MAGVIT: Masked Generative Video Transformer
Viaarxiv icon

Rethinking Spatial Invariance of Convolutional Networks for Object Counting

Jun 10, 2022
Zhi-Qi Cheng, Qi Dai, Hong Li, JingKuan Song, Xiao Wu, Alexander G. Hauptmann

Figure 1 for Rethinking Spatial Invariance of Convolutional Networks for Object Counting
Figure 2 for Rethinking Spatial Invariance of Convolutional Networks for Object Counting
Figure 3 for Rethinking Spatial Invariance of Convolutional Networks for Object Counting
Figure 4 for Rethinking Spatial Invariance of Convolutional Networks for Object Counting
Viaarxiv icon

Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals

Jan 14, 2022
Lijun Yu, Yijun Qian, Wenhe Liu, Alexander G. Hauptmann

Figure 1 for Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals
Figure 2 for Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals
Figure 3 for Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals
Figure 4 for Argus++: Robust Real-time Activity Detection for Unconstrained Video Streams with Overlapping Cube Proposals
Viaarxiv icon

Subspace Representation Learning for Few-shot Image Classification

May 05, 2021
Ting-Yao Hu, Zhi-Qi Cheng, Alexander G. Hauptmann

Figure 1 for Subspace Representation Learning for Few-shot Image Classification
Figure 2 for Subspace Representation Learning for Few-shot Image Classification
Figure 3 for Subspace Representation Learning for Few-shot Image Classification
Figure 4 for Subspace Representation Learning for Few-shot Image Classification
Viaarxiv icon

Pose Guided Person Image Generation with Hidden p-Norm Regression

Feb 19, 2021
Ting-Yao Hu, Alexander G. Hauptmann

Figure 1 for Pose Guided Person Image Generation with Hidden p-Norm Regression
Figure 2 for Pose Guided Person Image Generation with Hidden p-Norm Regression
Figure 3 for Pose Guided Person Image Generation with Hidden p-Norm Regression
Figure 4 for Pose Guided Person Image Generation with Hidden p-Norm Regression
Viaarxiv icon