Alert button
Picture for Siwen Luo

Siwen Luo

Alert button

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering

Add code
Bookmark button
Alert button
Apr 19, 2024
Yihao Ding, Kaixuan Ren, Jiabin Huang, Siwen Luo, Soyeon Caren Han

Viaarxiv icon

Workshop on Document Intelligence Understanding

Add code
Bookmark button
Alert button
Jul 31, 2023
Soyeon Caren Han, Yihao Ding, Siwen Luo, Josiah Poon, HeeGuen Yoon, Zhe Huang, Paul Duuring, Eun Jung Holden

Figure 1 for Workshop on Document Intelligence Understanding
Figure 2 for Workshop on Document Intelligence Understanding
Figure 3 for Workshop on Document Intelligence Understanding
Viaarxiv icon

PDFVQA: A New Dataset for Real-World VQA on PDF Documents

Add code
Bookmark button
Alert button
Apr 24, 2023
Yihao Ding, Siwen Luo, Hyunsuk Chung, Soyeon Caren Han

Figure 1 for PDFVQA: A New Dataset for Real-World VQA on PDF Documents
Figure 2 for PDFVQA: A New Dataset for Real-World VQA on PDF Documents
Figure 3 for PDFVQA: A New Dataset for Real-World VQA on PDF Documents
Figure 4 for PDFVQA: A New Dataset for Real-World VQA on PDF Documents
Viaarxiv icon

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

Add code
Bookmark button
Alert button
Dec 16, 2022
Siwen Luo, Feiqi Cao, Felipe Nunez, Zean Wen, Josiah Poon, Caren Han

Figure 1 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 2 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 3 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 4 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Viaarxiv icon

PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals

Add code
Bookmark button
Alert button
Dec 01, 2022
Zhihao Zhang, Siwen Luo, Junyi Chen, Sijia Lai, Siqu Long, Hyunsuk Chung, Soyeon Caren Han

Figure 1 for PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals
Figure 2 for PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals
Figure 3 for PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals
Viaarxiv icon

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

Add code
Bookmark button
Alert button
Aug 22, 2022
Siwen Luo, Yihao Ding, Siqu Long, Soyeon Caren Han, Josiah Poon

Figure 1 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Figure 2 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Figure 3 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Figure 4 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Viaarxiv icon

Local Interpretations for Explainable Natural Language Processing: A Survey

Add code
Bookmark button
Alert button
Mar 20, 2021
Siwen Luo, Hamish Ivison, Caren Han, Josiah Poon

Figure 1 for Local Interpretations for Explainable Natural Language Processing: A Survey
Figure 2 for Local Interpretations for Explainable Natural Language Processing: A Survey
Figure 3 for Local Interpretations for Explainable Natural Language Processing: A Survey
Figure 4 for Local Interpretations for Explainable Natural Language Processing: A Survey
Viaarxiv icon

Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images

Add code
Bookmark button
Alert button
Feb 20, 2021
Siwen Luo, Mengting Wu, Yiwen Gong, Wanying Zhou, Josiah Poon

Figure 1 for Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images
Figure 2 for Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images
Figure 3 for Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images
Figure 4 for Deep Structured Feature Networks for Table Detection and Tabular Data Extraction from Scanned Financial Document Images
Viaarxiv icon

VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks

Add code
Bookmark button
Alert button
Oct 25, 2020
Soyeon Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon

Figure 1 for VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Figure 2 for VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Figure 3 for VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Figure 4 for VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Viaarxiv icon