Picture for Jun Wang

Jun Wang

IBM T. J. Watson Research Center

Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering

Add code
Dec 30, 2024
Figure 1 for Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering
Figure 2 for Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering
Figure 3 for Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering
Figure 4 for Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering
Viaarxiv icon

ECG-guided individual identification via PPG

Add code
Dec 30, 2024
Figure 1 for ECG-guided individual identification via PPG
Figure 2 for ECG-guided individual identification via PPG
Figure 3 for ECG-guided individual identification via PPG
Figure 4 for ECG-guided individual identification via PPG
Viaarxiv icon

HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios

Add code
Dec 21, 2024
Figure 1 for HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
Figure 2 for HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
Figure 3 for HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
Figure 4 for HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
Viaarxiv icon

STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation

Add code
Dec 16, 2024
Figure 1 for STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation
Figure 2 for STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation
Figure 3 for STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation
Figure 4 for STAIR: Manipulating Collaborative and Multimodal Information for E-Commerce Recommendation
Viaarxiv icon

Why Not Together? A Multiple-Round Recommender System for Queries and Items

Add code
Dec 14, 2024
Figure 1 for Why Not Together? A Multiple-Round Recommender System for Queries and Items
Figure 2 for Why Not Together? A Multiple-Round Recommender System for Queries and Items
Figure 3 for Why Not Together? A Multiple-Round Recommender System for Queries and Items
Figure 4 for Why Not Together? A Multiple-Round Recommender System for Queries and Items
Viaarxiv icon

MEATRD: Multimodal Anomalous Tissue Region Detection Enhanced with Spatial Transcriptomics

Add code
Dec 14, 2024
Viaarxiv icon

CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels

Add code
Dec 11, 2024
Figure 1 for CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
Figure 2 for CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
Figure 3 for CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
Figure 4 for CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
Viaarxiv icon

StyleMark: A Robust Watermarking Method for Art Style Images Against Black-Box Arbitrary Style Transfer

Add code
Dec 10, 2024
Figure 1 for StyleMark: A Robust Watermarking Method for Art Style Images Against Black-Box Arbitrary Style Transfer
Figure 2 for StyleMark: A Robust Watermarking Method for Art Style Images Against Black-Box Arbitrary Style Transfer
Figure 3 for StyleMark: A Robust Watermarking Method for Art Style Images Against Black-Box Arbitrary Style Transfer
Figure 4 for StyleMark: A Robust Watermarking Method for Art Style Images Against Black-Box Arbitrary Style Transfer
Viaarxiv icon

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

Add code
Dec 09, 2024
Figure 1 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Figure 2 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Figure 3 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Figure 4 for ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
Viaarxiv icon

CNNSum: Exploring Long-Conext Summarization with Large Language Models in Chinese Novels

Add code
Dec 05, 2024
Figure 1 for CNNSum: Exploring Long-Conext Summarization with Large Language Models in Chinese Novels
Figure 2 for CNNSum: Exploring Long-Conext Summarization with Large Language Models in Chinese Novels
Figure 3 for CNNSum: Exploring Long-Conext Summarization with Large Language Models in Chinese Novels
Figure 4 for CNNSum: Exploring Long-Conext Summarization with Large Language Models in Chinese Novels
Viaarxiv icon