Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fengjie Li

Beyond Text: Aligning Vision and Language for Multimodal E-Commerce Retrieval

Mar 05, 2026

Qujiaheng Zhang, Guagnyue Xu, Fengjie Li

Abstract:Modern e-commerce search is inherently multimodal: customers make purchase decisions by jointly considering product text and visual informations. However, most industrial retrieval and ranking systems primarily rely on textual information, underutilizing the rich visual signals available in product images. In this work, we study unified text-image fusion for two-tower retrieval models in the e-commerce domain. We demonstrate that domain-specific fine-tuning and two stage alignment between query with product text and image modalities are both crucial for effective multimodal retrieval. Building on these insights, we propose a noval modality fusion network to fuse image and text information and capture cross-modal complementary information. Experiments on large-scale e-commerce datasets validate the effectiveness of the proposed approach.

Via

Access Paper or Ask Questions

Evaluating the Generalizability of LLMs in Automated Program Repair

Mar 12, 2025

Fengjie Li, Jiajun Jiang, Jiajun Sun, Hongyu Zhang

Figure 1 for Evaluating the Generalizability of LLMs in Automated Program Repair

Figure 2 for Evaluating the Generalizability of LLMs in Automated Program Repair

Figure 3 for Evaluating the Generalizability of LLMs in Automated Program Repair

Figure 4 for Evaluating the Generalizability of LLMs in Automated Program Repair

Abstract:LLM-based automated program repair methods have attracted significant attention for their state-of-the-art performance. However, they were primarily evaluated on a few well known datasets like Defects4J, raising questions about their effectiveness on new datasets. In this study, we evaluate 11 top-performing LLMs on DEFECTS4J-TRANS, a new dataset derived from transforming Defects4J while maintaining the original semantics. Results from experiments on both Defects4J and DEFECTS4J-TRANS show that all studied LLMs have limited generalizability in APR tasks, with the average number of correct and plausible patches decreasing by 49.48% and 42.90%, respectively, on DEFECTS4J-TRANS. Further investigation into incorporating additional repair-relevant information in repair prompts reveals that, although this information significantly enhances the LLMs' capabilities (increasing the number of correct and plausible patches by up to 136.67% and 121.82%, respectively), performance still falls short of their original results. This indicates that prompt engineering alone is insufficient to substantially enhance LLMs' repair capabilities. Based on our study, we also offer several recommendations for future research.

* 5 pages, 1 figure, to be published in ICSE2025-NIER

Via

Access Paper or Ask Questions