Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiabao Li

Text-to-Image Generation for Projector-Camera System Registration

Jul 03, 2026

Xinyu Chen, Yuqi Li, Jiabao Li, Pinyan Tang, Chong Wang, Aditi Majumder

Abstract:Establishing correspondence between projector and camera images in a procam (projector + camera) system is essential for achieving high-resolution pixel matching, referred to as procam registration. The highest accuracy is typically obtained using structured light patterns (e.g., stripes or blobs). However, these methods are often inefficient and lack meaningful information for human viewers. Although some have explored the use of natural images, these often fail to provide a sufficient distribution of features to achieve comparable accuracy. Additionally, existing methods struggle to cope with environmental factors such as surface textures and variations in brightness due to ambient light or changes in camera exposure. To address these limitations, we propose a method based on deep neural networks. Our approach aims to generate a single natural image from text-based prompts that not only appears realistic but also possesses rich spatial features to enhance registration accuracy in procam applications. We have developed a deep neural network trained on a synthesized dataset that simulates potential geometric and photometric distortions encountered in a procam system illuminating a relatively smooth object (see Figure 1). Our trained network predicts the correspondence between projector and camera images, significantly improving registration accuracy across various procam configurations. By jointly considering the naturalness and feature richness of the projector images, our method minimizes visual disruptions in projected content without sacrificing precision. A user study confirms that our technique enhances perceived naturalness and usability compared to existing methods, validating its practical utility in real-world applications.

Via

Access Paper or Ask Questions

KAT-Coder-V2 Technical Report

Mar 29, 2026

Fengxiang Li, Han Zhang, Haoyang Huang, Jinghui Wang, Jinhua Hao, Kun Yuan, Mengtong Li, Minglei Zhang, Pengcheng Xu, Wenhao Zhuang(+36 more)

Abstract:We present KAT-Coder-V2, an agentic coding model developed by the KwaiKAT team at Kuaishou. KAT-Coder-V2 adopts a "Specialize-then-Unify" paradigm that decomposes agentic coding into five expert domains - SWE, WebCoding, Terminal, WebSearch, and General - each undergoing independent supervised fine-tuning and reinforcement learning, before being consolidated into a single model via on-policy distillation. We develop KwaiEnv, a modular infrastructure sustaining tens of thousands of concurrent sandbox instances, and scale RL training along task complexity, intent alignment, and scaffold generalization. We further propose MCLA for stabilizing MoE RL training and Tree Training for eliminating redundant computation over tree-structured trajectories with up to 6.2x speedup. KAT-Coder-V2 achieves 79.6% on SWE-bench Verified (vs. Claude Opus 4.6 at 80.8%), 88.7 on PinchBench (surpassing GLM-5 and MiniMax M2.7), ranks first across all three frontend aesthetics scenarios, and maintains strong generalist scores on Terminal-Bench Hard (46.8) and tau^2-Bench (93.9). Our model is publicly available at https://streamlake.com/product/kat-coder.

* 22 pages, 7 figures

Via

Access Paper or Ask Questions

T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration

Jan 19, 2024

Chuxiong Sun, Zehua Zang, Jiabao Li, Jiangmeng Li, Xiao Xu, Rui Wang, Changwen Zheng

Figure 1 for T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration

Figure 2 for T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration

Figure 3 for T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration

Figure 4 for T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration

Abstract:Communication stands as a potent mechanism to harmonize the behaviors of multiple agents. However, existing works primarily concentrate on broadcast communication, which not only lacks practicality, but also leads to information redundancy. This surplus, one-fits-all information could adversely impact the communication efficiency. Furthermore, existing works often resort to basic mechanisms to integrate observed and received information, impairing the learning process. To tackle these difficulties, we propose Targeted and Trusted Multi-Agent Communication (T2MAC), a straightforward yet effective method that enables agents to learn selective engagement and evidence-driven integration. With T2MAC, agents have the capability to craft individualized messages, pinpoint ideal communication windows, and engage with reliable partners, thereby refining communication efficiency. Following the reception of messages, the agents integrate information observed and received from different sources at an evidence level. This process enables agents to collectively use evidence garnered from multiple perspectives, fostering trusted and cooperative behaviors. We evaluate our method on a diverse set of cooperative multi-agent tasks, with varying difficulties, involving different scales and ranging from Hallway, MPE to SMAC. The experiments indicate that the proposed model not only surpasses the state-of-the-art methods in terms of cooperative performance and communication efficiency, but also exhibits impressive generalization.

* AAAI24

Via

Access Paper or Ask Questions