Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junyi Cao

Patch-Discontinuity Mining for Generalized Deepfake Detection

Dec 26, 2025

Huanhuan Yuan, Yang Ping, Zhengqin Xu, Junyi Cao, Shuai Jia, Chao Ma

Abstract:The rapid advancement of generative artificial intelligence has enabled the creation of highly realistic fake facial images, posing serious threats to personal privacy and the integrity of online information. Existing deepfake detection methods often rely on handcrafted forensic cues and complex architectures, achieving strong performance in intra-domain settings but suffering significant degradation when confronted with unseen forgery patterns. In this paper, we propose GenDF, a simple yet effective framework that transfers a powerful large-scale vision model to the deepfake detection task with a compact and neat network design. GenDF incorporates deepfake-specific representation learning to capture discriminative patterns between real and fake facial images, feature space redistribution to mitigate distribution mismatch, and a classification-invariant feature augmentation strategy to enhance generalization without introducing additional trainable parameters. Extensive experiments demonstrate that GenDF achieves state-of-the-art generalization performance in cross-domain and cross-manipulation settings while requiring only 0.28M trainable parameters, validating the effectiveness and efficiency of the proposed framework.

* Our paper was accepted by the IEEE Transactions on Multimedia

Via

Access Paper or Ask Questions

Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

Oct 01, 2025

Minhui Zhu, Minyang Tian, Xiaocheng Yang, Tianci Zhou, Penghao Zhu, Eli Chertkov, Shengyan Liu, Yufeng Du, Lifan Yuan, Ziming Ji(+54 more)

Figure 1 for Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

Figure 2 for Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

Figure 3 for Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

Figure 4 for Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

Abstract:While large language models (LLMs) with reasoning capabilities are progressing rapidly on high-school math competitions and coding, can they reason effectively through complex, open-ended challenges found in frontier physics research? And crucially, what kinds of reasoning tasks do physicists want LLMs to assist with? To address these questions, we present the CritPt (Complex Research using Integrated Thinking - Physics Test, pronounced "critical point"), the first benchmark designed to test LLMs on unpublished, research-level reasoning tasks that broadly covers modern physics research areas, including condensed matter, quantum physics, atomic, molecular & optical physics, astrophysics, high energy physics, mathematical physics, statistical physics, nuclear physics, nonlinear dynamics, fluid dynamics and biophysics. CritPt consists of 71 composite research challenges designed to simulate full-scale research projects at the entry level, which are also decomposed to 190 simpler checkpoint tasks for more fine-grained insights. All problems are newly created by 50+ active physics researchers based on their own research. Every problem is hand-curated to admit a guess-resistant and machine-verifiable answer and is evaluated by an automated grading pipeline heavily customized for advanced physics-specific output formats. We find that while current state-of-the-art LLMs show early promise on isolated checkpoints, they remain far from being able to reliably solve full research-scale challenges: the best average accuracy among base models is only 4.0% , achieved by GPT-5 (high), moderately rising to around 10% when equipped with coding tools. Through the realistic yet standardized evaluation offered by CritPt, we highlight a large disconnect between current model capabilities and realistic physics research demands, offering a foundation to guide the development of scientifically grounded AI tools.

* 39 pages, 6 figures, 6 tables

Via

Access Paper or Ask Questions

SOPHY: Generating Simulation-Ready Objects with Physical Materials

Apr 17, 2025

Junyi Cao, Evangelos Kalogerakis

Abstract:We present SOPHY, a generative model for 3D physics-aware shape synthesis. Unlike existing 3D generative models that focus solely on static geometry or 4D models that produce physics-agnostic animations, our approach jointly synthesizes shape, texture, and material properties related to physics-grounded dynamics, making the generated objects ready for simulations and interactive, dynamic environments. To train our model, we introduce a dataset of 3D objects annotated with detailed physical material attributes, along with an annotation pipeline for efficient material annotation. Our method enables applications such as text-driven generation of interactive, physics-aware 3D objects and single-image reconstruction of physically plausible shapes. Furthermore, our experiments demonstrate that jointly modeling shape and material properties enhances the realism and fidelity of generated shapes, improving performance on generative geometry evaluation metrics.

* Project page: https://xjay18.github.io/SOPHY

Via

Access Paper or Ask Questions

Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

Oct 10, 2024

Junyi Cao, Shanyan Guan, Yanhao Ge, Wei Li, Xiaokang Yang, Chao Ma

Figure 1 for Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

Figure 2 for Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

Figure 3 for Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

Figure 4 for Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

Abstract:While humans effortlessly discern intrinsic dynamics and adapt to new scenarios, modern AI systems often struggle. Current methods for visual grounding of dynamics either use pure neural-network-based simulators (black box), which may violate physical laws, or traditional physical simulators (white box), which rely on expert-defined equations that may not fully capture actual dynamics. We propose the Neural Material Adaptor (NeuMA), which integrates existing physical laws with learned corrections, facilitating accurate learning of actual dynamics while maintaining the generalizability and interpretability of physical priors. Additionally, we propose Particle-GS, a particle-driven 3D Gaussian Splatting variant that bridges simulation and observed images, allowing back-propagate image gradients to optimize the simulator. Comprehensive experiments on various dynamics in terms of grounded particle accuracy, dynamic rendering quality, and generalization ability demonstrate that NeuMA can accurately capture intrinsic dynamics.

* NeurIPS 2024, the project page: https://xjay18.github.io/projects/neuma.html

Via

Access Paper or Ask Questions

Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

Mar 09, 2024

Junyi Cao, Zhichao Li, Naiyan Wang, Chao Ma

Figure 1 for Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

Figure 2 for Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

Figure 3 for Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

Figure 4 for Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

Abstract:Recent studies have highlighted the promising application of NeRF in autonomous driving contexts. However, the complexity of outdoor environments, combined with the restricted viewpoints in driving scenarios, complicates the task of precisely reconstructing scene geometry. Such challenges often lead to diminished quality in reconstructions and extended durations for both training and rendering. To tackle these challenges, we present Lightning NeRF. It uses an efficient hybrid scene representation that effectively utilizes the geometry prior from LiDAR in autonomous driving scenarios. Lightning NeRF significantly improves the novel view synthesis performance of NeRF and reduces computational overheads. Through evaluations on real-world datasets, such as KITTI-360, Argoverse2, and our private dataset, we demonstrate that our approach not only exceeds the current state-of-the-art in novel view synthesis quality but also achieves a five-fold increase in training speed and a ten-fold improvement in rendering speed. Codes are available at https://github.com/VISION-SJTU/Lightning-NeRF .

* Accepted to ICRA 2024

Via

Access Paper or Ask Questions

Artificial Intelligence Security Competition (AISC)

Dec 07, 2022

Yinpeng Dong, Peng Chen, Senyou Deng, Lianji L, Yi Sun, Hanyu Zhao, Jiaxing Li, Yunteng Tan, Xinyu Liu, Yangyi Dong(+42 more)

Figure 1 for Artificial Intelligence Security Competition (AISC)

Figure 2 for Artificial Intelligence Security Competition (AISC)

Figure 3 for Artificial Intelligence Security Competition (AISC)

Figure 4 for Artificial Intelligence Security Competition (AISC)

Abstract:The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.

* Technical report of AISC

Via

Access Paper or Ask Questions