Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shujie Wang

Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence

May 07, 2026

Xinquan Chen, Zhenyun Yin, Shan He, Bin Huang, Shanzhe Lei, Pengcheng Shi, Kun Cai, Bei Chen, Bangwei Liu, Zeyu Kang(+29 more)

Abstract:As large models evolve from conversational assistants into autonomous agents, challenges increasingly arise from long-horizon decision making, tool use, and real environment interaction. Existing agenticinfrastructure remain fragmented across evaluation, data management, and agent evolution, making it difficult to discover risks systematically and improve models in a continuous closed loop. In this report, we present \textbf{Safactory}, a scalable agent factory for trustworthy autonomous intelligence. Safactory integrates three tightly coupled platforms: a \textbf{Parallel Simulation Platform} for trajectory generation, a \textbf{Trustworthy Data Platform} for trajectory storage and experience extraction, and an \textbf{Autonomous Evolution Platform} for asynchronous reinforcement learning and on-policy distillation. As far as we know, Safactory is the first framework to propose a unified evolutionary pipeline for next-generation trustworthy autonomous intelligence.

* 50 pages, 21 figures

Via

Access Paper or Ask Questions

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Jul 24, 2025

Shanghai AI Lab, :, Yicheng Bao, Guanxu Chen, Mingkang Chen, Yunhao Chen, Chiyu Chen, Lingjie Chen, Sirui Chen, Xinquan Chen(+108 more)

$Figure 1 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law$

$Figure 2 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law$

$Figure 3 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law$

$Figure 4 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law$

Abstract:We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety. It is developed by our proposed SafeLadder framework, which incorporates large-scale, progressive, safety-oriented reinforcement learning post-training, supported by a suite of multi-principled verifiers. Unlike previous alignment methods such as RLHF that simply learn human preferences, SafeLadder enables SafeWork-R1 to develop intrinsic safety reasoning and self-reflection abilities, giving rise to safety `aha' moments. Notably, SafeWork-R1 achieves an average improvement of $46.54\%$ over its base model Qwen2.5-VL-72B on safety-related benchmarks without compromising general capabilities, and delivers state-of-the-art safety performance compared to leading proprietary models such as GPT-4.1 and Claude Opus 4. To further bolster its reliability, we implement two distinct inference-time intervention methods and a deliberative search mechanism, enforcing step-level verification. Finally, we further develop SafeWork-R1-InternVL3-78B, SafeWork-R1-DeepSeek-70B, and SafeWork-R1-Qwen2.5VL-7B. All resulting models demonstrate that safety and capability can co-evolve synergistically, highlighting the generalizability of our framework in building robust, reliable, and trustworthy general-purpose AI.

* 47 pages, 18 figures, authors are listed in alphabetical order by their last names

Via

Access Paper or Ask Questions

Deliberative Searcher: Improving LLM Reliability via Reinforcement Learning with constraints

Jul 23, 2025

Zhenyun Yin, Shujie Wang, Xuhong Wang, Xingjun Ma, Yinchun Wang

Abstract:Improving the reliability of large language models (LLMs) is critical for deploying them in real-world scenarios. In this paper, we propose \textbf{Deliberative Searcher}, the first framework to integrate certainty calibration with retrieval-based search for open-domain question answering. The agent performs multi-step reflection and verification over Wikipedia data and is trained with a reinforcement learning algorithm that optimizes for accuracy under a soft reliability constraint. Empirical results show that proposed method improves alignment between model confidence and correctness, leading to more trustworthy outputs. This paper will be continuously updated.

* The inconsistency of base models undermines the fairness of evaluation comparisons and affects the validity of the paper's conclusions

Via

Access Paper or Ask Questions

TGGLinesPlus: A robust topological graph-guided computer vision algorithm for line detection from images

Mar 26, 2024

Liping Yang, Joshua Driscol, Ming Gong, Shujie Wang, Catherine G. Potts

Figure 1 for TGGLinesPlus: A robust topological graph-guided computer vision algorithm for line detection from images

Figure 2 for TGGLinesPlus: A robust topological graph-guided computer vision algorithm for line detection from images

Figure 3 for TGGLinesPlus: A robust topological graph-guided computer vision algorithm for line detection from images

Figure 4 for TGGLinesPlus: A robust topological graph-guided computer vision algorithm for line detection from images

Abstract:Line detection is a classic and essential problem in image processing, computer vision and machine intelligence. Line detection has many important applications, including image vectorization (e.g., document recognition and art design), indoor mapping, and important societal challenges (e.g., sea ice fracture line extraction from satellite imagery). Many line detection algorithms and methods have been developed, but robust and intuitive methods are still lacking. In this paper, we proposed and implemented a topological graph-guided algorithm, named TGGLinesPlus, for line detection. Our experiments on images from a wide range of domains have demonstrated the flexibility of our TGGLinesPlus algorithm. We also benchmarked our algorithm with five classic and state-of-the-art line detection methods and the results demonstrate the robustness of TGGLinesPlus. We hope our open-source implementation of TGGLinesPlus will inspire and pave the way for many applications where spatial science matters.

* Our TGGLinesPlus Python implementation is open source. 27 pages, 8 figures and 4 tables

Via

Access Paper or Ask Questions

F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods

Jan 26, 2024

Yu Sun, Keyu Chen, Shujie Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin

Abstract:Large language models (LLMs) garner significant attention for their unprecedented performance, leading to an increasing number of researches evaluating LLMs. However, these evaluation benchmarks are limited to assessing the instruction-following capabilities, overlooking the fundamental abilities that emerge during the pre-training stage. Previous subjective evaluation methods mainly reply on scoring by API models. However, in the absence of references, large models have shown limited ability to discern subtle differences. To bridge the gap, we propose F-Eval, a bilingual evaluation benchmark to evaluate the fundamental abilities, including expression, commonsense and logic. The tasks in F-Eval include multi-choice objective tasks, open-ended objective tasks, reference-based subjective tasks and reference-free subjective tasks. For reference-free subjective tasks, we devise new evaluation methods, serving as alternatives to scoring by API models. We conduct evaluations on 13 advanced LLMs. Results show that our evaluation methods show higher correlation coefficients and larger distinction than other evaluators. Additionally, we discuss the influence of different model sizes, dimensions, and normalization methods. We anticipate that F-Eval will facilitate the study of LLMs' fundamental abilities.

* 21 pages, 7 figures

Via

Access Paper or Ask Questions

An EEG-based approach for Parkinson's disease diagnosis using Capsule network

Jan 11, 2022

Shujie Wang, Gongshu Wang, Guangying Pei

Figure 1 for An EEG-based approach for Parkinson's disease diagnosis using Capsule network

Figure 2 for An EEG-based approach for Parkinson's disease diagnosis using Capsule network

Figure 3 for An EEG-based approach for Parkinson's disease diagnosis using Capsule network

Figure 4 for An EEG-based approach for Parkinson's disease diagnosis using Capsule network

Abstract:As the second most common neurodegenerative disease, Parkinson's disease has caused serious problems worldwide. However, the cause and mechanism of PD are not clear, and no systematic early diagnosis and treatment of PD have been established. Many patients with PD have not been diagnosed or misdiagnosed. In this paper, we proposed an EEG-based approach to diagnosing Parkinson's disease. It mapped the frequency band energy of electroencephalogram(EEG) signals to 2-dimensional images using the interpolation method and identified classification using capsule network(CapsNet) and achieved 89.34% classification accuracy for short-term EEG sections. A comparison of separate classification accuracy across different EEG bands revealed the highest accuracy in the gamma bands, suggesting that we need to pay more attention to the changes in gamma band changes in the early stages of PD.

* 6 pages,2 image, 3 tables

Via

Access Paper or Ask Questions