Abstract:The spline adaptive filtering (SAF) algorithm-based information-theoretic learning has exhibited strong convergence performance in nonlinear system identification (NSI), establishing SAF as a promising framework for adaptive filtering. However, existing SAF-based methods suffer from performance degradation under generalized Gaussian noise (GGN) environment and exhibit significant steady-state misalignment under impulse noise. Moreover, prior research on SAF algorithms has not effectively addressed the adverse effects caused by outliers. To overcome these challenges, the generalized modified Blake-Zisserman robust spline adaptive filtering (SAF-GMBZ) algorithm is proposed. Compared to conventional SAF algorithms, SAF-GMBZ exhibits superior learning performance in GGN. Furthermore, the mean convergence ranges of the step-sizes and the steady-state mean-square error (MSE) are calculated by introducing the commonly utilized assumptions. To arrive at good convergence accuracy and noise cancellation capability in active noise control (ANC) application, the filter-c GMBZ (FcGMBZ) algorithm is further developed based on SAF-GMBZ. Simulation results confirm the accuracy of the theoretical steady-state MSE, and the superiority of the SAF-GMBZ algorithm under GGN environment in NSI, along with the effectiveness of the FcGMBZ algorithm in ANC application under impulsive noise environment.
Abstract:As large language models (LLMs) become more capable and widely used, ensuring the safety of their outputs is increasingly critical. Existing guardrail models, though useful in static evaluation settings, face two major limitations in real-world applications: (1) they typically output only binary "safe/unsafe" labels, which can be interpreted inconsistently across diverse safety policies, rendering them incapable of accommodating varying safety tolerances across domains; and (2) they require complete model outputs before performing safety checks, making them fundamentally incompatible with streaming LLM inference, thereby preventing timely intervention during generation and increasing exposure to harmful partial outputs. To address these challenges, we present Qwen3Guard, a series of multilingual safety guardrail models with two specialized variants: Generative Qwen3Guard, which casts safety classification as an instruction-following task to enable fine-grained tri-class judgments (safe, controversial, unsafe); and Stream Qwen3Guard, which introduces a token-level classification head for real-time safety monitoring during incremental text generation. Both variants are available in three sizes (0.6B, 4B, and 8B parameters) and support up to 119 languages and dialects, providing comprehensive, scalable, and low-latency safety moderation for global LLM deployments. Evaluated across English, Chinese, and multilingual benchmarks, Qwen3Guard achieves state-of-the-art performance in both prompt and response safety classification. All models are released under the Apache 2.0 license for public use.
Abstract:Non-Gaussian noise and the uncertainty of noise distribution are the common factors that reduce accuracy in dynamic state estimation of power systems (PS). In addition, the optimal value of the free coefficients in the unscented Kalman filter (UKF) based on information theoretic criteria is also an urgent problem. In this paper, a robust adaptive UKF (AUKF) under generalized minimum mixture error entropy with fiducial points (GMMEEF) over improve Snow Geese algorithm (ISGA) (ISGA-GMMEEF-AUKF) is proposed to overcome the above difficulties. The estimation process of the proposed algorithm is based on several key steps including augmented regression error model (AREM) construction, adaptive state estimation, and free coefficients optimization. Specifically, an AREM consisting of state prediction and measurement errors is established at the first step. Then, GMMEEF-AUKF is developed by solving the optimization problem based on GMMEEF, which uses a generalized Gaussian kernel combined with mixture correntropy to enhance the flexibility further and resolve the data problem with complex attributes and update the noise covariance matrix according to the AREM framework. Finally, the ISGA is designed to automatically calculate the optimal value of coefficients such as the shape coefficients of the kernel in the GMMEEF criterion, the coefficients selection sigma points in unscented transform, and the update coefficient of the noise covariance matrices fit with the PS model. Simulation results on the IEEE 14, 30, and 57-bus test systems in complex scenarios have confirmed that the proposed algorithm outperforms the MEEF-UKF and UKF by an average efficiency of 26% and 65%, respectively.



Abstract:Currently, adaptive filtering algorithms have been widely applied in frequency estimation for power systems. However, research on diffusion tasks remains insufficient. Existing diffusion adaptive frequency estimation algorithms exhibit certain limitations in handling input noise and lack robustness against impulsive noise. Moreover, traditional adaptive filtering algorithms designed based on the strictly-linear (SL) model fail to effectively address frequency estimation challenges in unbalanced three-phase power systems. To address these issues, this letter proposes an improved diffusion augmented complex maximum total correntropy (DAMTCC) algorithm based on the widely linear (WL) model. The proposed algorithm not only significantly enhances the capability to handle input noise but also demonstrates superior robustness to impulsive noise. Furthermore, it successfully resolves the critical challenge of frequency estimation in unbalanced three-phase power systems, offering an efficient and reliable solution for diffusion power system frequency estimation. Finally, we analyze the stability of the algorithm and computer simulations verify the excellent performance of the algorithm.




Abstract:The ability to adapt beliefs or behaviors in response to unexpected outcomes, reflection, is fundamental to intelligent systems' interaction with the world. From a cognitive science perspective, this serves as a core principle of intelligence applicable to both human and AI systems. To address the debate on the intelligence of large language models (LLMs), we propose Reflection-Bench, a comprehensive benchmark comprising 7 tasks spanning core cognitive functions crucial for reflection, including perception, memory, belief updating, decision-making, prediction, counterfactual thinking, and meta-reflection. We evaluate the performances of 13 prominent LLMs such as OpenAI o1, GPT-4, Claude 3.5 Sonnet, etc. The results indicate that current LLMs still lack satisfactory reflection ability. We discuss the underlying causes of these results and suggest potential avenues for future research. In conclusion, Reflection-Bench offers both evaluation tools and inspiration for developing AI capable of reliably interacting with the environment. Our data and code are available at https://github.com/YabYum/ReflectionBench.




Abstract:Considering the problem of nonlinear and non-gaussian filtering of the graph signal, in this paper, a robust square root unscented Kalman filter based on graph signal processing is proposed. The algorithm uses a graph topology to generate measurements and an unscented transformation is used to obtain the priori state estimates. In addition, in order to enhance the numerical stability of the unscented Kalman filter, the algorithm combines the double square root decomposition method to update the covariance matrix in the graph frequency domain. Furthermore, to handle the non-Gaussian noise problem in the state estimation process, an error augmentation model is constructed in the graph frequency domain by unifying the measurement error and state error, which utilizes the Laplace matrix of the graph to effectively reduce the cumulative error at each vertex. Then the general robust cost function is adopted as the optimal criterion to deal with the error, which has more parameter options so that effectively suppresses the problems of random outliers and abnormal measurement values in the state estimation process. Finally, the convergence of the error of the proposed algorithm is firstly verified theoretically, and then the robustness of the proposed algorithm is verified by experimental simulation.




Abstract:Emotion Support Conversation (ESC) is a crucial application, which aims to reduce human stress, offer emotional guidance, and ultimately enhance human mental and physical well-being. With the advancement of Large Language Models (LLMs), many researchers have employed LLMs as the ESC models. However, the evaluation of these LLM-based ESCs remains uncertain. Inspired by the awesome development of role-playing agents, we propose an ESC Evaluation framework (ESC-Eval), which uses a role-playing agent to interact with ESC models, followed by a manual evaluation of the interactive dialogues. In detail, we first re-organize 2,801 role-playing cards from seven existing datasets to define the roles of the role-playing agent. Second, we train a specific role-playing model called ESC-Role which behaves more like a confused person than GPT-4. Third, through ESC-Role and organized role cards, we systematically conduct experiments using 14 LLMs as the ESC models, including general AI-assistant LLMs (ChatGPT) and ESC-oriented LLMs (ExTES-Llama). We conduct comprehensive human annotations on interactive multi-turn dialogues of different ESC models. The results show that ESC-oriented LLMs exhibit superior ESC abilities compared to general AI-assistant LLMs, but there is still a gap behind human performance. Moreover, to automate the scoring process for future ESC models, we developed ESC-RANK, which trained on the annotated data, achieving a scoring performance surpassing 35 points of GPT-4. Our data and code are available at https://github.com/haidequanbu/ESC-Eval.




Abstract:Powered by remarkable advancements in Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities in manifold tasks. However, the practical application scenarios of MLLMs are intricate, exposing them to potential malicious instructions and thereby posing safety risks. While current benchmarks do incorporate certain safety considerations, they often lack comprehensive coverage and fail to exhibit the necessary rigor and robustness. For instance, the common practice of employing GPT-4V as both the evaluator and a model to be evaluated lacks credibility, as it tends to exhibit a bias toward its own responses. In this paper, we present MLLMGuard, a multidimensional safety evaluation suite for MLLMs, including a bilingual image-text evaluation dataset, inference utilities, and a lightweight evaluator. MLLMGuard's assessment comprehensively covers two languages (English and Chinese) and five important safety dimensions (Privacy, Bias, Toxicity, Truthfulness, and Legality), each with corresponding rich subtasks. Focusing on these dimensions, our evaluation dataset is primarily sourced from platforms such as social media, and it integrates text-based and image-based red teaming techniques with meticulous annotation by human experts. This can prevent inaccurate evaluation caused by data leakage when using open-source datasets and ensures the quality and challenging nature of our benchmark. Additionally, a fully automated lightweight evaluator termed GuardRank is developed, which achieves significantly higher evaluation accuracy than GPT-4. Our evaluation results across 13 advanced models indicate that MLLMs still have a substantial journey ahead before they can be considered safe and responsible.




Abstract:Although the known maximum total generalized correntropy (MTGC) and generalized maximum blakezisserman total correntropy (GMBZTC) algorithms can maintain good performance under the errors-in-variables (EIV) model disrupted by generalized Gaussian noise, their requirement for manual ad-justment of parameters is excessive, greatly increasing the practical difficulty of use. To solve this problem, the total arctangent based on logical distance metric (TACLDM) algo-rithm is proposed by utilizing the advantage of few parameters in logical distance metric (LDM) theory and the convergence behavior is improved by the arctangent function. Compared with other competing algorithms, the TACLDM algorithm not only has fewer parameters, but also has better robustness to generalized Gaussian noise and significantly reduces the steady-state error. Furthermore, the analysis of the algorithm in the generalized Gaussian noise environment is analyzed in detail in this paper. Finally, computer simulations demonstrate the outstanding performance of the TACLDM algorithm and the rigorous theoretical deduction in this paper.




Abstract:In recent years, multi-modal entity linking (MEL) has garnered increasing attention in the research community due to its significance in numerous multi-modal applications. Video, as a popular means of information transmission, has become prevalent in people's daily lives. However, most existing MEL methods primarily focus on linking textual and visual mentions or offline videos's mentions to entities in multi-modal knowledge bases, with limited efforts devoted to linking mentions within online video content. In this paper, we propose a task called Online Video Entity Linking OVEL, aiming to establish connections between mentions in online videos and a knowledge base with high accuracy and timeliness. To facilitate the research works of OVEL, we specifically concentrate on live delivery scenarios and construct a live delivery entity linking dataset called LIVE. Besides, we propose an evaluation metric that considers timelessness, robustness, and accuracy. Furthermore, to effectively handle OVEL task, we leverage a memory block managed by a Large Language Model and retrieve entity candidates from the knowledge base to augment LLM performance on memory management. The experimental results prove the effectiveness and efficiency of our method.