



Abstract:Finetuning pretrained large language models (LLMs) has become the standard paradigm for developing downstream applications. However, its security implications remain unclear, particularly regarding whether finetuned LLMs inherit jailbreak vulnerabilities from their pretrained sources. We investigate this question in a realistic pretrain-to-finetune threat model, where the attacker has white-box access to the pretrained LLM and only black-box access to its finetuned derivatives. Empirical analysis shows that adversarial prompts optimized on the pretrained model transfer most effectively to its finetuned variants, revealing inherited vulnerabilities from pretrained to finetuned LLMs. To further examine this inheritance, we conduct representation-level probing, which shows that transferable prompts are linearly separable within the pretrained hidden states, suggesting that universal transferability is encoded in pretrained representations. Building on this insight, we propose the Probe-Guided Projection (PGP) attack, which steers optimization toward transferability-relevant directions. Experiments across multiple LLM families and diverse finetuned tasks confirm PGP's strong transfer success, underscoring the security risks inherent in the pretrain-to-finetune paradigm.
Abstract:There has been a prevalence of applying AI software in both high-stakes public-sector and industrial contexts. However, the lack of transparency has raised concerns about whether these data-informed AI software decisions secure fairness against people of all racial, gender, or age groups. Despite extensive research on emerging fairness-aware AI software, up to now most efforts to solve this issue have been dedicated to binary classification tasks. Fairness in regression is relatively underexplored. In this work, we adopted a mutual information-based metric to assess separation violations. The metric is also extended so that it can be directly applied to both classification and regression problems with both binary and continuous sensitive attributes. Inspired by the Reweighing algorithm in fair classification, we proposed a FairReweighing pre-processing algorithm based on density estimation to ensure that the learned model satisfies the separation criterion. Theoretically, we show that the proposed FairReweighing algorithm can guarantee separation in the training data under a data independence assumption. Empirically, on both synthetic and real-world data, we show that FairReweighing outperforms existing state-of-the-art regression fairness solutions in terms of improving separation while maintaining high accuracy.
Abstract:This paper focuses on the legal compliance challenges of autonomous vehicles in a transnational context. We choose the perspective of designers and try to provide supporting legal reasoning in the design process. Based on argumentation theory, we introduce a logic to represent the basic properties of argument-based practical (normative) reasoning, combined with partial order sets of natural numbers to express priority. Finally, through case analysis of legal texts, we show how the reasoning system we provide can help designers to adapt their design solutions more flexibly in the cross-border application of autonomous vehicles and to more easily understand the legal implications of their decisions.
Abstract:In our previous research, we provided a reasoning system (called LeSAC) based on argumentation theory to provide legal support to designers during the design process. Building on this, this paper explores how to provide designers with effective explanations for their legally relevant design decisions. We extend the previous system for providing explanations by specifying norms and the key legal or ethical principles for justifying actions in normative contexts. Considering that first-order logic has strong expressive power, in the current paper we adopt a first-order deontic logic system with deontic operators and preferences. We illustrate the advantages and necessity of introducing deontic logic and designing explanations under LeSAC by modelling two cases in the context of autonomous driving. In particular, this paper also discusses the requirements of the updated LeSAC to guarantee rationality, and proves that a well-defined LeSAC can satisfy the rationality postulate for rule-based argumentation frameworks. This ensures the system's ability to provide coherent, legally valid explanations for complex design decisions.




Abstract:In the ride-hailing industry, subsidies are predominantly employed to incentivize consumers to place more orders, thereby fostering market growth. Causal inference techniques are employed to estimate the consumer elasticity with different subsidy levels. However, the presence of confounding effects poses challenges in achieving an unbiased estimate of the uplift effect. We introduce a consumer subsidizing system to capture relationships between subsidy propensity and the treatment effect, which proves effective while maintaining a lightweight online environment.
Abstract:MAUP (modifiable areal unit problem) is a fundamental problem for spatial data management and analysis. As an instantiation of MAUP in online transportation platforms, region generation (i.e., specifying the areal unit for service operations) is the first and vital step for supporting spatiotemporal transportation services such as ride-sharing and freight transport. Most existing region generation methods are manually specified (e.g., fixed-size grids), suffering from poor spatial semantic meaning and inflexibility to meet service operation requirements. In this paper, we propose RegionGen, a data-driven region generation framework that can specify regions with key characteristics (e.g., good spatial semantic meaning and predictability) by modeling region generation as a multi-objective optimization problem. First, to obtain good spatial semantic meaning, RegionGen segments the whole city into atomic spatial elements based on road networks and obstacles (e.g., rivers). Then, it clusters the atomic spatial elements into regions by maximizing various operation characteristics, which is formulated as a multi-objective optimization problem. For this optimization problem, we propose a multi-objective co-optimization algorithm. Extensive experiments verify that RegionGen can generate more suitable regions than traditional methods for spatiotemporal service management.


Abstract:Ontology is a popular method for knowledge representation in different domains, including the legal domain, and description logics (DL) is commonly used as its description language. To handle reasoning based on inconsistent DL-based legal ontologies, the current paper presents a structured argumentation framework particularly for reasoning in legal contexts on the basis of ASPIC+, and translates the legal ontology into formulas and rules of an argumentation theory. With a particular focus on the design of autonomous vehicles from the perspective of legal AI, we show that using this combined theory of formal argumentation and DL-based legal ontology, acceptable assertions can be obtained based on inconsistent ontologies, and the traditional reasoning tasks of DL ontologies can also be accomplished. In addition, a formal definition of explanations for the result of reasoning is presented.




Abstract:Fairness in decision-making has been a long-standing issue in our society. Despite the increasing number of research activities on unfairness mitigation in machine learning models, there is little research focusing on mitigating unfairness in human decisions. Fairness in human decisions is as important as, if not more important than, fairness in machine learning models since there are processes where humans make the final decisions and machine learning models can inherit bias from the human decisions they were trained on. As a result, this work aims to detect unfairness in human decisions, the very first step of solving the unfair human decision problem. This paper proposes to utilize the existing machine learning fairness detection mechanisms to detect unfairness in human decisions. The rationale behind this is, while it is difficult to directly test whether a human makes unfair decisions, with current research on machine learning fairness, it is now easy to test, on a large scale at a low cost, whether a machine learning model is unfair. By synthesizing unfair labels on four general machine learning fairness datasets and one image processing dataset, this paper shows that the proposed approach is able to detect (1) whether or not unfair labels exist in the training data and (2) the degree and direction of the unfairness. We believe that this work demonstrates the potential of utilizing machine learning fairness to detect human decision fairness. Following this work, research can be conducted on (1) preventing future unfair decisions, (2) fixing prior unfair decisions, and (3) training a fairer machine learning model.




Abstract:This paper aims to improve machine learning fairness on multiple protected at-tributes. Machine learning fairness has attracted increasing attention since machine learning models are increasingly used for high-stakes and high-risk decisions. Most existing solutions for machine learning fairness only target one protected attribute(e.g. sex) at a time. These solutions cannot generate a machine learning model which is fair against every protected attribute (e.g. both sex and race) at the same time. To solve this problem, we propose FairBalance in this paper to balance the distribution of training data across every protected attribute before training the machine learning models. Our results show that, under the assumption of unbiased ground truth labels, FairBalance can significantly reduce bias metrics (AOD, EOD, and SPD) on every known protected attribute without much, if not any damage to the prediction performance.




Abstract:Dehazing is in the image processing and computer vision communities, the task of enhancing the image taken in foggy conditions. To better understand this type of algorithm, we present in this document a dehazing method which is suitable for several local contrast adjustment algorithms. We base it on two filters. The first filter is built with a step of normalization with some other statistical tricks while the last represents the local contrast improvement algorithm. Thus, it can work on both CPU and GPU for real-time applications. We hope that our approach will open the door to new ideas in the community. Other advantages of our method are first that it does not need to be trained, then it does not need additional optimization processing. Furthermore, it can be used as a pre-treatment or post-processing step in many vision tasks. In addition, it does not need to convert the problem into a physical interpretation, and finally that it is very fast. This family of defogging algorithms is fairly simple, but it shows promising results compared to state-of-the-art algorithms based not only on a visual assessment but also on objective criteria.