Alert button
Picture for Shen Li

Shen Li

Alert button

Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection

Sep 28, 2023
Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi

Figure 1 for Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection
Figure 2 for Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection
Figure 3 for Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection
Figure 4 for Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection

Despite considerable advances in automated fake news detection, due to the timely nature of news, it remains a critical open question how to effectively predict the veracity of news articles based on limited fact-checks. Existing approaches typically follow a "Train-from-Scratch" paradigm, which is fundamentally bounded by the availability of large-scale annotated data. While expressive pre-trained language models (PLMs) have been adapted in a "Pre-Train-and-Fine-Tune" manner, the inconsistency between pre-training and downstream objectives also requires costly task-specific supervision. In this paper, we propose "Prompt-and-Align" (P&A), a novel prompt-based paradigm for few-shot fake news detection that jointly leverages the pre-trained knowledge in PLMs and the social context topology. Our approach mitigates label scarcity by wrapping the news article in a task-related textual prompt, which is then processed by the PLM to directly elicit task-specific knowledge. To supplement the PLM with social context without inducing additional training overheads, motivated by empirical observation on user veracity consistency (i.e., social users tend to consume news of the same veracity type), we further construct a news proximity graph among news articles to capture the veracity-consistent signals in shared readerships, and align the prompting predictions along the graph edges in a confidence-informed manner. Extensive experiments on three real-world benchmarks demonstrate that P&A sets new states-of-the-art for few-shot fake news detection performance by significant margins.

* Accepted to CIKM 2023 (Full Paper) 
Viaarxiv icon

On the Performance of Multidimensional Constellation Shaping for Linear and Nonlinear Optical Fiber Channel

Aug 17, 2023
Bin Chen, Zhiwei Liang, Shen Li, Yi Lei, Gabriele Liga, Alex Alvarado

Figure 1 for On the Performance of Multidimensional Constellation Shaping for Linear and Nonlinear Optical Fiber Channel
Figure 2 for On the Performance of Multidimensional Constellation Shaping for Linear and Nonlinear Optical Fiber Channel
Figure 3 for On the Performance of Multidimensional Constellation Shaping for Linear and Nonlinear Optical Fiber Channel
Figure 4 for On the Performance of Multidimensional Constellation Shaping for Linear and Nonlinear Optical Fiber Channel

Multidimensional constellation shaping of up to 32 dimensions with different spectral efficiencies are compared through AWGN and fiber-optic simulations. The results show that no constellation is universal and the balance of required and effective SNRs should be jointly considered for the specific optical transmission scenario.

* 4 pages, 3 figures 
Viaarxiv icon

Proximity-Informed Calibration for Deep Neural Networks

Jun 07, 2023
Miao Xiong, Ailin Deng, Pang Wei Koh, Jiaying Wu, Shen Li, Jianqing Xu, Bryan Hooi

Figure 1 for Proximity-Informed Calibration for Deep Neural Networks
Figure 2 for Proximity-Informed Calibration for Deep Neural Networks
Figure 3 for Proximity-Informed Calibration for Deep Neural Networks
Figure 4 for Proximity-Informed Calibration for Deep Neural Networks

Confidence calibration is central to providing accurate and interpretable uncertainty estimates, especially under safety-critical scenarios. However, we find that existing calibration algorithms often overlook the issue of proximity bias, a phenomenon where models tend to be more overconfident in low proximity data (i.e., lying in the sparse region of the data distribution) compared to high proximity samples, and thus suffer from inconsistent miscalibration across different proximity samples. We examine the problem over pretrained ImageNet models and observe that: 1) Proximity bias exists across a wide variety of model architectures and sizes; 2) Transformer-based models are more susceptible to proximity bias than CNN-based models; 3) Proximity bias persists even after performing popular calibration algorithms like temperature scaling; 4) Models tend to overfit more heavily on low proximity samples than on high proximity samples. Motivated by the empirical findings, we propose ProCal, a plug-and-play algorithm with a theoretical guarantee to adjust sample confidence based on proximity. To further quantify the effectiveness of calibration algorithms in mitigating proximity bias, we introduce proximity-informed expected calibration error (PIECE) with theoretical analysis. We show that ProCal is effective in addressing proximity bias and improving calibration on balanced, long-tail, and distribution-shift settings under four metrics over various model architectures.

Viaarxiv icon

Sim2real and Digital Twins in Autonomous Driving: A Survey

May 02, 2023
Xuemin Hu, Shen Li, Tingyu Huang, Bo Tang, Long Chen

Figure 1 for Sim2real and Digital Twins in Autonomous Driving: A Survey
Figure 2 for Sim2real and Digital Twins in Autonomous Driving: A Survey
Figure 3 for Sim2real and Digital Twins in Autonomous Driving: A Survey
Figure 4 for Sim2real and Digital Twins in Autonomous Driving: A Survey

Safety and cost are two important concerns for the development of autonomous driving technologies. From the academic research to commercial applications of autonomous driving vehicles, sufficient simulation and real world testing are required. In general, a large scale of testing in simulation environment is conducted and then the learned driving knowledge is transferred to the real world, so how to adapt driving knowledge learned in simulation to reality becomes a critical issue. However, the virtual simulation world differs from the real world in many aspects such as lighting, textures, vehicle dynamics, and agents' behaviors, etc., which makes it difficult to bridge the gap between the virtual and real worlds. This gap is commonly referred to as the reality gap (RG). In recent years, researchers have explored various approaches to address the reality gap issue, which can be broadly classified into two categories: transferring knowledge from simulation to reality (sim2real) and learning in digital twins (DTs). In this paper, we consider the solutions through the sim2real and DTs technologies, and review important applications and innovations in the field of autonomous driving. Meanwhile, we show the state-of-the-arts from the views of algorithms, models, and simulators, and elaborate the development process from sim2real to DTs. The presentation also illustrates the far-reaching effects of the development of sim2real and DTs in autonomous driving.

Viaarxiv icon

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

Apr 21, 2023
Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-Chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Shen Li

Figure 1 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Figure 2 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Figure 3 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Figure 4 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite the remarkable progress made in the field of machine learning systems research, which has enabled the development and exploration of large models, such abilities remain confined to a small group of advanced users and industry leaders, resulting in an implicit technical barrier for the wider community to access and leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel (FSDP) as an industry-grade solution for large model training. FSDP has been closely co-designed with several key PyTorch core components including Tensor implementation, dispatcher system, and CUDA memory caching allocator, to provide non-intrusive user experiences and high training efficiency. Additionally, FSDP natively incorporates a range of techniques and settings to optimize resource utilization across a variety of hardware configurations. The experimental results demonstrate that FSDP is capable of achieving comparable performance to Distributed Data Parallel while providing support for significantly larger models with near-linear scalability in terms of TFLOPS.

Viaarxiv icon

Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness

Feb 06, 2023
Ailin Deng, Shen Li, Miao Xiong, Zhirui Chen, Bryan Hooi

Trustworthy machine learning is of primary importance to the practical deployment of deep learning models. While state-of-the-art models achieve astonishingly good performance in terms of accuracy, recent literature reveals that their predictive confidence scores unfortunately cannot be trusted: e.g., they are often overconfident when wrong predictions are made, or so even for obvious outliers. In this paper, we introduce a new approach of self-supervised probing, which enables us to check and mitigate the overconfidence issue for a trained model, thereby improving its trustworthiness. We provide a simple yet effective framework, which can be flexibly applied to existing trustworthiness-related methods in a plug-and-play manner. Extensive experiments on three trustworthiness-related tasks (misclassification detection, calibration and out-of-distribution detection) across various benchmarks verify the effectiveness of our proposed probing framework.

* European Conference on Computer Vision 2022 
Viaarxiv icon

Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation

Nov 29, 2022
Miao Xiong, Shen Li, Wenjie Feng, Ailin Deng, Jihai Zhang, Bryan Hooi

Figure 1 for Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation
Figure 2 for Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation
Figure 3 for Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation
Figure 4 for Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation

How do we know when the predictions made by a classifier can be trusted? This is a fundamental problem that also has immense practical applicability, especially in safety-critical areas such as medicine and autonomous driving. The de facto approach of using the classifier's softmax outputs as a proxy for trustworthiness suffers from the over-confidence issue; while the most recent works incur problems such as additional retraining cost and accuracy versus trustworthiness trade-off. In this work, we argue that the trustworthiness of a classifier's prediction for a sample is highly associated with two factors: the sample's neighborhood information and the classifier's output. To combine the best of both worlds, we design a model-agnostic post-hoc approach NeighborAgg to leverage the two essential information via an adaptive neighborhood aggregation. Theoretically, we show that NeighborAgg is a generalized version of a one-hop graph convolutional network, inheriting the powerful modeling ability to capture the varying similarity between samples within each class. We also extend our approach to the closely related task of mislabel detection and provide a theoretical coverage guarantee to bound the false negative. Empirically, extensive experiments on image and tabular benchmarks verify our theory and suggest that NeighborAgg outperforms other methods, achieving state-of-the-art trustworthiness performance.

* Transactions on Machine Learning Research 08/2022  
* Published in Transactions on Machine Learning Research (TMLR) 2022 
Viaarxiv icon

Coordinating CAV Swarms at Intersections with a Deep Learning Model

Nov 10, 2022
Jiawei Zhang, Shen Li, Li Li

Figure 1 for Coordinating CAV Swarms at Intersections with a Deep Learning Model
Figure 2 for Coordinating CAV Swarms at Intersections with a Deep Learning Model
Figure 3 for Coordinating CAV Swarms at Intersections with a Deep Learning Model
Figure 4 for Coordinating CAV Swarms at Intersections with a Deep Learning Model

Connected and automated vehicles (CAVs) are viewed as a special kind of robots that have the potential to significantly improve the safety and efficiency of traffic. In contrast to many swarm robotics studies that are demonstrated in labs by employing a small number of robots, CAV studies aims to achieve cooperative driving of unceasing robot swarm flows. However, how to get the optimal passing order of such robot swarm flows even for a signal-free intersection is an NP-hard problem (specifically, enumerating based algorithm takes days to find the optimal solution to a 20-CAV scenario). Here, we introduce a novel cooperative driving algorithm (AlphaOrder) that combines offline deep learning and online tree searching to find a near-optimal passing order in real-time. AlphaOrder builds a pointer network model from solved scenarios and generates near-optimal passing orders instantaneously for new scenarios. Furthermore, our approach provides a general approach to managing preemptive resource sharing between swarm robotics (e.g., scheduling multiple automated guided vehicles (AGVs) and unmanned aerial vehicles (UAVs) at conflicting areas

Viaarxiv icon

Neural PCA for Flow-Based Representation Learning

Aug 23, 2022
Shen Li, Bryan Hooi

Figure 1 for Neural PCA for Flow-Based Representation Learning
Figure 2 for Neural PCA for Flow-Based Representation Learning
Figure 3 for Neural PCA for Flow-Based Representation Learning
Figure 4 for Neural PCA for Flow-Based Representation Learning

Of particular interest is to discover useful representations solely from observations in an unsupervised generative manner. However, the question of whether existing normalizing flows provide effective representations for downstream tasks remains mostly unanswered despite their strong ability for sample generation and density estimation. This paper investigates this problem for such a family of generative models that admits exact invertibility. We propose Neural Principal Component Analysis (Neural-PCA) that operates in full dimensionality while capturing principal components in \emph{descending} order. Without exploiting any label information, the principal components recovered store the most informative elements in their \emph{leading} dimensions and leave the negligible in the \emph{trailing} ones, allowing for clear performance improvements of $5\%$-$10\%$ in downstream tasks. Such improvements are empirically found consistent irrespective of the number of latent trailing dimensions dropped. Our work suggests that necessary inductive bias be introduced into generative modelling when representation quality is of interest.

* Accepted to IJCAI 2022 
Viaarxiv icon

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

Jun 09, 2022
Yanwei Wang, Nadia Figueroa, Shen Li, Ankit Shah, Julie Shah

Figure 1 for Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations
Figure 2 for Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations
Figure 3 for Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations
Figure 4 for Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

Learning from demonstration (LfD) methods have shown promise for solving multi-step tasks; however, these approaches do not guarantee successful reproduction of the task given disturbances. In this work, we identify the roots of such a challenge as the failure of the learned continuous policy to satisfy the discrete plan implicit in the demonstration. By utilizing modes (rather than subgoals) as the discrete abstraction and motion policies with both mode invariance and goal reachability properties, we prove our learned continuous policy can simulate any discrete plan specified by a Linear Temporal Logic (LTL) formula. Consequently, the imitator is robust to both task- and motion-level disturbances and guaranteed to achieve task success. Project page: https://sites.google.com/view/ltl-ds

Viaarxiv icon