Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fan Li

HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers

Mar 30, 2025

Yadong Xie, Fan Li, Yue Wu, Yu Wang

Figure 1 for HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers

Figure 2 for HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers

Figure 3 for HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers

Figure 4 for HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers

Abstract:Fitness can help to strengthen muscles, increase resistance to diseases, and improve body shape. Nowadays, a great number of people choose to exercise at home/office rather than at the gym due to lack of time. However, it is difficult for them to get good fitness effects without professional guidance. Motivated by this, we propose the first personalized fitness monitoring system, HearFit+, using smart speakers at home/office. We explore the feasibility of using acoustic sensing to monitor fitness. We design a fitness detection method based on Doppler shift and adopt the short time energy to segment fitness actions. Based on deep learning, HearFit+ can perform fitness classification and user identification at the same time. Combined with incremental learning, users can easily add new actions. We design 4 evaluation metrics (i.e., duration, intensity, continuity, and smoothness) to help users to improve fitness effects. Through extensive experiments including over 9,000 actions of 10 types of fitness from 12 volunteers, HearFit+ can achieve an average accuracy of 96.13% on fitness classification and 91% accuracy for user identification. All volunteers confirm that HearFit+ can help improve the fitness effect in various environments.

* IEEE Transactions on Mobile Computing ( Volume: 22, Issue: 5, 01 May 2023)

Via

Access Paper or Ask Questions

ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Jan 03, 2025

Zihao Wang, Yuxiang Wei, Fan Li, Renjing Pei, Hang Xu, Wangmeng Zuo

Figure 1 for ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Figure 2 for ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Figure 3 for ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Figure 4 for ACE: Anti-Editing Concept Erasure in Text-to-Image Models

Abstract:Recent advance in text-to-image diffusion models have significantly facilitated the generation of high-quality images, but also raising concerns about the illegal creation of harmful content, such as copyrighted images. Existing concept erasure methods achieve superior results in preventing the production of erased concept from prompts, but typically perform poorly in preventing undesired editing. To address this issue, we propose an Anti-Editing Concept Erasure (ACE) method, which not only erases the target concept during generation but also filters out it during editing. Specifically, we propose to inject the erasure guidance into both conditional and the unconditional noise prediction, enabling the model to effectively prevent the creation of erasure concepts during both editing and generation. Furthermore, a stochastic correction guidance is introduced during training to address the erosion of unrelated concepts. We conducted erasure editing experiments with representative editing methods (i.e., LEDITS++ and MasaCtrl) to erase IP characters, and the results indicate that our ACE effectively filters out target concepts in both types of edits. Additional experiments on erasing explicit concepts and artistic styles further demonstrate that our ACE performs favorably against state-of-the-art methods. Our code will be publicly available at https://github.com/120L020904/ACE.

* 25 pages, code available at https://github.com/120L020904/ACE

Via

Access Paper or Ask Questions

Spot Risks Before Speaking! Unraveling Safety Attention Heads in Large Vision-Language Models

Jan 03, 2025

Ziwei Zheng, Junyao Zhao, Le Yang, Lijun He, Fan Li

Abstract:With the integration of an additional modality, large vision-language models (LVLMs) exhibit greater vulnerability to safety risks (e.g., jailbreaking) compared to their language-only predecessors. Although recent studies have devoted considerable effort to the post-hoc alignment of LVLMs, the inner safety mechanisms remain largely unexplored. In this paper, we discover that internal activations of LVLMs during the first token generation can effectively identify malicious prompts across different attacks. This inherent safety perception is governed by sparse attention heads, which we term ``safety heads." Further analysis reveals that these heads act as specialized shields against malicious prompts; ablating them leads to higher attack success rates, while the model's utility remains unaffected. By locating these safety heads and concatenating their activations, we construct a straightforward but powerful malicious prompt detector that integrates seamlessly into the generation process with minimal extra inference overhead. Despite its simple structure of a logistic regression model, the detector surprisingly exhibits strong zero-shot generalization capabilities. Experiments across various prompt-based attacks confirm the effectiveness of leveraging safety heads to protect LVLMs. Code is available at \url{https://github.com/Ziwei-Zheng/SAHs}.

Via

Access Paper or Ask Questions

Efficient Dynamic Attributed Graph Generation

Dec 11, 2024

Fan Li, Xiaoyang Wang, Dawei Cheng, Cong Chen, Ying Zhang, Xuemin Lin

Figure 1 for Efficient Dynamic Attributed Graph Generation

Figure 2 for Efficient Dynamic Attributed Graph Generation

Figure 3 for Efficient Dynamic Attributed Graph Generation

Figure 4 for Efficient Dynamic Attributed Graph Generation

Abstract:Data generation is a fundamental research problem in data management due to its diverse use cases, ranging from testing database engines to data-specific applications. However, real-world entities often involve complex interactions that cannot be effectively modeled by traditional tabular data. Therefore, graph data generation has attracted increasing attention recently. Although various graph generators have been proposed in the literature, there are three limitations: i) They cannot capture the co-evolution pattern of graph structure and node attributes. ii) Few of them consider edge direction, leading to substantial information loss. iii) Current state-of-the-art dynamic graph generators are based on the temporal random walk, making the simulation process time-consuming. To fill the research gap, we introduce VRDAG, a novel variational recurrent framework for efficient dynamic attributed graph generation. Specifically, we design a bidirectional message-passing mechanism to encode both directed structural knowledge and attribute information of a snapshot. Then, the temporal dependency in the graph sequence is captured by a recurrence state updater, generating embeddings that can preserve the evolution pattern of early graphs. Based on the hidden node embeddings, a conditional variational Bayesian method is developed to sample latent random variables at the neighboring timestep for new snapshot generation. The proposed generation paradigm avoids the time-consuming path sampling and merging process in existing random walk-based methods, significantly reducing the synthesis time. Finally, comprehensive experiments on real-world datasets are conducted to demonstrate the effectiveness and efficiency of the proposed model.

* 14 pages,10 figures. Accepted by IEEE ICDE2025

Via

Access Paper or Ask Questions

MagicEraser: Erasing Any Objects via Semantics-Aware Control

Oct 14, 2024

Fan Li, Zixiao Zhang, Yi Huang, Jianzhuang Liu, Renjing Pei, Bin Shao, Songcen Xu

Abstract:The traditional image inpainting task aims to restore corrupted regions by referencing surrounding background and foreground. However, the object erasure task, which is in increasing demand, aims to erase objects and generate harmonious background. Previous GAN-based inpainting methods struggle with intricate texture generation. Emerging diffusion model-based algorithms, such as Stable Diffusion Inpainting, exhibit the capability to generate novel content, but they often produce incongruent results at the locations of the erased objects and require high-quality text prompt inputs. To address these challenges, we introduce MagicEraser, a diffusion model-based framework tailored for the object erasure task. It consists of two phases: content initialization and controllable generation. In the latter phase, we develop two plug-and-play modules called prompt tuning and semantics-aware attention refocus. Additionally, we propose a data construction strategy that generates training data specially suitable for this task. MagicEraser achieves fine and effective control of content generation while mitigating undesired artifacts. Experimental results highlight a valuable advancement of our approach in the object erasure task.

* Accepted by ECCV 2024

Via

Access Paper or Ask Questions

TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Oct 09, 2024

Fan Li, Xiaoyang Wang, Dawei Cheng, Wenjie Zhang, Ying Zhang, Xuemin Lin

Figure 1 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Figure 2 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Figure 3 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Figure 4 for TCGU: Data-centric Graph Unlearning based on Transferable Condensation

Abstract:With growing demands for data privacy and model robustness, graph unlearning (GU), which erases the influence of specific data on trained GNN models, has gained significant attention. However, existing exact unlearning methods suffer from either low efficiency or poor model performance. While being more utility-preserving and efficient, current approximate unlearning methods are not applicable in the zero-glance privacy setting, where the deleted samples cannot be accessed during unlearning due to immediate deletion requested by regulations. Besides, these approximate methods, which try to directly perturb model parameters still involve high privacy concerns in practice. To fill the gap, we propose Transferable Condensation Graph Unlearning (TCGU), a data-centric solution to zero-glance graph unlearning. Specifically, we first design a two-level alignment strategy to pre-condense the original graph into a small yet utility-preserving dataset. Upon receiving an unlearning request, we fine-tune the pre-condensed data with a low-rank plugin, to directly align its distribution with the remaining graph, thus efficiently revoking the information of deleted data without accessing them. A novel similarity distribution matching approach and a discrimination regularizer are proposed to effectively transfer condensed data and preserve its utility in GNN training, respectively. Finally, we retrain the GNN on the transferred condensed data. Extensive experiments on 6 benchmark datasets demonstrate that TCGU can achieve superior performance in terms of model utility, unlearning efficiency, and unlearning efficacy than existing GU methods.

* 14 pages, 18 figures

Via

Access Paper or Ask Questions

OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength

Sep 22, 2024

Le Yang, Ziwei Zheng, Yizeng Han, Shiji Song, Gao Huang, Fan Li

Figure 1 for OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength

Figure 2 for OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength

Figure 3 for OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength

Figure 4 for OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength

Abstract:Differentiable architecture search (DARTS) has emerged as a promising technique for effective neural architecture search, and it mainly contains two steps to find the high-performance architecture: First, the DARTS supernet that consists of mixed operations will be optimized via gradient descent. Second, the final architecture will be built by the selected operations that contribute the most to the supernet. Although DARTS improves the efficiency of NAS, it suffers from the well-known degeneration issue which can lead to deteriorating architectures. Existing works mainly attribute the degeneration issue to the failure of its supernet optimization, while little attention has been paid to the selection method. In this paper, we cease to apply the widely-used magnitude-based selection method and propose a novel criterion based on operation strength that estimates the importance of an operation by its effect on the final loss. We show that the degeneration issue can be effectively addressed by using the proposed criterion without any modification of supernet optimization, indicating that the magnitude-based selection method can be a critical reason for the instability of DARTS. The experiments on NAS-Bench-201 and DARTS search spaces show the effectiveness of our method.

Via

Access Paper or Ask Questions

Hardware-Efficient and Reliable Coherent DSCM Systems Enabled by Single-Pilot-Tone-Based Polarization Demultiplexing

Jul 14, 2024

Wei Wang, Dongdong Zou, Weihao Ni, Fan Li

Abstract:Recently, coherent digital subcarrier multiplexing (DSCM) technology has become an attractive solution for next-generation ultra-high-speed datacenter interconnects (DCIs). To meet the requirements of low-cost and low-power consumption in DCI applications, a comprehensive simplification of the coherent DSCM system has been investigated. The pilot-tone-based polarization demultiplexing (PT-PDM) technique, known for its low-power consumption and ultra-fast polarization tracking capabilities, has emerged as a compelling alternative to the power-hungry N-tap adaptive multi-input multiple-output (MIMO) equalizer. However, the effectiveness of this PT-PDM technique is extremely vulnerable to the receiver-side XY-skew (Rx-XY-skew), which is revealed in this paper for the first time. Then, a pilot-tone-enabled modified Godard phase detector (PT-MGPD) scheme is proposed to realize Rx-XY-skew estimation, serving as the prerequisite for the successful implementation of the PT-PDM and simplification of the adaptive equalizer. Both the simulation and experiment are conducted to evaluate the accuracy of the proposed PT-MGPD scheme. The results prove it can achieve accurate estimation with an error of less than 0.3ps. Besides, a low-complexity, high-spectral-efficiency, and ultra-fast polarization demultiplexing method based on a single pilot tone (SPT) is proposed for the DSCM system in this work. Based on the proposed PT-MGPD and SPT schemes, the conventional N-tap MIMO equalizer served for each subcarrier can be successfully pruned into two polarization-independent single-input single-output equalizers, and there is no performance penalty even if the polarization rotation speed reaches 10Mrad/s. According to the results, the proposed schemes provide a hardware-efficient and reliable coherent DSCM solution for next-generation ultra-high-speed DCIs.

Via

Access Paper or Ask Questions

Fine-grained Dynamic Network for Generic Event Boundary Detection

Jul 05, 2024

Ziwei Zheng, Lijun He, Le Yang, Fan Li

Abstract:Generic event boundary detection (GEBD) aims at pinpointing event boundaries naturally perceived by humans, playing a crucial role in understanding long-form videos. Given the diverse nature of generic boundaries, spanning different video appearances, objects, and actions, this task remains challenging. Existing methods usually detect various boundaries by the same protocol, regardless of their distinctive characteristics and detection difficulties, resulting in suboptimal performance. Intuitively, a more intelligent and reasonable way is to adaptively detect boundaries by considering their special properties. In light of this, we propose a novel dynamic pipeline for generic event boundaries named DyBDet. By introducing a multi-exit network architecture, DyBDet automatically learns the subnet allocation to different video snippets, enabling fine-grained detection for various boundaries. Besides, a multi-order difference detector is also proposed to ensure generic boundaries can be effectively identified and adaptively processed. Extensive experiments on the challenging Kinetics-GEBD and TAPOS datasets demonstrate that adopting the dynamic strategy significantly benefits GEBD tasks, leading to obvious improvements in both performance and efficiency compared to the current state-of-the-art.

* ECCV 2024

Via

Access Paper or Ask Questions

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Jul 03, 2024

Le Yang, Ziwei Zheng, Yizeng Han, Hao Cheng, Shiji Song, Gao Huang, Fan Li

Figure 1 for DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Figure 2 for DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Figure 3 for DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Figure 4 for DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Abstract:Recent proposed neural network-based Temporal Action Detection (TAD) models are inherently limited to extracting the discriminative representations and modeling action instances with various lengths from complex scenes by shared-weights detection heads. Inspired by the successes in dynamic neural networks, in this paper, we build a novel dynamic feature aggregation (DFA) module that can simultaneously adapt kernel weights and receptive fields at different timestamps. Based on DFA, the proposed dynamic encoder layer aggregates the temporal features within the action time ranges and guarantees the discriminability of the extracted representations. Moreover, using DFA helps to develop a Dynamic TAD head (DyHead), which adaptively aggregates the multi-scale features with adjusted parameters and learned receptive fields better to detect the action instances with diverse ranges from videos. With the proposed encoder layer and DyHead, a new dynamic TAD model, DyFADet, achieves promising performance on a series of challenging TAD benchmarks, including HACS-Segment, THUMOS14, ActivityNet-1.3, Epic-Kitchen 100, Ego4D-Moment QueriesV1.0, and FineAction. Code is released to https://github.com/yangle15/DyFADet-pytorch.

* ECCV 2024

Via

Access Paper or Ask Questions