Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuning Zhang

OmniLoc: A Geometry-Aware Foundation Model for Anchor-Free UE Localization Across Diverse Indoor Environments

Jun 09, 2026

Lei Chu, Yuning Zhang, Omer Gokalp Serbetci, Anushka Katiyar, Bassel Abou Ali Modad, Andreas F. Molisch

Abstract:Indoor localization from wireless measurements remains challenging in large-scale deployments due to substantial variation in building geometry, the set of detectable access points (APs), and the heterogeneity of received signals. Existing learning-based methods often perform well only in limited settings and degrade under environmental shifts, making robust anchor-free localization across diverse indoor environments notoriously difficult. In this paper, we present OmniLoc, an environment-interactive foundation model for anchor-free user equipment localization across diverse indoor environments. To the best of our knowledge, OmniLoc is the first foundation-model-based approach built directly on wireless measurements for this task. OmniLoc is built on three key designs. First, a unified input tokenization module converts heterogeneous wireless measurements into a common representation that is more amenable to learning. Second, a geometry-aware Transformer performs AP-aware feature extraction by emphasizing dominant APs while aggregating complementary evidence from supporting APs. Third, a geometry-aware location estimation module conditions regression on geometric embeddings to produce geometrically consistent location predictions. We evaluate OmniLoc on both a large-scale in-house dataset and a public benchmark dataset. Results show that OmniLoc significantly outperforms existing methods, consistently improves existing backbones when its design components are integrated, and demonstrates strong generalization in cross-environment evaluations.

Via

Access Paper or Ask Questions

TextFake: Benchmarking AI-Generated Image Detection on Text-Rich Images

May 31, 2026

Yuning Zhang, Changtao Miao, Mingyu Liao, Tingyu Liu, Xinghao Wang, Tao Gong, Qi Chu, Nenghai Yu

Abstract:Recent AI-generated image (AIGI) detectors perform well on natural-image benchmarks, but their behavior on text-rich forgeries, such as fabricated screenshots, documents, and news pages prevalent in misinformation, remains untested. We introduce TextFake, a 20,000-image benchmark for text-rich AIGI detection spanning 28 languages, 4 topic categories, and 2 scene modalities. Fake images are synthesized via a four-stage pipeline that annotates real images along three controlled dimensions and generates counterparts through distribution-aligned structured prompting, ruling out covariate shortcuts. Zero-shot evaluation of 14 specialized detectors and 3 frontier VLM APIs reveals a large systematic gap: no method exceeds 80% accuracy, with some dropping over 60% from natural-image benchmarks. Diagnostic evaluations identify three failure modes: the Text Density Curse, where dense glyphs overwhelm low-level detectors; Cloaking via Rendering Fidelity, where stronger text rendering suppresses enerative artifacts; and Threshold Collapse, where routine perturbations drive detectors toward chance-level performance.

Via

Access Paper or Ask Questions

When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors

Apr 23, 2026

Chenghao Yang, Yuning Zhang, Zhoufutu Wen, Tao Gong, Jiaheng Liu, Qi Chu, Nenghai Yu

Abstract:Model distillation is a primary driver behind the rapid progress of LLM agents, yet it often leads to behavioral homogenization. Many emerging agents share nearly identical reasoning steps and failure modes, suggesting they may be distilled echoes of a few dominant teachers. Existing metrics, however, fail to distinguish mandatory behaviors required for task success from non-mandatory patterns that reflect a model's autonomous preferences. We propose two complementary metrics to isolate non-mandatory behavioral patterns: \textbf{Response Pattern Similarity (RPS)} for verbal alignment and \textbf{Action Graph Similarity (AGS)} for tool-use habits modeled as directed graphs. Evaluating 18 models from 8 providers on $τ$-Bench and $τ^2$-Bench against Claude Sonnet 4.5 (thinking), we find that within-family model pairs score 5.9 pp higher in AGS than cross-family pairs, and that Kimi-K2 (thinking) reaches 82.6\% $S_{\text{node}}$ and 94.7\% $S_{\text{dep}}$, exceeding Anthropic's own Opus 4.1. A controlled distillation experiment further confirms that AGS distinguishes teacher-specific convergence from general improvement. RPS and AGS capture distinct behavioral dimensions (Pearson $r$ = 0.491), providing complementary diagnostic signals for behavioral convergence in the agent ecosystem. Our code is available at https://github.com/Syuchin/AgentEcho.

* Accepted by ACL 2026 Main Conference

Via

Access Paper or Ask Questions

WiLoc: Massive Measured Dataset of Wi-Fi Channel State Information with Application to Machine-Learning Based Localization

Feb 09, 2026

Yuning Zhang, Lei Chu, Omer Gokalp Serbetci, Jorge Gomez-Ponce, Andreas F. Molisch

Abstract:Localization is a key component of the wireless ecosystem. Machine learning (ML)-based localization using channel state information (CSI) is one of the most popular methods for achieving high-accuracy localization with low cost. However, to be accurate and robust, ML-based algorithms need to be trained and tested with large amounts of data, covering not only many user equipment (UE)/target locations, but also many different access points (APs) locations to which the UEs connect, in a variety of different environment types. This paper presents a massive-sized CSI dataset, WiLoc (Wi-Fi Localization), and makes it publicly available. WiLoc is obtained by a series of precision measurement campaigns that span three months, and it is massive in all the above-mentioned three dimensions: > 12 million UE locations, > 3,000 APs, covering 16 buildings for indoor localization, and > 30 streets for outdoor use. The paper describes the dataset structure, measurement environments, measurement protocols, and the dataset validations. Comprehensive case studies validate the advantages of large datasets in ML-driven localization strategies for both "standard" and transfer learning. We envision this dataset, which is by far the largest of its kind, to become a standard resource for researchers in the field of ML-based localization.

* Accepted by Incofom 2026. 9 pages without reference

Via

Access Paper or Ask Questions

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Dec 09, 2025

Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen(+14 more)

Abstract:Neural rendering, particularly 3D Gaussian Splatting (3DGS), has evolved rapidly and become a key component for building world models. However, existing viewer solutions remain fragmented, heavy, or constrained by legacy pipelines, resulting in high deployment friction and limited support for dynamic content and generative models. In this work, we present Visionary, an open, web-native platform for real-time various Gaussian Splatting and meshes rendering. Built on an efficient WebGPU renderer with per-frame ONNX inference, Visionary enables dynamic neural processing while maintaining a lightweight, "click-to-run" browser experience. It introduces a standardized Gaussian Generator contract, which not only supports standard 3DGS rendering but also allows plug-and-play algorithms to generate or update Gaussians each frame. Such inference also enables us to apply feedforward generative post-processing. The platform further offers a plug in three.js library with a concise TypeScript API for seamless integration into existing web applications. Experiments show that, under identical 3DGS assets, Visionary achieves superior rendering efficiency compared to current Web viewers due to GPU-based primitive sorting. It already supports multiple variants, including MLP-based 3DGS, 4DGS, neural avatars, and style transformation or enhancement networks. By unifying inference and rendering directly in the browser, Visionary significantly lowers the barrier to reproduction, comparison, and deployment of 3DGS-family methods, serving as a unified World Model Carrier for both reconstructive and generative paradigms.

* Project page: https://visionary-laboratory.github.io/visionary

Via

Access Paper or Ask Questions

GuardFed: A Trustworthy Federated Learning Framework Against Dual-Facet Attacks

Nov 12, 2025

Yanli Li, Yanan Zhou, Zhongliang Guo, Nan Yang, Yuning Zhang, Huaming Chen, Dong Yuan, Weiping Ding, Witold Pedrycz

Abstract:Federated learning (FL) enables privacy-preserving collaborative model training but remains vulnerable to adversarial behaviors that compromise model utility or fairness across sensitive groups. While extensive studies have examined attacks targeting either objective, strategies that simultaneously degrade both utility and fairness remain largely unexplored. To bridge this gap, we introduce the Dual-Facet Attack (DFA), a novel threat model that concurrently undermines predictive accuracy and group fairness. Two variants, Synchronous DFA (S-DFA) and Split DFA (Sp-DFA), are further proposed to capture distinct real-world collusion scenarios. Experimental results show that existing robust FL defenses, including hybrid aggregation schemes, fail to resist DFAs effectively. To counter these threats, we propose GuardFed, a self-adaptive defense framework that maintains a fairness-aware reference model using a small amount of clean server data augmented with synthetic samples. In each training round, GuardFed computes a dual-perspective trust score for every client by jointly evaluating its utility deviation and fairness degradation, thereby enabling selective aggregation of trustworthy updates. Extensive experiments on real-world datasets demonstrate that GuardFed consistently preserves both accuracy and fairness under diverse non-IID and adversarial conditions, achieving state-of-the-art performance compared with existing robust FL methods.

Via

Access Paper or Ask Questions

EdgeLoRA: An Efficient Multi-Tenant LLM Serving System on Edge Devices

Jul 02, 2025

Zheyu Shen, Yexiao He, Ziyao Wang, Yuning Zhang, Guoheng Sun, Wanghao Ye, Ang Li

Abstract:Large Language Models (LLMs) have gained significant attention due to their versatility across a wide array of applications. Fine-tuning LLMs with parameter-efficient adapters, such as Low-Rank Adaptation (LoRA), enables these models to efficiently adapt to downstream tasks without extensive retraining. Deploying fine-tuned LLMs on multi-tenant edge devices offers substantial benefits, such as reduced latency, enhanced privacy, and personalized responses. However, serving LLMs efficiently on resource-constrained edge devices presents critical challenges, including the complexity of adapter selection for different tasks and memory overhead from frequent adapter swapping. Moreover, given the multiple requests in multi-tenant settings, processing requests sequentially results in underutilization of computational resources and increased latency. This paper introduces EdgeLoRA, an efficient system for serving LLMs on edge devices in multi-tenant environments. EdgeLoRA incorporates three key innovations: (1) an adaptive adapter selection mechanism to streamline the adapter configuration process; (2) heterogeneous memory management, leveraging intelligent adapter caching and pooling to mitigate memory operation overhead; and (3) batch LoRA inference, enabling efficient batch processing to significantly reduce computational latency. Comprehensive evaluations using the Llama3.1-8B model demonstrate that EdgeLoRA significantly outperforms the status quo (i.e., llama.cpp) in terms of both latency and throughput. The results demonstrate that EdgeLoRA can achieve up to a 4 times boost in throughput. Even more impressively, it can serve several orders of magnitude more adapters simultaneously. These results highlight EdgeLoRA's potential to transform edge deployment of LLMs in multi-tenant scenarios, offering a scalable and efficient solution for resource-constrained environments.

Via

Access Paper or Ask Questions

Controllable 3D Outdoor Scene Generation via Scene Graphs

Mar 10, 2025

Yuheng Liu, Xinke Li, Yuning Zhang, Lu Qi, Xin Li, Wenping Wang, Chongshou Li, Xueting Li, Ming-Hsuan Yang

Abstract:Three-dimensional scene generation is crucial in computer vision, with applications spanning autonomous driving, gaming and the metaverse. Current methods either lack user control or rely on imprecise, non-intuitive conditions. In this work, we propose a method that uses, scene graphs, an accessible, user friendly control format to generate outdoor 3D scenes. We develop an interactive system that transforms a sparse scene graph into a dense BEV (Bird's Eye View) Embedding Map, which guides a conditional diffusion model to generate 3D scenes that match the scene graph description. During inference, users can easily create or modify scene graphs to generate large-scale outdoor scenes. We create a large-scale dataset with paired scene graphs and 3D semantic scenes to train the BEV embedding and diffusion models. Experimental results show that our approach consistently produces high-quality 3D urban scenes closely aligned with the input scene graphs. To the best of our knowledge, this is the first approach to generate 3D outdoor scenes conditioned on scene graphs.

* Project Page: https://yuheng.ink/project-page/control-3d-scene/

Via

Access Paper or Ask Questions

Cell-free massive MIMO Channels in an Urban Environment -- Measurements and Channel Statistics

Jun 03, 2024

Yuning Zhang, Thomas Choi, Zihang Cheng, Jorge Gomez-Ponce, Issei Kanno, Masaaki Ito, Andreas F. Molisch

Figure 1 for Cell-free massive MIMO Channels in an Urban Environment -- Measurements and Channel Statistics

Figure 2 for Cell-free massive MIMO Channels in an Urban Environment -- Measurements and Channel Statistics

Figure 3 for Cell-free massive MIMO Channels in an Urban Environment -- Measurements and Channel Statistics

Figure 4 for Cell-free massive MIMO Channels in an Urban Environment -- Measurements and Channel Statistics

Abstract:Cell-free massive MIMO (CF-mMIMO), where each user equipment (UE) is connected to multiple access points (APs), is emerging as an important component for 5G and 6G cellular systems. Accurate channel models based on measurements are required to optimize their design and deployment. This paper presents an extensive measurement campaign for CF-mMIMO in an urban environment. A new "virtual AP" technique measures channels between 80 UE locations and more than 20,000 possible microcellular AP locations. Measurements are done at 3.5 GHz carrier frequency with 350 MHz bandwidth (BW). The paper describes the measurement setup and data processing, shows sample results and their physical interpretation, and provides statistics for key quantities such as pathloss, shadowing, delay spread (DS), and delay window. We find pathloss coefficients of 2.9 and 10.4 for line-of-sight (LOS) and non line-of-sight (NLOS), respectively, where the high LOS coefficient is mainly because larger distance leads to more grazing angle of incidence and thus lower antenna gain in our setup. Shadowing standard deviations are 5.1/16.6 dB, and root mean squared (RMS) DSs of -80.6/-72.6 dBs. The measurements can also be used for parameterizing a CUNEC-type model, which will be reported in future work.

* submitted to IEEE TWC

Via

Access Paper or Ask Questions

Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz

May 31, 2024

Yuning Zhang, Thomas Choi, Zihang Cheng, Issei Kanno, Masaaki Ito, Jorge Gomez-Ponce, Hussein Hammoud, Bowei Wu, Ashwani Pradhan, Kelvin Arana(+7 more)

Figure 1 for Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz

Figure 2 for Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz

Figure 3 for Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz

Figure 4 for Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz

Abstract:The design of cell-free massive MIMO (CF-mMIMO) systems requires accurate, measurement-based channel models. This paper provides the first results from the by far most extensive outdoor measurement campaign for CF-mMIMO channels in an urban environment. We measured impulse responses between over 20,000 potential access point (AP) locations and 80 user equipments (UEs) at 3.5 GHz with 350 MHz bandwidth (BW). Measurements use a "virtual array" approach at the AP and a hybrid switched/virtual approach at the UE. This paper describes the sounder design, measurement environment, data processing, and sample results, particularly the evolution of the power-delay profiles (PDPs) as a function of the AP locations, and its relation to the propagation environment.

* 6 pages, 6 figures, conference: VTC 2024-Fall

Via

Access Paper or Ask Questions