Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuan Shen

Validating the Effectiveness of a Large Language Model-based Approach for Identifying Children's Development across Various Free Play Settings in Kindergarten

May 06, 2025

Yuanyuan Yang, Yuan Shen, Tianchen Sun, Yangbin Xie

Abstract:Free play is a fundamental aspect of early childhood education, supporting children's cognitive, social, emotional, and motor development. However, assessing children's development during free play poses significant challenges due to the unstructured and spontaneous nature of the activity. Traditional assessment methods often rely on direct observations by teachers, parents, or researchers, which may fail to capture comprehensive insights from free play and provide timely feedback to educators. This study proposes an innovative approach combining Large Language Models (LLMs) with learning analytics to analyze children's self-narratives of their play experiences. The LLM identifies developmental abilities, while performance scores across different play settings are calculated using learning analytics techniques. We collected 2,224 play narratives from 29 children in a kindergarten, covering four distinct play areas over one semester. According to the evaluation results from eight professionals, the LLM-based approach achieved high accuracy in identifying cognitive, motor, and social abilities, with accuracy exceeding 90% in most domains. Moreover, significant differences in developmental outcomes were observed across play settings, highlighting each area's unique contributions to specific abilities. These findings confirm that the proposed approach is effective in identifying children's development across various free play settings. This study demonstrates the potential of integrating LLMs and learning analytics to provide child-centered insights into developmental trajectories, offering educators valuable data to support personalized learning and enhance early childhood education practices.

* 15 pages, 4 figures

Via

Access Paper or Ask Questions

Towards Latency-Aware 3D Streaming Perception for Autonomous Driving

Apr 27, 2025

Jiaqi Peng, Tai Wang, Jiangmiao Pang, Yuan Shen

Abstract:Although existing 3D perception algorithms have demonstrated significant improvements in performance, their deployment on edge devices continues to encounter critical challenges due to substantial runtime latency. We propose a new benchmark tailored for online evaluation by considering runtime latency. Based on the benchmark, we build a Latency-Aware 3D Streaming Perception (LASP) framework that addresses the latency issue through two primary components: 1) latency-aware history integration, which extends query propagation into a continuous process, ensuring the integration of historical feature regardless of varying latency; 2) latency-aware predictive detection, a module that compensates the detection results with the predicted trajectory and the posterior accessed latency. By incorporating the latency-aware mechanism, our method shows generalization across various latency levels, achieving an online performance that closely aligns with 80\% of its offline evaluation on the Jetson AGX Orin without any acceleration techniques.

Via

Access Paper or Ask Questions

Bias-Eliminated PnP for Stereo Visual Odometry: Provably Consistent and Large-Scale Localization

Apr 24, 2025

Guangyang Zeng, Yuan Shen, Ziyang Hong, Yuze Hong, Viorela Ila, Guodong Shi, Junfeng Wu

Abstract:In this paper, we first present a bias-eliminated weighted (Bias-Eli-W) perspective-n-point (PnP) estimator for stereo visual odometry (VO) with provable consistency. Specifically, leveraging statistical theory, we develop an asymptotically unbiased and $\sqrt {n}$-consistent PnP estimator that accounts for varying 3D triangulation uncertainties, ensuring that the relative pose estimate converges to the ground truth as the number of features increases. Next, on the stereo VO pipeline side, we propose a framework that continuously triangulates contemporary features for tracking new frames, effectively decoupling temporal dependencies between pose and 3D point errors. We integrate the Bias-Eli-W PnP estimator into the proposed stereo VO pipeline, creating a synergistic effect that enhances the suppression of pose estimation errors. We validate the performance of our method on the KITTI and Oxford RobotCar datasets. Experimental results demonstrate that our method: 1) achieves significant improvements in both relative pose error and absolute trajectory error in large-scale environments; 2) provides reliable localization under erratic and unpredictable robot motions. The successful implementation of the Bias-Eli-W PnP in stereo VO indicates the importance of information screening in robotic estimation tasks with high-uncertainty measurements, shedding light on diverse applications where PnP is a key ingredient.

* 10 pages, 7 figures

Via

Access Paper or Ask Questions

Localization and Tracking for Cooperative Users in Multi-RIS-assisted Systems: Theoretical Analysis and Principles of Interpretations

Apr 07, 2025

Peng Gao, Lixiang Lian, Yuan Shen

Figure 1 for Localization and Tracking for Cooperative Users in Multi-RIS-assisted Systems: Theoretical Analysis and Principles of Interpretations

Figure 2 for Localization and Tracking for Cooperative Users in Multi-RIS-assisted Systems: Theoretical Analysis and Principles of Interpretations

Figure 3 for Localization and Tracking for Cooperative Users in Multi-RIS-assisted Systems: Theoretical Analysis and Principles of Interpretations

Figure 4 for Localization and Tracking for Cooperative Users in Multi-RIS-assisted Systems: Theoretical Analysis and Principles of Interpretations

Abstract:Localization and tracking (LocTrack) are fundamental enablers for a wide range of emerging applications. Reconfigurable intelligent surfaces (RISs) have emerged as key components for enhancing the LocTrack performance. This paper investigates a multi-RIS-assisted multi-user (MRMU) LocTrack system, where multiple RISs collaboratively reflect the position-bearing signals for information fusion at the base station, leveraging spatial-temporal correlations in user positions. While studies have shown these correlations improve localization accuracy, their trade-offs with system complexity remain unclear. To address this gap, we characterize the effectiveness of spatial-temporal correlation priors (STPs) utilization in MRMU LocTrack systems using a metric, termed efficiency of correlation (EoC). To further elucidate correlation propagation and RIS interactions, we provide a "correlation information routing" interpretation of EoC through random walk theory. EoC provides a principled performance evaluation metric, that enables system designers to balance localization accuracy enhancement against the increased complexity. Additionally, we investigate the error propagation phenomenon, analyzing its convergence and asymptotic behavior in MRMU LocTrack systems. Finally, we validate the theoretical results through extensive numerical simulations.

Via

Access Paper or Ask Questions

Beware of Metacognitive Laziness: Effects of Generative Artificial Intelligence on Learning Motivation, Processes, and Performance

Dec 12, 2024

Yizhou Fan, Luzhen Tang, Huixiao Le, Kejie Shen, Shufang Tan, Yueying Zhao, Yuan Shen, Xinyu Li, Dragan Gašević

Abstract:With the continuous development of technological and educational innovation, learners nowadays can obtain a variety of support from agents such as teachers, peers, education technologies, and recently, generative artificial intelligence such as ChatGPT. The concept of hybrid intelligence is still at a nascent stage, and how learners can benefit from a symbiotic relationship with various agents such as AI, human experts and intelligent learning systems is still unknown. The emerging concept of hybrid intelligence also lacks deep insights and understanding of the mechanisms and consequences of hybrid human-AI learning based on strong empirical research. In order to address this gap, we conducted a randomised experimental study and compared learners' motivations, self-regulated learning processes and learning performances on a writing task among different groups who had support from different agents (ChatGPT, human expert, writing analytics tools, and no extra tool). A total of 117 university students were recruited, and their multi-channel learning, performance and motivation data were collected and analysed. The results revealed that: learners who received different learning support showed no difference in post-task intrinsic motivation; there were significant differences in the frequency and sequences of the self-regulated learning processes among groups; ChatGPT group outperformed in the essay score improvement but their knowledge gain and transfer were not significantly different. Our research found that in the absence of differences in motivation, learners with different supports still exhibited different self-regulated learning processes, ultimately leading to differentiated performance. What is particularly noteworthy is that AI technologies such as ChatGPT may promote learners' dependence on technology and potentially trigger metacognitive laziness.

Via

Access Paper or Ask Questions

Fundamental Limits of Pulse Based UWB ISAC Systems: A Parameter Estimation Perspective

Oct 17, 2024

Fan Liu, Tingting Zhang, Zenan Zhang, Bin Cao, Yuan Shen, Qinyu Zhang

Figure 1 for Fundamental Limits of Pulse Based UWB ISAC Systems: A Parameter Estimation Perspective

Figure 2 for Fundamental Limits of Pulse Based UWB ISAC Systems: A Parameter Estimation Perspective

Figure 3 for Fundamental Limits of Pulse Based UWB ISAC Systems: A Parameter Estimation Perspective

Figure 4 for Fundamental Limits of Pulse Based UWB ISAC Systems: A Parameter Estimation Perspective

Abstract:Impulse radio ultra-wideband (IR-UWB) signals stand out for their high temporal resolution, low cost, and large bandwidth, making them a highly promising option for integrated sensing and communication (ISAC) systems. In this paper, we design an ISAC system for a bi-static passive sensing scenario that accommodates multiple targets. Specifically, we introduce two typical modulation schemes, PPM and BPSK, for data transmission. The essential coupling between sensing and communication is examined through the Fisher information matrix (FIM). Accordingly, we introduce a pilot-based decoupling approach that relies on known time-delays, as well as a differential decoupling strategy that uses a known starting symbol position. Finally, we assess the sensing and communication performance under various modulation and demodulation schemes under the constraints of current UWB standards. This assessment utilizes the Cramer-Rao Lower Bound (CRLB) for sensing and the Shannon capacity limit for communication, offering theoretical insights into choosing suitable data signal processing methods in real-world applications.

Via

Access Paper or Ask Questions

EarthGen: Generating the World from Top-Down Views

Sep 02, 2024

Ansh Sharma, Albert Xiao, Praneet Rathi, Rohit Kundu, Albert Zhai, Yuan Shen, Shenlong Wang

Figure 1 for EarthGen: Generating the World from Top-Down Views

Figure 2 for EarthGen: Generating the World from Top-Down Views

Figure 3 for EarthGen: Generating the World from Top-Down Views

Figure 4 for EarthGen: Generating the World from Top-Down Views

Abstract:In this work, we present a novel method for extensive multi-scale generative terrain modeling. At the core of our model is a cascade of superresolution diffusion models that can be combined to produce consistent images across multiple resolutions. Pairing this concept with a tiled generation method yields a scalable system that can generate thousands of square kilometers of realistic Earth surfaces at high resolution. We evaluate our method on a dataset collected from Bing Maps and show that it outperforms super-resolution baselines on the extreme super-resolution task of 1024x zoom. We also demonstrate its ability to create diverse and coherent scenes via an interactive gigapixel-scale generated map. Finally, we demonstrate how our system can be extended to enable novel content creation applications including controllable world generation and 3D scene generation.

Via

Access Paper or Ask Questions

Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Jul 01, 2024

Hao Wang, Zhichao Chen, Yuan Shen, Jiajun Fan, Zhaoran Liu, Degui Yang, Xinggao Liu, Haoxuan Li

Figure 1 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Figure 2 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Figure 3 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Figure 4 for Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

Abstract:Heterogeneous treatment effect (HTE) estimation from observational data poses significant challenges due to treatment selection bias. Existing methods address this bias by minimizing distribution discrepancies between treatment groups in latent space, focusing on global alignment. However, the fruitful aspect of local proximity, where similar units exhibit similar outcomes, is often overlooked. In this study, we propose Proximity-aware Counterfactual Regression (PCR) to exploit proximity for representation balancing within the HTE estimation context. Specifically, we introduce a local proximity preservation regularizer based on optimal transport to depict the local proximity in discrepancy calculation. Furthermore, to overcome the curse of dimensionality that renders the estimation of discrepancy ineffective, exacerbated by limited data availability for HTE estimation, we develop an informative subspace projector, which trades off minimal distance precision for improved sample complexity. Extensive experiments demonstrate that PCR accurately matches units across different treatment groups, effectively mitigates treatment selection bias, and significantly outperforms competitors. Code is available at https://anonymous.4open.science/status/ncr-B697.

* Code is available at https://anonymous.4open.science/status/ncr-B697

Via

Access Paper or Ask Questions

SuperGaussian: Repurposing Video Models for 3D Super Resolution

Jun 04, 2024

Yuan Shen, Duygu Ceylan, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Frühstück

Figure 1 for SuperGaussian: Repurposing Video Models for 3D Super Resolution

Figure 2 for SuperGaussian: Repurposing Video Models for 3D Super Resolution

Figure 3 for SuperGaussian: Repurposing Video Models for 3D Super Resolution

Figure 4 for SuperGaussian: Repurposing Video Models for 3D Super Resolution

Abstract:We present a simple, modular, and generic method that upsamples coarse 3D models by adding geometric and appearance details. While generative 3D models now exist, they do not yet match the quality of their counterparts in image and video domains. We demonstrate that it is possible to directly repurpose existing (pretrained) video models for 3D super-resolution and thus sidestep the problem of the shortage of large repositories of high-quality 3D training models. We describe how to repurpose video upsampling models, which are not 3D consistent, and combine them with 3D consolidation to produce 3D-consistent results. As output, we produce high quality Gaussian Splat models, which are object centric and effective. Our method is category agnostic and can be easily incorporated into existing 3D workflows. We evaluate our proposed SuperGaussian on a variety of 3D inputs, which are diverse both in terms of complexity and representation (e.g., Gaussian Splats or NeRFs), and demonstrate that our simple method significantly improves the fidelity of the final 3D models. Check our project website for details: supergaussian.github.io

* Check our project website for details: https://supergaussian.github.io

Via

Access Paper or Ask Questions

The Integrated Sensing and Communication Revolution for 6G: Vision, Techniques, and Applications

May 03, 2024

Nuria González-Prelcic, Musa Furkan Keskin, Ossi Kaltiokallio, Mikko Valkama, Davide Dardari, Xiao Shen, Yuan Shen, Murat Bayraktar, Henk Wymeersch

Figure 1 for The Integrated Sensing and Communication Revolution for 6G: Vision, Techniques, and Applications

Figure 2 for The Integrated Sensing and Communication Revolution for 6G: Vision, Techniques, and Applications

Figure 3 for The Integrated Sensing and Communication Revolution for 6G: Vision, Techniques, and Applications

Figure 4 for The Integrated Sensing and Communication Revolution for 6G: Vision, Techniques, and Applications

Abstract:Future wireless networks will integrate sensing, learning and communication to provide new services beyond communication and to become more resilient. Sensors at the network infrastructure, sensors on the user equipment, and the sensing capability of the communication signal itself provide a new source of data that connects the physical and radio frequency environments. A wireless network that harnesses all these sensing data can not only enable additional sensing services, but also become more resilient to channel-dependent effects like blockage and better support adaptation in dynamic environments as networks reconfigure. In this paper, we provide a vision for integrated sensing and communication (ISAC) networks and an overview of how signal processing, optimization and machine learning techniques can be leveraged to make them a reality in the context of 6G. We also include some examples of the performance of several of these strategies when evaluated using a simulation framework based on a combination of ray tracing measurements and mathematical models that mix the digital and physical worlds.

Via

Access Paper or Ask Questions