Abstract:Accurate classification of respiratory sounds requires deep learning models that effectively capture fine-grained acoustic features and long-range temporal dependencies. Convolutional Neural Networks (CNNs) are well-suited for extracting local time-frequency patterns but are limited in modeling global context. In contrast, transformer-based models can capture long-range dependencies, albeit with higher computational demands. To address these limitations, we propose a compact CNN-Temporal Self-Attention (CNN-TSA) network that integrates lightweight self-attention into an efficient CNN backbone. Central to our approach is a Frequency Band Selection (FBS) module that suppresses noisy and non-informative frequency regions, substantially improving accuracy and reducing FLOPs by up to 50%. We also introduce age-specific models to enhance robustness across diverse patient groups. Evaluated on the SPRSound-2022/2023 and ICBHI-2017 lung sound datasets, CNN-TSA with FBS sets new benchmarks on SPRSound and achieves state-of-the-art performance on ICBHI, all with a significantly smaller computational footprint. Furthermore, integrating FBS into an existing transformer baseline yields a new record on ICBHI, confirming FBS as an effective drop-in enhancement. These results demonstrate that our framework enables reliable, real-time respiratory sound analysis suitable for deployment in resource-constrained settings.
Abstract:State estimation for nonlinear dynamical systems is a critical challenge in control and engineering applications, particularly when only partial and noisy measurements are available. This paper introduces a novel Adaptive Physics-Informed Neural Network-based Observer (PINN-Obs) for accurate state estimation in nonlinear systems. Unlike traditional model-based observers, which require explicit system transformations or linearization, the proposed framework directly integrates system dynamics and sensor data into a physics-informed learning process. The observer adaptively learns an optimal gain matrix, ensuring convergence of the estimated states to the true system states. A rigorous theoretical analysis establishes formal convergence guarantees, demonstrating that the proposed approach achieves uniform error minimization under mild observability conditions. The effectiveness of PINN-Obs is validated through extensive numerical simulations on diverse nonlinear systems, including an induction motor model, a satellite motion system, and benchmark academic examples. Comparative experimental studies against existing observer designs highlight its superior accuracy, robustness, and adaptability.
Abstract:Accurate knowledge of the state variables in a dynamical system is critical for effective control, diagnosis, and supervision, especially when direct measurements of all states are infeasible. This paper presents a novel approach to designing software sensors for nonlinear dynamical systems expressed in their most general form. Unlike traditional model-based observers that rely on explicit transformations or linearization, the proposed framework integrates neural networks with adaptive Sliding Mode Control (SMC) to design a robust state observer under a less restrictive set of conditions. The learning process is driven by available sensor measurements, which are used to correct the observer's state estimate. The training methodology leverages the system's governing equations as a physics-based constraint, enabling observer synthesis without access to ground-truth state trajectories. By employing a time-varying gain matrix dynamically adjusted by the neural network, the observer adapts in real-time to system changes, ensuring robustness against noise, external disturbances, and variations in system dynamics. Furthermore, we provide sufficient conditions to guarantee estimation error convergence, establishing a theoretical foundation for the observer's reliability. The methodology's effectiveness is validated through simulations on challenging examples, including systems with non-differentiable dynamics and varying observability conditions. These examples, which are often problematic for conventional techniques, serve to demonstrate the robustness and broad applicability of our approach. The results show rapid convergence and high accuracy, underscoring the method's potential for addressing complex state estimation challenges in real-world applications.
Abstract:A new class of Multi-Rotor Aerial Vehicles (MRAVs), known as omnidirectional MRAVs (o-MRAVs), has gained attention for their ability to independently control 3D position and orientation. This capability enhances robust planning and control in aerial communication networks, enabling more adaptive trajectory planning and precise antenna alignment without additional mechanical components. These features are particularly valuable in uncertain environments, where disturbances such as wind and interference affect communication stability. This paper examines o-MRAVs in the context of robust aerial network planning, comparing them with the more common under-actuated MRAVs (u-MRAVs). Key applications, including physical layer security, optical communications, and network densification, are highlighted, demonstrating the potential of o-MRAVs to improve reliability and efficiency in dynamic communication scenarios.
Abstract:Recently, overconfidence in large language models (LLMs) has garnered considerable attention due to its fundamental importance in quantifying the trustworthiness of LLM generation. However, existing approaches prompt the \textit{black box LLMs} to produce their confidence (\textit{verbalized confidence}), which can be subject to many biases and hallucinations. Inspired by a different aspect of overconfidence in cognitive science called \textit{overprecision}, we designed a framework for its study in black box LLMs. This framework contains three main phases: 1) generation, 2) refinement and 3) evaluation. In the generation phase we prompt the LLM to generate answers to numerical questions in the form of intervals with a certain level of confidence. This confidence level is imposed in the prompt and not required for the LLM to generate as in previous approaches. We use various prompting techniques and use the same prompt multiple times to gauge the effects of randomness in the generation process. In the refinement phase, answers from the previous phase are refined to generate better answers. The LLM answers are evaluated and studied in the evaluation phase to understand its internal workings. This study allowed us to gain various insights into LLM overprecision: 1) LLMs are highly uncalibrated for numerical tasks 2) {\color{blue}there is no correlation between the length of the interval and the imposed confidence level, which can be symptomatic of a a) lack of understanding of the concept of confidence or b) inability to adjust self-confidence by following instructions}, {\color{blue}3)} LLM numerical precision differs depending on the task, scale of answer and prompting technique {\color{blue}4) Refinement of answers doesn't improve precision in most cases}. We believe this study offers new perspectives on LLM overconfidence and serves as a strong baseline for overprecision in LLMs.
Abstract:Data breaches have begun to take on new dimensions and their prediction is becoming of great importance to organizations. Prior work has addressed this issue mainly from a technical perspective and neglected other interfering aspects such as the social media dimension. To fill this gap, we propose STRisk which is a predictive system where we expand the scope of the prediction task by bringing into play the social media dimension. We study over 3800 US organizations including both victim and non-victim organizations. For each organization, we design a profile composed of a variety of externally measured technical indicators and social factors. In addition, to account for unreported incidents, we consider the non-victim sample to be noisy and propose a noise correction approach to correct mislabeled organizations. We then build several machine learning models to predict whether an organization is exposed to experience a hacking breach. By exploiting both technical and social features, we achieve a Area Under Curve (AUC) score exceeding 98%, which is 12% higher than the AUC achieved using only technical features. Furthermore, our feature importance analysis reveals that open ports and expired certificates are the best technical predictors, while spreadability and agreeability are the best social predictors.
Abstract:A new class of Multi-Rotor Aerial Vehicles (MRAVs), known as omnidirectional MRAVs (o-MRAVs), has attracted significant interest in the robotics community. These MRAVs have the unique capability of independently controlling their 3D position and 3D orientation. In the context of aerial communication networks, this translates into the ability to control the position and orientation of the antenna mounted on the MRAV without any additional devices tasked for antenna orientation. This additional Degrees of Freedom (DoF) adds a new dimension to aerial communication systems, creating various research opportunities in communications-aware trajectory planning and positioning. This paper presents this new class of MRAVs and discusses use cases in areas such as physical layer security and optical communications. Furthermore, the benefits of these MRAVs are illustrated with realistic simulation scenarios. Finally, new research problems and opportunities introduced by this advanced robotics technology are discussed.
Abstract:Asthma rates have risen globally, driven by environmental and lifestyle factors. Access to immediate medical care is limited, particularly in developing countries, necessitating automated support systems. Large Language Models like ChatGPT (Chat Generative Pre-trained Transformer) and Gemini have advanced natural language processing in general and question answering in particular, however, they are prone to producing factually incorrect responses (i.e. hallucinations). Retrieval-augmented generation systems, integrating curated documents, can improve large language models' performance and reduce the incidence of hallucination. We introduce AsthmaBot, a multi-lingual, multi-modal retrieval-augmented generation system for asthma support. Evaluation of an asthma-related frequently asked questions dataset shows AsthmaBot's efficacy. AsthmaBot has an added interactive and intuitive interface that integrates different data modalities (text, images, videos) to make it accessible to the larger public. AsthmaBot is available online via \url{asthmabot.datanets.org}.
Abstract:Recent speech technologies have led to produce high quality synthesised speech due to recent advances in neural Text to Speech (TTS). However, such TTS models depend on extensive amounts of data that can be costly to produce and is hardly scalable to all existing languages, especially that seldom attention is given to low resource languages. With techniques such as knowledge transfer, the burden of creating datasets can be alleviated. In this paper, we therefore investigate two aspects; firstly, whether data from social media can be used for a small TTS dataset construction, and secondly whether cross lingual transfer learning (TL) for a low resource language can work with this type of data. In this aspect, we specifically assess to what extent multilingual modeling can be leveraged as an alternative to training on monolingual corporas. To do so, we explore how data from foreign languages may be selected and pooled to train a TTS model for a target low resource language. Our findings show that multilingual pre-training is better than monolingual pre-training at increasing the intelligibility and naturalness of the generated speech.
Abstract:Federated learning (FL) involves several clients that share with a fusion center (FC), the model each client has trained with its own data. Conventional FL, which can be interpreted as an estimation or distortion-based approach, ignores the final use of model information (MI) by the FC and the other clients. In this paper, we introduce a novel FL framework in which the FC uses an aggregate version of the MI to make decisions that affect the client's utility functions. Clients cannot choose the decisions and can only use the MI reported to the FC to maximize their utility. Depending on the alignment between the client and FC utilities, the client may have an individual interest in adding strategic noise to the model. This general framework is stated and specialized to the case of clustering, in which noisy cluster representative information is reported. This is applied to the problem of power consumption scheduling. In this context, utility non-alignment occurs, for instance, when the client wants to consume when the price of electricity is low, whereas the FC wants the consumption to occur when the total power is the lowest. This is illustrated with aggregated real data from Ausgrid \cite{ausgrid}. Our numerical analysis clearly shows that the client can increase his utility by adding noise to the model reported to the FC. Corresponding results and source codes can be downloaded from \cite{source-code}.