Abstract:Despite the effectiveness of large language models (LLMs) for code generation, they often output incorrect code. One reason is that model output probabilities are often not well-correlated with correctness, and reflect only the final output of the generation process. Inspired by findings that LLMs internally encode concepts like truthfulness, this paper explores if LLMs similarly represent code correctness. Specifically, we identify a correctness representation inside LLMs by contrasting the hidden states between pairs of correct and incorrect code for the same programming tasks. By experimenting on four LLMs, we show that exploiting this extracted correctness representation outperforms standard log-likelihood ranking, as well as verbalized model confidence. Furthermore, we explore how this internal correctness signal can be used to select higher-quality code samples, without requiring test execution. Ultimately, this work demonstrates how leveraging internal representations can enhance code generation systems and make LLMs more reliable, thus improving confidence in automatically generated code.




Abstract:Human Activity Recognition (HAR) is the identification and classification of static and dynamic human activities, which find applicability in domains like healthcare, entertainment, security, and cyber-physical systems. Traditional HAR approaches rely on wearable sensors, vision-based systems, or ambient sensing, each with inherent limitations such as privacy concerns or restricted sensing conditions. Recently, Radio Frequency (RF)-based HAR has emerged, relying on the interaction of RF signals with people to infer activities. Reconfigurable Intelligent Surfaces (RISs) offers significant potential in this domain by enabling dynamic control over the wireless environment, thus enhancing the information extracted from RF signals. We present an Hand Gesture Recognition (HGR) approach that employs our own 6.5 GHz RIS design to manipulate the RF medium in an area of interest. We validate the capability of our RIS to control the medium by characterizing its steering response, and further we gather and publish a dataset for HGR classification for three different hand gestures. By employing two Convolutional Neural Networks (CNNs) models trained on data gathered under random and optimized RIS configuration sequences, we achieved classification accuracies exceeding 90%.