Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Heigl

Following the Clues: Experiments on Person Re-ID using Cross-Modal Intelligence

Jul 02, 2025

Robert Aufschläger, Youssef Shoeb, Azarm Nowzad, Michael Heigl, Fabian Bally, Martin Schramm

Abstract:The collection and release of street-level recordings as Open Data play a vital role in advancing autonomous driving systems and AI research. However, these datasets pose significant privacy risks, particularly for pedestrians, due to the presence of Personally Identifiable Information (PII) that extends beyond biometric traits such as faces. In this paper, we present cRID, a novel cross-modal framework combining Large Vision-Language Models, Graph Attention Networks, and representation learning to detect textual describable clues of PII and enhance person re-identification (Re-ID). Our approach focuses on identifying and leveraging interpretable features, enabling the detection of semantically meaningful PII beyond low-level appearance cues. We conduct a systematic evaluation of PII presence in person image datasets. Our experiments show improved performance in practical cross-dataset Re-ID scenarios, notably from Market-1501 to CUHK03-np (detected), highlighting the framework's practical utility. Code is available at https://github.com/RAufschlaeger/cRID.

* accepted for publication at the 2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC 2025), taking place during November 18-21, 2025 in Gold Coast, Australia

Via

Access Paper or Ask Questions

ClustEm4Ano: Clustering Text Embeddings of Nominal Textual Attributes for Microdata Anonymization

Dec 17, 2024

Robert Aufschläger, Sebastian Wilhelm, Michael Heigl, Martin Schramm

Figure 1 for ClustEm4Ano: Clustering Text Embeddings of Nominal Textual Attributes for Microdata Anonymization

Figure 2 for ClustEm4Ano: Clustering Text Embeddings of Nominal Textual Attributes for Microdata Anonymization

Figure 3 for ClustEm4Ano: Clustering Text Embeddings of Nominal Textual Attributes for Microdata Anonymization

Figure 4 for ClustEm4Ano: Clustering Text Embeddings of Nominal Textual Attributes for Microdata Anonymization

Abstract:This work introduces ClustEm4Ano, an anonymization pipeline that can be used for generalization and suppression-based anonymization of nominal textual tabular data. It automatically generates value generalization hierarchies (VGHs) that, in turn, can be used to generalize attributes in quasi-identifiers. The pipeline leverages embeddings to generate semantically close value generalizations through iterative clustering. We applied KMeans and Hierarchical Agglomerative Clustering on $13$ different predefined text embeddings (both open and closed-source (via APIs)). Our approach is experimentally tested on a well-known benchmark dataset for anonymization: The UCI Machine Learning Repository's Adult dataset. ClustEm4Ano supports anonymization procedures by offering more possibilities compared to using arbitrarily chosen VGHs. Experiments demonstrate that these VGHs can outperform manually constructed ones in terms of downstream efficacy (especially for small $k$-anonymity ($2 \leq k \leq 30$)) and therefore can foster the quality of anonymized datasets. Our implementation is made public.

* 16 pages, 5 figures, accepted for presentation at IDEAS: 2024 28th International Symposium on Database Engineered Applications, Bayonne, France, August 26-29, 2024

Via

Access Paper or Ask Questions