Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oguzhan Baser

TensorCommitments: A Lightweight Verifiable Inference for Language Models

Feb 13, 2026

Oguzhan Baser, Elahe Sadeghi, Eric Wang, David Ribeiro Alves, Sam Kazemian, Hong Kang, Sandeep P. Chinchali, Sriram Vishwanath

Abstract:Most large language models (LLMs) run on external clouds: users send a prompt, pay for inference, and must trust that the remote GPU executes the LLM without any adversarial tampering. We critically ask how to achieve verifiable LLM inference, where a prover (the service) must convince a verifier (the client) that an inference was run correctly without rerunning the LLM. Existing cryptographic works are too slow at the LLM scale, while non-cryptographic ones require a strong verifier GPU. We propose TensorCommitments (TCs), a tensor-native proof-of-inference scheme. TC binds the LLM inference to a commitment, an irreversible tag that breaks under tampering, organized in our multivariate Terkle Trees. For LLaMA2, TC adds only 0.97% prover and 0.12% verifier time over inference while improving robustness to tailored LLM attacks by up to 48% over the best prior work requiring a verifier GPU.

* 23 pages, 8 figures, under review

Via

Access Paper or Ask Questions

SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures

Jul 01, 2024

Oguzhan Baser, Kaan Kale, Sandeep P. Chinchali

Figure 1 for SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures

Figure 2 for SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures

Figure 3 for SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures

Figure 4 for SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures

Abstract:Advancements in DeepFake (DF) audio models pose a significant threat to voice authentication systems, leading to unauthorized access and the spread of misinformation. We introduce a defense mechanism, SecureSpectra, addressing DF threats by embedding orthogonal, irreversible signatures within audio. SecureSpectra leverages the inability of DF models to replicate high-frequency content, which we empirically identify across diverse datasets and DF models. Integrating differential privacy into the pipeline protects signatures from reverse engineering and strikes a delicate balance between enhanced security and minimal performance compromises. Our evaluations on Mozilla Common Voice, LibriSpeech, and VoxCeleb datasets showcase SecureSpectra's superior performance, outperforming recent works by up to 71% in detection accuracy. We open-source SecureSpectra to benefit the research community.

* 5 pages, 4 figures, Proc. INTERSPEECH 2024

Via

Access Paper or Ask Questions

TexShape: Information Theoretic Sentence Embedding for Language Models

Feb 05, 2024

H. Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath

Figure 1 for TexShape: Information Theoretic Sentence Embedding for Language Models

Figure 2 for TexShape: Information Theoretic Sentence Embedding for Language Models

Figure 3 for TexShape: Information Theoretic Sentence Embedding for Language Models

Figure 4 for TexShape: Information Theoretic Sentence Embedding for Language Models

Abstract:With the exponential growth in data volume and the emergence of data-intensive applications, particularly in the field of machine learning, concerns related to resource utilization, privacy, and fairness have become paramount. This paper focuses on the textual domain of data and addresses challenges regarding encoding sentences to their optimized representations through the lens of information-theory. In particular, we use empirical estimates of mutual information, using the Donsker-Varadhan definition of Kullback-Leibler divergence. Our approach leverages this estimation to train an information-theoretic sentence embedding, called TexShape, for (task-based) data compression or for filtering out sensitive information, enhancing privacy and fairness. In this study, we employ a benchmark language model for initial text representation, complemented by neural networks for information-theoretic compression and mutual information estimations. Our experiments demonstrate significant advancements in preserving maximal targeted information and minimal sensitive information over adverse compression ratios, in terms of predictive accuracy of downstream models that are trained using the compressed data.

* Submitted to the 2024 IEEE International Symposium on Information Theory

Via

Access Paper or Ask Questions