Privacy auditing provides an important safeguard by estimating the actual information leaked by a model, thus ensuring that theoretical privacy guarantees hold in practice. We study empirical privacy auditing for differentially private (DP) machine learning, focusing on efficient one-run methods for mechanisms such as DP-SGD. Prior one-run approaches threshold training examples or "canaries" into binary membership guesses, which discards useful information. We show that, in the white-box DP-SGD setting, canary-aligned signals naturally form a sequence of random variables whose normalized sum is asymptotically Gaussian. Leveraging this distributional perspective, we develop a DP-auditing framework that leads to tighter privacy lower bounds from a single training run.
Pre-Print
Balanced Additive Randomized Encodings for Shuffle Differential Privacy
Yu Wei, Jaspal Singh, Adya Agrawal, and Vassilis Zikas
2026
Presented at TPDP 2026. Submitted to ACM CCS 2026.
As large language models (LLMs) become more powerful, the computation required to run these models is increasingly outsourced to a third-party cloud. While this saves clients’ computation, it risks leaking the clients’ LLM queries to the cloud provider. Fully homomorphic encryption (FHE) presents a natural solution to this problem: simply encrypt the query and evaluate the LLM homomorphically on the cloud machine. The result remains encrypted and can only be learned by the client who holds the secret key. In this work, we present a GPU-accelerated implementation of FHE and use this implementation to benchmark an encrypted GPT-2 forward pass, with runtimes over 200x faster than the CPU baseline. We also present novel and extensive experimental analysis of approximations of LLM activation functions to maintain accuracy while achieving this performance.
@inproceedings{castro2025encryptedllm,title={Encrypted{LLM}: Privacy-Preserving Large Language Model Inference via {GPU}-Accelerated Fully Homomorphic Encryption},author={de Castro, Leo and Escudero, Daniel and Agrawal, Adya and Polychroniadou, Antigoni and Veloso, Manuela},booktitle={Forty-second International Conference on Machine Learning},year={2025},url={https://openreview.net/forum?id=PGNff6H1TV},}
Public exchanges like the New York Stock Exchange and NASDAQ act as auctioneers in a public double auction system, where buyers submit their highest bids and sellers offer their lowest asking prices, along with the number of shares (volume) they wish to trade. The auctioneer matches compatible orders and executes the trades when a match is found. However, auctioneers involved in high-volume exchanges, such as dark pools, may not always be reliable. They could exploit their position by engaging in practices like front-running or face significant conflicts of interest—ethical breaches that have frequently resulted in hefty fines and regulatory scrutiny within the financial industry. Previous solutions, based on the use of fully homomorphic encryption (Asharov et al., AAMAS 2020), encrypt orders ensuring that information is revealed only when a match occurs. However, this approach introduces significant computational overhead, making it impractical for high-frequency trading environments such as dark pools. In this work, we propose a new system based on differential privacy combined with lightweight encryption, offering an efficient and practical solution that mitigates the risks of an untrustworthy auctioneer. Specifically, we introduce a new concept called Indifferential Privacy, which can be of independent interest, where a user is indifferent to whether certain information is revealed after some special event, unlike standard differential privacy. For example, in an auction, it’s reasonable to disclose the true volume of a trade once all of it has been matched. Moreover, our new concept of Indifferential Privacy allows for maximum matching, which is impossible with conventional differential privacy.
@inproceedings{anonymous2024indifferential,title={Indifferential Privacy: A New Paradigm and Its Applications to Optimal Matching in Dark Pool Auctions},author={Polychroniadou, Antigoni and Chan, T-H. Hubert and Agrawal, Adya},booktitle={The 24th International Conference on Autonomous Agents and Multi-Agent Systems},year={2025},url={https://openreview.net/forum?id=YGKL8fKjbd},}
As the dominant operating system for mobile devices, Android is the prime target of malicious attackers. Installed Android applications provide an opportunity for attackers to bypass the system’s security. Therefore, it is vital to study and evaluate Android applications to effectively identify harmful applications. Android applications are analyzed by conventional methods using signature hash-based algorithms or static features-based machine learning approaches. This research proposes optimized ensemble classification models for Android applications. Ensemble models have been trained for both static and dynamic analysis using seven and eight distinct classifiers respectively. These models have been optimized by tuning their hyper-parameters and evaluated using K-fold cross-validation. We were able to acquire an F1 score of 99.27% and an accuracy of 99.47% for static analysis and our dynamic analysis model yielded an F1 score of 96.96% and an accuracy of 96.66%. Our proposed approach overcomes conventional solutions by taking into account both static and dynamic analysis and attaining high accuracy with the help of ensemble models.
@inproceedings{10.1007/978-3-031-31164-2_14,author={Jain, Samyak and Agrawal, Adya and Nayak, Swapna Sambhav and Kakelli, Anil Kumar},editor={Sharma, Harish and Saha, Apu Kumar and Prasad, Mukesh},title={Optimized Static and Dynamic Android Malware Analysis Using Ensemble Learning},booktitle={Proceedings of International Conference on Intelligent Vision and Computing (ICIVC 2022)},year={2022},publisher={Springer Nature Switzerland},address={Cham},pages={165--179},isbn={978-3-031-31164-2}}