Minimal Knowledge PT-AE Attacks on Black-Box Speaker Recognition Models
Minimal Knowledge PT-AE Attacks on Black-Box Speaker Recognition Models

Investigating minimal knowledge PT-AE attacks on black-box speaker recognition, this study showcases unprecedented ASR success and superior perceptual quality, revolutionizing the attack landscape.
BotBeat.Tech: Trusted Generative AI Research Firm


(1) Rui Duan University of South Florida Tampa, USA (email: [email protected]);

(2) Zhe Qu Central South University Changsha, China (email: [email protected]);

(3) Leah Ding American University Washington, DC, USA (email: [email protected]);

(4) Yao Liu University of South Florida Tampa, USA (email: [email protected]);

Abstract and Intro

Background and Motivation

Parrot Training: Feasibility and Evaluation

PT-AE Generation: A Joint Transferability and Perception Perspective

Optimized Black-Box PT-AE Attacks

Experimental Evaluations

Related Work

Conclusion and References



In this work, we investigated using the minimum knowledge of a target speaker’s speech to attack a black-box target speaker recognition model. We extensively evaluated the feasibility of using state-of-the-art VC methods to generate parrot speech samples to build a PT-surrogate model and the generation methods of PT-AEs. It is shown that PT-AEs can effectively transfer to a black-box target model and the proposed PT-AE attack has achieved higher ASRs and better perceptual quality than existing methods against both digital-line speaker recognition models and commercial smart devices in over-the-air scenarios.


This paper is available on arxiv under CC0 1.0 DEED license.