What Are Large Language Models Capable Of: The Vulnerability of LLMs to Adversarial Attacks
Too Long; Didn't Read
Recent research uncovered a vulnerability in deep learning models, including large language models, called "adversarial attacks." These attacks manipulate input data to mislead models. So, I decided to test out a framework that automatically generates universal adversarial prompts.