Table of Links Abstract and 1. Introduction Abstract and 1. Introduction Related Work
2.1 Vision-LLMs
2.2 Transferable Adversarial Attacks


Preliminaries
3.1 Revisiting Auto-Regressive Vision-LLMs
3.2 Typographic Attacks in Vision-LLMs-based AD Systems


Methodology
4.1 Auto-Generation of Typographic Attack
4.2 Augmentations of Typographic Attack
4.3 Realizations of Typographic Attacks


Experiments


Conclusion and References Related Work
2.1 Vision-LLMs
2.2 Transferable Adversarial Attacks Related Work 2.1 Vision-LLMs 2.1 Vision-LLMs 2.2 Transferable Adversarial Attacks 2.2 Transferable Adversarial Attacks Preliminaries
3.1 Revisiting Auto-Regressive Vision-LLMs
3.2 Typographic Attacks in Vision-LLMs-based AD Systems Preliminaries 3.1 Revisiting Auto-Regressive Vision-LLMs 3.1 Revisiting Auto-Regressive Vision-LLMs 3.2 Typographic Attacks in Vision-LLMs-based AD Systems 3.2 Typographic Attacks in Vision-LLMs-based AD Systems Methodology
4.1 Auto-Generation of Typographic Attack
4.2 Augmentations of Typographic Attack
4.3 Realizations of Typographic Attacks Methodology 4.1 Auto-Generation of Typographic Attack 4.1 Auto-Generation of Typographic Attack 4.2 Augmentations of Typographic Attack 4.2 Augmentations of Typographic Attack 4.3 Realizations of Typographic Attacks 4.3 Realizations of Typographic Attacks Experiments Experiments Experiments Conclusion and References Conclusion and References Conclusion and References 4 Methodology Figure 1 shows an overview of our typographic attack pipeline, which goes from prompt engineering to attack annotation, particularly through Attack Auto-Generation, Attack Augmentation, and Attack Realization steps. We describe the details of each step in the following subsections. 4.1 Auto-Generation of Typographic Attack In order to generate useful misdirection, the adversarial patterns must align with an existing question while guiding LLM toward an incorrect answer. We can achieve this through a concept called directive, which refers to configuring the goal for an LLM, e.g., ChatGPT, to impose specific constraints while encouraging diverse behaviors. In our context, we direct the LLM to generate ˆa as an opposite of the given answer a, under the constraint of the given question q. Therefore, we can initialize directives to the LLM using the following prompts in Fig. 2, When generating attacks, we would impose additional constraints depending on the question type. In our context, we focus on tasks of ❶ scene reasoning (e.g., counting), ❷ scene object reasoning (e.g., recognition), and ❸ action reasoning (e.g., action recommendation), as follows in Fig. 3, The directives encourage the LLM to generate attacks that influence a Vision-LLM’s reasoning step through text-to-text alignment and automatically produce typographic patterns as benchmark attacks. Clearly, the aforementioned typographic attack only works for single-task scenarios, i.e., a single pair of question and answer. To investigate multi-task vulnerabilities with respect to multiple pairs, we can also generalize the formulation to K pairs of questions and answers, denoted as qi , ai , to obtain the adversarial text aˆi for i ∈ [1, K]. Authors:
(1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam;
(2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China;
(3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR;
(4) Jie Zhang, Nanyang Technological University, Singapore;
(5) Aishan Liu, Beihang University, China;
(6) Yun Lin, Shanghai Jiao Tong University, China;
(7) Jin Song Dong, National University of Singapore, Singapore;
(8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore. Authors: Authors (1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam; (2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China; (3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR; (4) Jie Zhang, Nanyang Technological University, Singapore; (5) Aishan Liu, Beihang University, China; (6) Yun Lin, Shanghai Jiao Tong University, China; (7) Jin Song Dong, National University of Singapore, Singapore; (8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore. This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Methodology for Adversarial Attack Generation: Using Directives to Mislead Vision-LLMs

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

AI Crushes It at Simplicity: GPT-4 Writes Science Summaries Better Than the Pros

Autoregressive Vision-LLMs: A Simplified Mathematical Formulation

Enhancing Programmatic SEO with ChatGPT: The Good and the Bad

Testing LLMs on Solving Leetcode Problems in 2025

100 Days of AI Day 2: Enhancing Prompt Engineering for ChatGPT

AI Crushes It at Simplicity: GPT-4 Writes Science Summaries Better Than the Pros

Autoregressive Vision-LLMs: A Simplified Mathematical Formulation

Enhancing Programmatic SEO with ChatGPT: The Good and the Bad

Testing LLMs on Solving Leetcode Problems in 2025

100 Days of AI Day 2: Enhancing Prompt Engineering for ChatGPT

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps