The Dual-Edged Sword of Vision-LLMs in AD: Reasoning Capabilities vs. Attack Vulnerabilities

Table of Links Abstract and 1. Introduction Abstract and 1. Introduction Related Work 2.1 Vision-LLMs 2.2 Transferable Adversarial Attacks Preliminaries 3.1 Revisiting Auto-Regressive Vision-LLMs 3.2 Typographic Attacks in Vision-LLMs-based AD Systems Methodology 4.1 Auto-Generation of Typographic Attack 4.2 Augmentations of Typographic Attack 4.3 Realizations of Typographic Attacks Experiments Conclusion and References Related Work 2.1 Vision-LLMs 2.2 Transferable Adversarial Attacks Related Work 2.1 Vision-LLMs 2.1 Vision-LLMs 2.2 Transferable Adversarial Attacks 2.2 Transferable Adversarial Attacks Preliminaries 3.1 Revisiting Auto-Regressive Vision-LLMs 3.2 Typographic Attacks in Vision-LLMs-based AD Systems Preliminaries 3.1 Revisiting Auto-Regressive Vision-LLMs 3.1 Revisiting Auto-Regressive Vision-LLMs 3.2 Typographic Attacks in Vision-LLMs-based AD Systems 3.2 Typographic Attacks in Vision-LLMs-based AD Systems Methodology 4.1 Auto-Generation of Typographic Attack 4.2 Augmentations of Typographic Attack 4.3 Realizations of Typographic Attacks Methodology 4.1 Auto-Generation of Typographic Attack 4.1 Auto-Generation of Typographic Attack 4.2 Augmentations of Typographic Attack 4.2 Augmentations of Typographic Attack 4.3 Realizations of Typographic Attacks 4.3 Realizations of Typographic Attacks Experiments Experiments Experiments Conclusion and References Conclusion and References Conclusion and References 3.2 Typographic Attacks in Vision-LLMs-based AD Systems The integration of Vision-LLMs into end-to-end AD systems has brought promising results thus far [9], where Vision-LLMs can enhance user trust through explicit reasoning steps of the scene. On the one hand, language reasoning in AD systems can elevate their capabilities by utilizing the learned commonsense of LLMs, while being able to proficiently communicate to users. On the other hand, exposing Vision-LLMs to public traffic scenarios not only makes them more vulnerable to typographic attacks that misdirect the reasoning process but can also prove harmful if their results are connected with decision-making, judgment, and control processes. Authors: (1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam; (2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China; (3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR; (4) Jie Zhang, Nanyang Technological University, Singapore; (5) Aishan Liu, Beihang University, China; (6) Yun Lin, Shanghai Jiao Tong University, China; (7) Jin Song Dong, National University of Singapore, Singapore; (8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore. Authors: Authors (1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam; (2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China; (3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR; (4) Jie Zhang, Nanyang Technological University, Singapore; (5) Aishan Liu, Beihang University, China; (6) Yun Lin, Shanghai Jiao Tong University, China; (7) Jin Song Dong, National University of Singapore, Singapore; (8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore. This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv available on arxiv

The Dual-Edged Sword of Vision-LLMs in AD: Reasoning Capabilities vs. Attack Vulnerabilities

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A New Model That Actually Works: Why Today’s AI Detectors Fail

How Large Language Models Impact Data Security in RAG Applications

LLM Security: A Practical Overview of the Protective Measures Needed

The Prompt Protocol: Why Tomorrow's Security Nightmares Will Be Whispered, Not Coded

Typographic Attacks on Vision-LLMs: Evaluating Adversarial Threats in Autonomous Driving Systems

The Vulnerability of Autonomous Driving to Typographic Attacks: Transferability and Realizability

A New Model That Actually Works: Why Today’s AI Detectors Fail

How Large Language Models Impact Data Security in RAG Applications

LLM Security: A Practical Overview of the Protective Measures Needed

The Prompt Protocol: Why Tomorrow's Security Nightmares Will Be Whispered, Not Coded

Typographic Attacks on Vision-LLMs: Evaluating Adversarial Threats in Autonomous Driving Systems

The Vulnerability of Autonomous Driving to Typographic Attacks: Transferability and Realizability

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps