Table of Links
-
Methodology and 3.1 Preliminary
-
A. Character Generation Detail
C. Effect of Text Moderator on Text-based Jailbreak Attack
B Ethics and Broader Impact
While our research introduces a jailbreaking method aimed at MLLMs, we emphasize the importance of responsible utilization of our methodology and underscore the academic nature of our discoveries. Our intention is to highlight potential vulnerabilities within these models and encourage collaborative efforts to develop robust defenses, thereby enhancing the safety of MLLMs. To facilitate a transparent and constructive discussion surrounding FigStep, we are committed to releasing our datasets and sharing any harmful responses generated with academic institutions upon request. Additionally, considering that large multi-modal models like MLLMs are still in their early stages of development, we believe that there are likely more text-image jailbreaking attacks waiting to be explored. Ultimately, our findings should raise significant security concerns.
Authors:
(1) Siyuan Ma, University of Wisconsin–Madison ([email protected]);
(2) Weidi Luo, The Ohio State University ([email protected]);
(3) Yu Wang, Peking University ([email protected]);
(4) Xiaogeng Liu, University of Wisconsin-Madison ([email protected]).
This paper is