This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Zhe Liu, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China;
(2) Chunyang Chen, Monash University, Melbourne, Australia;
(3) Junjie Wang, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China & Corresponding author;
(4) Mengzhuo Chen, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China;
(5) Boyu Wu, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China;
(6) Zhilin Tian, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China;
(7) Yuekai Huang, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China;
(8) Jun Hu, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China;
(9) Qing Wang, State Key Laboratory of Intelligent Game, Beijing, China Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China & Corresponding author.
Motivational Study and Background
Discussion and Threats to Validity
Testing Related with Text Inputs. There have been many automated GUI testing techniques for mobile apps [7, 9, 12, 22, 23, 26, 39, 48, 49, 57, 61, 67], yet they mainly focus on how to plan the exploration paths to fully cover the app activities and states. There are also studies [27, 43, 44] that aim at generating valid inputs to pass the GUI pages and are used to enrich the automated testing tools for higher coverage. None of them can conduct the testing of text input widgets.
For Web apps, SWAT [5] and AWET [62] generated the unusual inputs based on the pre-defined template. ACTEve [6] and S3 [63] first used symbolic execution to extract input constraints in the source code and then employ a solver to generate the inputs. They need to analyze the web code and can’t be directly applied to Android apps which have quite different rendering mechanisms. In addition, some constraints are dynamically generated (as shown in Section 2.1.2), and couldn’t be extracted from the source code.
There are some string analysis methods for generating the strings that violate the constraints (e.g., string length) [14, 15, 18, 28, 33, 34, 37, 42, 64]. Although they are effective for string constraints, yet the inputs of mobile apps are more diversified, and they cannot work well in our task.
LLM for Software Engineering. With the breakthrough of LLMs, studies have proposed to explore how LLMs can be used to assist developers in a variety of tasks, such as code generation [54, 69], program repair [29, 31, 52], and code summarization [4, 69]. There is also a growing trend of applying LLM for software testing, e.g., fuzzing deep learning libraries [20], unit test generation [36], bug reproduction [32], valid input generation [44], etc, and achieves significant performance improvement. This work explores a different task, i.e., unusual text input generation for mobile apps, which provides new insights into how LLM can enhance the software testing practice.