Authors:
(1) LI LI, Beihang University, China;
(2) XIANG GAO, Beihang University, China;
(3) HAILONG SUN, Beihang University, China;
(4) CHUNMING HU, Beihang University, China;
(5) XIAOYU SUN, The Australian National University, Australia;
(6) HAOYU WANG, Huazhong University of Science and Technology, China;
(7) HAIPENG CAI, Washington State University, Pullman, USA;
(8) TING SU, East China Normal University, China;
(9) XIAPU LUO, The Hong Kong Polytechnic University, China;
(10) TEGAWENDÉ F. BISSYANDÉ, University of Luxembourg, Luxembourg;
(11) JACQUES KLEIN, University of Luxembourg, Luxembourg;
(12) JOHN GRUNDY, Monash University, Australia;
(13) TAO XIE, Peking University, China;
(14) HAIBO CHEN, Shanghai Jiao Tong University, China;
(15) HUAIMIN WANG, National University of Defense Technology, China.
The State Of OpenHarmony Ecosystem
Overview Of Mobile Software Engineering
In this work, we are interested in building a research roadmap for conducting software engineering research for OpenHarmony. Unfortunately, since OpenHarmony is still in its early stages, there is not much work proposed for that. Nevertheless, we believe all the research efforts contributed to improving the Android and iOS ecosystem could be also conducted for OpenHarmony. Therefore, in this section, we first resort to a systematic literature review to understand the status quo of mobile software engineering research. We will then leverage the empirical observations to form our research roadmap dedicated to OpenHarmony.
In this work, we conduct the systematic literature review following the methodologies outlined by Brereton et al. [16] and Li et al. [60]. Fig. 6 highlights the working process. In Step 1, we plan
to investigate the latest research advancements in the area of mobile software engineering by answering the following research question.
RQ: What problems are targeted by our fellow researchers in the MSE community and how they are resolved?
Then, in Step 2, we identify the search keywords that could be used to find all the relevant publications, in order to answer the pre-defined research questions. When we started to do that, we immediately realized that the number of primary publications was too huge, it is literally impossible to manually read all of them. Indeed, taking Android-related research alone, there are already over 7,500 papers recorded on DBLP. If one is able to read a paper in 2 hours, it would still require years to complete. To mitigate this problem, we resort to only considering the existing survey and literature review papers, for which our fellow researchers have already systematically reviewed the different aspects of mobile software engineering. We believe these survey papers are representative of the status quo of mobile software engineering research. To this end, we identify the search keywords based on these concerns and the list of included keywords is summarized in Table 5. In total, we have identified two groups of keywords, represented as G1 and G2, respectively. We then form the query based on this rule[9] for which we require it to contain at least one keyword from each group.
After the query is formed, in Step 3, we directly applied it to search relevant studies in the Computer Science Bibliography (DBLP) Database. This step gives us 65 papers that are potentially relevant to our study. After that, in Step 4, we refine the gathered list of relevant papers by manually checking their relevance to mobile software engineering (i.e., could indeed be helpful for answering the aforementioned research question). Specifically, we filter out the less relevant papers based on the following set of exclusion criteria.
(1) Since we only consider survey or literature review papers, all the non-survey papers are simply excluded from our study.
(2) Although there are some papers that meet our selection criteria (i.e., whose title contains the group keywords in Table 5), their topics may not strictly fall into the software engineering category. We initially eliminate these papers by manually reviewing the abstracts and identifying those that have the potential to provide guidance for OpenHarmony.
Once the irrelevant papers are filtered out, we then conduct a backward snowballing (i.e., Step 5) by scanning all the referenced papers and checking if they should also be considered for our study. We remind the readers that we have cross-checked the results (i.e., Step 6) in all the previous two steps (i.e., exclusion criteria filtering and backward snowballing) to ensure the reliability of the results.
We were able to eventually collect 39 papers to answer our research question defined at the beginning of this study. Table 6 enumerates the list of selected papers, including their publication year and venue. Once the relevant papers are collected, we carefully read all of them and attempt to extract the relevant data (i.e., Step 7) from each paper to answer the research question. Specifically, we aim to extract the following two types of information: (1) Targeted Problems, which involve understanding the issues within the Android/iOS ecosystem that have been identified by our MSE researchers as problems needing resolution to create a more user-friendly mobile ecosystem, and (2) Fundamental Techniques, aimed at discovering the techniques required to address the various challenges in the mobile community. Considering that OpenHarmony may encounter similar issues to those faced by Android and iOS, we argue that insights gained from exploring these two aspects could prove valuable in shaping the roadmap for conducting software engineering research for OpenHarmony. Furthermore, similar to our approach in identifying relevant papers, we have conducted cross-checks of our observations, involving at least two authors, to ensure the reliability of these observations, thereby enhancing the trustworthiness of the research roadmap.
Before going into the details in summarizing the top problems targeted by our fellow researchers in MSE, we first present the major participants (or artifacts) involved in MSE research. These participants have been closely associated with the top problems identified and handled in MSE. As illustrated in Fig. 7, developers play a core role in MSE, who contribute to the ecosystem by implementing mobile apps based on the Android framework (also known as the SDK) provided by Google, along with various third-party libraries that are pre-developed for facilitating app developments. The libraries also include the ones used to provide advertisements, which also play a crucial role in Android as they are the major source for app developers to make profits.[10] When there are problems encountered while developing an app, developers frequently resort to question and answer website (such as Stack Overflow) to search for solutions. The app’s source code is often managed on code hosting websites such as Github, which is also one of the most important resources leveraged by mining software repository researchers to learn for improving Android
apps. Once the apps are developed, they will be uploaded to app stores such as the official Google Play store, on which various metadata associated with the app (such as app’s description, name, authors, etc.) will also be provided. The app stores are the main portal for users to find and install apps. Except for searching and installing apps, app stores also provide a platform for users to leave feedback (i.e., user comments, which could be complaints about defects or suggestions regarding new app features) for their apps on dedicated pages.
We now highlight the top problems targeted by our fellow researchers (cf. Table 7). These top problems could be applied to any of the aforementioned participants highlighted in Fig. 7. The problems are mainly grouped into nine categories, including app development, app deployment, user experience, security and privacy, quality, reliability, performance, energy, and socio-technical issues. To help readers better understand each of the categories (i.e., the actual problems handled by our fellow researchers), we also provide various problem examples in the second column of the table.
To solve the above software engineering problems, researchers have proposed various kinds of techniques. Note that, while there are more techniques designed to solve the above problems, e.g., trust environment execution (TEE) for increasing mobile application security, we will not include them but only consider the software engineering techniques in this work. Also, resolving software engineering tasks often involves manual efforts, such as confirming the warnings yielded by static analyzers or labeling datasets for training machine learning models, etc. In this work, we will not take into account those manual approaches. For the remaining techniques, after discussing
them among co-authors, we preliminarily categorize them as static-based, dynamic-based, and learning-based approaches. Fig. 8 highlights the represented ones.
Static Approaches. Static approaches are the analysis of programs performed without executing them. The widely used static approaches are listed in Fig. 8. These static approaches have been applied to the SE problems of mobile applications, Android frameworks, and mobile operating systems. Specifically, static approaches (e.g., taint analysis, symbolic execution, code instrumentation, model checking) are widely used to detect application bugs, including functional errors, code smells, security weaknesses/vulnerabilities, energy and performance bugs, permission escalations, etc. Beyond bug detection, static approaches (e.g., application hardening, code sign) are also used to increase the security and reliability of mobile applications. Moreover, with the rapid development of machine/deep learning, we have observed a trend to use static approaches to extract program features, which are then provided to learning approaches.
Dynamic approaches. In contrast with static approaches, dynamic approaches are performed on programs during their execution. Similar to static approaches, dynamic approaches are also applied for program testing. Widely used dynamic testing techniques include search-based testing, black-box/random testing, grey-box fuzzing, concolic execution, event-driven test generation, mutation testing, etc. Dynamic program analysis is also applied for security analysis (e.g., dynamic taint analysis and runtime monitoring) and automated program repair.
Learning-based approaches Beyond the traditional static and dynamic approaches, we have seen an increasing trend that applies machine/deep learning techniques to solve mobile software engineering problems. Learning techniques train models by extracting features from large program artifacts and have achieved significant success in the field of code analysis. Learning-based techniques have been applied to solve many mobile software engineering tasks, including vulnerability detection, privacy issues detection, program testing, code smell checking, etc. Moreover, it has recently garnered considerable research attention to employ deep learning techniques to thwart Android malware attacks.
[9] (𝑔11 OR ... OR 𝑔1𝑥 ) AND (𝑔21 OR ... OR 𝑔2𝑦), where 𝑔1𝑖 ∈ 𝐺1, 𝑔2𝑗 ∈ 𝐺2, and 1 ≤ 𝑖 ≤ 𝑥, 1 ≤ 𝑗 ≤ 𝑦, for which 𝑥 and 𝑦 are the number of keywords in G1 and G2, respectively
[10] Indeed, app developers often cannot make profits directly from the apps per se as they are often made available to users as free apps.
This paper is available on arxiv under CC 4.0 license.