paint-brush
ChatGPT in Test Design: How to Streamline QA Processesby@pietester
23,909 reads
23,909 reads

ChatGPT in Test Design: How to Streamline QA Processes

by Artem TregubAugust 26th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

QA engineers can use the AI-driven agent to draft test designs, convert test cases into specific formats such as Jira TestRail CSV, create test cases from code, potentially generate code from test cases, and prepare test data. The intelligent agent highlights the critical information missing from the initial requirement and suggests improvements.
featured image - ChatGPT in Test Design: How to Streamline QA Processes
Artem Tregub HackerNoon profile picture

Designing software tests is often seen as a routine task for quality assurance (QA) engineers. This process requires considerable time, from requirements testing to preparing test data, before the actual testing begins. Many QA specialists find this part of their job tedious and are eager to simplify it. Fortunately, there are several artificial intelligence assistants available that promise to streamline this process. In this article, ChatGPT from OpenAI will be the focus of consideration.


This article explores how the virtual chatbot can be utilized to meet the needs of quality assurance. Specifically, it demonstrates how QA engineers can use the AI-driven agent to draft test designs, convert test cases into specific formats such as Jira TestRail CSV, create test cases from code, potentially generate code from test cases, and prepare test data.


Before proceeding, it's important to note that, for this article, I used ChatGPT-4o to demonstrate the AI assistant’s capabilities and provide the most effective solutions for the cases discussed. Now, we are prepared to move forward.

Requirements Testing

Every system or project begins with requirements, making requirements testing a critical part of the testing cycle. Each requirement and set of requirements must meet quality criteria such as completeness, unambiguity, correctness, etc. While I won't delve into these criteria here, you can read more about requirements quality criteria in my previous article.


In this section, I will focus on a prompt used to request ChatGPT to test any requirement and the AI-powered assistant's response to showcase its current capability. Although you can use a requirements testing checklist to ensure all necessary criteria are met, the AI chatbot can be applied for pre-checking requirements or generating ideas.


Let’s get down to business. The first example includes a simple, non-specific requirement to test a text field: “Text field shall accept only digits as an input from a user.” Often, we encounter such vague requirements, and clearly, this one is incomplete. Using the digital assistant, we can improve it. I specify my prompt to obtain improvements and a regular expression for future automation.


Sometimes, the digital assistant may ask additional questions to refine the result or require several prompt modifications to include all necessary information. The prompt and response for this particular request are shown below.


Here is a requirement: 'Text field shall accept only digits as an input from a user.' Please provide recommendations for improvement if necessary, or confirm if the requirement is already good. Also, provide a regular expression to cover this requirement.



The intelligent agent highlights the critical information missing from the initial requirement and suggests improvements. Additionally, it provides recommendations on other aspects that can be tested and how the system itself can be enhanced at an early development stage when the requirement first appears in a lifecycle management system like Polarion or Jira. The response also includes the requested regex with an explanation of its use.


Next, let’s consider a more specific requirement to see how the virtual assistant handles it. Since I work in the automotive industry, specifically with HV (high voltage) systems for electric vehicles, this requirement will pertain to that field. Using the same prompt with a different requirement, I didn't request any regular expressions this time, as they were unnecessary. The prompt and response are shown further.


Here is a requirement: 'The HV Battery System shall stop pre-charge and disconnect the battery from the HV bus IF "State" != Connect within a calibratable timeout, "CalibrationTimeout".' Please provide recommendations for improvement if necessary, or confirm if the requirement is already good.



As we can see, the AI-powered assistant suggests clarifying the “State” condition. However, in this case, it’s unnecessary. Nonetheless, this remark could motivate you to check whether this or other states are documented. The same applies to the definition of “Status != Connect”; it’s essential to ensure this state is defined in a specification or other document.


Additionally, the AI-driven agent emphasizes logging, which is often overlooked but should be included and tested during development. This reminder from the virtual chatbot helps prevent future issues and enhances project traceability.


What does this second example demonstrate? It shows that the current version of ChatGPT-4o can handle not only simple requirements, such as those for API construction or web development, but also specific ones related to specialized fields like HV systems. While the language model generates well-constructed sentences, it can serve as an initial stage of requirement testing.


The generative AI can provide a direction for thinking about how to approach requirements, assess their completeness, and offer ideas and advice.


It’s important to remember that the AI agent lacks complete context for every requirement. Providing it raises concerns, including data security, NDA information, and organizational data privacy. When using AI at work, I change every name of signals and algorithms that might be vital to my company and avoid inputting any sensitive information into the AI-driven tool. OpenAI states that they use the data in an undefined, unsigned manner.


However, the future use of this data is uncertain, so I recommend against using specific details and data, even from your work. If working with the API, change keys at a minimum. Also to achieve a good outcome, provide as much information as possible while considering your organization's privacy and data security concerns.

Test Design

This section examines how the digital assistant prepares test designs for one of the initial requirements used in the previous section and a new one for an edge case. First, a few words about the format: the AI agent was asked to generate a CSV file for each test case. This format was chosen because most QA engineers use a tracking system called Jira.


This and other systems have various plugins for managing tests, their descriptions, and artifacts of test design. Supported file formats may vary, so users can specify their preferred scheme to the intelligent chatbot.


The following prompt was used to obtain a CSV file for an automotive requirement:


“Let's create test cases for the requirement "The HV Battery System shall stop pre-charge and disconnect the battery from the HV bus IF "State" != Connect within a calibratable timeout, "CalibrationTimeout"." Please generate separate CSV files for each test case. Format each CSV with the following columns: Action, Data, Expected Result. Use a comma (,) as the separator. Use (;) to separate data inside csv fields.”


The last part of the prompt about using a semicolon helps the AI assistant avoid producing an incorrect result. As a language model, an AI-powered chatbot tends to separate complex parts of sentences with a comma, which is unsuitable here since a comma is used as the CSV file separator. Therefore, another separator for data should be specified if necessary.


Below is the response where CSV files were generated by the intelligent agent as a piece of code that can be represented in a table format by Jira.



Here, I have displayed the resulting tests below in tabular form to make it easier to analyze visually.


File 1: test_stop_precharge_and_disconnect_within_timeout.csv

The test verifies that the HV Battery System correctly halts pre-charge and disconnects the battery from the HV bus if the "State" is not "Connect" within the calibratable timeout period.


File 2: test_continue_precharge_and_remain_connected_within_timeout.csv

This case confirms that the HV Battery System continues pre-charging and stays connected to the HV bus when the "State" is "Connect" within the calibratable timeout period.


File 3: test_stop_precharge_and_disconnect_after_timeout.csv

The scenario checks that the HV Battery System halts pre-charge and disconnects the battery from the HV bus if the "State" remains not "Connect" after the calibratable timeout period expires.


File 4: test_ignore_state_change_to_connect_after_timeout.csv

This test script ensures that the HV Battery System does not reconnect the battery to the HV bus if the "State" changes to "Connect" only after the calibratable timeout period has passed.


File 5: test_log_error_on_timeout.csv

This test case verifies that the HV Battery System logs an error when the "State" is not "Connect," and the battery is disconnected from the HV bus due to the expiration of the calibratable timeout.


The AI-powered chatbot responded to the query using various tests that take into account the requirement parameter, calibration timeout, and different states. Because in this example, the initial requirement was used instead of the improved version previously discussed. Due to the lack of specific context and logging guidance in this initial requirement, the resulting tests prompt verification of how the system logs these events and manages them.


This could motivate you to ask the development team and the project manager multiple questions, necessitating further investigation and clarification. But despite this fact, the generative AI was able to apply basic test design techniques to provide a test design for this particular requirement.


When testing straightforward requirements, it seems easy to generate test cases using AI. However, QA specialists often deal with edge cases that require diverse test cases for effective validation. While working with OpenAI's model, it cannot be guaranteed that the AI will automatically create various test cases for each specific situation.


It is therefore essential to analyze the AI-generated results and make necessary edits. Taking into account this fact, for the selected requirement listed below, edge test scripts were generated separately.


Let's create edge test cases for the requirement: 'The web service should allow uploading files of limited size to the web server. The minimum size of an uploaded file is 1 MB, and the maximum size is 30 MB. The web service allows users to upload only the following formats: PDF, DOC, DOCX. If a user attempts to upload a file that is outside the size limits or is in an unsupported format, an error message should be displayed, and the upload should be rejected.' Please generate separate CSV files for each edge test case. Format each CSV with the following columns: Action, Data, Expected Result. Use a comma (,) as the separator. Use (;) to separate data inside csv fields.



File 1: test_minimum_size.csv

This test case verifies the web service's handling of file uploads at the minimum size limit, ensuring files at the minimum size are accepted and those just below are rejected.


File 2: test_maximum_size.csv

This test scenario checks the web service's ability to handle file uploads with a maximum size, ensuring files of the maximum size are accepted and files slightly larger than this limit are rejected.


File 3: test_supported_formats.csv

The validation case assesses the web service's response to file uploads based on format, confirming that supported formats are accepted and unsupported formats are rejected.


File 4: test_combined_size_and_format.csv

This test script evaluates the web service's handling of file uploads at boundary conditions, verifying compliance with size (minimum and maximum) and format requirements, and rejecting non-compliant files.


As shown in the response, the AI agent successfully generated edge scenarios to validate the requirement for uploading a file of limited size. This demonstrates that generative AI can be utilized to develop boundary tests, such as in the automation process, to cover the maximum number of possible test cases without extensive time spent on creating a test design for each requirement manually.


Nevertheless, to increase the likelihood of achieving the desired result, it may be necessary to provide detailed context about the system and the specific application of the requirement. Including related requirements can also help clarify specific terms, states, and other details.

Converting Test Cases to Specific Formats

To evaluate the AI-powered assistant's capability to convert a test design description into another format, an example test plan containing information about authorization by token was used. The AI was asked to convert this into a CSV file with specified columns. As mentioned earlier, many systems and plugins can be used for test design creation, and various formats such as XML, XLS or XLSX, JSON, etc., may be supported.


A simple text description of a test served as the basis for a prompt that defined input data and an expected result. The prompt and the AI-generated result are shown below.


I have a test design draft. Please generate separate CSV files for each test case. Format each CSV with the following columns: Action, Data, Expected Result. Use a comma (,) as the separator. Use (;) to separate data inside csv fields.


Here is the first test:

Set System time = 23:59:00 -->> System time = 23:59:00

Request token with lifetime = 2 min -->> Request include token

Try to use token to authorise in system -->> Auth successful

Wait 3 min

Try to use token to authorise in system -->> Auth denied

Set System time = 23:59:00 -->> System time = 23:59:00

Try to use token to authorise in system -->> Auth denied

Check token in database -->> token marked as expired




This result can be useful for various needs, and this approach can transfer regular textual representations of test cases into an appropriate format. Additionally, if any changes are needed in the response, such as altering columns, rows, or even the file format, you don't have to start from scratch. Simply make clarifications, and the AI-powered agent will regenerate its answer.


This is one of the intelligent agent’s greatest strengths for professionals: it provides a solid draft to work with, and you can save the prompt for future use, allowing you to repeat the same tasks much faster since you already have a reusable version.


When working with the AI-driven agent, I find it helpful to treat it like a database rather than a magical AI assistant. Similar to querying a database with SQL to retrieve specific information — selecting fields and performing actions — my approach with the AI chatbot follows an identical logic.


Asking specific questions or giving clear instructions on what needs to be done allows ChatGPT to generate drafts, provide ideas for future work, and suggest approaches to problem-solving. This method of interaction allows me to leverage its capabilities effectively, improving productivity and fostering creativity.

Creating Test Cases From Code

As a QA automation engineer, I prefer to create test cases directly from code rather than using separate text files, CSV, or Excel sheets. Automation simplifies scenario management, debugging, and refactoring code, and improves system understanding. Whether software-based or embedded in hardware, each system presents different challenges, especially when developing test cases from code, which is a critical task for maximizing automation, a fundamental aspect of our role in any organization.


For this example, Python code with the gummy bear API was chosen, an old test task from a company.



An iteratively generated prompt included instructions for generating CSV files based on Python code provided after the request, and instructions for the expected result:

Here's some Python code with multiple test functions. Please generate separate CSV files for each test case from this code. Format each CSV with the following columns: Action, Data, Expected Result. Use a comma (,) as the separator. Use (;) to separate data inside csv fields. Ensure the test steps are described generically, without direct references to specific function calls, and include verification of system states as part of the expected outcomes.


How did the last sentence of this request come about? This final section was developed from initial experiments with the AI-driven tool in test design. Initially, the generative AI relied on function calls, which is not ideal for test development as it is too specific, rather than using the more natural language typically used in test design.


To address this, I provided guidance to the intelligent chatbot on the necessary steps, including a verification system in the expected results. This guidance directed the AI to include validation as an integral part of the expected outcomes.


File 1: test_read_by_id.csv


File 2: test_read_all.csv


Examining the result reveals that the initial file contained two distinct test functions, resulting in two separate files: test_read_by_id.csv and test_read_all.csv. The descriptions are written in general language, using phrases like "create a test", "read bear info with specific ID", and "delete the created bear" instead of specific method names like create, read, and delete.


Additionally, the format includes three columns as requested, facilitating easy integration into Jira. This format can be directly imported into Jira or adjusted according to specific preferences.

Creating Code From Test Cases

In summary, the current state of affairs is not yet satisfactory. Many project and product managers hope for a solution that can automate tasks directly from descriptions or requirements. However, realistically, the virtual chatbot has not yet reached that level of capability.


Recently, I encountered an intriguing challenge involving ChatGPT and its interaction with code. The task involved migrating a Java-based test framework and its test cases to Python. In our company, we have a team whose main development language is Java, and this team implemented test automation using Java.


However, this created challenges for other team members, including test engineers and automation specialists, who primarily use Python. Unfortunately, my attempt to use the AI agent for this task wasn't successful. The model requires detailed information about the framework itself, which involves a significant amount of code.


Currently, the AI-driven tool lacks support for analyzing large volumes of code and cannot segment or process extensive files into manageable parts. This limitation led to failure. The virtual assistant simply isn't capable of this yet.


Recent articles on code generation with the intelligent agent highlight similar statements. While the AI-powered chatbot can assist in drafting code, it cannot provide a fully functional service. In practice, manual coding remains more efficient in most cases.

Test Data Preparation With OpenAI's Model

Test data preparation typically requires a significant amount of time from QA engineers. While there are various ways to prepare data sets, the intelligent chatbot can be utilized for this purpose. To test this, the AI was asked to create a dictionary for numerous parameterized tests in Python. The prompt used was:


I have a task to test a text field where only capital letters with a length of no more than 5 should be accepted. I have created a parameterized test with Pytest for this purpose. Could you prepare a dictionary with test data for me? Please include tests with non-letter symbols. Let's format the expected result as a boolean value True or False.


This final prompt was refined to achieve a practical and useful result, as the initial response was insufficient. The addition of tests with non-letter symbols improved the outcome.


The result from ChatGPT:

test_data = [
    {"input": "ABCDE", "expected_result": True},
    {"input": "A", "expected_result": True},
    {"input": "AB", "expected_result": True},
    {"input": "ABCDEF", "expected_result": False}, # More than 5 characters
    {"input": "abc", "expected_result": False},    # Lowercase letters
    {"input": "123", "expected_result": False},    # Numbers
    {"input": "AB1C", "expected_result": False},   # Mixed letters and numbers
    {"input": "!@#", "expected_result": False},    # Special characters
    {"input": "AB C", "expected_result": False},   # Space included
    {"input": "ABCDE", "expected_result": False},  # Mixed case letters
    {"input": "", "expected_result": False},       # Empty input
]

These input values can be further extended by the user or the generative AI by adding specific details and providing more context to generate an appropriate data set for the test cases. It's a quick operation, and the test data is ready for use.

Summary

In conclusion, ChatGPT demonstrates proficiency in performing basic requirement analysis and testing, as well as fundamental test design techniques. It effectively generates test designs from requirements or other descriptions and converts general test cases into various specific formats.


The CSV format demonstrated in this article is specific to my case, illustrating how test design descriptions can be stored in Jira and uploaded with formatted steps.


Furthermore, OpenAI's model excels at generating test cases from code according to specific formatting requirements, significantly saving time for automation engineers. It also proves valuable in preparing test data for daily use, providing thorough case analysis, including edge cases and exceptions, even outside of automation targets.


Although it is not yet capable of fully automating tasks directly from descriptions or requirements, it acts as a powerful tool to enhance productivity and streamline various aspects of the QA process.