Unit test generation using LLMs

Unit test generation using LLMs^†

An exploratory view

Unit test case generation using LLMs diagram

LLMs have largely been used for the generation of unit test cases. The process in general is as below

The input to unit test case generation are below, combined in different ways

Program under test (PUT)
Functionality textual descriptions: like bug reports, documentation, commit messages, bug patches, requests in natural language.
Pre-existing test cases
Context: such as examples of desired output, test method signature, error feedback.

All of the above inputs are placed into the crafted prompt which is then used to query the LLM to generate the unit test cases.

C, D

The LLM can be a general-purpose model (e.g. ChatGPT, Gemini) or a fine-tuned version. The fine-tuning is done with examples of test cases.

E, F, G, H

The LLM produces one or more test cases, involving a validation phase which will either 1) discard the test case or 2) iteratively improve the test case by feeding the generated context back to the LLM.

The final output is 1 or more validated unit test cases.

To end

Unit test generation using LLMs would have to be integrated with existing tools, for instance, combining with LLM data generation tool, enhancing with ML approaches to detect hallucinations and improve LLM output.

Sudhir Shetty, Jun 08 2026.

† Reference

[2025] Large Language Models for Software Testing: A Research Roadmap