An End-to-End System for Generating Frontends from Sketches with LLMs

Table Of Links

3 RESULTS, DISCUSSION AND REFERENCES

SYSTEM DESIGN

We introduce Frontend Diffusion, an end-to-end LLM-powered high-quality frontend code generation tool, spanning from sketching canvas to website previews. As outlined in the introduction, the frontend generation task progresses through three stages: sketching, writing, and coding Our system utilizes the Claude 3.5 Sonnet language model (Sonnet)1 for all text and code generation.

While Claude represents one of the most advanced language models as of July 2024, we anticipate rapid developments in Generative AI. Therefore, the task transition techniques described herein are designed to be model-agnostic, ensuring their applicability to future, more advanced Generative AI models.

2.1 Sketching: Visual Layout Design and Theme Input

The system’s initial phase comprises a graphical user interface with two key components: a canvas panel for visual representation of the envisioned website layout, and a prompt panel for textual descriptions of the website theme. Upon completion of the user’s sketch and theme input, the user can activate the code generation process via "Generate" button.

The system then converts the sketch into SVG format, followed by a subsequent transformation into JPG format. This two-step conversion process was implemented based on empirical evidence from our tests, showing that language models exhibit better performance when processing images in JPG format compared to images in SVG format.

2.2 Writing: Product Requirements Document Generation

This phase transforms the user’s visual and textual inputs into a structured document, referred to as the Product Requirements Document (PRD), which serves as a blueprint for the website’s development process. The PRD generation process leverages Sonnet. To enhance the visual appearance of the generated websites, the system integrates the Pexels API2 for image retrieval.

The language model is specifically prompted to include image terms and size descriptions (e.g., [school(large)]). These descriptors are subsequently utilized to query the Pexels API, which returns relevant image URLs for incorporation into the PRD.

The coding phase of the system consists of two primary components: (1) Initial code generation: the system utilizes the generated PRD and the original user prompt as inputs for code generation, employing Sonnet to produce the initial website code; (2) Iterative refinement: the system implements an iterative refinement process to automatically enhance the generated website with richer functionality and reduced flaws.

This process involves analyzing the initial code to generate optimization suggestions, merging these suggestions with the original theme, and utilizing the enhanced theme along with the previously generated PRD to regenerate the code. The system executes this iterative refinement process multiple times (by default, n=4). Users can navigate between iterations by selecting preview thumbnails displayed at the interface’s bottom, and can access or copy the generated code for each version.

Authors:

This paper is available on arxiv under CC BY 4.0 DEED license.

An End-to-End System for Generating Frontends from Sketches with LLMs

Table Of Links

SYSTEM DESIGN

2.1 Sketching: Visual Layout Design and Theme Input

2.2 Writing: Product Requirements Document Generation

2.3 Coding: Website Generation and Iterative Refinement