Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Sep 12.
Published before final editing as: Environ Plan B Urban Anal City Sci. 2025 Jan 27:10.1177/23998083251316064. doi: 10.1177/23998083251316064

Generating conceptual landscape design via text-to-image generative AI model

Xinyue Ye a,*, Tianchen Huang a, Yang Song a, Xin Li a, Galen Newman a, Dayong Jason Wu b, Yijun Zeng c
PMCID: PMC12424439  NIHMSID: NIHMS2104924  PMID: 40949112

Abstract

This study explores the integration of text-to-image generative AI, particularly Stable Diffusion, in conjunction with ControlNet and LoRA models in conceptual landscape design. Traditional methods in landscape design are often time-consuming and limited by the designer’s individual creativity, also often lacking efficiency in the exploration of diverse design solutions. By leveraging AI tools, we demonstrate a workflow that efficiently generates detailed and visually coherent landscape designs, including natural parks, city plazas, and courtyard gardens. Through both qualitative and quantitative evaluations, our results indicate that fine-tuned models produce superior designs compared to non-fine-tuned models, maintaining spatial consistency, control over scale, and relevant landscape elements. This research advances the efficiency of conceptual design processes and underscores the potential of AI in enhancing creativity and innovation in landscape architecture.

Keywords: Text-to-image, generative AI, conceptual design, generative design, stable diffusion, landscape design

Introduction

The importance of conceptual design

The landscape design process includes steps like client consultation, site analysis, conceptual design, design development, construction documentation, bidding, construction, and maintenance planning (Norton and Brouwer, 2009). Each step ensures that the design meets esthetic, functional, and ecological goals (Reid, 2007). The conceptual design phase is crucial as it translates client needs and site characteristics into a visual plan, guiding future decisions (Booth, 2012). This phase fosters creativity and addresses challenges like sustainability (Thompson and Sorvig, 2000; Yelavich and Adams, 2014). It also influences material and plant selection, ensuring alignment with design goals (Andreasen et al., 2015; Suthersan et al., 2016), and facilitates a smooth transition to construction (Pressman, 2012).

Traditional conceptual design and limitations

Conceptual designs are often presented as sketches or renderings, crucial for conveying the proposed layout, features, and overall esthetic of the landscape (Thompson and Sorvig, 2000). Sketches allow for quick idea exploration, while renderings provide a more detailed and realistic depiction (Yee, 2012). In this phase, designers explore layout options, aesthetics, and materials to balance visual appeal with functionality (Bell, 2019; Reid, 2007). The design is then presented to the client, facilitating a discussion on the vision and allowing for collaborative refinement (Souter- Brown, 2014). The conceptual design process typically involves multiple iterations, with designers engaging in ongoing dialogue with clients. This iterative approach helps designers better understand the project and progressively align their designs with the client’s needs. Current methods of conceptual design have several limitations. Traditional approaches can be time-consuming and resource-intensive, with extensive research, manual sketching, and multiple revisions prolonging the design process and increasing costs (Buxton, 2010; Reid, 2007; Yelavich and Adams, 2014). The quality and innovation of designs are often constrained by the individual designer’s expertise, potentially limiting exploration of alternative solutions (Salter and Gann, 2003). Additionally, traditional methods may struggle to effectively visualize complex ideas, leading to communication challenges with clients or stakeholders and possible misunderstandings (Pressman, 2012; Ye et al., 2024).

Related work

Integration of generative artificial intelligence (AI) in landscape research

The field of landscape architecture has been revitalized by integrating artificial intelligence (AI) and machine learning (ML) technologies (Fernberg and Chamberlain, 2023; Stupariu et al., 2022). These technologies automate repetitive tasks, explore a broader range of design solutions, and provide novel insights into complex spatial problems, often surpassing traditional methods (Kingma and Welling, 2022). Generative AI, increasingly popular in the design industry and research, uses deep learning models to create new content similar yet distinct from its training data (Fui-Hoon Nah et al., 2023; Mandapuram et al., 2018; Rane, 2023). Models like Generative Adversarial Networks (GANs) leverage neural networks with multiple layers to learn complex data patterns (Creswell et al., 2018; LeCun et al., 2015; Ye et al., 2022). GANs have revolutionized landscape design by generating innovative proposals, such as park layouts that match human creativity, and enhancing data augmentation (Chen et al., 2023). Additionally, GANs accelerate environmental performance driven design in urban settings, predicting factors like wind, solar radiation, and thermal comfort in real-time, significantly improving efficiency and effectiveness over traditional methods (Huang et al., 2022).

Convolutional neural networks (CNNs) are deep neural networks primarily used for analyzing visual imagery and are recognized for their pattern recognition capabilities (Gonzalez, 2018). By establishing a database of urban space cases and utilizing CNNs, researchers generated urban design elements such as traffic road networks and neighborhood layouts, demonstrating feasibility in an application case in the northern extension of the city’s central green axis (Wan and Shi, 2021; Ye et al., 2021). In another study, the CAIN-GAN framework was introduced to enhance automated site planning by integrating domain knowledge for context-aware design solutions. Applied in New York City, CAIN-GAN effectively generated sustainable and tailored urban plans, facilitating more performative urban design solutions (Jiang et al., 2024).

Variational autoencoders (VAEs), generative models used in landscape design and research, have proven effective in generating new data similar to input data, making them useful for image generation and reconstruction (Goodfellow et al., 2014). For example, Xu et al. (2021) introduced BlockPlanner, a generative model that creates city blocks using a vectorized graph representation to capture global and local structures, ensuring valid block generation and enabling new applications like topology refinement. Danhaive and Mueller (2021) presented a method for structural design space exploration using performance-conditioned generative modeling with conditional VAEs, enhancing designers’ ability to explore diverse, high-performing structural concepts through an intuitive low-dimensional latent space.

Diffusion models, a class of generative models, create high-quality images by reversing a diffusion process involving the gradual mixing of substances due to random motion (Yang et al., 2023). Latent diffusion models (LDMs), operating in the latent space of pre-trained autoencoders, reduce computational requirements while maintaining high visual fidelity for tasks like landscape image inpainting and class-conditional image synthesis (Rombach et al., 2022). Li and Li (2024) introduced a daylight-driven AI-aided architectural design method using diffusion models, generating architectural massing models with random parameters and determining window layouts based on daylight, supporting architects’ creative processes.

However, these deep learning models have several limitations. First, they often lack user-friendly interfaces, requiring users to write code, which is a barrier for designers without programming skills. Second, the results tend to have limited generalization; generating rare objects or adopting different design styles often requires extensive retraining, consuming significant time and resources. Lastly, these models are typically tailored to specific problems within particular design domains. Applying them across various fields or conditions requires substantial modifications, limiting their flexibility and adaptability.

Recent technological advancements have significantly popularized text-to-image generation AI tools, which create images from text by combining natural language processing with computer vision for coherent, contextually relevant visuals (Oppenlaender, 2022). These tools, based on deep learning models such as GANs, VAEs, Diffusion Models, and Transformer architectures, offer distinct advantages. They are highly accessible, featuring user-friendly interfaces that require no programming knowledge, ideal for designers with limited coding skills, and democratizing the design process (Kotturi et al., 2024). They also speed up the design process by automating visual content creation, enabling rapid prototyping and iteration (Lamac, 2023). Moreover, by leveraging large pre-trained models, these tools generate diverse, high-quality images without the need for extensive training, saving time and computational resources (Zhang et al., 2023).

Stable Diffusion, MidJourney, and DALL-E 3, three mainstream text-to-image generative AI tools, have found widespread applications in architecture, urban, and landscape design (Aničin and Stojmenović, 2023). Phillips et al. (2024) evaluated the performance of DALL-E 2, Stable Diffusion, and MidJourney in generating urban design imagery from scene descriptions, revealing significant differences in their ability to depict common and unique urban design elements. These findings underscore their potential in early design stages for rapid ideation and visual brainstorming. Kim et al. (2024) proposed a method for synthesizing artistic landscape sketches using Stable Diffusion and ControlNet, which processes three-channel perspective maps to enhance sketch quality, supporting both text-to-sketch and image-to-sketch generation. Additionally, the application of DALL-E 2, MidJourney, and Stable Diffusion in architectural design has been explored, highlighting their potential to boost creativity and innovation. Hanafy (2023) analyzed 40 million MidJourney inquiries, identifying patterns and architectural keywords, demonstrating how these tools can enhance the conceptual design process with their user-friendly interfaces and ability to generate diverse images. These tools are crucial for designers in rapidly prototyping ideas and creating new images, offering a digital canvas for exploring imaginative concepts (Hoşer and Köymen, 2023; Kulkarni et al., 2023). They provide several advantages over other generative AI tools. First, their user-friendly interfaces make them accessible to a wider audience, including those without deep technical expertise in AI. Second, their ability to generate images quickly enhances productivity by enabling faster turnaround times in creative projects (Derevyanko and Zalevska, 2023). Finally, their capability to interpret and generate images from textual descriptions opens new avenues for creative expression, allowing users to guide the image generation process intuitively and descriptively (Hakimshafaei, 2023; Paananen et al., 2023).

Literature gaps and study objectives

With the notable advancements in state-of-the-art text-to-image generative AI tools (such as Stable Diffusion, Midjourney, and DALL-E 3), existing literature has applied them to certain physical design procedures. However, significant constraints persist when these tools are used for conceptual landscape design. While they tools have excellent abilities in producing realistic images, they can encounter difficulties in generating highly accurate and detailed images, especially when confronted with complex or ambiguous verbal prompts (Cao et al., 2023). These issues are exacerbated by the presence of ambiguity in prompt descriptions, which, in turn, contributes to a reduction in the diversity of output interpretations (Wu et al., 2023). Further, the ability to achieve precise control over the generated images, which is essential for changing specific design aspects, presents other challenges within the intrinsic parameters of the model (Du et al., 2023). These technologies may occasionally fail to capture the intricate details of stylistic and esthetic expression, which could potentially undermine the integrity of the design (Bendel, 2023).

Some researchers have proposed solutions for applying text-to-image generative AI tools in design. Li et al. (2024) presented a method that leverages generative AI to quickly create conceptual floorplans and 3D models from sketches, emphasizing controllable image structure generation by sketch. Another study by Lee et al. (2024) discussed a novel approach to architectural visualization using generative AI, particularly text-to-image technology, to enhance efficiency and personalization in design visualization. This approach highlights the integration of various architectural styles and the potential of AI in user-centered design using custom-trained datasets.

While current studies have made progress in applying generative AI to specific aspects of architecture, urban, and landscape image generation, they often lack a holistic approach that fully leverages text-to-image generative AI for the entire conceptual landscape design workflow. Additionally, these methods typically require extensive data collection for model training, often involving hundreds or thousands of images, which is time-consuming and resource-intensive (Boudier et al., 2024; Han and Chen, 2024; Zhang et al., 2024; Zhang et al., 2024). This not only increases the workload but also demands significant computational power, making it challenging to efficiently achieve high-quality conceptual landscape designs, even with advanced generative AI techniques. As such, this research explores two key questions:

· How can a workflow be developed to effectively use text-to-image generative AI in conceptual landscape design?

· How can we speed up data collection and model training in text-to-image generative AI for conceptual landscape design?

Our conceptual landscape design workflow begins by selecting the type of landscape design to study, using natural parks, city plazas, and courtyard gardens as case studies.We then create simple spatial layout sketches and gather a minimal set of reference images for the training dataset. Using Stable Diffusion and related tools, the model is trained on this small dataset, with parameters adjusted to control the image generation process. The generated images are then evaluated both quantitatively, using PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity Index Measure), and LPIPS (Learned Perceptual Image Patch Similarity), and qualitatively, based on layout, function, and graphics. This research contributes to the literature by offering a comprehensive workflow for conceptual landscape design using text-to-image generative AI, covering the entire process from concept and sketching to model training, image generation, and evaluation. By using a very small training set (fewer than 10 images per landscape type) and a short training time (about 20 minutes on a personal computer), we demonstrate that effective results can be achieved with significantly reduced time and effort. This efficiency is particularly advantageous in the early stages of conceptual design, where quick iterations are crucial, underscoring the significant benefits of AI in the design process.

Methods

Model implementation

This study utilized Stable Diffusion, along with its plugins, ControlNet, and self-trained Lora models, for conceptual landscape designs. Based on a Latent Diffusion Model (LDM), Stable Diffusion removes Gaussian noise using denoising autoencoders (Rombach et al., 2022). While primarily used for generating detailed graphics from text descriptions, it also supports inpainting, outpainting, and image-to-image conversions (Rombach et al., 2022). Unlike DALL-E 3 and Midjourney, Stable Diffusion is open-source, allowing extensive customization, enhanced functionality, and a more user-friendly interface, making it easier to learn and apply.

The ControlNet framework applied in this research is a neural network architecture aimed at controlling the behavior of diffusion models by the incorporation of additional constraints. The process entails generating two copies of the neural network blocks: one is designated as “locked,” while the other is referred to as “trainable.” In our approach, we use two copies of the neural network blocks: one is designated as “locked,” retaining the initial model weights, while the other is referred to as “trainable,” tasked with acquiring the desired conditions (Zhang and Agrawala, 2023). Users can provide a picture as the frame for the image that can be generated via ControlNet plugin in Stable Diffusion. Figure 2s illustrates the difference between images generated with and without ControlNet.

Although all generated images include the elements specified in the prompt—“park corner with trees, shrubs, grasses, pavements, and walls”—the image generated with ControlNet closely follows the spatial structure of the initial input image, demonstrating some control over the design generation. In this research, we also fine-tuned the pre-trained Stable Diffusion model. Fine-tuning involves further training a pre-trained model on a more specific dataset to adapt it for a particular task, allowing the model to produce more specialized and accurate images by leveraging its existing knowledge and structure (Sun et al., 2019). This approach enhances model performance in specific domains without the need for full retraining (Kwon et al., 2023). LoRA (Low-Rank Adaptation) is one of the most effective fine-tuning methods for Stable Diffusion. Given the large size of these models, full retraining is computationally intensive, but LoRA introduces low-rank matrices that modify the pre-trained model’s weights. Instead of updating all the weights in a neural network layer, only these matrices are updated, significantly reducing the number of parameters to be trained and making the fine-tuning process more efficient and resource-friendly (Smith et al., 2023). LoRA also allows for excellent results with a very small training dataset, often fewer than 10 images, minimizing the time and effort required for data collection and model training, further enhancing the efficiency of the fine-tuning process.

Data collection and processing

Landscape design includes categories like residential gardens, commercial landscapes, public parks, and environmental restoration areas (Stiles, 1994). In this research, we focus on three examples: natural park (environmental restoration), city plaza (public parks), and courtyard garden (residential gardens). We prepared a total of 18 images: 5 for the natural park, 7 for the city plaza, and 6 for the courtyard garden. The small size of the training datasets is notable for a deep learning project. All images had similar dimensions and were sourced from Pinterest, a visual search engine that allows users to quickly find high-quality master plans with similarities, aiding the training process.

Method of conceptual design generation

For this study, our approach includes the implementation of conceptual landscape designs for different types, the minimization of human intervention, and a capability to retain a desired spatial structural integrity. Recognizing this, we implemented Stable Diffusion in conjunction with the ControlNet plug-in and Lora models. This approach included integrating training LoRA models via lightweight datasets and overlaying them over a pre-trained Stable Diffusion model, which resulted in the generation of desired landscape designs. The process was organized in the following manner:

  1. An image (Figure 3S) containing basic structural lines, is input into ControlNet, serving as a foundational spatial structure.

  2. Training is conducted on three datasets, including the natural park, city plaza, and courtyard garden. Initial pre-processing automatically tags these images with labels indicating elements they have (Table 1). The subsequent training process is carried out via LoRA, resulting in generating fine-tune models for each kind of landscape design.

  3. For Stable Diffusion prompts, we just use simple prompts for three kind of landscape design separately:
    · Bird’s eye view, a natural park
    · Bird’s eye view, a city plaza
    · Bird’s eye view, a courtyard garden
  4. Both Stable Diffusion and ControlNet have a variety of adjustable parameters. Different combinations of parameters can generate completely different results. An essential parameter within ControlNet is “weight,” ranging 0–2, identifying the level of variation of the design. Figure 4S exemplifies the results of image generation across different weight gradations. The presence of a smaller weight can lead towards randomness, hence improving the desired degree of variation. On the other hand, a larger weight restricts the variation of the design. The level of variation depends on the designer’s personal preference and practical experience.

  5. For Stable Diffusion and LoRA, we generated two sets of images for each landscape design category. The first set was produced using only the pre-trained Stable Diffusion model, without applying any fine-tuning. The second set was generated using the pre-trained Stable Diffusion model combined with the LoRA fine-tuned model. This approach allowed us to compare the differences in output quality between the fine-tuned and non-fine-tuned models, evaluating how LoRA fine-tuning enhanced the generated results.

  6. After experimenting with different combinations of parameters multiple times, we reached the desired conceptual design by inputting the value of weight. To provide a more comprehensive depiction of the findings, it is better to generate more images and afterward select those that most effectively represent the outcomes. Consequently, a total of 20 images were created for each design category, including both the fine-tuned and non-fine-tuned variations. Subsequently, 4 photographs that were the most representative were chosen from each collection. To conduct a comparative analysis of park designs in terms of their scales, the only different parameter in all three landscape categories was the prompt. Consistent parameters across these groups included the following: Steps (20), Sampler (DDIM), CFG Scale (10), Size (768 Å~ 512), Model hash (d289dfa4ed), Model (miaoshouai.com), ControlNet (Enabled: True, Module: hed, Model: control_hed-fp16 [13fee50b], Weight: 0.5, Guidance Start: 0, Guidance End: 1).

  7. The last stage of the study was a comparison in both quantitatively and qualitatively between designs created using a combination of pre-trained models and customized training models in LoRA and designs generated purely using a pre-trained model.

Table 1.

Example generated labels for each dataset after pre-processing.

Dataset An example of generated labels after pre-processing

Dataset1 - natural park architecture, bridge, building, bush, car, castle, chain-link_fence, city, city_lights cityscape, day, fence, garden, gate, grass, ground_vehicle, house, lamppost, motor_vehicle, no_humans, outdoors, real_world_location, river, road, rooftop, scenery, sky, skyscraper, street, sunlight, tower, town, traditional_media, tree
Dataset2 - city plaza architecture, bridge, building, car, city, cityscape, day, house, no_humans, outdoors, overgrown, pavement, post-apocalypse, railing, real_world_location, road, scenery, skyscraper, street, tower, town, tree
Dataset3 - courtyard garden basket, bowtie, daisy, door, faux_figurine, fence, flower, garden, grass, hydrangea, no_humans, outdoors, painting_\(medium\), plant, potted_plant, purple_flower, sunflower, traditional_media, tree_stump, watercolor_\(medium\), watering_can, window

Criteria for evaluating generative results

The qualitative evaluation method assessing the generative designs was developed using a customize rubric (Table 2), which was derived by integrating criteria outlined in the publications “Basic Elements of Landscape Architectural Design” and “Form and Fabric in Landscape Architecture: A Visual Introduction” (Dee, 2001; Holden, 1984).

Table 2.

Rubric for evaluating landscape design

Aspect Criteria 1 - Low 2 - Moderate 3 - High

a. Circulation and accessibility Poor circulation and accessibility Adequate circulation and accessibility Excellent circulation and accessibility
Layout b. Spatial organization Incoherent spatial organization and hierarchy Adequate spatial organization and hierarchy Exceptional spatial organization and hierarchy
a. User needs and requirements Does not meet user needs and requirements Adequately meets user needs and requirements Excellently meets user needs and requirements
Function b. Flexibility and adaptability Inflexible design, unable to adapt to changing needs and usage patterns Moderately flexible design, adaptable to changing needs and usage patterns Exceptionally flexible design, seamless adaptable to changing needs and usage patterns
a. Visual quality and aesthetics Unappealing visual composition, poor use of form, texture color, and scale Visually pleasing balanced use of form, texture, color, and scale Exceptionally appealing visual composition, masterful use of form, texture, color, and scale
Graphics b. Clarity and legibility Confusing, illegible graphics Adequate clarity and legibility in graphics Exceptionally clear, highly legible graphics

Also, we employed three quantitative methods, PSNR, SSIM, and LPIPS, to evaluate the quality of the generated images. Given that the generated images were divided into two groups: fine-tuned and without fine-tuning, we compared both sets of generated images against the training dataset, which served as the reference. This method allowed us to determine how closely the outputs from each model aligned with the target characteristics captured in the training dataset.

PSNR (Peak Signal-to-Noise Ratio) is a metric that quantifies the difference between two images by measuring the peak error. Higher PSNR values indicate that the images are more similar, implying less distortion in the generated image (Huynh-Thu and Ghanbari, 2008). This metric is particularly useful for comparing the technical accuracy of the images produced by the two different models.

SSIM (Structural Similarity Index Measure) focuses on evaluating the structural similarity between two images. Unlike PSNR, which assesses pixel-level differences, SSIM considers changes in luminance, contrast, and texture (Wang et al., 2004). This makes SSIM valuable for understanding how well the fine-tuned model preserves the overall structural integrity and perceptual quality of the images.

LPIPS (Learned Perceptual Image Patch Similarity) measures the perceptual similarity between images based on deep neural network features. Unlike PSNR and SSIM, LPIPS aligns more closely with human visual perception, making it ideal for evaluating how visually appealing and realistic the generated images are (Zhang et al., 2018). This metric is particularly relevant in assessing the effectiveness of fine-tuning in generating high-quality, perceptually accurate landscape designs.

Results

Training outcomes

Figures 57 elucidate the designs generated by Stable Diffusion for natural parks, city plazas, and courtyard gardens. Using a standard personal computer equipped with 1 NVIDIA GeForce RTX 2070 GPU, a 6-core Intel® Core i7 9750H CPU, and 16 GB RAM, each LoRA fine-tune model required approximately 20 minutes of training. Remarkably, this 20-minute training period in a personal computer is considered very short for deep learning models, especially when compared to the hours or even days with multiple more advanced GPUs in some other deep learning projects. This brevity is particularly beneficial for conceptual landscape design, as it allows designers to quickly achieve usable results, drastically improving efficiency. Moreover, image generation took only about 5–10 seconds per image. The ability to swiftly move from training to generating highquality images underscores the potential of this automated technique to significantly accelerate what is traditionally a labor-intensive process, providing a valuable tool for designers to expedite their creative workflows.

Generative design description and qualitative evaluation

The AI-generated landscape designs with fine-tuning appear to have more complexity and detail compared to the ones without fine-tuning. The conceptual designs with fine-tuning also seem to have more defined features, such as distinct paths and water features. The natural park conceptual designs that were carefully adjusted include a variety of water features (as shown in Figure 5a and b) and pathways (Figure 5b5d), which together contribute to the creation of a dynamic and immersive experience for visitors, fostering a feeling of movement and discovery. The city plaza conceptual designs integrate water features, trees, plants (Figure 6ad), sitting places (Figure 6a and b), and attractive elements such as fountains (Figure 6a and b), which offer an abundance of landscape elements and improved esthetic appeal. The courtyard garden conceptual designs include elements such as water features, trees, bushes, flowers, pavement, and sitting places (in Figure 7ad). These designs attempt to provide a feeling of enclosure and isolation. On the contrary, most of the conceptual landscape designs created by AI without fine-tuning exhibit a deficiency in complexity and detail compared to those that have undergone fine-tuning. The natural park without fine-tuning seems to have fewer water features (see Figure 5fh), and the vegetation exhibits a sparse and limited diversity (see Figure 5eg). City plaza seems to lack compelling characteristics and give off a sense of monotony (see Figure 6eh). The courtyard garden without fine-tune displays an absence in both diversity and esthetic features (see Figure 7eh).

The conceptual landscape design of natural parks, city plazas, and courtyard gardens is assessed according to the criteria outlined in Table 1. This qualitative evaluation is conducted for both scenarios with fine-tuning and without fine-tuning. In terms of layout, designs with fine-tuning usually seem to have better circulation and accessibility with paved paths accessible to people of all ages and creating a clear circulation pattern that can guide visitors through the space. Also, they generally appear to have better spatial organization with a feature located as focal points as well as radiating paths and spaces that create a sense of movement and visual interest. In terms of function, designs with fine-tuning also usually include facilities and space for activities and events. Moreover, they seem to have better flexibility and adaptability and include spaces that can be used for a variety of purposes. When considering graphics, designs that include fine-tuning exhibit much better visual quality and aesthetics via the integration of various textures and materials. Additionally, the use of natural elements like wood, stone, and flora appears to establish a sense of connection to the natural world. Further, a higher level of clarity and legibility seems to be achieved by well-defined circular patterns and distinct zoning for various tasks in the fine-tuned outputs. In general, designs that with fine-tuning have much better performance compared to those that don’t undergo fine-tuning.

Spatial consistency, scale, and generation content control

The adherence to the similar spatial structure of the initial input picture (Figure 3S) is seen in each generative design within the three categories. A circular feature is at the center of all schemes, using different features such as ponds (Figure 5ae), fountains (Figure 6ad), pavements (Figure 6f and h), shrubs (Figure 7ad and g), trees (Figure 5g, 6c and g), and grasslands (Figures 5h and 7e). Regardless of their specific features, these uniformly maintain a rounded shape and seamlessly integrate within their surrounding environments. Moreover, a clearly visible line, which stretches from the northeast to the southwest, remains consistent across all designs. This line mostly appears as a pathway in different designs, functioning to enhance accessibility within the given area. These consistent features indicate the importance of ControlNet that it can both guide the overall spatial organization and facilitate the seamless integration of landscape elements within their surrounding environments.

All the generative designs with fine-tuning in all three design categories exhibit characteristic features and have related landscape elements. The dimensions of these generating schemes show similarities to the dimensions of corresponding training datasets. While the terms “natural park,” “city plaza,” and “courtyard park” may partially convey the concept of size, it is noteworthy that all the generative schemes within the same category exhibit similar scales. When comparing the photos created with and without fine-tuning across three categories, notable disparities are observed, particularly in terms of size and park classification. The natural park without fine-tune revealed deviations from the expected characteristics of a typical natural park. Conversely, some of the outputs look like an aerial observation of a mountain, as seen in Figure 5f5h. The size of non-finetune outputs is comparatively smaller than the fine-tuned ones, as seen by the size of the trees depicted in Figure 5e, f, and h. In the context of the generative city plaza, an image without finetuning depicted a rooftop garden instead of a traditional city plaza (Figure 6f). In the case of the courtyard garden, we found that all photos without fine-tuning presented a larger size compared to those with fine-tuning. Additionally, the former also had a lower plant density, which deviated from the characteristics seen in the training dataset of LoRA (Figure 7eh). Hence, the findings suggest that relying only on prompts could deliver schemes that deviate from the desired assumptions, resulting in schemes being generated at a random scale. Relatedly, the personalized models we trained in LoRA played a significant role in regulating the scale of the generated outputs. This indicates the use of fine-tuning in LoRA can enhance the precision of design creation.

Quantitatively evaluation of generated design

In this study, we used the training datasets as the reference for comparison in our quantitative evaluations. These datasets provide a baseline for assessing how well the generated images align with the desired characteristics of the conceptual landscape designs. Table 3 shows the results of quantitative evaluation. A higher PSNR value indicates better image quality, as it reflects less distortion. Similarly, a higher SSIM value signifies that the generated image maintains essential visual features. Conversely, lower LPIPS values are desirable because they indicate that the generated images are more perceptually appealing according to human vision. Across all three types of landscape designs—natural park, city plaza, and courtyard garden—the fine-tuned model consistently outperformed the non-fine-tuned model, according to these metrics. The fine-tuned images had higher PSNR values compared to the non-fine-tuned images in all cases. This indicates that the fine-tuned model generates images with less distortion and better fidelity compared with non-fine-tune images. The fine-tuned images also exhibited higher SSIM values across all types of designs. This indicates that the fine-tuned model is better at preserving the structural integrity of the images, maintaining key visual features such as luminance, contrast, and texture more effectively than the non-fine-tuned model. Also, the fine-tuned model in all three types consistently achieved lower LPIPS values, indicating that the images it produced are more visually attractive from a human perceptual standpoint.

Table 3.

Results of quantitative evaluation

Type Fine-tune PSNR (dB) Without fine-tune PSNR (dB) Fine-tune SSIM Without finetune SSIM Fine-tune LPIPS Without finetune LPIPS

Natural park 31.1557 25.6414 0.5876 0.5487 0.5093 0.5544
City plaza 32.9425 27.9353 0.5474 0.472 0.5257 0.5657
Courtyard garden 34.4287 28.2568 0.5873 0.5139 0.5172 0.5605

The consistent improvement in PSNR, SSIM, and LPIPS values across all three landscape design types strongly supports the conclusion that fine-tuning significantly enhances the quality of the generated images. This demonstrates that fine-tuning is an effective technique for improving the output of AI-generated conceptual landscape designs, making the images more accurate and visually pleasing.

Discussion

Our method for conceptual landscape design shows promising potential in assisting landscape designers with practical projects. In the traditional landscape conceptual design workflow, designers start with site analysis and initial sketches based on client input. As the design develops, they refine concepts using manual sketches or CAD tools, adding details and exploring different layouts (Reid, 2007). Collaboration involves presenting designs to clients and making revisions based on their feedback, which can be time-consuming (Pressman, 2012). Our method achieves it by proposing and solving two research questions, 1) How can a workflow be developed to effectively use text-toimage generative AI in conceptual landscape design? First, designers can quickly generate multipl conceptual designs by simply providing sketches and minimal reference images. This allows for fast iteration, enabling designers to explore different ideas and refine concepts rapidly without being bogged down by lengthy manual processes. It also serves as a tool for inspiration and ideation, helping designers visualize a variety of design possibilities and sparking new creative directions. Moreover, designers retain control over the design process by adjusting parameters and input sketches, allowing them to guide the AI towards desired outcomes. This interaction between the designer’s intent and the AI’s capabilities can lead to more personalized and context-specific designs. Also, by ensuring that the generated images meet certain quality standards through both quantitative and qualitative evaluations, the workflow helps maintain a consistent level of design quality throughout the process. 2) How can we speed up data collection and model training in textto-image generative AI for conceptual landscape design? Given the efficiency of the workflow—requiring only a small training set (usually less than 10 images) and a short training time (around 20 minutes in a personal computer in this research)—this efficiency is achieved by leveraging a pretrained Stable Diffusion model combined with our fine-tuned LoRA-trained model, and is notable when compared to other text-to-image generative AI studies in design (Boudier et al., 2024; Han and Chen, 2024; Zhang et al., 2024; Zhang et al., 2024). this enables designers t present and evalua multiple options in client meetings or team discussions, speeding up the decision-making process.

While technological advancements, like generative AI, have expanded landscape design horizons, it is important to emphasize that these tools are meant to enhance, not replace, the foundational theories and human expertise that define the field (Shaw et al., 2024). Traditional design skills and domain knowledge remain crucial in this process, designers should approach the technology as a complementary asset rather than a replacement for their traditional skills and expertise. To effectively leverage AI, designers should start by familiarizing themselves with the capabilities and limitations of these tools, understanding how they can enhance, rather than override, their creative process. Also, designers should actively engage in selecting and curating training datasets, using their domain knowledge to ensure that the inputs align with the specific design goals. Additionally, they should continue to apply their expertise in creating guiding sketches and refining AI-generated outputs, ensuring that the final designs meet the esthetic and functional standards required for the project. Moreover, designers should view AI as a tool for rapid iteration and exploration, enabling them to quickly visualize and test multiple concepts before settling on the most viable option. In this way, designers can integrate AI into their workflow that preserves the integrity of their work while benefiting from the efficiencies and creative possibilities that AI offers. Ultimately, designers should embrace a mindset of collaboration between human expertise and AI technology, using the strengths of both to push the boundaries of the conceptual landscape design while upholding the core principles that have always guided their practice.

Conclusions

Our research employed a text-to-text generative AI tool combining Stable Diffusion with ControlNet and LoRA to output three typical types of landscape design: a natural park, a city plaza and a courtyard garden. Through both qualitative and quantitative evaluations, we have shown that finetuned models produce more detailed, visually appealing, and structurally coherent designs compared to non-fine-tuned models. The ability to maintain spatial consistency, control scale, and generate relevant landscape elements underscores the potential of AI as a powerful tool in the landscape architecture field. The first major contribution of this work is the development of a workflow specifically tailored for conceptual landscape design. This workflow integrates the entire process, allowing designers to transition smoothly from concept to evaluation. The second key contribution is the efficiency of our approach. While extensive datasets and long training periods can be time-consuming, our method leverages a small set of reference images and brief training times to enable rapid iteration, making it particularly valuable for early-stage design. This efficiency offers a significant advantage in generating and refining conceptual designs without the overhead associated with larger datasets.

This study has certain limitations that suggest avenues for future research. Firstly, while we focused on three types of landscapes—natural parks, city plazas, and courtyard gardens—we did not display a detailed case study that involved iterative optimization of specific design elements. For example, we did not show the process of generating a single design and repeatedly refine it to optimize or address specific aspects. Additionally, while our study employed quantitative metrics to evaluate the image quality of the generated designs, these methods assess visual and perceptual quality but do not account for the assessment of landscape design, which remain challenging to evaluate quantitatively. Moreover, our research exclusively utilized AI-generated conceptual designs without comparing them to human-designed alternatives or applying them to real-world projects.

Future research could involve selecting a particular case, thoroughly documenting the process of generating, assessing, revising, and reassessing a design, and showcasing multiple iterations. This would provide a more comprehensive understanding of how AI can assist in the iterative design process. Expanding the evaluation framework to include metrics that capture design considerations is another important direction, aiming to establish more comprehensive methods for assessing landscape designs. Additionally, we plan to conduct a series of practical case studies where we wil apply generative AI tools alongside traditional human-centered design methods. These case studies will focus on conceptual landscape design projects in real-world contexts, allowing us to directly compare the outcomes produced by AI with those generated through conventional design practices. We will evaluate the results based on various criteria, such as design quality, time efficiency, and resource utilization, while also gathering feedback from design professionals. This research will help establish best practices for integrating AI with human creativity in the field of landscape design.

Supplementary Material

Figure 1s. Stable Diffusion interface
Figure 3S. Initial input image for ControlNet
Figure 2s. Images generated in Stable Diffusion with and without ControlNet (prompt: park corner with trees, shrubs, grasses, pavements and walls)
Figure 4S. An example of generative park design when the weight of ControlNet is 0, 0.25, 05, 2 (prompt: bird’s eye view, a park)
Figure 5S. Natural park, a-d with fine-tune, row e-h without fine-tune
Figure 6S. City plaza, a-d with fine-tune, e-f without fine-tune
Figure 7S. Courtyard garden, a-d with fine-tune, e-h without fine-tune

Supplemental material for this article is available online.

Acknowledgments

We greatly appreciate the helpful comments and suggestions from the editor and anonymous reviewers.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was supported by National Science Foundation (NSF) under grant ITE- 2235678, CMMI-2430700, and CNS-2401860, NASA under 80NSSC22KM0052, and Texas A&M University Harold Adams Interdisciplinary Professorship Research Fund. The funders had no role in the study design, data collection, analysis, or preparation of this article.

Biographies

Xinyue Ye is the Harold Adams Endowed Professor in the Department of Landscape Architecture and Urban Planning and Director of Center for Geospatial Sciences, Applications and Technology at Texas A&M University. His research expertise is on human dynamics and urban informatics.

Tianchen Huang is a doctoral student in the Department of Landscape Architecture and Urban Planning and Center for Geospatial Sciences, Applications and Technology at Texas A&M University. His research expertise is on AI-based urban design.

Yang Song is an Assistant Professor in the Department of Landscape Architecture & Urban Planning at Texas A&M University. He has a long-standing interest in applying digital technology and data science in landscape research and design.

Xin Li is Professor and Chair of the Section of Visual Computing and Interactive Media at Texas A&MUniversity. His research interests include visual computing, AI-assisted visual data modeling, processing and analysis, and their applications.

Galen Newman is the Nicole and Kevin Youngblood Professor and Head in the Department of Landscape Architecture and Urban Planning at Texas A&M University. His research interests include urban regeneration, land use science, spatial analytics, community resilience, and community/urban scaled design.

Dayong Jason Wu is an Associate Research Scientist at Texas A&M Transportation Institute. His research expertise is on Intelligent Transportation Systems, Big Data Analytics, Machine Learning, and GIS-based transportation applications.

Yijun Zeng is a lecturer in the Department of Landscape Architecture at Iowa State University. Her research focuses on the interaction between humans and urban landscape

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The datasets and code will be available upon request.

References

  1. Andreasen MM, Hansen CT and Cash P (2015) Conceptual Design: Interpretations, Mindset and Models. Berlin: Springer International Publishing. DOI: 10.1007/978-3-319-19839-2. [DOI] [Google Scholar]
  2. Aničin L and Stojmenović M (2023) Bias analysis in stable diffusion and MidJourney models. In: Nandan Mohanty S, Garcia Diaz V and Satish Kumar GAE (eds) Intelligent Systems and Machine Learning. Nature Switzerland: Springer, 378–388. DOI: 10.1007/978-3-031-35081-8_32. [DOI] [Google Scholar]
  3. Bell S (2019) Elements of Visual Design in the Landscape. London: Routledge. [Google Scholar]
  4. Bendel O (2023) Image synthesis from an ethical perspective. AI & Society. DOI: 10.1007/s00146-023-01780-4. [DOI] [Google Scholar]
  5. Booth KN (2012) Residential Landscape Architecture. Hoboken, NJ: Prentice Hall. https://thuvienso.dau.edu.vn:88/handle/123456789/9001. [Google Scholar]
  6. Boudier M, Beltrán N and Michelangelo A (2024) How to generate new versions of an original character? an application of LoRA and DreamBooth fine-tuning of diffusion models. https://hdl.handle.net/10230/68765. [Google Scholar]
  7. Buxton B (2010) Sketching User Experiences: Getting the Design Right and the Right Design. Burlington, MA: Morgan Kaufmann. [Google Scholar]
  8. Cao Y, Li S, Liu Y, et al. (2023) A comprehensive survey of AI-generated content (AIGC): a History of generative AI from GAN to ChatGPT (arXiv:2303.04226). DOI: 10.48550/arXiv.2303.04226. [DOI] [Google Scholar]
  9. Chen R, Zhao J, Yao X, et al. (2023) Generative design of outdoor green spaces based on generative adversaria networks. Buildings 13(4): 1083. DOI: 10.3390/buildings13041083. [DOI] [Google Scholar]
  10. Creswell A, White T, Dumoulin V, et al. (2018) Generative adversarial networks: an overview. IEEE Signal Processing Magazine 35(1): 53–65. DOI: 10.1109/MSP.2017.2765202. [DOI] [Google Scholar]
  11. Danhaive R and Mueller CT (2021) Design subspace learning: structural design space exploration using performance-conditioned generative modeling. Automation in Construction 127: 103664. DOI: 10.1016/j.autcon.2021.103664. [DOI] [Google Scholar]
  12. Dee C (2001) Form and Fabric in Landscape Architecture: A Visual Introduction. London: Taylor & Francis. [Google Scholar]
  13. Derevyanko N and Zalevska O (2023) Comparative analysis of neural networks Midjourney, stable diffusion, and DALL-E and ways of their implementation in the educational process of students of design specialities. Scientific Bulletin of Mukachevo State University. Series “Pedagogy and Psychology 9: 36. DOI: 10.52534/msu-pp3.2023.36. [DOI] [Google Scholar]
  14. Du H, Li Z, Niyato D, et al. (2023, March 23). Diffusion-based reinforcement learning for edge-enabled AIgenerate content services. https://arxiv.org/abs/2303.13052v3. [Google Scholar]
  15. Fernberg P and Chamberlain B (2023) Artificial intelligence in landscape architecture: a literature review. Landscape Journal 42(1): 13–35. DOI: 10.3368/lj.42.1.13. [DOI] [Google Scholar]
  16. Fui-Hoon Nah F, Zheng R, Cai J, et al. (2023) Generative AI and ChatGPT: applications, challenges, and AIhuman collaboration. Journal of Information Technology Case and Application Research 25(3): 277–304. DOI: 10.1080/15228053.2023.2233814. [DOI] [Google Scholar]
  17. Gonzalez RC (2018) Deep convolutional neural networks [lecture notes]. IEEE Signal Processing Magazine 35(6): 79–87. DOI: 10.1109/MSP.2018.2842646. [DOI] [Google Scholar]
  18. Goodfellow I, Pouget-Abadie J, Mirza M, et al. (2014) Generative adversarial nets. Advances in Neura Information Processing Systems 27. Available at: https://proceedings.neurips.cc/paper_files/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html. [Google Scholar]
  19. Hakimshafaei M (2023) Survey of Generative AI in Architecture and Design. MS. Santa Cruz: University of California. https://www.proquest.com/docview/2812330639/abstract/4B3973AF789249C9PQ/1. [Google Scholar]
  20. Han Z and Chen Y (2024) Automatic generation of standard nursing unit floor plan in general hospital based on stable diffusion. Buildings 14(9): 2601. DOI: 10.3390/buildings14092601. [DOI] [Google Scholar]
  21. Hanafy NO (2023) Artificial intelligence’s effects on design process creativity: “A study on used A.I. text-toimag in architecture”. Journal of Building Engineering 80: 107999. DOI: 10.1016/j.jobe.2023.107999. [DOI] [Google Scholar]
  22. Holden R (1984) Basic elements of landscape architectural design. Landscape and Planning 11(3): 260–261. [Google Scholar]
  23. Hoşer M and Köymen E (2023) Analysis of text-to-image artificial intelligence Systems in terms of contribution to interior coloring. Bilişim Teknolojileri Dergisi 16(4): 275–283. DOI: 10.17671/gazibtd.1252993. [DOI] [Google Scholar]
  24. Huang C, Zhang G, Yao J, et al. (2022) Accelerated environmental performance-driven urban design with generative adversarial network. Building and Environment 224: 109575. DOI: 10.1016/j.buildenv.2022.109575. [DOI] [Google Scholar]
  25. Huynh-Thu Q and Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electronics Letters 44(13): 800–801. [Google Scholar]
  26. Jiang F, Ma J, Webster CJ, et al. (2024) Automated site planning using CAIN-GAN model. Automation in Construction 159: 105286. DOI: 10.1016/j.autcon.2024.105286. [DOI] [Google Scholar]
  27. Kim J, Yang H and Min K (2024) DALS: diffusion-based artistic landscape sketch. Mathematics 12(2): 238. DOI: 10.3390/math12020238. [DOI] [Google Scholar]
  28. Kingma DP andWelling M(2022) Auto-encoding variational Bayes (arXiv:1312.6114). DOI: 10.48550/arXiv.1312.6114. [DOI] [Google Scholar]
  29. Kotturi Y, Anderson A, Ford G, et al. (2024) Deconstructing the veneer of simplicity: Co-designing introductory generative AI workshops with local entrepreneurs. In Proceedings of the CHI conference on human factors in computing systems, Honolulu, HI, May 11–16, 2024: 1–16. DOI: 10.1145/3613904.3642191. [DOI] [Google Scholar]
  30. Kulkarni A, Shivananda A, Kulkarni A, et al. (2023) Diffusion model and generative AI for images. In: Kulkarni A, Shivananda A, Kulkarni A and Gudivada D (eds) Applied Generative AI for Beginners: Practical Knowledge on Diffusion Models, ChatGPT, and Other LLMs. New York, NY: Apress, 155–177. DOI: 10.1007/978-1-4842-9994-4_8. [DOI] [Google Scholar]
  31. Kwon Y, Wu E, Wu K, et al. (2023, October 2) DataInf: efficiently estimating data influence in LoRA-tuned LLMs and diffusion models. https://arxiv.org/abs/2310.00902v1. [Google Scholar]
  32. Lamac R (2023) Text-to-image AI as a tool for the designer’s ideation process. https://aaltodoc.aalto.fi/handle/123456789/125920. [Google Scholar]
  33. LeCun Y, Bengio Y and Hinton G (2015) Deep learning. Nature 521(7553): 436–444. DOI: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
  34. Lee J-K, Yoo Yand Cha SH (2024) Generative early architectural visualizations: incorporating architect’s styletrained models. Journal of Computational Design and Engineering 11: 40–59. DOI: 10.1093/jcde/qwae065. [DOI] [Google Scholar]
  35. Li P and Li B (2024) Generating daylight-driven architectural design via diffusion models (arXiv:2404.13353). DOI: 10.48550/arXiv.2404.13353. [DOI] [Google Scholar]
  36. Li P, Li B and Li Z (2024) Sketch-to-Architecture: generative AI-aided architectural design (arXiv: 2403.20186). https://arxiv.org/abs/2403.20186. [Google Scholar]
  37. Mandapuram M, Gutlapalli SS, Bodepudi A, et al. (2018) Investigating the prospects of generative artificial intelligence. Asian Journal of Humanity, Art and Literature 5(2): 2. DOI: 10.18034/ajhal.v5i2.659. [DOI] [Google Scholar]
  38. Norton JN and Brouwer AB (2009) Chapter 4—the planning, design and construction process. In: Hessler JR and Lehner NDM (eds) Planning and Designing Research Animal Facilities. Cambridge, MA: Academic Press: 17–44. DOI: 10.1016/B978-0-12-369517-8.00004-9. [DOI] [Google Scholar]
  39. Oppenlaender J (2022) The creativity of text-to-image generation. In Proceedings of the 25th International Academic Mindtrek Conference: 192–202. DOI: 10.1145/3569219.3569352. [DOI] [Google Scholar]
  40. Paananen V, Oppenlaender J and Visuri A. (2023). Using text-to-image generation for architectural design ideation (arXiv:2304.10182). arXiv. DOI: 10.48550/arXiv.2304.10182. [DOI] [Google Scholar]
  41. Phillips C, Jiao J and Clubb E (2024) Testing the capability of AI art tools for urban design. IEEE Computer Graphics and Applications 44(2): 37–45. DOI: 10.1109/MCG.2024.3356169. [DOI] [PubMed] [Google Scholar]
  42. Pressman A (2012) Designing Architecture: The Elements of Process. London: Routledge. [Google Scholar]
  43. Rane N (2023) ChatGPT and similar generative artificial intelligence (AI) for smart industry: role, challenges and opportunities for industry 4.0, industry 5.0 and society 5.0. (SSRN Scholarly Paper 4603234) DOI: 10.2139/ssrn.4603234. [DOI] [Google Scholar]
  44. Reid GW (2007) From Concept to Form in Landscape Design. Hoboken, NJ: John Wiley & Sons. [Google Scholar]
  45. Rombach R, Blattmann A, Lorenz D, et al. (2022) High-Resolution Image Synthesis with Latent Diffusion Models: 10684–10695. https://openaccess.thecvf.com/content/CVPR2022/html/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.html. [Google Scholar]
  46. Salter A and Gann D (2003) Sources of ideas for innovation in engineering design. Research Policy 32(8): 1309–1324. DOI: 10.1016/S0048-7333(02)00119-1. [DOI] [Google Scholar]
  47. Shaw SL, Ye X, Goodchild M, et al. (2024) Human dynamics research in GIScience: challenges and opportunities. Computational Urban Science 4(1): 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Smith JS, Hsu Y-C, Zhang L, et al. (2023, April 12). Continual diffusion: continual customization of text-toimag diffusion with C-LoRA. https://arxiv.org/abs/2304.06027v1. [Google Scholar]
  49. Souter-Brown G (2014) Landscape and Urban Design for Health andWell-Being: Using Healing, Sensory and Therapeutic Gardens. London: Routledge. [Google Scholar]
  50. Stiles R (1994) Landscape theory: a missing link between landscape planning and landscape design? Landscape and Urban Planning 30(3): 139–149. DOI: 10.1016/0169-2046(94)90053-1. [DOI] [Google Scholar]
  51. Stupariu M-S, Cushman SA, Pleşoianu A-I, et al. (2022) Machine learning in landscape ecological analysis a review of recent approaches. Landscape Ecology 37(5): 1227–1250. DOI: 10.1007/s10980-021-01366-9. [DOI] [Google Scholar]
  52. Sun C, Qiu X, Xu Y, et al. (2019) How to fine-tune BERT for text classification? In: Sun M, Huang X, Ji H, et al. (eds) Chinese Computational Linguistics. Berlin: Springer International Publishing: 194–206. DOI: 10.1007/978-3-030-32381-3_16. [DOI] [Google Scholar]
  53. Suthersan SS, Horst J, Schnobrich M, et al. (2016) Remediation Engineering: Design Concepts. 2nd edition. Boca Raton, FL: CRC Press. [Google Scholar]
  54. Thompson JW and Sorvig K (2000) Sustainable landscape construction: a guide to green building outdoors. https://trid.trb.org/view/680366. [Google Scholar]
  55. Wan J and Shi H (2021) Research on urban renewal public space design based on convolutional neural network model. Security and Communication Networks 2021: 1–9. DOI: 10.1155/2021/9504188. [DOI] [Google Scholar]
  56. Wang Z, Bovik AC, Sheikh HR, et al. (2004) Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13(4): 600–612. DOI: 10.1109/TIP.2003.819861. [DOI] [PubMed] [Google Scholar]
  57. Wu J, Gan W, Chen Z, et al. (2023) AI-generated content (AIGC): a survey (arXiv:2304.06632). DOI: 10.48550/arXiv.2304.06632. [DOI] [Google Scholar]
  58. Xu L, Xiangli Y, Rao A, et al. (2021) BlockPlanner: city block generation with vectorized graph representation. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV): 5057–5066. DOI: 10.1109/ICCV48922.2021.00503. [DOI] [Google Scholar]
  59. Yang L, Zhang Z, Song Y, et al. (2023) Diffusion models: a comprehensive survey of methods and applications. ACM Computing Surveys 56(4): 105:1–105:39. DOI: 10.1145/3626235. [DOI] [Google Scholar]
  60. Ye X, Duan L and Peng Q (2021) Spatiotemporal prediction of theft risk with deep inception-residual networks. Smart Cities 4(1): 204–216. [Google Scholar]
  61. Ye X, Du J and Ye Y (2022) MasterplanGAN: facilitating the smart rendering of urban master plans via generative adversarial networks. Environment and Planning B: Urban Analytics and City Science 49(3): 794–814. [Google Scholar]
  62. Ye X, Jamonnak S, Van Zandt S, et al. (2024) Developing campus digital twin using interactive visual analytics approach. Frontiers of Urban and Rural Planning 2(1): 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Yee R (2012) Architectural Drawing: A Visual Compendium of Types and Methods. Hoboken, NJ: JohnWiley & Sons. [Google Scholar]
  64. Yelavich S and Adams B (2014) Design as Future-Making. London: Bloomsbury Publishing. [Google Scholar]
  65. Zhang L and Agrawala M (2023). Adding conditional control to text-to-image diffusion models (arXiv: 2302.05543). DOI: 10.48550/arXiv.2302.05543. [DOI] [Google Scholar]
  66. Zhang R, Isola P, Efros AA, et al. (2018) The unreasonable effectiveness of deep features as a perceptual metric. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition: 586–595. DOI: 10.1109/CVPR.2018.00068. [DOI] [Google Scholar]
  67. Zhang C, Zhang C, Zhang M, et al. (2023) Text-to-image diffusion models in generative AI: a survey (arXiv: 2303.07909). DOI: 10.48550/arXiv.2303.07909. [DOI] [Google Scholar]
  68. Zhang L, Tian X, Zhang C, et al. (2024a) Aided design of bridge aesthetics based on stable diffusion finetuning. DOI: 10.48550/ARXIV.2409.15812. [DOI] [Google Scholar]
  69. Zhang C, Wu Q, Gambardella CC, et al. (2024b) Taming stable diffusion for text to 360° panorama image generation. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): 6347–6357. DOI: 10.1109/CVPR52733.2024.00607. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure 1s. Stable Diffusion interface
Figure 3S. Initial input image for ControlNet
Figure 2s. Images generated in Stable Diffusion with and without ControlNet (prompt: park corner with trees, shrubs, grasses, pavements and walls)
Figure 4S. An example of generative park design when the weight of ControlNet is 0, 0.25, 05, 2 (prompt: bird’s eye view, a park)
Figure 5S. Natural park, a-d with fine-tune, row e-h without fine-tune
Figure 6S. City plaza, a-d with fine-tune, e-f without fine-tune
Figure 7S. Courtyard garden, a-d with fine-tune, e-h without fine-tune

Data Availability Statement

The datasets and code will be available upon request.

RESOURCES