Abstract
The innovation atmosphere of industrial parks, a crucial indicator of urban spatial vitality and regional economic dynamism, is difficult to assess using traditional, experience-driven methods. To overcome these limitations, this study proposes a novel, data-driven framework for urban spatial perception using Multimodal Large Language Models (MLLMs). Focusing on typical industrial parks in Wuhan, China, we harnessed MLLMs to interpret multi-source urban data, validating their diagnostic accuracy against expert evaluations. Subsequently, we simulated the diverse cognitive perspectives of four key stakeholder groups to diagnose the innovation atmosphere, diagnose the innovation atmosphere, quantifying the subjective spatial perceptions of different user groups and reflecting a nuanced understanding of human-environment interactions. The principal findings are: (1) The diagnostic assessments from the Gemini-2.5-pro model demonstrated a significant correlation (r = 0.890, p < 0.001) with the expert judgment baseline, affirming the high feasibility of this data-driven approach. (2) The MLLM framework effectively quantified perceptual heterogeneity among simulated stakeholders, offering deep insights into the varied dimensions of the parks’ city image and perceived quality. (3) Spatial analysis revealed a consistent overall assessment of the innovation atmosphere across different perspectives, with parks in the southeastern and northwestern regions exhibiting higher spatial vitality. This research contributes an objective and automated tool for diagnosing the innovation atmosphere, a key facet of urban spatial perception. Crucially, the proposed framework provides robust empirical support for big data-driven strategies in urban planning, enabling the refined management of innovation spaces to be more productive, collaborative, and sustainable.
Keywords: Urban spatial perception, Innovation atmosphere, Multimodal large language model(MLLM), Industrial parks
Subject terms: Environmental social sciences, Geography, Geography
Introduction
In an era of rapid urbanization and technological transformation, industrial parks have evolved from mere economic engines into complex ecosystems pivotal for innovation. Their success is now measured not only by economic output but also by their contribution to urban vitality, resilience, and the creation of high-quality, human-centric environments. Among these, parks integrating R&D with production are particularly vital. Characterized by frequent spatial interactions and dynamic human activity patterns, they are critical nexuses for fostering synergy between industry, academia, and research1.
The core competitiveness of such parks hinges on their innovation atmosphere—an intangible quality that stimulates creativity and belonging. This atmosphere is, in essence, a complex form of urban spatial perception, shaped by the daily interactions between knowledge workers and their physical environment. For instance, the openness of shared spaces can catalyze the cross-disciplinary collaboration essential for innovation2, while the restorative quality of green landscapes can alleviate cognitive fatigue3, directly supporting well-being and productivity. However, current planning and research remain largely reliant on experience-driven paradigms and macro-level economics, lacking effective data-driven tools to assess and optimize this crucial spatial atmosphere. This methodological gap often leads to a disconnect between design intent and user experience, resulting in spaces that are underutilized and inefficient.
A review of methods for quantifying urban spatial perception reveals persistent limitations. Early research relied on manual investigations like surveys4 and on-site inspections5, methods constrained by high costs, small sample sizes, and subjective biases, failing to capture granular spatial nuances. The advent of 3 S technologies enhanced spatial data acquisition but could not effectively quantify subjective qualities like Degree of Enclosure6. Subsequently, deep learning models enabled large-scale analysis of street-view imagery; however, these are typically single-task models that struggle to comprehend complex semantic associations relevant to public sentiment or a holistic city image. While biosensors like electrodermal activity (EDA)7 and electroencephalography (EEG)8,9 offer direct physiological measures, their intrusive nature and high cost render them impractical for large-scale, naturalistic application.
The emergence of Multimodal Large Language Models (MLLMs) offers a transformative approach to shift urban spatial perception from an experience-driven to a data-driven paradigm10,11. MLLMs are large language models capable of simultaneously receiving, processing, and integrating two or more types of heterogeneous data, such as text, images, and audio. The core advantage lies in powerful cross-modal understanding and reasoning capabilities, which enable a precise analysis of the complex semantic relationships between physical environments and human perception. MLLMs can perform open-ended evaluations that capture complex sentiments and perceptions without extensive retraining, allowing for flexible adaptation to diverse urban scenarios12,13. To date, however, the application of MLLMs in urban science has primarily focused on macro-scale streetscapes, leaving a critical gap in establishing an evaluation system tailored to the high-density functional complexity of industrial parks.
Therefore, this study constructs and validates an innovative framework using MLLMs to enable automated, in-depth diagnostics of the innovation atmosphere in industrial parks. The research pursues three objectives: first, to establish a multi-level perceptual evaluation index grounded in environmental psychology; second, to validate the efficacy of leading MLLMs against human expert judgments; and finally, to use a persona-based simulation to investigate perceptual heterogeneity among key stakeholders, akin to a large-scale sentiment analysis in urban spaces. The findings provide not only a robust diagnostic tool but also an evidence-based framework to support big data-driven strategies in urban planning.
Literature review
Perception of the innovation atmosphere in industrial parks
As core spatial carriers for concentrating innovation factors, facilitating industrial chain collaboration, and enabling knowledge spillovers, industrial parks are essentially complex innovation ecosystems constructed through spatial planning and functional configuration. These systems leverage the advantages of spatial proximity to foster deep interactions among enterprises, research institutions, talent, and service organizations through policy incentives, shared facilities, and optimized services14. Compared to traditional parks with a singular production focus, modern industrial parks have diversified into various forms such as Science and Technology Parks15, Eco-Industrial Parks16, and Innovation Districts17. Their core competitiveness has shifted from low-cost factors to multi-functional integration and independent innovation capabilities. Against this backdrop, the perceived quality of a park’s innovation environment atmosphere has become a key variable influencing the frequency, quality, and sustainability of innovation activities, and it significantly enhances talent attraction18. Therefore, constructing a scientific assessment framework is of great significance for optimizing the innovation ecosystem of these parks.
Existing research has conducted in-depth explorations into the components of the innovation environment. Florida’s 3T theory (Technology, Talent, Tolerance)19 and Glaeser’s 3 S theory (Skills, Sun, Sprawl)20 have jointly revealed the decisive influence of sociocultural and physical spatial environments. Specific to industrial parks, the innovation atmosphere is a comprehensive perceptual experience of multi-dimensional elements by its users, which can be summarized into three levels: (1)the physical environment, including transportation accessibility, public service facilities, and research infrastructure21; (2)the institutional environment, such as fiscal and tax policy support and incentive mechanisms22; (3)the social environment, encompassing collaborative networks and an inclusive culture23. In recent years, to quantify the impact of these elements on the innovation atmosphere, researchers have begun to introduce relevant urban analysis indicators. For example, to measure the potential of the physical environment to stimulate interaction and knowledge sharing, the “3D” elements of the built environment—namely Density (e.g., floor area ratio), Diversity (e.g., land use mix), and Design (e.g., street connectivity)—are often used for quantification24. Additionally, as a crucial manifestation of the innovation atmosphere, vibrancy is increasingly measured through proxy variables such as the density and richness of Points of Interest (POI), public transit accessibility, and human flow heat map data25.
However, when analyzing these elements, current research tends to favor “soft” factors like institutional and social aspects, while lacking in-depth quantitative analysis of the mechanisms through which the physical spatial environment is intuitively perceived and translated into an innovation atmosphere.
Current assessments of the innovation atmosphere primarily rely on two approaches: objective quantification and subjective perception. However, when applied to small and medium-sized industrial park scenarios with close interaction among innovation actors, both approaches exhibit significant limitations. On the one hand, the objective quantification path relies on macroeconomic and technological data such as patent counts and output value26. For instance, although the studies by Bigliardi et al.27 and Lu et al.28 yielded intuitive results and measured innovation outputs, they could not explain the role played by the quality of the spatial environment in the innovation process, nor could they capture process-oriented factors that are difficult to quantify, such as the innovation atmosphere. On the other hand, the subjective perception path, which employs questionnaires and interviews29, can access individual perceptions but suffers from drawbacks such as high costs, sample bias, and strong subjectivity. This makes it difficult to conduct large-scale, high-frequency, and reproducible spatial diagnoses, let al.one meet the dynamic assessment needs of different parks at various stages of development.
Application of multimodal large language models in urban spatial perception
To overcome these challenges, the field of urban studies has progressively established an analytical framework for urban perception based on visual data, forming multiple systematic methods with clear processes and general applicability30. Early research relied on data sources like street view and satellite imagery, using feature engineering to extract core visual elements such as the green view index, sky view factor, and building view factor31. By combining these with models like multiple linear regression, researchers established quantitative correlations between the physical environment and subjective perceptions, enabling the large-scale measurement of multidimensional perception indicators like safety, vitality, and beauty.
With the advancement of deep learning, researchers have further developed standardized analytical frameworks. For instance, a building perception evaluation system based on CNNs and street view imagery allows for comparative analysis of building facade perception features across different urban scales32,33. Together, these systematic methods have solidified the central role of visual data in urban perception research, breaking through the scale limitations and subjective biases of traditional survey methods. However, existing approaches still have significant shortcomings: a single visual model struggles to interpret the deep semantic meaning of subjective perception terms, and the reliance on large-scale annotated data makes it difficult to adapt to specific functional spaces where data is scarce.
It is against this backdrop that the recent rise of MLLMs offers a revolutionary new paradigm for urban spatial perception. By integrating multimodal information such as text and images, MLLM transcend the limitations of traditional single-modal models, offering a novel technological path for urban spatial perception. Unlike the textual-symbolic constraints of Large Language Models (LLMs)34,35 or the semantic reasoning deficiencies of Large Vision Models (LVMs)36,37, MLLM achieve deep semantic understanding and reasoning through cross-modal alignment38. Their advantages in tasks such as image captioning, visual question answering, and spatial reasoning make them ideal tools for analyzing complex spatial perceptions. This is precisely the core capability required for assessing the sustainable innovation atmosphere.
At present, research has begun to validate the potential of MLLM in urban spatial analysis. For instance, Li et al.39 evaluated street-view localizability using the CLIP model. Zhang et al.40 perceived urban visual safety through an MLLM, and the team led by Liang10 developed a Street-Quality-GPT for the dynamic assessment of spatial visual quality. These explorations demonstrate that MLLM can effectively handle the ambiguity and comprehensive nature of spatial perception, which aligns perfectly with the needs of assessing the innovation atmosphere in industrial parks. Their core advantages are manifested in three aspects: (1)The capability to interpret subjective perception terms such as Comfortable and Colorful, bridging the gap between visual experience and linguistic expression; (2)The capacity to support multi-task generalization through a unified framework, meeting the multidimensional evaluation needs of innovation atmosphere; (3)The capability for zero-shot or few-shot learning, minimizing reliance on labeled data and making it suitable for data-scarce park scenarios.
Existing research on MLLM in urban spatial perception has largely focused on general-purpose spaces like streets and communities41,42, while neglecting the fact that industrial parks, as specific functional carriers, have more specialized and complex requirements for spatial atmosphere assessment. Therefore, this study attempts to construct an evaluation framework for the sustainable innovation atmosphere of industrial parks that integrates MLLM. The aim is to provide an efficient, replicable, and low-cost diagnostic tool, thereby offering scientific and precise technical support for optimizing the quality of innovation spaces and enhancing the core competitiveness of the parks.
Materials and methods
Study area
This study selects Wuhan City (113°41′–115°05′E, 29°58′–31°22′N) as the research area (Fig. 1), encompassing a total of eight administrative districts. As the capital of China’s Hubei Province, Wuhan is a major economic and technological center in the Central China region. Its well-established industrial foundation, dense innovation resources, and robust public service system provide an excellent development ecosystem and supportive conditions for various industrial innovation activities. From the perspective of its industrial spatial pattern, industrial paSrks in Wuhan exhibit a tendency to cluster in the suburban areas beyond the 3rd ring road, with key industrial functional zones such as Jiangxia District and Dongxihu District forming the main agglomeration areas. Based on the current data on industrial park development in Wuhan, this study screened research samples using a three-fold set of criteria: first, the park’s footprint must be less than 10 ha; second, it must possess integrated “R&D-production” functional attributes; and third, it must not have layouts with large-scale production facilities. A final selection of 180 parks that met these criteria was made. The spatial quality and innovation atmosphere of such small and medium-sized parks have a critical impact on attracting high-quality talent and enhancing innovation efficiency, thus making them ideal subjects for this research.
Fig. 1.
Distribution map of research area samples.
Dataset construction and preprocessing
This study constructed a multimodal dataset comprising visual images that reflect the real spatial environments of the industrial parks. To ensure the comprehensiveness, authenticity, and consistency of the data, we established a rigorous and systematic data collection and screening methodology. The core of this approach was to build a high-quality visual database through a triple mechanism of “multi-source complementation, standardized quantification, and cross-validation” to minimize selection bias. In the online collection phase, we employed a multi-source data fusion strategy: in addition to official park websites, promotional materials, and public government construction archives, we incorporated map street views and non-promotional, real-world shared images to balance the beautification bias of single official sources.
To guarantee the scientific validity and interpretability of the analysis results, we created a detailed image screening guide that clearly defined scene authenticity: (1) Priority was given to real-world photographs reflecting the daily environmental atmosphere of the parks, such as key spaces including building facades, internal streetscapes, and public landscapes. (2) Three categories of atypical images were explicitly excluded: architectural design renderings or conceptual diagrams, promotional photos of temporary scenes like festival activities, and images that had been excessively altered with post-production filters. (3)Human-Centric Perspective: The viewpoint was aligned with a pedestrian’s eye level at a height of 1.2–1.6 m to replicate a true sense of space.(4)High Image Quality: The resolution was no less than 1024 × 768 pixels, with no significant blurring or exposure issues.
To ensure consistency in applying these screening criteria, we organized a cross-evaluation among three researchers. Before the formal screening, two pilot rating rounds were conducted using a test set of 80 diverse park images. After the first round, a calibration training session was held to address discrepancies in judgment. The second pilot round achieved a 90% agreement rate, after which the formal screening commenced. The evaluators then independently screened all initially collected images. To quantify inter-rater reliability, Fleiss’ Kappa43 was calculated, yielding a result of k = 0.85, which indicates a high degree of agreement among the evaluators. For parks with insufficient information from online sources, we conducted targeted on-site photography according to a unified standard, ensuring that every park in the study had corresponding visual representation data.
Through the process described above, we ultimately constructed a dataset containing a total of 180 images, covering all 180 study parks. Before being input into the model, all images underwent a unified preprocessing workflow, which included removing irrelevant or low-quality images, performing image denoising, and correcting orientation. Concurrently, each image was labeled with a number as its filename, and key park information, such as park names, was redacted. The construction of this dataset laid a solid data foundation for the subsequent objective and interpretable diagnosis of the innovation environment’s spatial perception in the parks using Multimodal Large Language Models.
A Multi-level framework for data-driven perceptual assessment
This study constructs a multi-level perceptual assessment framework to systematically analyze the transmission mechanism between the physical spatial environment and innovation efficacy, and to enable an operationalized diagnosis of the innovation atmosphere. Based on Ulrich’s Stress Reduction Theory (SRT)44 and Kaplan’s Attention Restoration Theory (ART)45,46, this study distills three core psychological mechanisms through which the spatial environment influences innovation efficacy: enhancing concentration, regulating negative emotions, and improving cognitive vitality. These three mechanisms constitute the top-level objectives of the assessment framework, designated as the S1 level. However, these psychological mechanisms cannot be measured directly. To operationalize them, we further break down the assessment dimensions into the user’s intuitive perceptions of the environment and the objective elemental characteristics that constitute the environment, which form the S2 level47.
In order to empirically screen and validate the specific assessment indicators (S3 level) most closely associated with the three mechanisms of the S1 level, this study employed a structured questionnaire survey method. The survey targeted four types of core innovation stakeholders within the sample parks: Enterprise Founder(n = 60), Researcher(n = 80), Investor(n = 40) and Park Manager(n = 32). A total of 212 valid questionnaires were collected, representing a valid response rate of 84.8%. Through statistical analysis of the data, we identified the key combinations of indicators that have a significant impact on stimulating the three psychological mechanisms. Ultimately, the assessment framework for the innovation atmosphere in industrial parks was constructed (Table 1), laying the methodological foundation for the subsequent MLLM-based perceptual diagnosis.
Table 1.
Multi-perception hierarchical evaluation framework for innovation atmosphere in industrial parks.
| Overall Objective | S1 | S2 | S3 |
|---|---|---|---|
| the Innovation Atmosphere of Industrial Parks |
Enhancing Concentration |
Intuitive Perceptions | Serene48 |
| Comfortable44 | |||
| Clean49 | |||
| Elemental Characteristics | Noise Level48 | ||
| Facility Quality50 | |||
| Green Coverage Rate44,51 | |||
| Degree of Enclosure52 | |||
|
Regulating Negative Emotions |
Intuitive Perceptions | Serene | |
| Safe53 | |||
| Comfortable | |||
| Beautiful54 | |||
| Relaxing45 | |||
| Elemental Characteristics | Noise Level | ||
| Green Coverage Rate | |||
| Color Coordination54 | |||
| Environmental Sense of Belonging55 | |||
|
Improving Cognitive Vitality |
Intuitive Perceptions | Novel45 | |
| Colorful55 | |||
| Relaxing | |||
| Elemental Characteristics | Green Coverage Rate | ||
| Scene Diversity56 | |||
| Color Coordination | |||
| Field of View Openness57 |
Framework implementation: MLLM-based diagnosis of innovation atmosphere
In this study, we constructed a diagnostic workflow for sustainable innovation atmosphere based on MLLM (Fig. 2). The core advantage of this method lies in dual-modal input, cross-modal reasoning, and uni-modal output. The input layer requires simultaneously feeding the MLLM both real-world park images and structured prompts, which serve to reflect the park’s physical spatial features and define the connotations of the evaluation indicators, respectively. The reasoning layer leverages its pre-trained cross-modal mapping capabilities to associate visual features from the images with the perceptual indicators from the text. The output layer then provides a quantitative score from 1 to 10, completing the full mapping from physical spatial features to subjective perceptual indicators, and finally to an innovation atmosphere score.
Fig. 2.
Automated diagnostic framework for park innovation atmosphere based on MLLM.
Before importing the park images into the MLLM, the established assessment framework needs to be operationalized. This involves converting the indicators at each level into structured prompts that the MLLM can understand and execute. Specifically, for the two main categories of indicators—Intuitive Perceptions and Elemental Characteristics—this study designed detailed descriptive prompts for each one (Tables 2 and 3). These prompts not only define the connotation of each indicator but also specify its visual representation within the images.
Table 2.
Prompt of intuitive perception indicators in the MLLM.
| Indicator | Prompt Description |
|---|---|
| Serene | The space is free from obvious sources of distracting noise. External noise interference is actively shielded through effective acoustic design (e.g., green sound-insulating belts, noise-reducing pavement) and visual guidance (e.g., open green spaces, minimalist landscapes). |
| Comfortable | The scale of the buildings is in good proportion with the sidewalks and greenery, avoiding a sense of oppression. Visual elements are clean and free of clutter. |
| Clean | The street sanitation and environment are well-maintained. Building facades are clean and tidy. |
| Safe | The park space offers an open field of vision with no safety hazards. There is a clear separation between pedestrian and vehicle traffic on the roads. |
| Beautiful | The overall environment is tidy and orderly. The architectural design possesses uniqueness or aesthetic appeal. The green landscape is meticulously planned and maintained. |
| Relaxing | The spatial rhythm of the park is slow-paced, and the floor area ratio is low. It is equipped with recreational facilities such as green spaces and leisure walkways. |
| Novel | The park layout or its internal buildings adopt unique and contemporary spatial forms and design languages. |
| Colorful | The park space features strong contrasts or harmonious combinations of multiple colors, creating vibrant visual focal points. |
Table 3.
Prompt of elemental characteristics indicators in the MLLM.
| Indicator | Prompt Description |
|---|---|
| Noise Level | Are there wide and dense green belts separating sidewalks and office areas from main roads (with green belts serving as physical sound barriers within the park)? |
| Facility Quality | The physical condition and quality of the park’s hardware facilities, reflecting the level of maintenance and the park’s investment costs. |
| Green Coverage Rate | The visual proportion of green area within the park. |
| Degree of Enclosure | The sense of spatial enclosure formed by internal buildings or natural barriers within the park, such as continuous building interfaces or semi-enclosed corridors. |
| Color Coordination | The color harmony between building facades, public facilities, and the natural landscape. |
| Environmental Sense of Belonging | Is the spatial scale of the park appropriate? Are there public spaces for people to linger and interact, enhancing psychological identity through spatial enclosure and interactive design? |
| Scene Diversity | Are there functional zones that meet multi-dimensional needs such as production, living, and leisure, thereby forming a composite park space? |
| Field of View Openness | Does the park have a low-density layout? Is the spacing between buildings generous enough to form visual corridors, thereby reducing the sense of visual oppression? |
Different groups within an industrial park have significantly different perceptions and needs regarding the innovation atmosphere. Therefore, this methodological framework introduces role-based simulation by assigning unique personas to different stakeholders, simulating their perception of the spatial innovation atmosphere to create a diverse set of simulated subjects (Table 4). During the assessment, the MLLM is instructed to adopt a specific role’s perspective. For instance, as an enterprise founder, it would focus on the environment’s potential to attract talent and inspire team efficiency. As an investor, it would scrutinize the space as a reflection of industrial vitality and development prospects. This approach enables the MLLM to evaluate the innovation environment atmosphere of industrial parks from various stakeholder viewpoints, thereby providing more in-depth decision support for the fine-grained renewal and management of the parks.
Table 4.
Detailed information on the simulation of innovation atmosphere evaluation for different stakeholder groups.
| No. | Stakeholder Type | Representative Group | Characteristics |
|---|---|---|---|
| 1 | Innovation Entities | Enterprise Founder | Focuses on the team’s work efficiency; concerned with whether the environment can attract and retain talent. |
| 2 | Researcher | Focuses on the innovativeness and advanced nature of research conditions to ensure research efficiency and innovative breakthroughs. | |
| 3 | Support and Service Entities | Investor | Focuses on the suitability for enterprise development to judge the park’s investment value; concerned with the industrial vitality behind the environment. |
| 4 | Park Manager | Pays comprehensive attention to all dimensions to enhance the park’s overall image and competitiveness. |
Results
Validation of the data-driven perception framework: accuracy and consistency
To simultaneously validate the accuracy, stability, and credibility of the MLLM’s evaluation results, this study designed a comparative experiment with two core objectives. First, to compare the alignment between the evaluation capabilities of different leading-edge models and a benchmark of human expert judgments. Second, to establish the credibility level and application boundaries of each model through a two-stage assessment involving pre-modeling prediction and post-analysis verification. To begin, 50 experts from relevant fields such as urban planning, architecture, industrial development, and strategic planning were invited to rate the innovation atmosphere of the sample parks. Concurrently, we selected six of the most commonly used and state-of-the-art models: Gemini-2.5-pro57, Gemini-2.5-flash58, GPT-4o59, Doubao-Seed-1.6-thinking60, QvQ-Max61, Claude 3.5 Sonnet62, to evaluate the same batch of park images. Based on each model’s modal capabilities and performance on public benchmarks, a pre-modeling credibility prediction was conducted to ensure the selected models covered a gradient of different credibility levels, providing a foundation for the subsequent credibility comparison (Table 5).
Table 5.
Test results and pre-modeling credibility ratings of selected MLLMs.
| MLLM | GPQA Diamond test results | MMMU test results | Pre-modeling credibility rating |
|---|---|---|---|
| Gemini-2.5-pro | 84.8% | 81.7% | High |
| Gemini-2.5-flash | 78.3% | 79.7% | High |
| GPT-4o | 72.0% | 69.1% | Medium |
| Doubao-Seed-1.6-thinking | 81.5% | 74.8% | Medium |
| QvQ-Max | 76.1% | 70.3% | Limited |
| Claude 3.5 Sonnet | 65.0% | 70.4% | Limited |
* GPQA Diamond is used to evaluate the reasoning ability of artificial intelligence models when dealing with complex problems; MMMU is used to comprehensively evaluate the capabilities of multimodal AI models.
To ensure the consistency and comparability of the assessments, all expert scores and MLLM ratings used a 1-to-10 point scale, where a higher score indicates a better innovation atmosphere. During the model evaluation phase, we employed a baseline prompt that did not specify a particular role, and the temperature parameter for all models was uniformly set to 0.2 to ensure the determinism and stability of the outputs.
The comparative analysis results (Table 6) show a significant positive correlation between the evaluation performance of the six models and their credibility ratings. Among them, Gemini-2.5-pro demonstrated superior performance across the three core dimensions of consistency, stability, and credibility, significantly outperforming the other models. In terms of consistency, its Pearson correlation coefficient (r = 0.8900) and coefficient of determination (R²=0.8060) with the expert scores were the highest, and the results were highly statistically significant (p < 0.001), indicating it can explain 80.6% of the variance in the expert ratings. In terms of stability, its Mean Absolute Error (MAE = 0.3756) and maximum error (Max = 1.2200) were the lowest among all models, showing not only the smallest average deviation but also the strongest robustness against extreme outliers. Regarding the credibility classification, Gemini-2.5-flash (r = 0.875, MAE = 0.3986) and GPT-4o (r = 0.858, MAE = 0.4056) also met the standard for high credibility.
Table 6.
Differences between expert ratings and MLLM ratings.
| MLLM | MAE↓ | r↑ | Max↓ | Min↓ | R2↑ | Post-analysis credibility rating |
|---|---|---|---|---|---|---|
| Gemini-2.5-pro | 0.3756 | 0.8900 | 1.2200 | 0.0800 | 0.8060 | High |
| Gemini-2.5-flash | 0.3986 | 0.8750 | 1.5400 | 0.0800 | 0.7880 | High |
| GPT-4o | 0.4056 | 0.8580 | 1.8700 | 0.1200 | 0.7640 | Medium |
| Doubao-Seed-1.6-thinking | 0.4244 | 0.8010 | 2.0500 | 0.1300 | 0.7350 | Medium |
| QvQ-Max | 0.4765 | 0.7840 | 2.4100 | 0.1700 | 0.6870 | Limited |
| Claude 3.5 Sonnet | 0.5850 | 0.7560 | 2.7200 | 0.2500 | 0.6650 | Limited |
* The arrows indicate the direction of better performance for each metric. MAE (Mean Absolute Error), Max (Maximum Error), and Min (Minimum Error) are better when lower (↓). The Pearson correlation coefficient (r) and the coefficient of determination (R²) are better when higher (↑). All of the above models can simultaneously process both the real-world park images (the visual modality) and the evaluation indicator prompts (the text modality). Therefore, all six models are MLLMs.
Figure 3 further validates the high degree of alignment between Gemini-2.5-pro and the expert scores. Its evaluation results not only accurately replicate the experts’ subjective judgment logic regarding the innovation atmosphere but also possess the advantages of being quantifiable and reproducible, establishing it as the core tool for the subsequent multi-agent simulation assessment. This not only validates the feasibility of this study’s methodology but also lays a solid foundation for using this model in more in-depth and complex persona-based simulation assessments. Therefore, in all subsequent analyses, this study utilizes Gemini-2.5-pro as the core assessment tool.
Fig. 3.
Consistency check of the innovation atmosphere assessment based on Gemini-2.5-pro against expert scores.
Quantifying perceptual heterogeneity: a multi-stakeholder diagnosis
After validating the model’s performance, this study utilized the top-performing Gemini-2.5-pro to conduct a multi-stakeholder simulated assessment of the innovation atmosphere in the industrial parks. We constructed unique evaluation perspectives for four key stakeholder groups—enterprise founders, researchers, investors, and park managers—and quantified their perceptual weights for various innovation atmosphere indicators. The analysis results (Fig. 4) show that the perceptual weights of the four groups exhibit both significant heterogeneity and clear commonalities.
Fig. 4.
Distribution of evaluation indicator weights for industrial park innovation atmosphere by different stakeholders.
To validate the effectiveness of the persona-based simulation method, this study conducted a cross-comparison between the perceptual weights simulated by the MLLM and preference data obtained from real stakeholders through a structured survey (Table 7). The survey covered 212 key stakeholders, and their feedback showed a high degree of consistency with the simulated weights (overall r = 0.832, p < 0.001). Specifically, 78.3% of actual enterprise founders explicitly mentioned the need to “attract core talent” and “stimulate team collaboration” in the survey. This highly aligns with the high weights assigned to “Novel” (weight 0.11) and “Scene Diversity” (weight 0.10) in the simulation results. Similarly, the demand from actual researchers for a “disturbance-free work environment” and “stress-relieving spaces” accounted for 69.6% of their responses, which is consistent with the priority of the “Serene” (weight 0.08) and “Comfortable” (weight 0.06) indicators in the simulation.
Table 7.
Consistency comparison between simulated perceptual weights and real stakeholder preferences.
| Stakeholder Type | Core Focus Indicators | Real Preference Feedback (Survey %) | Item-specific Correlation Coefficient (r) |
|---|---|---|---|
| Enterprise Founder | Novel, Scene Diversity | Attracting Talent (78.3%), Collaborative Space (65.2%) | 0.857 |
| Researcher | Serene, Comfortable | Disturbance-free Environment (69.6%), Stress-reducing Space (61.4%) | 0.819 |
| Investor | Safe, Environmental Sense of Belonging | Asset Stability (72.5%), Development Potential (68.3%) | 0.803 |
| Park Manager | Noise Level, Facility Quality | Basic Environment Maintenance (83.1%), Overall Image (76.4%) | 0.861 |
Regarding the differences in weights, the preference differentiation among real stakeholders showed consistency with the simulation results. In the survey, real investors placed greater emphasis on asset stability and development potential, which are logically consistent with the high weights assigned in the simulation to “Safe” (weight 0.10) and “Environmental Sense of Belonging” (weight 0.09). Meanwhile, the emphasis real park managers placed on “Basic Environment Maintenance” (83.1% in the survey) was significantly higher than other stakeholders, matching the priority of “Noise Level” (weight 0.08) and “Facility Quality” (weight 0.08) in the simulation. Notably, both datasets indicate that purely aesthetic indicators (such as beautiful and colorful) ranked low in both real preferences (demand ≤ 15.7%) and simulated weights (weight ≤ 0.04). This further confirms that industrial park stakeholders, being function-oriented, prioritize the psychological utility of the spatial environment.
In terms of shared characteristics, safety was listed by all stakeholders in the real survey as an “indispensable basic condition” (92.4%), which perfectly aligns with its highest weight (0.09–0.12) across all simulated personas. This validates the universality of Maslow’s hierarchy of needs in the context of spatial perception. Scene Diversity, which was linked to the need to promote cross-disciplinary communication by 73.6% of respondents, corresponded with its medium-to-high weight (0.08–0.11) across the simulated personas, indicating a consensus on its core value in stimulating innovation.
By applying these differently-weighted stakeholder perspectives to all sample parks, we obtained spatial distribution maps of the innovation atmosphere scores (Fig. 5). The results show that although the specific ratings from each stakeholder vary, the overall evaluation trends are highly consistent, revealing significant spatial differentiation. A clear spatial gradient formed within the study area, characterized as “low in the southwest, high in the northwest; low in the northeast, high in the southeast.” Specifically, the low-value areas for innovation atmosphere are concentrated in the western part of Qiaokou District and in Caidian District, whereas the high-value areas are clearly clustered in the southeastern region, centered around the East Lake High-tech Development Zone. This spatial pattern profoundly reflects the real-world gap in the quality of the innovation environment across different regions of Wuhan.
Fig. 5.
Distribution of innovation atmosphere scores for sample parks by different stakeholders.
Decoding the perceptual drivers of innovation atmosphere: a typological comparison
To further reveal the formation mechanism behind the spatial differentiation pattern of the innovation atmosphere in Wuhan’s industrial parks, we selected three representative park cases for in-depth analysis (Fig. 6). These three cases respectively represent three typical scoring patterns: High-Scoring Consensus, Perceptual Divergence and Low-Scoring Consensus. By analyzing the scoring differences among the four stakeholder groups for these three cases, we found that the core reason for the varied ratings and perceptual divergence is the quality of the built environment. The key factor is its ability to provide added value, such as inspiring creativity and promoting communication, beyond simply meeting basic functional needs.
Fig. 6.
Typical industrial park innovation atmosphere evaluation results based on different stakeholders.
Industrial Park A is defined as the High-Scoring Consensus, receiving high scores from all stakeholders (Enterprise Founder: 8.62; Researcher: 8.37; Investor: 8.75; Park Manager: 8.23). The key to this park’s success lies in its built environment, which not only fully covers basic functions but also creates multi-dimensional value tailored to the different needs of various stakeholders, thus fostering a high degree of consensus on the park’s value. Its novel architectural aesthetics and functionally composite scenes provide inspiration and a sense of belonging for founders’ teams, while the large central green space creates a serene, stress-relieving, and comfortable environment for researchers, ensuring the state required for deep work. At the same time, the high-quality built environment conveys strong asset value and future potential to investors and has also earned high recognition from park managers.
Industrial Park B exhibits significant Perceptual Divergence (Enterprise Founder: 7.26; Researcher: 6.92; Investor: 8.11; Park Manager: 7.95). The root of this divergence lies in the misalignment between the functional provisions from the manager’s perspective and the actual needs from the user’s perspective. For investors and managers, the park’s tidy and orderly roads and external environment convey a stable management level and controllable asset risk, thus receiving higher ratings. However, for the actual users—enterprise founders and researchers—this functional design, which lacks novelty, can hardly inspire creativity or attract talent. Furthermore, its location near a main road and its low greenery rate fail to provide the serene environment needed to alleviate stress and facilitate deep work.
Industrial Park C is a typical representative of the Low-Scoring Consensus, with all stakeholders giving it low ratings (Enterprise Founder: 4.38; Researcher: 4.52; Investor: 3.85; Park Manager: 4.12). Its core problem is a severe lack of “soft” environmental development capable of stimulating innovation vitality, presenting an overall character that is functionally monotonous, dull, and lacking in human-centric care. From an investor’s perspective, a park that lacks aesthetic appeal, comfort, and interactive spaces is unattractive in the capital market. For park managers, the poor facility quality directly leads to significant operational difficulties in attracting high-quality enterprises. For enterprise founders, this kind of park environment, with its lack of scene diversity, can neither retain the core talent needed for innovation nor does it do anything but harm the company’s brand image. Finally, for researchers, the monotonous and dull park environment severely suppresses creative thinking and weakens their sense of belonging to the work environment.
In summary, the analysis of these three cases clearly indicates that a successful innovation environment requires not only the provision of complete infrastructure but also high-quality spatial design that responds to and meets the core value demands of different innovation stakeholders. The formation of perceptual divergence or negative consensus often stems from a misalignment or rupture between spatial supply and diverse demands, and this is precisely the micro-level mechanism that leads to the significant spatial differences in the innovation atmosphere of urban industrial parks.
Discussion
Comparative analysis of multi-agent simulation evaluation results based on MLLM
This study systematically validates that the use of advanced MLLMs can efficiently simulate the comprehensive judgments of different stakeholders on the complex and subjective concept of innovation atmosphere by comparing expert ratings with MLLM-based evaluations of real images of the same industrial parks. Compared to traditional methods, MLLM marks a paradigm shift: the evaluation process is no longer limited by small sample subjective questionnaires nor reliant on macroeconomic indicators that cannot explain the innovation process. Instead, it enters an era of quantifiable, reproducible, and efficient automated diagnostics.
Further analysis reveals a significant perceptual divergence among stakeholders, which delineates two primary clusters: park users (enterprise founders, researchers) and park service providers (investors, park managers). This observation lends new empirical support to the Person-Environment (P-E) Fit Theory63. Concurrently, it substantiates the well-documented perceptual gap between expert and user perspectives in built environment evaluations. As demonstrated by Xu et al.64, even within conventional urban public spaces, design professionals exhibit a predilection for features that ensure the functionality and order of necessary activities, whereas lay users are more responsive to environmental qualities that afford spontaneous social interaction. Our research identifies a comparable structural disparity in the more functionally oriented context of innovation districts. The user group prioritizes micro-spatial qualities conducive to creativity, deep work, and informal communication, such as environmental novelty, serenity, and scene diversity. Notably, within this group, researchers exhibit a strong preference for serene environments that facilitate deep work and comfortable spaces that mitigate cognitive fatigue, a finding highly consistent with the principles of Attention Restoration Theory (ART)65,66. In contrast, the service provider group tends to evaluate the environment from a macro-level perspective, focusing on indicators like cleanliness and a sense of environmental belonging, which they associate with asset value, management quality, and security.
Despite the aforementioned perceptual divergences, all stakeholders demonstrated a strong consensus on several core indicators. Notably, Safety emerged as a universally indispensable prerequisite for all innovation activities. This finding corroborates the applicability of Maslow’s hierarchy of needs to spatial perception67,68, positing that the physical environment must first fulfill fundamental safety requirements to support higher-order social and creative functions. Furthermore, Scene Diversity was consistently valued across all groups, a finding that resonates with Granovetter’s Strength of Weak Ties Theory69 and Oldenburg’s Third Place Theory70. This suggests a collective recognition that spaces promoting informal social encounters are critical vectors for stimulating innovation and collective creativity. It is noteworthy that the weights for traditional, purely aesthetic indicators (such as Beautiful and Colorful) have relatively low weights across all stakeholder groups. This suggests that in the small and medium-sized “R&D-production” integrated industrial parks focused on in this study, the environmental psychology utility of a space (such as supporting concentration and relieving stress) is more important than its purely visual appeal.
However, this conclusion is not absolute. The relationship between visual appeal and functional utility can vary significantly depending on the type and context of the innovation space. Innovation spaces with different positionings have fundamentally different demands for aesthetic indicators. Relevant studies have provided corroborating evidence by comparing perceptual differences across various types of spaces. Rui et al.30 discovered that the physical environmental characteristics of different types of spaces directly shape the users’ visual perception and evaluative focus. Based on this, our study further proposes that the weights of function and aesthetics may depend on the type of innovation activities. For example, in creative parks centered on industries like design, art, and cultural innovation, visual uniqueness and richness may be core elements that directly inspire creativity, and the weight of their aesthetic indicators would likely be significantly higher than in our sample. Conversely, parks focused on hard-tech R&D and precision manufacturing are more inclined to prioritize functional practicality and environmental stability. Therefore, future research and practice should not treat functionality and aesthetics as a simple dichotomy. Instead, they should dialectically explore the optimal path for synergistic enhancement between the two, based on specific contextual factors such as the park’s positioning, target audience, and industrial culture, in order to more precisely shape attractive and competitive innovation spaces.
Implications for the development policies and practices of industrial parks
By integrating MLLM with a sustainable innovation atmosphere framework, this study introduces a pivotal methodological innovation for the sustainable planning, adaptive management, and human-centric design of industrial parks. This method provides park planners and managers with an empirically validated and cost-effective decision-support tool, ensuring the innovation ecosystem is highly aligned with long-term sustainable development goals.
For the planning of newly built parks, this framework can conduct pre-evaluation of multiple design schemes. When reviewing design schemes, planning departments no longer need to rely solely on renderings and abstract statements. Instead, they can require all bidding schemes to go through the MLLM evaluation framework proposed in this study to generate a clear quantitative scoring table for the innovation atmosphere. This scoring table can be used for horizontal comparison of different schemes, avoiding subjective judgments and making scientific, data - driven choices. In the pre-planning stage, this evaluation mechanism can be incorporated as a core reference for scheme selection, actively avoiding the high - cost renovation risks caused by insufficient design of the innovation space atmosphere and effectively reducing the uncertainty of pre-investment. In essence, this mechanism allows for a simulation rehearsal of the park’s atmosphere before construction begins, ensuring the final design achieves an optimal configuration oriented towards human-centric innovation.
For the iterative upgrading of existing parks, this framework provides a powerful digital Post-Occupancy Evaluation (POE) tool, promoting the transformation of park management from passive response to active diagnosis. MLLM can accurately identify the spatial shortcomings that restrict the improvement of sustainability and vitality. For example, by uncovering the perceptual differences among various groups, it can reveal the deep-seated reasons for a lack of park vitality: it could be a shortage of the inspiring interactive spaces needed by young innovators, or a deficit in the serene, green environments required by senior researchers for deep thought. This deep insight supports precise policymaking, allowing for targeted and efficient small-scale upgrades that meet the specific needs of different user groups. This approach substitutes costly, large-scale redevelopment and maximizes resource utilization efficiency. At the same time, managers can incorporate the framework into the annual evaluation work. Through quantifiable data on the improvement of user satisfaction and indicators of park vitality growth, they can intuitively prove a higher return on investment to stakeholders.
Furthermore, this method provides a scalable solution for broader urban governance. Decision-makers can extend it to diverse innovation scenarios, including technology incubators, university science parks, and innovation districts. By mapping the dynamic innovation atmosphere, decision - makers can conduct objective comparative analyses of all innovation carriers in the city. This provides a clear decision-making basis for strategic public resource allocation, enabling the accurate identification of benchmark areas and key areas that need policy support. More importantly, through annual dynamic evaluations, a feedback loop of “policy-effect-adjustment” can be established, enabling urban managers to continuously track the effectiveness of policy interventions and dynamically adjust their urban innovation development strategies, thus achieving refined control of the urban innovation landscape.
Limitations and future work
Although this study successfully validated the feasibility of using MLLM to assess the innovation atmosphere of parks, it is necessary to acknowledge its limitations, which also point to directions for future research. First, the assessment in this study was based on a curated dataset of images, which has inherent limitations in terms of Spatial Representativeness. While we systematically included key spatial images reflecting the core functions of the parks, this is essentially a sampling method and cannot provide a complete panoramic view of the park. Insufficient coverage of peripheral areas, non-core functional zones, or visually less ideal locations may lead to assessment results that do not fully reflect the overall environmental quality of the park, potentially introducing a degree of selection bias. Second, this study was unable to fully control for temporal and environmental variables during the data collection phase. Transient factors such as lighting conditions and the state of vegetation can affect the visual presentation of a space, which in turn could interfere with the MLLM’s perceptual judgment. Future research could enhance the robustness and comprehensiveness of the assessment results by integrating panoramic street-view imagery that covers the main functional spaces of the parks, or by combining it with drone-based 3D reality modeling to achieve a full-domain assessment of the park’s innovation atmosphere. At the same time, as MLLM technology continues to iterate, the cross-modal understanding and reasoning capabilities of these models will be further enhanced, holding the promise of achieving a more refined and profound interpretation of spatial atmosphere.
Conclusions
This study addresses the long-standing challenge of quantifying complex urban spatial qualities by proposing and validating a novel, data-driven framework for urban spatial perception. Focusing on the innovation atmosphere of industrial parks, a key indicator of urban vitality, our research makes several key contributions.
First, we empirically demonstrate that advanced Multimodal Large Language Models (MLLMs), specifically Gemini-2.5-Pro, can replicate the nuanced judgments of human experts in assessing complex spatial environments. The high correlation with the expert benchmark (r = 0.890, p < 0.001) validates the feasibility of shifting the assessment of urban spatial quality from a subjective, experience-driven approach to a scalable and objective data-driven paradigm.
Second, by implementing a persona-based simulation, our framework effectively quantifies the perceptual heterogeneity among different stakeholders. This method, analogous to a large-scale sentiment analysis in urban spaces, reveals that enterprise founders and researchers prioritize micro-spatial qualities that foster creativity, whereas investors and park managers focus on macro-level order and asset image. This finding provides granular insights into the multifaceted nature of the city image as perceived by different user groups.
Methodologically, this research pioneers an efficient AI-driven approach for a holistic, human-centric diagnosis of specialized urban environments, moving beyond the limitations of traditional surveys and single-task computational models. Practically, it offers a powerful decision-support tool for data-driven urban planning and governance. By translating the abstract concept of urban spatial perception into actionable insights, our framework enables planners and managers to optimize innovation spaces with precision. This fosters a virtuous cycle: enhancing spatial quality to attract and retain talent, which in turn stimulates innovation, ultimately driving the creation of innovation ecosystems that are more productive, collaborative, and sustainable.
Author contributions
Xinyuan Chen: Conceptualization, Methodology, Validation, Writing – original draft; Zhongying Song: Methodology, Visualization, Writing – original draft; Li Xu: Investigation, Resources; Junhua Zhu: Software, Visualization; Qiang Niu: Conceptualization, Writing–review & editing, Supervision, Funding acquisition; Guo Cheng: Conceptualization, Methodology, Writing–review & editing. All authors reviewed the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (52278075) and the Key Project of the Wuhan Municipal Knowledge Innovation Program (2023010188060007).
Data availability
The data that support the findings of this study are available from the corresponding author(C.G. 2025102090012@whu.edu.cn) upon reasonable request due to privacy concerns related to the industrial park.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Xinyuan Chen and Zhongying Song.
Contributor Information
Qiang Niu, Email: niuqiang@whu.edu.cn.
Guo Cheng, Email: 2025102090012@whu.edu.cn.
References
- 1.Wu, Y. & Gao, X. Can the establishment of eco-industrial parks promote urban green innovation? Evidence from China. J. Clean. Prod.341, 130855 (2022). [Google Scholar]
- 2.Ungureanu, P., Cochis, C., Bertolotti, F., Mattarelli, E. & Scapolan, A. C. Multiplex boundary work in innovation projects: the role of collaborative spaces for cross-functional and open innovation. Eur. J. Innov. Manage.24, 984–1010 (2020). [Google Scholar]
- 3.Xu, J., Qiu, B., Zhang, F. & Zhang, J. Restorative effects of pocket parks on mental fatigue among young adults: A comparative experimental study of three park types. Forests15, 286 (2024). [Google Scholar]
- 4.Sallis, J. F., Johnson, M. F., Calfas, K. J., Caparosa, S. & Nichols, J. F. Assessing perceived physical environmental variables that May influence physical activity. Res. Q. Exerc. Sport. 10.1080/02701367.1997.10608015 (1997). https://www.tandfonline.com/doi/abs/ [DOI] [PubMed] [Google Scholar]
- 5.Ewing, R. H. et al. Measuring Urban Design: Metrics for Livable Places Vol. 200 (Island, 2013).
- 6.B., M. A. The uses of big data in cities. Big Data. 10.1089/big.2013.0042 (2014). doi:10.1089/big.2013.0042. [DOI] [PubMed] [Google Scholar]
- 7.Caruelle, D., Gustafsson, A., Shams, P. & Lervik-Olsen, L. The use of electrodermal activity (EDA) measurement to understand consumer emotions–a literature review and a call for action. J. Bus. Res.104, 146–160 (2019). [Google Scholar]
- 8.Kim, M., Cheon, S. & Kang, Y. Use of electroencephalography (EEG) for the analysis of emotional perception and fear to nightscapes. Sustainability11, 233 (2019). [Google Scholar]
- 9.Reece, R., Bornioli, A., Bray, I. & Alford, C. Exposure to green and historic urban environments and mental well-being: results from EEG and psychometric outcome measures. Int. J. Environ. Res. Public Health. 19, 13052 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Liang, H., Zhang, J., Li, Y., Zhu, Z. & Wang, B. Automatic estimation for visual quality changes of street space via street-view images and multimodal large language models. (2023). https://www.preprints.org/frontend/manuscript/1c3c24d0ed8f219c5cfacb49b1c49c12/download_pub
- 11.Malekzadeh, M., Willberg, E., Torkko, J. & Toivonen, T. Urban attractiveness according to chatgpt: contrasting AI and human insights. Comput. Environ. Urban Syst.117, 102243 (2025). [Google Scholar]
- 12.Blečić, I., Saiu, V. & Trunfio, A. Enhancing urban walkability assessment with multimodal large Language models. In Computational Science and its Applications – ICCSA 2024 Workshops (eds Gervasi, O. et al.) 394–411 (Springer Nature Switzerland, 2024). 10.1007/978-3-031-65282-0_26. [Google Scholar]
- 13.Ki, D., Lee, H., Park, K., Ha, J. & Lee, S. Measuring nuanced walkability: leveraging chatgpt’s vision reasoning with multisource Spatial data. Comput. Environ. Urban Syst.121, 102319 (2025). [Google Scholar]
- 14.Melnychenko, A., Shevchuk, N., Babiy, I., Blyznyuk, T. & Akimova, O. Transformation of industrial parks in the direction of providing of the purposes achievement of sustainable development. Int. J. Comput. Sci. Netw. Secur.22, 7–14 (2022). [Google Scholar]
- 15.Phan, P. H., Siegel, D. S. & Wright, M. Science parks and incubators: Observations, synthesis and future research. J. Bus. Ventur.20, 165–182 (2005). [Google Scholar]
- 16.Côté, R. P. & Cohen-Rosenthal, E. Designing eco-industrial parks: A synthesis of some experiences. J. Clean. Prod.6, 181–188 (1998). [Google Scholar]
- 17.Katz, B. & Wagner, J. The rise of urban innovation districts. Harv Bus. Rev12. https://hbr.org/2014/11/the-rise-of-urban-innovation-districts (2014).
- 18.Amabile, T. M., Barsade, S. G., Mueller, J. S. & Staw, B. M. Affect and creativity at work. Adm. Sci. Q.50, 367–403 (2005). [Google Scholar]
- 19.Florida, R. Cities and the Creative Class (Routledge, 2005).
- 20.Glaeser, E. L. & Resseger, M. G. The complementarity between cities and skills. J. Reg. Sci.50, 221–244 (2010). [Google Scholar]
- 21.Wang, J., Tong, C. & Hu, X. Policy zoning method for innovation districts to sustainably develop the knowledge-economy: A case study in hangzhou, China. Sustainability13, 3503 (2021). [Google Scholar]
- 22.Bloom, N., Van Reenen, J. & Williams, H. A toolkit of policies to promote innovation. J. Economic Perspect.33, 163–184 (2019). [Google Scholar]
- 23.Maennig, W. & Ölschläger, M. Innovative milieux and regional competitiveness: the role of associations and chambers of commerce and industry in Germany. Reg. Stud.45, 441–452 (2011). [Google Scholar]
- 24.Kim, Y. A. & Hipp, J. R. Density, diversity, and design: three measures of the built environment and the Spatial patterns of crime in street segments. J. Criminal Justice. 77, 101864 (2021). [Google Scholar]
- 25.Wang, X., Zhang, Y., Yu, D., Qi, J. & Li, S. Investigating the Spatiotemporal pattern of urban vibrancy and its determinants: Spatial big data analyses in beijing, China. Land. Use Policy. 119, 106162 (2022). [Google Scholar]
- 26.Dabrowska, J. Measuring the success of science parks: Performance monitoring and evaluation. (2011). https://repositorio.minciencias.gov.co/bitstream/handle/20.500.14143/265/1622-DABROWSKA_2011_MEASURING_TH.PDF?sequence=1
- 27.Bigliardi, B., Dormio, A. I., Nosella, A. & Petroni, G. Assessing science parks’ performances: directions from selected Italian case studies. Technovation26, 489–505 (2006). [Google Scholar]
- 28.Lu, J. et al. IOP Publishing,. Evaluation on synergetic innovation ability of environmental protection industrial park. in IOP Conference Series: Earth and Environmental Science vol. 598 012079 (2020).
- 29.Anderson, N. R. & West, M. A. Measuring climate for work group innovation: development and validation of the team climate inventory. J. Organiz Behav.19, 235–258 (1998). [Google Scholar]
- 30.Rui, J., Xu, Y., Cai, C. & Li, X. Leveraging large Language models for tourism research based on 5D framework: A collaborative analysis of tourist sentiments and Spatial features. Tour. Manag.108, 105115 (2025). [Google Scholar]
- 31.Liang, J. et al. GSV2SVF-an interactive GIS tool for sky, tree and Building view factor Estimation from street view photographs. Build. Environ.168, 106475 (2020). [Google Scholar]
- 32.Zhang, L. et al. Quantifying the urban visual perception of Chinese traditional-style Building with street view images. Appl. Sci.10, 5963 (2020). [Google Scholar]
- 33.He, N. & Li, G. Urban neighbourhood environment assessment based on street view image processing: A review of research trends. Environ. Challenges. 4, 100090 (2021). [Google Scholar]
- 34.Kostikova, A. et al. LLLMs: A data-driven survey of evolving research on limitations of large Language models. Preprint at.10.48550/arXiv.2505.19240 (2025). [Google Scholar]
- 35.Hadi, M. U. et al. A survey on large language models: Applications, challenges, limitations, and practical usage. Authorea Preprints (2023). https://www.authorea.com/doi/full/10.36227/techrxiv.23589741.v3?commit=257b583a651fe9d363a4bce30dd48b38eb5a2bea
- 36.Liu, Y. et al. Sora: A review on background, technology, limitations, and opportunities of large vision models. Preprint at.10.48550/arXiv.2402.17177 (2024). [Google Scholar]
- 37.Wu, J. et al. Reinforcing Spatial reasoning in vision-language models with interwoven thinking and visual drawing. Preprint at.10.48550/arXiv.2506.09965 (2025). [Google Scholar]
- 38.Belaroussi, R. Subjective assessment of a built environment by ChatGPT, gemini and grok: comparison with architecture, engineering and construction expert perception. Big Data Cogn. Comput.9, 100 (2025). [Google Scholar]
- 39.Li, L., Ye, Y., Jiang, B., Zeng, W. & Georeasoner Geo-localization with reasoning in street views using a large vision-language model. in Forty-first International Conference on Machine Learning (2024).
- 40.Zhang, J., Li, Y., Fukuda, T. & Wang, B. Urban safety perception assessments via integrating multimodal large Language models with street view images. Cities165, 106122 (2025). [Google Scholar]
- 41.Shang, Y. et al. UrbanWorld: an urban world model for 3D City generation. Preprint at.10.48550/arXiv.2407.11965 (2024). [Google Scholar]
- 42.Zhang, D., Xiong, Z. & Zhu, X. Evaluation of thermal comfort in urban commercial space with vision–language-model-based agent model. Land14, 786 (2025). [Google Scholar]
- 43.Falotico, R. & Quatto, P. Fleiss’ kappa statistic without paradoxes. Qual. Quant.49, 463–470 (2015). [Google Scholar]
- 44.Ulrich, R. S. Stress reduction theory. D. Marchand, E. Pol, & K. Weiss (Eds.) 100, 143–146 (2023).
- 45.Basu, A., Duvall, J. & Kaplan, R. Attention restoration theory: exploring the role of soft fascination and mental bandwidth. Environ. Behav.51, 1055–1081 (2019). [Google Scholar]
- 46.Pham, T. P. & Sanocki, T. Human attention restoration, flow, and creativity: A conceptual integration. J. Imaging. 10, 83 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Kothencz, G. & Blaschke, T. Urban parks: visitors’ perceptions versus Spatial indicators. Land. Use Policy. 64, 233–244 (2017). [Google Scholar]
- 48.Dean, J. T. Noise, cognitive function, and worker productivity. Am. Economic Journal: Appl. Econ.16, 322–360 (2024). [Google Scholar]
- 49.Moultrie, J. et al. Innovation spaces: towards a framework for Understanding the role of the physical environment in innovation. Creativity Innov. Manage.16, 53–65 (2007). [Google Scholar]
- 50.Wu, K., Wang, Y., Zhang, H., Liu, Y. & Ye, Y. Impact of the built environment on the Spatial heterogeneity of regional innovation productivity: evidence from the Pearl river delta, China. Chin. Geogr. Sci.31, 413–428 (2021). [Google Scholar]
- 51.Stokols, D., Clitheroe, C. & Zmuidzinas, M. Qualities of work environments that promote perceived support for creativity. Creativity Res. J.14, 137–147 (2002). [Google Scholar]
- 52.Roe, D. Naturally artificial: the pre-raphaelite garden enclosed. Vic. Poetry. 57, 131–153 (2019). [Google Scholar]
- 53.Daniel, G. R. Safe spaces for enabling the creative process in classrooms. Australian J. Teacher Educ. (Online). 45, 41–57 (2020). [Google Scholar]
- 54.Caivano, J. L. Research on color in architecture and environmental design: brief history, current developments, and possible future. Color. Res. Application. 31, 350–363 (2006). [Google Scholar]
- 55.Azudin, N., Ismail, M. N. & Taherali, Z. Knowledge sharing among workers: A study on their contribution through informal communication in cyberjaya, Malaysia. Knowl. Manage. E-Learning. 1, 139 (2009). [Google Scholar]
- 56.Yun, J. J., Zhao, X., Yigitcanlar, T., Lee, D. & Ahn, H. Architectural design and open innovation symbiosis: insights from research campuses, manufacturing systems, and innovation districts. Sustainability10, 4495 (2018). [Google Scholar]
- 57.Moritz, E. The tapestry metaphor: Weaving meaning from threads. Experimenting with gemini Pro2. 5. (2025). https://www.researchgate.net/profile/Elan-Moritz/publication/390527921_The_Tapestry_Metaphor_Weaving_Meaning_from_Threads_Experimenting_with_Gemini_Pro25/links/67f1e276e8041142a16a2991/The-Tapestry-Metaphor-Weaving-Meaning-from-Threads-Experimenting-with-Gemini-Pro25.pdf
- 58.Comanici, G. et al. Gemini 2.5: pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities. Preprint at.10.48550/arXiv.2507.06261 (2025). [Google Scholar]
- 59.OpenAI et al. GPT-4o system card. Preprint at.10.48550/arXiv.2410.21276 (2024). [Google Scholar]
- 60.Wang, Y. et al. AICrypto: A comprehensive benchmark for evaluating cryptography capabilities of large Language models. Preprint at.10.48550/arXiv.2507.09580 (2025). [Google Scholar]
- 61.Qiu, Y. et al. Human-aligned bench: Fine-grained assessment of reasoning ability in MLLMs vs. Hum. Preprint at.10.48550/arXiv.2505.11141 (2025). [Google Scholar]
- 62.Suzuki, K. Claude 3.5 sonnet indicated improved TNM classification on radiology report of pancreatic cancer. Jpn J. Radiol.43, 56–57 (2025). [DOI] [PubMed] [Google Scholar]
- 63.Caplan, R. D. & Van Harrison, R. Person-environment fit theory: some history, recent developments, and future directions. J. Soc. Issues. 49, 253–275 (1993). [Google Scholar]
- 64.Xu, L., Zhang, Y., Li, F. & Yin, J. Perceptual difference of urban public spaces between design professionals and ‘laypersons’: Evidence, health implications and ready-made urban design templates. Indoor Built Environ.10.1177/1420326X221116318 (2022).38603046 [Google Scholar]
- 65.Neilson, B. N., Craig, C. M., Travis, A. T. & Klein, M. I. A review of the limitations of attention restoration theory and the importance of its future research for the improvement of well-being in urban living. Visions Sustain.10.13135/2384-8677/3323 (2019). [Google Scholar]
- 66.Liu, Y., Zhang, J., Liu, C. & Yang, Y. A review of attention restoration theory: implications for designing restorative environments. Sustainability16, 3639 (2024). [Google Scholar]
- 67.Maslow, A. & Lewis, K. J. Maslow’s hierarchy of needs. Salenger Incorporated. 14, 987–990 (1987). [Google Scholar]
- 68.Shafique, A. Hierarchy of user’s need for Spatial organisation in public open spaces. Eur. J. Archit. Urban Plann.3, 1–8 (2024). [Google Scholar]
- 69.Friedkin, N. A test of structural features of granovetter’s strength of weak ties theory. Social Networks. 2, 411–422 (1980). [Google Scholar]
- 70.Markoç, İ. Twitter in the context of oldenburg’s third place theory. IBAD 79–89. 10.21733/ibad.610335 (2019).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author(C.G. 2025102090012@whu.edu.cn) upon reasonable request due to privacy concerns related to the industrial park.






