Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Feb 25;15:6768. doi: 10.1038/s41598-025-90783-w

Social implications of coexistence of CAVs and human drivers in the context of route choice

Grzegorz Jamróz 1,, Ahmet Onur Akman 2, Anastasia Psarou 2, Zoltán György Varga 2, Rafał Kucharski 1
PMCID: PMC11862007  PMID: 40000751

Abstract

Suppose in a stable urban traffic system populated only by human driven vehicles (HDVs), a given proportion (e.g. Inline graphic) is replaced by a fleet of Connected and Autonomous Vehicles (CAVs), which share information and pursue a collective goal. Suppose these vehicles are centrally coordinated and differ from HDVs only by their collective capacities allowing them to make more efficient routing decisions before the travel on a given day begins. Suppose there is a choice between two routes and every day each driver makes a decision which route to take. Human drivers maximize their utility. CAVs might optimize different goals, such as the total travel time of the fleet. We show that in this plausible futuristic setting, the strategy CAVs are allowed to adopt may result in human drivers either benefitting or being systematically disadvantaged and urban networks becoming more or less optimal. Consequently, some regulatory measures might become indispensable.

Subject terms: Computational science, Civil engineering

Introduction

Which route should I take? Millions of people commuting to work by car face this dilemma every day44. In urban settings the choice is not straightforward as there are usually multiple viable alternatives. In fact, the reasons we select a given route might be very complex3,5 ranging from habitual choice or everyday exploration in order to identify the best alternative to anticipating decisions of others. Moreover, people are often very different and might prefer different options in the same situation or behave seemingly irrationally28. Suppose now that in a future urban traffic system with stable drivers’ choice strategies a proportion of human drivers (HDVs) is replaced by intelligent vehicles (CAVs) which share information and make collective route choices based on one of the pre-defined collective fleet strategies:

  • Selfish (minimization of CAVs’ collective travel time),

  • Altruistic (minimization of HDVs’ mean travel time),

  • Social (minimization of the mean travel time of all vehicles in the system),

  • Malicious (aiming to maximize HDVs’ mean travel time),

  • Disruptive (maximization of HDVs’ travel time at a bounded own cost).

Will, once the system has stabilized again after such disruption, the route preferences of CAVs and HDVs be different? Will CAVs be better off than the HDVs they replaced? And, crucially, could the human drivers be significantly disadvantaged or the system-wide travel times deteriorate?

In this paper we set out to study these fundamental questions using mathematical models and simulations, see Fig. 2. Focusing on the two-route bottleneck settings, Fig. 1, which are often present in real systems20,34, we discover that:

  • The choices of CAVs that replace a given share of HDVs differ significantly from the choices of the remaining HDVs.

  • In different scenarios the average travel time of both HDVs and CAVs may increase or decrease, Fig. 2.

  • If the fleet of CAVs applies the selfish strategy, it may improve its collective travel time at a cost to human drivers when the share of CAVs is small.

  • For a large share of CAVs, the selfish or social strategies of CAVs may result in improvement of travel times for all the drivers. This, however, comes at a price of reduced equity.

  • Human driver populations with low perception bias may be less prone to exploitation by intelligent fleets of CAVs than more diverse and less optimal populations.

  • Heavily congested systems, where the choices of HDVs and CAVs tend to be similar, may be less susceptible to exploitation by CAVs. Contrariwise, uncongested networks could be easily exploited by machines.

  • More elaborate, e.g. malicious, CAV strategies may result in oscillations and significant deterioration of driving conditions for all the drivers.

  • Allowed dynamical switching between HDV and CAV may result in various impact on travel times, advantageous or disadvantageous to both HDVs and CAVs.

  • If all agents are free to choose between HDV and CAV, malicious and disruptive fleet strategies may result in higher eventual market share of CAVs and may become preferred choices of fleet operators.

These conclusions seem to have been missing in the literature dealing with CAV - HDV interaction and constitute our original contribution to the subject. We obtain them from simulations by comparing the properties of the two-route bottleneck system before and after the introduction of CAVs.

Fig. 2.

Fig. 2

The learning and decision processes applied by human drivers (HDVs) and machines (CAVs). HDVs’ reasoning is subjective and based on limited access to information. Contrariwise, CAVs have access to complete information on travel times and make optimal collective routing decisions. The interaction between human agents and CAVs may result in any combination of human drivers and CAVs being better off or worse off subject to the strategy applied by CAVs. In particular, the system-wide welfare may improve or deteriorate in the wake of introduction of CAVs.

Fig. 1.

Fig. 1

A two-route bottleneck in a city. To reach the other side of the river the drivers have to choose between the alternatives A and B. The everyday choice to minimize travel time can be understood as a repeated game between multiple participants striving to find the option which maximizes a driver’s utility.

The standard econometric framework used to quantify choice is the expected utility theory41, which posits that people choose the alternative with the highest expected utility. In the route choice setting with no access to external sources of information, the main component of utility is the predicted travel time8:

graphic file with name M2.gif 1

where Inline graphic is the utility of route r and Inline graphic is the expected (by a given agent) travel time on route r. If other factors are negligible, the rational HDV choice is to select the route with the highest utility, which corresponds to the shortest expected travel time. In the case of bottlenecks with two alternatives A and B, Fig. 1, this amounts to choosing

graphic file with name M5.gif 2

Transport systems analysts typically assume that the system is in or close to equilibrium43. This means that the numbers of drivers traveling along alternative routes within a given time interval, e.g. the morning peak hour, are stable across consecutive days. This also implies stability of travel times (which may be assumed to depend monotonically, via the BPR11 function, on the number of drivers, see Methods) on different routes.

The most classical and widely-accepted traffic equilibrium, postulated by Wardrop52, occurs when no single driver, who is assumed to have infinitesimal influence on the system as a whole, has an incentive to swap routes provided other drivers do not modify their choices the following day. Quantitatively, the drivers are assumed to make choices according to formula (2), where Inline graphic are equilibrium travel times52. This so-called User Equilibrium (UE), is reminiscent of Nash equilibrium40 in game theory and in simple settings can be explicitly computed. When the number of agents is finite, however, the setting becomes an atomic congestion game which is inherently unstable1,30, see also Appendix, and often admits multiple Nash equilibria,55.

A more realistic setting, adopted in our study, assumes that there exist other components of utility in Eq. (1), such as tastes or fluctuations in driving conditions, which are incorporated via formula

graphic file with name M7.gif 3

where Inline graphic are random variables. This setting, the subject of random utility theory39,49 implies that, for Inline graphic independent identically distributed Gumbel variables commonly used in the field of transportation (see Methods), the expected proportion of drivers choosing alternative A is given by the logit formula:

graphic file with name M10.gif 4

which is pervasive in transport modelling14. In (4), Inline graphic is the spread of subjective HDV tastes (perception bias). Low spread corresponds to HDVs preferring routes close to optimal in terms of travel time. High spread makes the choices more random. Assuming that the number of vehicles is very large, Daganzo and Sheffi17 postulated the so-called stochastic user equilibrium (SUE), in which no agent believes they can improve their travel time by unilaterally changing routes.

Fast-forward to 2024, the logit choice, based on Gumbel-distributed random terms in (3), and its variants6 is still the most popular family of human route choice models, see also10,23,36,54 for other approaches. Accordingly, we adopt a plausible logit-type model, called Inline graphic-Gumbel, in this paper, see Methods, noting that for normally distributed error terms the results are similar, see Appendix. Importantly, the logit choice formula can be derived not only based on the error in perceived utility as per Daganzo and Sheffi17 but also within the more recent framework of rational inattention22,38. The equilibrium notions of SUE as well as UE, see also BRUE19,37, however, seem to be poorly suited to more realistic state-of-the-art models of multi-agent simulations, initiated 40 years ago by Horowitz31 and employed in this paper. Therefore, instead of assuming that the system is strictly in equilibrium, such as SUE, we study experimentally systems which stabilize, see Appendix, bearing in mind that the stable states can be very complex15,53 or nonunique47.

As the system is not exactly in equilibrium12, the drivers do not know precisely the travel times they will experience selecting different alternatives. Therefore, Inline graphic in formula (3) can only be approximate and we assume that every driver adjusts (in their minds) these estimates every day. There are various mechanism by which the human agents may adjust their day-to-day route choices1. In this paper we only consider the most popular mechanism called, depending on the source, exponential filter or Gawron/Horowitz/Erev-Roth learning13,21,26,31, omitting explicit modeling of habitual choice or bounded rationality13,37,54 or direct anticipation of decisions of others based on game theory1,45. We assume, namely, that every driver maintains implicitly/subconsciously estimates of travel times on alternative routes and these estimates are updated daily by combining previous knowledge and most recent travel times. There exist two basic mechanisms, experience only and full information as well as a whole spectrum of models, where only partial information is available24,38. In this paper, we focus on the experience only mechanism, in which human drivers’ knowledge is updated based on the experienced travel times only and there is no access to past or real-time travel times on alternative routes.

In our simulations, the human-only system stabilizes as a result of human learning and adjustment. Once this has happened, a given share of HDVs is replaced by a fleet of CAVs which is centrally controlled and pursues a pre-defined collective goal. The main line of experiments considers a fixed CAV share while one experiment adopts a simplified model of drivers switching between HDVs and CAVs. Although the switching behaviour is poorly understood, and more research is necessary, see however7, this experiment highlights possible dynamic outcomes of coexistence of fleets and HDVs.

The considered behaviours of the fleet can be broadly classified as optimizing, indifferent or degrading along two dimensions: attitude towards the fleet itself and attitude towards human drivers, see Table 1. Out of nine possibilities four seem to be irrelevant as they either imply self-harming behaviour of the fleet, which does not seem to be justifiable in any realistic scenario, or indifferent in both dimensions which is equivalent to completely random and is not a behaviour of a coordinated fleet. The remaining five could be conceived as follows. Social – optimizes travel times of all the drivers thus striving to minimize costs system-wide, which is desirable from the city point of view. Altruistic – might occur when the travel times of fleet vehicles are irrelevant, e.g. when the vehicles transport non-perishable goods, however reducing the travel time of vehicles driven by people plays a pivotal role. Selfish – the default case when the fleet aims to reduce its own costs. Malicious – could occur when the long-term goal of the fleet is to drive other vehicles out of a given part of the road network in order to later control it. Disruptive – a more realistic variant of malicious when the fleet’s budget is limited and the fleet balances between harm done to HDVs and own cost.

Table 1.

Classification of fleet behaviours.

Attitude
Towards HDVs
Attitude Towards CAVs
Optimizing Indifferent Degrading
Optimizing Social Altruistic Inline graphic
Indifferent Selfish Inline graphic Inline graphic
Degrading Disruptive Malicious Inline graphic

Once the goal is set, we assume that, every day, the fleet operator decides how many CAVs will be routed via each alternative. Once this decision has been made the CAVs set off onto the prescribed routes and, during the process of driving, behave similarly to HDVs. In particular, we assume that CAVs do not utilize more efficient driving techniques such as platooning35,51. The only aspect differentienting CAVs from HDVs that we consider in this paper is collective route choice based on superior access to information about the system and prediction of human drivers’ behaviour. Once the modified system has stabilized (in most cases) again, we compare the statistics of the system before and after the introduction of CAVs and reach our conclusions.

Let us note that similar frameworks under the name of guidance systems, Advanced Travel Information Systems (ATIS) or Stackelberg congestion games29,45,50,56,57 have been considered in the literature. However, in contrast to them our goal is to demonstrate a range of outcomes with emphasis on the ordinary human driver as well as system-wide welfare when confronted with a centrally-guided fleet of CAVs rather than to show how the traffic system could be made more efficient or brought closer to system optimum, compare33,58. Furthermore, we explicitly consider the decision process and gradual adaptation of human drivers as opposed to a typical Stackelberg game setting57 of a Cournot-Nash company with market power and individual rational price-takers. We also treat human drivers as separate entities with different tastes who take time to adapt without aggregating them into a single User Equilibrium player which can instantaneously arrive at an optimal equilibrium assignment50. Moreover, in contrast to the repeated game setting typical in reinforcement learning48, we assume that human agents only take myopic decisions to minimize the current perceived travel time without optimizing their long-term pay-offs. Finally, our point of view is distinct regarding the CAVs. Namely, the fleet of CAVs, even if it represents a robo-taxi company carrying people who switched from their own cars or caters for people who have subscribed to a collective route guidance system (like online routing services), is a separate entity with its own target which might diverge from the goal of city authorities or even be a reflection of hidden hostile motives. In this vein, we do not assume that the system necessarily stabilizes after introduction of CAVs. Indeed, for some fleet strategies, as we demonstrate, a stable state is an undesirable feature, and keeping the system away from it allows the fleet of CAVs to maximize its specific collective target, compare2 for more general multi-agent targets. Finally, the fleet has full information regarding the system travel times and can predict how many HDVs will choose every alternative before making their own routing decision, see also2 for reinforcement learning-based city-scale scenarios.

Results

In the main experiment we study the long-term consequences of different proportions of HDVs becoming a centrally-coordinated fleet of CAVs in our two-route scenario. We compare the choices and travel times of HDVs and CAVs and summarize the results in Figs. 3, 4, 5, 6. In the second experiment, Fig. 7, we examine the dependence of the results on perception bias of human agents. In the third experiment, Fig. 8, we study how the results depend on congestion. In the last experiment, Figs. 9, 10 we study the dynamics of decisions and their long-term consequences by allowing drivers to switch between CAVs and HDVs.

Fig. 3.

Fig. 3

Kernel density estimations of mean travel times of HDVs and CAVs for different CAV shares and strategies, based on the final 100 days of the simulation. Every subpanel shows two probability distributions: for HDVs (teal) and CAVs (red), each of which integrates to 1, even after convolving with a kernel. The scaling of the vertical and horizontal axes varies across subpanels. In the selfish strategy the CAVs experience shorter while HDVs longer travel times for small CAV shares compared to the situation before CAV introduction. For larger CAV shares both groups’ travel times improve, with HDVs gaining more. In the altruistic and social strategies the travel times of CAVs increase and those of HDVs decrease, compared to the travel times before the introduction of CAVs except for the social case with very large shares of CAVs when both groups’ travel times decrease. The malicious and disruptive strategies are similar to selfish for small CAV shares however they may cause oscillations and lead to increased travel times of all the vehicles for large CAV shares.

Fig. 4.

Fig. 4

Comparison of average fractions of CAVs and HDVs on route A for different CAV shares. In the selfish scenario, all CAVs are routed via A for small CAV shares and this fraction decreases for increasing CAV share. The fraction of HDVs on A remains relatively stable. In the social scenario the tendency is exactly opposite, i.e., all CAVs are routed via B (corresponding to fraction 0 on A) for small CAV shares. Altruistic CAVs are all routed via A while HDVs strongly prefer B. Malicious CAVs behave similarly to selfish CAVs for small shares, however, for large shares their strategy entails routing more vehicles, on average, via A. Finally, the disruptive strategy is, in terms of fractions, on average similar to the malicious strategy.

Fig. 5.

Fig. 5

Outcomes of replacing a fraction of HDVs by CAVs resulting in different CAV shares for baseline HDV perception bias. The outcomes are quantified as the following ratios. Panel CAV advantage – ratio Inline graphic: the ratio of mean HDV travel time averaged over days Inline graphic and mean CAV travel time averaged over days Inline graphic. If Inline graphic it is better to be CAV than HDV after M-day. Panel Effect of changing to CAV – ratio Inline graphic: the ratio of mean HDV travel time averaged over days Inline graphic and mean CAV travel time averaged over days Inline graphic. If Inline graphic, the vehicle which switched from HDV to CAV experiences on average shorter travel times. Panel Effect of remaining HDV – ratio Inline graphic: the ratio of mean HDV travel time averaged over days Inline graphic and mean HDV travel time averaged over days Inline graphic. If Inline graphic, the vehicle which remained HDV after M-day experiences on average shorter travel times. Panel Perceived effect of remaining HDV – ratio Inline graphic: the ratio of mean perceived HDV travel time averaged over days Inline graphic and mean perceived HDV travel time averaged over days Inline graphic. If Inline graphic, the vehicles which remained HDVs after M-day experience on average better perceived travel times.

Fig. 6.

Fig. 6

Optimality gap (distance from the system optimum, in which the mean travel time of all the drivers is lowest) and equity gap (standard deviation of travel times in the system) for different CAV shares and strategies. The optimality gap is 0 for CAV shares large enough in the social strategy when the system reaches optimum, this however entails considerable equity gap. The selfish strategy is similar to the social one. The altruistic strategy results in large optimality and equity gaps for high CAV shares. Malicious and disruptive strategies exhibit varying optimality and equity gaps.

Fig. 7.

Fig. 7

Positive and negative consequences of introduction of CAVs for the selfish CAV strategy and different fleet shares and spread of human preferences (perception bias), compare Fig. 5. CAV advantage (left) is particularly high for high human bias and low fleet shares. The situation is opposite for high fleet share and low spread. Effect of changing to CAV (middle-left) is virtually always beneficial. Effect of remaining HDV (middle-right) is beneficial primarily for low spread and high fleet shares. Otherwise it could be slightly negative (middle-right and right).

Fig. 8.

Fig. 8

Outcomes of replacing a fraction of HDVs by CAVs for the selfish strategy and different congestion levels C (where Inline graphic is the total number of drivers in the system) based on CAV advantage, Effect of changing to CAV, Effect of remaining HDV and Perceived effect of remaining HDV, see Fig. 5. For small congestion levels the effect or remaining HDV is negative and this negativity deepens with increasing CAV share. Contrariwise, the effect of changing to CAV is positive. Very high congestion levels result in the agents being indifferent to whether they change into a CAV or remain HDV. For intermediate congestion, CAV advantage is negative and Effect of remaining HDV positive, the more so the larger the CAV share.

Fig. 9.

Fig. 9

Final CAV share (CAV share averaged over days 301-400) as a function of initial CAV share (on day 201) for various fleet discount factors when switching HDV Inline graphic CAV is allowed. Malicious and disruptive strategies seem to be the most successful for fleet operators across a range of parameters as they result in highest average final CAV shares.

Fig. 10.

Fig. 10

Mean travel times of HDVs averaged over days 101-200 (HDV before) or 301-400 (HDV after) and of CAVs averaged over days 301-400 (CAV after) for selfish fleet and all agents allowed to switch between HDV and CAV.

Experimental setting

In the experiments, run in a custom simulation software, we let the system composed of only HDVs stabilize and, after 200 days (on M-day) we replace a given share of HDVs by CAVs. We study the system purely experimentally in the stable regime of parameters summarized in Table 3, see Appendix for experiments motivating this choice. For human drivers we assume the Inline graphic-Gumbel model, see Methods. For CAVs, we consider five possible strategies (see Tables 1, 2 and Methods). After M-day, we run the simulation for another 100 days, see Fig. 2, after which we record HDV and CAV travel times and flows (vehicle counts) on both routes and compare them to the respective values before M-day. We distinguish five phases:

  • Days 1-100: Stabilization of HDV-only system composed of, by default, 1000 drivers.

  • Days 101-200: Stable state in which we capture various statistics for HDVs.

  • Day 200 (M-day): a given share of HDVs is replaced by a centrally-coordinated fleet of CAVs.

  • Days 201 - 300: Stabilization of the system in the new reality.

  • Days 301 - 400: Stabilized (for most cases) state in which we compute the same statistics, this time for both HDVs and CAVs. We compare them to each other as well as to the statistics from days 101-200.

Are route choices of HDVs and CAVs similar?

Table 3.

Parameters used in the experiments on coexistence of HDVs and CAVs.

Parameter Default Value Alternative Values
HDV Model Type Inline graphic-Gumbel -
Human perception spread (Inline graphic) 5.0 0.01 – 1000.0
Initial HDV Knowledge Free Flow -
Initial HDV Choice Random -
HDV learning rate (Inline graphic) 0.2 -
HDV exploration rate (Inline graphic) 0.1 -
HDV Learning From Experience Only True -
CAV optimization strategy Selfish Malicious, Disruptive, Altruistic, Social
CAV share 0.0 Inline graphic
Fleet discount factor 1.0 0.9, 0.8, 0.5, 0.3
Congestion 1.0 0.25 – 2.6

Table 2.

CAV fleet optimization targets used in our experiments.

Inline graphic Inline graphic Interpretation Optimization target
1 0 Selfish minimize only CAV travel time
0 1 Altruistic minimize only HDV travel time
0 Inline graphic Malicious maximize HDV travel time
1 Inline graphic Disruptive maximize travel time for HDV and minimize for CAV
1 1 Social minimize total travel time (system optimum)

Figure 4 shows that they are quite different. In the selfish scenario, for instance, the fraction of CAVs choosing A is 1 for low CAV shares and it decreases to around 0.6 for share 1.0, which corresponds to system optimum. Fraction of HDVs choosing A, on the other hand, seems to increase for increasing CAV share. In the social case, we observe an exactly opposite tendency, with all CAVs selecting B for low CAV shares. In the altruistic case, virtually all CAVs select the route with longer travel time. In the malicious and disruptive cases, the differences between HDV and CAV choice patterns are also considerable.

Overall, we conclude that the routing choices made by fleets of CAVs differ substantially from those made by HDVs accross various CAV strategies. This changes the system and affects HDVs’ travel times.

Are HDVs better off after the introduction of CAVs? Are CAVs better off than HDVs?

We consider the following statistics (see Methods):

  • mean travel time of HDVs averaged over days 101-200, i.e. before introduction of CAVs (Inline graphic),

  • mean travel time of HDVs averaged over days 301-400 (Inline graphic),

  • mean perceived travel time of HDVs averaged over days 101-200 (Inline graphic).

  • mean perceived travel time of HDVs averaged over days 301-400 (u),

  • mean travel time of CAVs averaged over days 301-400 (Inline graphic),

Defining CAV advantage Inline graphic/Inline graphic, Effect of changing to CAV Inline graphic/Inline graphic, Effect of remaining HDV Inline graphic/Inline graphic and Perceived effect of remaining HDV Inline graphic/u we discover that (Fig. 5):

  • For modest CAV shares, CAVs are better off compared to HDVs before M-day (Effect of changing to CAV Inline graphic) in the selfish, malicious and disruptive scenarios, while HDVs are worse off (Effect of remaining HDV Inline graphic). Consequently, there seem to exist scenarios in which CAVs improve their total travel time at a cost to HDVs, see also Fig. 8. The effect is opposite in the social and altruistic strategies, where CAVs bear the cost of improving the driving conditions for HDVs or for the entire system, compare Fig. 6.

  • Larger shares of CAVs render selfish and especially malicious and disruptive strategies costly to CAVs (CAV advantage as well as Effect of changing to CAV drop below 1). The high cost of altruistic strategy skyrockets while the social strategy becomes more and more cost-effective as CAV share increases. Larger CAV shares result also in oscillations in the system for malicious and disruptive strategies, confirmed by the bimodal distribution in Fig. 3, see also Suppl. Fig. 9 for more details.

  • The influence of M-day on mean perceived travel times is in general very similar to the influence on actual travel times (the panels Effect of remaining HDV and Perceived Effect of remaining HDV are similar), compare however Appendix Section 1.5 and Suppl. Fig. 12 for cases where there is a difference between the two.

To summarize, CAVs and HDVs might be both better off and worse off compared to HDVs before introduction of CAVs and the outcome depends on the share and strategy of CAVs. In particular, for certain feasible combinations of parameters, the most disturbing scenario when CAVs gain and HDVs are disadvantaged may occur.

Is the system closer to optimum?

Fig. 6 demonstrates that the social strategy reduces the optimality gap for the price of increased equity gap. The selfish strategy makes the system less efficient for low CAV shares and more efficient for large CAV shares. The altruistic strategy is very inefficient and inequitable and the same, to a lesser degree, applies to malicious and disruptive strategies.

Does perception bias of HDVs influence the reaction of the system to introduction of CAVs?

In this experiment we vary the spread of human preferences Inline graphic. Small Inline graphic corresponds to unbiased human behaviour (choosing the predicted faster route) while large Inline graphic makes HDV choices more random because of large spread of subjective preferences. Considering only the selfish CAV strategy (see Appendix Suppl. Fig. 12 for the remaining strategies) we conclude that, Fig. 7,

  • More biased (large Inline graphic) HDV choices allow CAVs to decrease their travel time after M-day, especially for small shares of CAVs. The impact on HDVs’ travel times tends to be slightly negative.

  • Less biased (small Inline graphic) HDV choices in combination with the selfish CAV strategy result in improvement of driving conditions for both types of agents, with HDVs gaining the most. This is particularly visible for larger shares of CAVs.

Does congestion in the system influence the consequences of introduction of CAVs?

In this experiment, we vary congestion levels, keeping human drivers’ perception bias at the baseline level and letting CAVs apply the selfish strategy. We conclude that, Fig. 8:

  • Modest congestion lets CAVs gain considerably in the selfish strategy at a cost to human drivers. The negative impact on HDV travel times increases as CAV share increases. Heavy traffic, on the other hand, makes the system more rigid, precluding any substantial gain in CAV travel time.

  • In the intermediate congestion regime, CAVs are better off in terms of travel times than HDVs, with the effect more pronounced for higher CAV shares.

What are the results of introduction of CAVs when, after the initial shock, all agents are allowed to freely switch between HDV and CAV?

In this experiment, after the initial shock of a given share of HDVs becoming CAVs, all the drivers are free to switch between HDVs and CAVs based on a simplified model described in detail in Methods. The probability of switching on a given day depends on the perceived travel time a given driver expects to achieve by using an HDV or CAV. To account for a certain inconvenience/cost related to switching there is a threshold of potential gains below which the drivers do not consider switching. Furthermore, to account for additional benefits (e.g. no need to drive) of using a CAV, not explicitly modeled here, we consider a range of fleet discount factors by which the perceived travel time is multiplied when an agent is (or considers to be) CAV. Value 1.0, corresponds to no extra benefits, while values 0.9, 0.8, 0.5, 0.3 range from modest to considerable extra benefits. Fig. 9 shows the dependence of the average final CAV share (mean over days 301-400) on the initial CAV share (on day 201) for different discount factors. We conclude that:

  • Final CAV share is only weakly dependent on initial CAV share, however it inversely correlates with fleet discount factor.

  • Malicious and disruptive strategies seem to be the most successful from the evolutionary point of view. Consequently, a fleet operator could be inclined to use those strategies instead of the standard selfish one, to maximize its market share.

  • Social and selfish strategies are similar with selfish faring slightly better while altruistic strategy is, unsurpringly, the least successful for fleet operators.

Fig. 10 shows the comparison of mean travel times of HDVs and CAVs in the selfish scenario (note that the perceived travel times are not considered due to no straightforward interpretation of them when agents may switch multiple times). One can notice that for different fleet discount factors and initial CAV shares:

  • HDVs may be worse off and CAVs better off in terms of travel times (fleet discount 1.0 and large initial CAV share),

  • HDVs may be better off and CAVs worse off (fleet discount 0.8), see also altruistic case in Suppl. Fig. 13.

  • Both HDVs and CAVs may be better off (discount 0.3),

  • Both HDVs and CAVs may be worse off (e.g. malicious or disruptive strategy, discount 0.8 or 0.9, see Appendix, Suppl. Fig. 13).

Notably, the cases with HDVs better off and CAVs worse off in terms of travel times for selfish strategy and discount 0.3 or 0.5 result in large final CAV shares (Fig. 9). The large fixed CAV shares in Figs. 3, 5 exhibit similar outcomes. Similarly, discount factor 1.0 for selfish strategy results typically in final CAV share around 0.2 (Fig. 9), for which HDVs are worse off and CAVs better off (Fig. 10). The same happens if fleet share is fixed at 0.2, Fig. 3. This suggests that if fleet size stabilizes, studying the fixed fleet size cases may provide hints as to the long-term behaviour of variable fleet size cases, although the connection is not straightforward and requires further research.

Discussion

In this research we provide evidence for the existence of certain phenomena emerging from HDV-CAV interaction in the context of route choice which are of paramount significance for the performance of future urban systems. Our abstract models deliberately reduce the complexity of the problem in order to highlight these typical phenomena, which are likely to be even more pronounced in real urban mobility systems. To achieve this, we abstract reality at three main levels: network topology, traffic flow, human route-choice decision process.

  • Network. In the complex networks of real megacities the number of available routes is huge and, consequently, the everyday action space for CAVs is much larger. It might include route alternatives not considered by humans in their choice process25. We argue that if CAVs, like here, manage to identify effective strategies in two-route abstract networks, they are likely to identify them within more complex topologies. On the other hand, even very dense and complex urban road network topologies tend to have bottlenecks, where demand exceeds capacity. Consequently, the competitive strategy of route-choice is to exploit the capacity at isolated bottlenecks27,32,34, which might resemble the two route scenario considered in the paper.

  • Traffic flow. The traffic flow, in reality non-deterministic, highly variable, controlled by static and adaptive traffic lights and with diverse microscopic phenomena such as platooning, accidents, slow-vehicles and driver errors4,46 is only coarsely approximated with static BPR functions. We argue, however, that if machines manage to identify better routes in static analytical models with strictly increasing travel times (BPR), they will find even more opportunities in fluctuating, sensitive to outliers and non-continuous granular spatiotemporal patterns of real traffic flow9,42.

  • Human decision process. The long postulated Nash Equilibrium in traffic networks seems to be over-optimistic and hardly observed empirically. Here, we applied machine actions in a system much more equilibrated than observed in the real world59, compare Appendix. We argue that, instead, real systems are ensembles of various agents with different motives, utilities, capabilities and taste heterogeneities as in the seminal El Farol bar paper3. Consequently, the agents are much more diverse than the rigid utility maximisers considered here, which is likely to facilitate the task of CAVs.

In all these aspects our models seem to be more restrictive for CAVs than real world and yet we were able to clearly reveal the disturbing phenomena. Hence, the results might be even sharper when, as it is in real cities, the network topology and traffic flow are complex, humans are even less optimal or homogenous and advanced machine learning is used to optimize CAV strategies. On the other hand, in the real world, human drivers might have better access to information facilitated by new technologies. Moreover, we modeled travel times by simple analytical BPR functions which are easily optimized by machines. In reality, CAVs will not have such precise information about the system and their optimization is likely to be based on reinforcement learning2 and high performance computing.

The advantages of CAVs visible in our experiments can be summarized as follows.

  • Advantage by collective decision taking, e.g. strategies that improve the average travel time of the fleet or of the system, which are hardly possible if every agent, like humans, is independent.

  • Advantage by better access to information and information sharing, e.g. perfect understanding of the characteristics of the traffic system.

  • Advantage by advanced processing and optimization capabilities, e.g. human behaviour modeling, human choice prediction.

  • Advantage by lack of perception error, i.e. decisions based on actual as opposed to perceived travel times.

  • Advantage by instantaneous adaptation, which allows the machines to keep the system out of equilibrium and exploit slower human drivers’ adaptation, as is the case for malicious and disruptive strategies, see also Appendix.

These sources of advantage enable more efficient CAV routing decisions. In the default selfish case the CAVs outperform human drivers by selecting on average faster routes for small CAV market shares, Figs. 3, 5. For large market shares, the impact is more complex. Namely, the CAVs obtain better travel times than travel times of HDVs they replaced, however the driving conditions for HDVs improve even more, Fig. 2. This is due to the fact that they bring the system closer to optimum which involves different travel times on routes. The tipping poing is around Inline graphic, Fig. 5, for the default moderate congestion levels and spread of human preferences. System-wise, collective strategies of CAVs, even if they are selfish may reduce the mean travel time (see optimality gap, Fig. 6), reducing e.g. Inline graphic emissions and noise16. Other strategies, notably malicious and altruistic for large CAV shares, may increase the optimality gap, reducing the liveability of cities and sustainability of urban driving.

To summarize, CAV fleets will transform urban traffic systems. One of the aspects in which this will manifest itself will be route choice. The impact on the human drivers and urban welfare will depend on the strategies CAVs are allowed to adopt and the potential switching behaviour of drivers, who will be able to join the fleet or switch back to HDV. For instance, for the outright malicious CAV fixed-size fleet strategy, the driving conditions will deteriorate for everyone. At the other end of the spectrum, the altruistic strategy might bring huge benefits to the HDVs which remain in the system.

Non-standard strategies aside, however, our results indicate that even in the most straightforward scenarios with modest shares of CAVs minimizing their collective travel time the remaining human drivers might become disadvantaged as a side-effect. Do we want this?

Methods

We run agent-based simulations, where human drivers are modelled as independent heterogenous rational utility maximizers who learn from experience to maximize expected utility. They share the network, where congestion is modelled with a static BPR function, with CAVs, whose behaviour in the traffic is the same as HDVs’, except for routing. The focus is on representing collective route choices of CAVs while HDVs’ choice is based on adaptation of standard models well-established in the literature, see Introdution.

Traffic Networks

We abstract the traffic network to two independent non-overlapping routes, A and B, connecting one pair Origin-Destination. The travel times are functions of flow, given by the static BPR-type function11,

graphic file with name M61.gif 5

where Inline graphic, for Inline graphic, is the free-flow travel time, which a traveller would experience travelling on an empty road. Inline graphic is the capacity of the road section and Inline graphic. Finally, Inline graphic is the number of vehicles choosing route r within a given interval of time such as 1 hour.

In our setting, we assume Inline graphic, the default total number of drivers, Inline graphic and the alternative routes’ capacities and free-flow travel times are given by Inline graphic, Inline graphic, Inline graphic, Inline graphic, which corresponds to a shorter route (A) with low capacity and a longer route (B) with higher capacity. Note that when the number of cars on a given alternative exceeds its capacity the travel time rises steeply as in reality27.

Human Learning

Let Inline graphic denote the predictions (estimates) by human agent i of travel times on routes A and B, respectively. Suppose that, on a given day, agent i travelled along route r(i) and experienced travel time Inline graphic. Then the predicted travel times are adjusted by

graphic file with name M75.gif 6

Above, Inline graphic is the learning rate which specifies the relative weight of the most recent experience. For a typical value of Inline graphic, the new estimate of travel time is made up of Inline graphic most recent experience and Inline graphic previous estimate. Crucially, the estimate of travel time along the unused alternative remains unaltered.

Human Choice

The human agents make choices according to the Inline graphic-Gumbel model, which we introduce for our setting. In this model, following Eq. (3), we assume that

graphic file with name M81.gif 7

is the perceived utility of alternative r to agent i. Predicted travel times Inline graphic are updated daily by formula (6) while Inline graphic is a fixed real number sampled once, independently for every agent i and alternative r, from distribution Inline graphic with scale Inline graphic and location Inline graphic. Here, Inline graphic is the Euler-Mascheroni constant and the cumulative distribution function of Inline graphic is given by Inline graphic. The mean of Inline graphic is 0 and variance equals Inline graphic. The Gumbel distribution is similar to the normal distribution, with heavier positive (more probable positive extreme events) and lighter negative tails, however. Its application in transportation modelling is based on convenience rather than necessity, compare Appendix Section 1.3 and Suppl. Fig. 10.

In the Inline graphic-Gumbel model every agent has fixed, pre-specified tastes related to the alternatives, which are independent of the tastes of other agents. The larger Inline graphic, which we call spread or bias, the more subjective, on average, the decisions of human agents. These decisions are based on maximizing utility (7) up to small exploration Inline graphic via formula:

graphic file with name M95.gif 8

For instance, if Inline graphic then with probability Inline graphic the route with better utility is chosen and with probability 0.1 a uniformly random decision is made. In the case of two options A and B considered in this paper this amounts to choosing the option with higher utility with probability Inline graphic and the alternative with probability Inline graphic.

Once on a given day every agent, including both HDVs and CAVs (if there are any) has made its choice, we determine Inline graphic and Inline graphic as the total number of agents choosing A and B, respectively, and use formula (5) to compute travel times Inline graphic and Inline graphic. We feed these values into Eq. (6) to update HDV estimates on the following day and close the HDV experience-learning-choice loop by incrementing the day number, see Fig. 2.

CAV Optimization

We assume that CAVs optimize their pre-defined target taking advantage of their superior knowledge about the system and human agents’ decisions. Namely, on a given day, before deciding how many CAVs to route via A and B, the fleet operator predicts perfectly the numbers of HDVs, Inline graphic, Inline graphic which intend to travel on A and B, respectively. Then, it selects the number Inline graphic of agents it routes via A such that Inline graphic, where Inline graphic is the total number of centrally controlled machine agents, and Inline graphic minimizes the target function

graphic file with name M115.gif

where

graphic file with name M116.gif

and Inline graphic, Inline graphic are given by (5). Coefficients Inline graphic, Inline graphic depend on the strategy adopted by CAVs, see Table 2 and Fig. 2. Finally, the CAVs are assigned randomly to routes so that the totals on AB equal Inline graphic, respectively.

Dynamics of switching between HDV and CAV

In the experiment with allowed switching, we assume that every agent i every day evaluates the disutility of using an HDV or CAV according to formulas:

graphic file with name M122.gif

where we used (7) and Inline graphic is the expected disutility (i.e. expected perceived travel time) of agent i belonging to the fleet, scaled by a discount factor Inline graphic which expresses the gain in perceived travel time due to factors such as no need to drive etc. As the switching Inline graphic or vice versa is not straightforward and incurs some cost we assume that an HDV (CAV) considers switching on a given day only if Inline graphic (Inline graphic), for some threshold Inline graphic, here assumed to be equal Inline graphic. The probability of switching is assumed to be equal

graphic file with name M130.gif

E.g. if Inline graphic an HDV would have Inline graphic and a CAV would not consider switching. While in the fleet, each agent keeps learning and adjusting Inline graphic as per eq. (6).

Parameters

Table 3 presents the parameters used to study the coexistence of HDVs and CAVs. The default human choice model is the Inline graphic-Gumbel model described above, see Appendix for alternatives. Parameter Inline graphic, by default equal to 5.0, accounts for reasonable spread between the alternatives. Nevertheless, we vary it in the range 0.01 - 1000.0, Fig. 7, which allows us to test the systems for very unbiased HDVs, whose utility is very close to predicted travel times, as well as systems where human tastes and, consequently, decisions look random to an external observer. Initial HDV Knowledge and Initial HDV Choice account for the initial conditions in the simulation on day 1. We assume that for every human agent i, Inline graphic for Inline graphic on day 1 (Free Flow Initial Knowledge) and the first choice r(i) (Initial HDV Choice) is Random. We note that the first choice does not significantly influence the outcomes of the simulations, see Appendix. HDV learning and exploration rates equal 0.2 and 0.1, respectively, and we assume that HDVs learn from experience only. Fleet discount factors range from 1.0 (no additional benefit of riding a CAV) to 0.3 (considerable benefits, corresponding to the perceived cost of riding a CAV equal to only Inline graphic of perceived cost of driving an HDV on the same route). Finally, the total number of vehicles is equal to Inline graphic, where the default congestion, C, is set to 1.0 resulting in 1000 agents. This level amounts to Inline graphic of the total capacity of the system, equal to Inline graphic and is moderate. However, we consider the traffic congestion from very light (Inline graphic) with 250 vehicles up to gridlocked (Inline graphic) with 2600 vehicles, Fig. 8.

Statistics used in experiments

Here we assume that before M-day there are 1000 HDVs in the system (for congestion C different from baseline the formulas are adjusted accordingly). As the characteristics of the HDVs are assigned randomly, we assume that the HDVs that are replaced by CAVs are the HDVs with the last Inline graphic indices. E.g. if Inline graphic then after M-day vehicles Inline graphic remain HDVs and vehicles Inline graphic become CAVs. Therefore, after M-day Inline graphic. Let us also denote Inline graphic.

The statistics we report in our experiments are the following. After a given day of simulation we compute:

  • Mean travel time of HDVs:
    graphic file with name M150.gif
    Importantly, the number of HDVs, Inline graphic is not constant and is equal to 1000 until M-day and Inline graphic after M-day.
  • Mean perceived travel time of HDVs:
    graphic file with name M153.gif
    where r(i) is the route chosen by agent i on a given day. Note that in contrast to mean travel time of HDVs, the mean perceived travel time before M-day is computed only for the vehicles that remain HDVs after M-day in the fixed CAV share scenarios.
  • Mean travel time of CAVs:
    graphic file with name M154.gif
  • Fractions of HDVs and CAVs on A: Inline graphic and Inline graphic, respectively.

These one-day statistics are then averaged over days 101-200 or 301-400 to Inline graphic and average fractions, see Results. The system-wide statistics are:

  • Optimality gap: Inline graphic averaged over days Inline graphic, where
    graphic file with name M160.gif
    is the mean travel time of all vehicles on a given day and Inline graphic is the least possible mean travel time (experienced in System Optimum).
  • Equity gap: Inline graphic averaged over days 301 - 400, where
    graphic file with name M163.gif
    is the standard deviation of travel times of all the drivers on a given fixed day.

Reproducibility

To verify reproducibility of the core findings, we reran the main experiment 10 times for selected CAV shares (those used in Fig. 3). We obtained statistical significance of the results presented in Fig. 2 (with Inline graphic) using the paired two-tailed t-test with nine degrees of freedom, see Appendix.

Supplementary Information

Acknowledgements

This work was financed by the European Union within the Horizon Europe Framework Programme (ERC Starting Grant COeXISTENCE no. 101075838). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

Author contributions

GJ - conceptualization, methodology, software, validation, data analysis, writing - original draft, writing - review and editing; AOA - visualization, software, data analysis, writing - review and editing; AP - visualization, software, data analysis, writing - review and editing; ZGV - visualization, data analysis, writing - review and editing; RK - conceptualization, methodology, visualization, data analysis, writing - review and editing, project administration, funding acquisition.

Code and data availability

The experiments were performed using custom light-weight simulation software, BottleCOEX, available online as a github repository at https://github.com/COeXISTENCE-PROJECT/BottleCOEX along with all the data used for the experiments described in this paper.

Declarations

Competing interests

The authors declare no competing interest.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-90783-w.

References

  • 1.Ahmad, F. & Al-Fagih, L. Travel behaviour and game theory: A review of route choice modeling behaviour. Journal of Choice Modelling50, 100472. 10.1016/j.jocm.2024.100472 (2024). [Google Scholar]
  • 2.Akman, O., Psarou, A., Varga, Z., Jamróz, G. & Kucharski, R. Impact of Collective Behaviors of Autonomous Vehicles on Urban Traffic Dynamics: A Multi-Agent Reinforcement Learning Approach. EWRL 17(2024) (2024).
  • 3.Arthur, W. B. Inductive Reasoning and Bounded Rationality. The American Economic Review84(2), 406–411 (1994). [Google Scholar]
  • 4.Avila, A. M. & Mezić, I. Data-driven analysis and forecasting of highway traffic dynamics. Nature communications11(1), 2090 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bacharach, M. The epistemic structure of a theory of a game. Theor Decis37, 7–48. 10.1007/BF01079204 (1994). [Google Scholar]
  • 6.Ben-Akiva, M. & Bierlaire, M. Discrete Choice Methods and their Applications to Short Term Travel Decisions. In: Hall, R.W. (eds) Handbook of Transportation Science. International Series in Operations Research and Management Science, vol 23. Springer, Boston, MA. 10.1007/978-1-4615-5203-1 2 (1999).
  • 7.Bitar, I., Watling, D. & Romano, R. How can autonomous road vehicles coexist with human-driven vehicles? An evolutionary-game-theoretic perspective. In Proceedings of the 8th international conference on vehicle technology and intelligent transport systems. SciTePress. (2022, April).
  • 8.Bogers, E. A. I., Bierlaire, M. & Hoogendoorn, S. P. Modeling Learning in Route Choice. Transportation Research Record2014(1), 1–8. 10.3141/2014-01 (2007). [Google Scholar]
  • 9.Çolak, S., Lima, A. & González, M. C. Understanding congested travel in urban areas. Nature communications7(1), 10793 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Daganzo, C. F. Multinomial probit: the theory and its application to demand forecasting (Academic Press, New York, 1979). [Google Scholar]
  • 11.Bureau of Public Roads. Traffic assignment manual (U.S. Dept. of Commerce, Urban Planning Division, 1964).
  • 12.Cantarella, G. E. & Cascetta, E. Dynamic Processes and Equilibrium in Transportation Networks: Towards a Unifying Theory. Transportation Science29(4), 305–329 (1995). [Google Scholar]
  • 13.Cantarella, G., Watling, D., De Luca, S. & Di Pace, R. Dynamics and Stochasticity in Transportation Systems: Tools for Transportation Network Modelling. Elsevier. (2019).
  • 14.Cascetta, E. Transportation Systems Analysis 2nd edn. (Springer, Springer Optimization and Its Applications, 2009). [Google Scholar]
  • 15.Cascetta, E. A stochastic process approach to the analysis of temporal dynamics in transportation networks. Transportation Research Part B: Methodological23(1), 1–17. 10.1016/0191-2615(89)90019-2 (1989). [Google Scholar]
  • 16.Choudhary, A. & Gokhale, S. Urban real-world driving traffic emissions during interruption and congestion. Transportation Research Part D: Transport and Environment43, 59–70 (2016). [Google Scholar]
  • 17.Daganzo, C.F. & Sheffi, Y. On Stochastic Models of Traffic Assignment. Transportation Science, INFORMS, vol. 11(3), pages 253-274, August. (1977)
  • 18.Davis, G. A. & Nihan, N. L. Large Population Approximations of a General Stochastic Traffic Assignment Model. Operations Research41(1), 169–178 (1993). [Google Scholar]
  • 19.Di, X., Liu, H., Pang, J. & Ban, X. Boundedly Rational User Equilibria (BRUE): Mathematical Formulation and Solution Sets. Procedia - Social and Behavioral Sciences80, 231–248 (2013). [Google Scholar]
  • 20.Duan, J. et al. Spatiotemporal dynamics of traffic bottlenecks yields an early signal of heavy congestions. Nature communications14(1), 8002 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Erev, I. & Roth, A. E. Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. American economic review, 848-881. (1998).
  • 22.Fosgerau, M., Emerson, M., de Palma, A. & Shum, M. Discrete choice and rational inattention: a general equivalence result. International Economic Review Vol. 61, No.4, Novembe (2020). [DOI] [PMC free article] [PubMed]
  • 23.Fosgerau, M., Paulsen, M. & Rasmussen, T. K. A perturbed utility route choice model. Transportation Research Part C: Emerging Technologies136, 103514 (2022). [Google Scholar]
  • 24.Friesz, T. L., Bernstein, D., Mehta, N. J., Tobin, R. L. & Ganjalizadeh, S. Day-To-Day Dynamic Network Disequilibria and Idealized Traveler Information Systems. Operations Research42(6), 1120–1136 (1994). [Google Scholar]
  • 25.Gao, S., Frejinger, E. & Ben-Akiva, M. Adaptive route choices in risky traffic networks: A prospect theory approach. Transportation Research Part C: Emerging Technologies18(5), 727–740. 10.1016/j.trc.2009.08.001 (2010). [Google Scholar]
  • 26.Gawron, C. Simulation-based traffic assignment - computing user equilibria in large street networks. Ph.D. Dissertation, University of Köln, Germany (1998).
  • 27.Geroliminis, N. & Daganzo, C. F. Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings. Transportation Research Part B: Methodological42(9), 759–770 (2008). [Google Scholar]
  • 28.Glimcher, P. W. Efficiently irrational: deciphering the riddle of human choice. Trends in cognitive sciences26(8), 669–687 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Harker, P. T. Multiple equilibrium behaviors on networks. Transportation science22(1), 39–46 (1988). [Google Scholar]
  • 30.Harsanyi, J. C. Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points. International journal of game theory2(1), 1–23 (1973). [Google Scholar]
  • 31.Horowitz, Joel L. The stability of stochastic equilibrium in a two-link transportation network. Transportation Research Part B: Methodological, Elsevier, vol. 18(1), pages 13-28, February. (1984).
  • 32.Ji, Y. & Geroliminis, N. On the spatial partitioning of urban transportation networks. Transportation Research Part B: Methodological46(10), 1639–1656 (2012). [Google Scholar]
  • 33.Kashmiri, F. A. & Lo, H. K. Routing of autonomous vehicles for system optimal flows and average travel time equilibrium over time. Transportation Research Part C: Emerging Technologies143, 103818 (2022). [Google Scholar]
  • 34.Leclercq, L. & Geroliminis, N. Estimating MFDs in Simple Networks with Route Choice. Procedia - Social and Behavioral Sciences80, 99–118. 10.1016/j.sbspro.2013.05.008 (2013). [Google Scholar]
  • 35.Li, Q., Chen, Z. & Li, X. A review of connected and automated vehicle platoon merging and splitting operations. IEEE Transactions on Intelligent Transportation Systems23(12), 22790–22806 (2022). [Google Scholar]
  • 36.Li, J., Wang, Z. & Nie, Y. Wardrop equilibrium can be boundedly rational: A new behavioral theory of route choice. Transportation Science10.1287/trsc.2023.0132 (2024).39700273 [Google Scholar]
  • 37.Mahmassani, H. S. & Chang, G. On Boundedly Rational User Equilibrium in Transportation Systems. Transportation Science21(2), 89–99 (1987). [Google Scholar]
  • 38.Matejka, F. & McKay, A. Rational Inattention to Discrete Choices: A new foundation for the Multinomial Logit Model. American Economic Review105, 272–98 (2015). [Google Scholar]
  • 39.McFadden D. Conditional logit analysis of qualitative choice behavior Front. Econ. (1974), pp. 105-142
  • 40.Nash, John. Non-Cooperative Games. The Annals of Mathematics54(2), 286–295 (1951). [Google Scholar]
  • 41.Von Neumann, J. & Morgenstern, O. Theory of games and economic behavior (Princeton University Press, 1944). [Google Scholar]
  • 42.Olmos, L. E., Çolak, S., Shafiei, S., Saberi, M. & González, M. C. Macroscopic dynamics and the collapse of urban traffic. Proceedings of the National Academy of Sciences115(50), 12654–12661 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Patriksson, M. The traffic assignment problem: models and methods. Courier Dover Publications. (2015).
  • 44.Prieto-Curiel, R. & Ospina, J. P. The ABC of mobility. Environment International185, 108541 (2024). [DOI] [PubMed] [Google Scholar]
  • 45.Rosenthal, R. W. A class of games possessing pure-strategy Nash equilibria. Int J Game Theory2, 65–67. 10.1007/BF01737559 (1973). [Google Scholar]
  • 46.Saberi, M. et al. A simple contagion process describes spreading of traffic jams in urban networks. Nature communications11(1), 1616 (2020). [DOI] [PMC free article] [PubMed]
  • 47.Smith, M., Hazelton, M. L., Lo, H. K., Cantarella, G. E. & Watling, D. P. The long term behaviour of day-to-day traffic assignment models. Transportmetrica A: Transport Science10(7), 647–660. 10.1080/18128602.2012.751683 (2013). [Google Scholar]
  • 48.R.S Sutton, A.G. Barto Reinforcement Learning. An introduction. 2nd edition. The MIT Press (2018).
  • 49.Thurstone, L. L. A law of comparative judgment. Psychological review101(2), 266 (1994). [Google Scholar]
  • 50.Van Vuren, T. & Watling, D. A multiple user class assignment model for route guidance. Transportation research record, 22-22 (1991).
  • 51.Wang, Z. et al. A survey on cooperative longitudinal motion control of multiple connected and automated vehicles. IEEE Intelligent Transportation Systems Magazine12(1), 4–24 (2019). [Google Scholar]
  • 52.Wardrop, J. G. Some Theoretical Aspects of Road Traffic Research. Proceedings of the Institution of Civil Engineers.1(3), 325–362 (1952). [Google Scholar]
  • 53.Watling, D. & Cantarella, G. Model Representation & Decision-Making in an Ever-Changing World: The Role of Stochastic Process Models of Transportation Systems. Networks and Spatial Economics, Springer15(3), 843–882 (2015). [Google Scholar]
  • 54.Watling, D., Rasmussen, T., Prato, C. & Nielsen, O. Stochastic user equilibrium with a bounded choice model. Transportation Research Part B: Methodological114, 254–280. 10.1016/j.trb.2018.05.004 (2018). [Google Scholar]
  • 55.Whitehead, D. The El Farol bar problem revisited: Reinforcement learning in a potential game. (2008).
  • 56.Yang, H. Multiple equilibrium behaviors and advanced traveler information systems with endogenous market penetration. Transportation Research Part B: Methodological32(3), 205–218 (1998). [Google Scholar]
  • 57.Yang, H., Zhang, X. & Meng, Q. Stackelberg games and multiple equilibrium behaviors on networks. Transportation Research Part B: Methodological41(8), 841–861. 10.1016/j.trb.2007.03.002 (2007). [Google Scholar]
  • 58.Zhang, K. & Nie, Y. M. Mitigating the impact of selfish routing: An optimal-ratio control scheme (ORCS) inspired by autonomous driving. Transportation Research Part C: Emerging Technologies87, 75–90 (2018). [Google Scholar]
  • 59.Zhu, S. & Levinson, D. Do people use the shortest path? An empirical test of Wardrop’s first principle. PloS one10(8), e0134322 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The experiments were performed using custom light-weight simulation software, BottleCOEX, available online as a github repository at https://github.com/COeXISTENCE-PROJECT/BottleCOEX along with all the data used for the experiments described in this paper.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES