An empirical data analysis of “price runs” in daily financial indices: Dynamically assessing market geometric distributional behavior

Héctor Raúl Olivares-Sánchez; Carlos Manuel Rodríguez-Martínez; Héctor Francisco Coronel-Brizio; Enrico Scalas; Thomas Henry Seligman; Alejandro Raúl Hernández-Montoya

doi:10.1371/journal.pone.0270492

. 2022 Jul 7;17(7):e0270492. doi: 10.1371/journal.pone.0270492

An empirical data analysis of “price runs” in daily financial indices: Dynamically assessing market geometric distributional behavior

Héctor Raúl Olivares-Sánchez ^1,^#, Carlos Manuel Rodríguez-Martínez ^2,^#, Héctor Francisco Coronel-Brizio ^2,^#, Enrico Scalas ^3,^#, Thomas Henry Seligman ^4,^5,^#, Alejandro Raúl Hernández-Montoya ^2,^6,^*,^#

Editor: Aurelio F Bariviera⁷

PMCID: PMC9262240 PMID: 35797336

Abstract

In financial time series there are time periods in which market indices values or assets prices increase or decrease monotonically. We call those events “price runs”, “elementary uninterrupted trends” or just “uninterrupted trends”. In this paper we study the distribution of the duration of uninterrupted trends for the daily indices DJIA, NASDAQ, IPC and Nikkei 225 during the period of time from 10/30/1978 to 08/07/2020 and we compare the simple geometric statistical model with $p = \frac{1}{2}$ consistent with the EMH to the empirical data. By a fitting procedure, it is found that the geometric distribution with parameter $p = \frac{1}{2}$ provides a good model for uninterrupted trends of short and medium duration for the more mature markets; however, longest duration events still need to be statistically characterized. Estimated values of the parameter p were also obtained and confirmed by calculating the mean value of p fluctuations from empirical data. Additionally, the observed trend duration distributions for the different studied markets are compared over time by means of the Anderson-Darling (AD) test, to the expected geometric distribution with parameter $p = \frac{1}{2}$ and to a geometric distribution with a free parameter p, making possible to assess and compare different market geometric behavior for different dates as well as to measure the fraction of time runs duration from studied markets are consistent with the geometric distribution with $p = \frac{1}{2}$ and in parametric free way.

Introduction

Financial-market analysis studies the movements of price assets and financial indices. Extracting a profit from these movements is an important activity in the financial industry; a large variety of methods that intend to predict market behavior have been developed over the years, ranging from complex mathematical models to even pseudo-scientific techniques [1]. An important approach is the statistical analysis of large sets of data, now partially available to small investors, as well, due to the increasing availability of computer power and high quality data sets. This analysis has benefited from the contributions not only from economists, but also from many physicists and mathematicians who have applied methods and ideas of probability theory and statistical physics to finance. As an academic result of these efforts, a set of universal, nontrivial statistical properties of financial historical data, persistent over time, has been observed and called “stylized facts” [2, 3].

When looking at price values of an asset on a financial time series chart, it is common to observe “price trends” in which most of the values are larger (or smaller) than the previous ones, these trends can be seen as composed by uninterrupted elementary trends, with periods in which the value increases or decreases monotonically. Trends are a popular subject within the so-called technical analysis. According to the followers of technical analysis, the chartists, patterns in the trend direction of financial data are believed to be indicators of changes in market direction and indicative of future behavior of prices. The effectiveness of this approach to financial markets is disputed and put at a stake by what is known as the Efficient Market Hypothesis (EMH), which indicates that current prices reflect available information. Elementary uninterrupted trends are the main subject of the present work, where we study empirically a basic random process consistent with the EMH allowing us to quantify trend directions in financial time series. From now on, we call these elementary uninterrupted trends only “trends”, or more specifically, “uptrends” or “downtrends”, depending of their direction. Empirical studies of financial and economic data are becoming increasingly relevant for the following reasons:

1) Currently dozens of stylized facts have been observed and more are still being discovered. 2) The study and prediction of stylized facts by means of methodologies of multi-agents market models is an important area of research in Finance and Econophysics. 3) Stylized facts are an import tool to validate proposed numerical and multi-agent market models; and 4) At present, we still lack a general, microscopic theory or model to explain the origin of stylized facts, we think simulation methodologies using agents could be useful in the construction of such a general theory. Some interesting references on these issues are the following: [3–10].

Before going further, it is necessary to present some preliminary and basic definitions. In subsections Definitions and The Efficient Market Hypothesis these definitions and other useful information will be presented. In section An ‘Efficient Market’ toy model for the distribution of run durations, a model for the distribution of trends duration will be developed consistently with the EMH. Section Data sample and methodology will explain how the data were analyzed and section Data analysis will provide an interpretation of the analysis.

Definitions

Given a financial time series of asset prices or index values, S(1), S(2), …, S(n), let X(t) = log S(t) be the logarithm of its terms, where t = 1, 2, …, n. A common quantity used to study price variations in financial time series is the log-return defined at time t as

\begin{matrix} r (t, Δ t) \approx X (t + Δ t) - X (t) \end{matrix}

(1)

for a given time sampling scale Δt. If the price variation is small, the log-return is a good approximation of the return

\begin{matrix} R (t, Δ t) = \frac{S (t + Δ t) - S (t)}{S (t)} . \end{matrix}

(2)

In this paper, we consider Δt equal to 1 day and we use the values of the indices corresponding to the close value in the investigated markets. More details on the data set will be given in section Data sample and methodology.

An elementary trend of duration k is defined as a subseries of k + 1 values within the given time series S(t) in which every value is greater (for an uptrend) or smaller or equal (for a downtrend) than the preceding one, an example of which is shown in Fig 1 for the prices of the DJIA, in a time period between October 1978 and January 1979. The duration of an elementary downward/upward trend in daily data is the number of days before the price changes direction, as the price varies, i.e. if the price does not change sign from one day to the other, the corresponding trend continues. In this figure and focusing our attention on red points, we see first an uninterrupted downtrend one day long, followed by a three days long uptrend, then a downtrend with a duration of two days, a three days long uptrend, etc. By construction, uptrends and downtrends appear alternately in the original time series S(t).

Fig 1 — The dashed line segments join the starting and ending points of each elementary trend.

Here, we present a detailed statistical study of these short elementary trends using market closing price values from four different indices over a time sampling scale of Δt = 1 day for the period between October 30, 1978 and August 07, 2020.

The Efficient Market Hypothesis

The Efficient Market Hypothesis (EMH), first stated by Eugene F. Fama in 1970 [11], claims that the market quickly finds the rational price for a traded asset [12], as the current value incorporates all possible information about the price in the future. The most important consequence of this hypothesis was shown by P. Samuelson [13] and it is the fact that the best forecast for the future price of an asset is its present price.

\begin{matrix} E (S (t + Δ t) | F_{t}) = S (t), \end{matrix}

(3)

where $E (\cdot | F_{t})$ is the conditional expectation with respect to the filtration $F_{t}$ , namely with respect to the known history up to time t. Indeed, it is easy to derive the above form of EMH starting from a simple statistical no-arbitrage argument. Suppose we have two assets, a risky one, with price S(t) and a risk-free one giving a constant interest rate r_F. To avoid arbitrage, one has to require that the expected return of the risky asset is equal to the risk-free interest rate, that is

\begin{matrix} E (R (t, Δ t) | F_{t}) = r_{F}; \end{matrix}

(4)

where R(t, Δt) was defined in Eq (2), assuming for simplicity that no dividends are paid in the time interval Δt. The latter equation immediately yields, for non vanishing S(t),

\begin{matrix} E (S (t + Δ t) | F_{t}) = (1 + r_{F}) S (t), \end{matrix}

(5)

which reduces to Eq (3) for r_F = 0. Eqs (3) and (5), jointly with the integrability of the process S(t), are known as martingale and sub-martingale conditions (remember that, under normal conditions r_F ≥ 0, even if interest rates can be negative), respectively. From a technical point of view, one has to further assume integrability of the price process ( $E [| S (t) |] < \infty$ ), together with Eqs (3) or (5), and Eq (5) together with integrability means that the discounted price is a martingale when r_F > 0. Please notice that Eqs (3) and (5) are not uniquely specifying a random process for S(t), but one can prove that, if they hold, then returns must be uncorrelated. In financial data, square returns or absolute returns turn out to be correlated (with long-range correlations), but this stylized fact does not falsify the EMH even if it is the main reason for the popularity of ARCH/GARCH models in financial econometrics [14, 15].

The EMH invalidates the pretence of technical analysis to predict future prices or trends; in fact, in Samuelson’s words, “there is no way of making an expected profit by extrapolating past changes in the futures price, by chart or any esoteric devices of magic or mathematics” [13].

An ‘Efficient Market’ toy model for the distribution of run durations

Among all the possible statistical models that can describe price fluctuations, the geometric random walk is the simplest one. A geometric random walk is just a product of independent and identically distributed positive random variables. If the expected value of these variables is 1, then the geometric random walk is a martingale; otherwise, if the expected value is larger than 1, the geometric random walk is a submartingale. However, the geometric random walk hypothesis is neither necessary nor sufficient for an efficient market, as shown by many authors, among whom Leroy [16], Lucas [17] and Lo and Mackinlay [1]. Again, to see this point, it is enough to consider that Eq (5) allows for any (sub)-martingale model.

To study our trends, at each step of a time series of price or index values, there are three possible outcomes: increase, constant and decrease, but the second one does not change prices direction. Then we consider the two possible outcomes: either the time series increases or it does not increase. In an efficient market, the expected future price only depends on information about the current price, not on its previous history. Therefore, it should be impossible to predict the expected direction of a future price change given the history of the price process. In formula, from Eq (3) (after discounting for the risk-free rate), we have

\begin{matrix} E (S (t + Δ t) - S (t) | F_{t}) = 0; \end{matrix}

(6)

therefore, if we consider the sign of the price change Y(t, Δt) = sign(S(t + Δt) − S(t)), which coincides with the sign of returns, we accordingly have

\begin{matrix} E (Y (t, Δ t)) = 0 . \end{matrix}

(7)

If the price follows a geometric random walk, then the series of price-change signs can be modeled as a Bernoulli process. This process could be biased to take the presence of a risk-free interest rate into account. To be more specific, let us consider a log-normal geometric random walk and let us use the assumption Δt = 1. Let S₀ be the initial price. The price at time t will be given by

\begin{matrix} S (t) = S_{0} \prod_{i = 1}^{t} Q_{i} \end{matrix}

(8)

where Q_i are independent and identically distributed random variables following a log-normal distribution with parameters μ and σ. These two parameters come from the corresponding normal distribution for log-returns. As a direct consequence of the EMH in the form Eq (5), we have

\begin{matrix} E (Q) = 1 + r_{F}, \end{matrix}

(9)

whilst for a log-normal distributed random variable the expected value is

\begin{matrix} E (Q) = e^{μ} e^{σ^{2} / 2}; \end{matrix}

(10)

by combining these two equations the following dependence between the parameters is found:

\begin{matrix} σ = \sqrt{2 (log (1 + r_{F}) - μ)} \end{matrix}

(11)

which allows us to compare the parameters estimated from the distributions of the index price returns, as the values of the two parameters μ and σ come from the corresponding normal distribution for log-returns of the price or index data. Further, from the cumulative distribution function of a log-normal random variable

\begin{matrix} F_{Q} (u) = P (Q \leq u) = \frac{1}{2} + \frac{1}{2} erf (\frac{log (u) - μ}{\sqrt{2 σ^{2}}}), \end{matrix}

(12)

we find that the probability of a negative sign of the return is given by

\begin{matrix} q = F_{Q} (1) = P (Q \leq 1) = \frac{1}{2} + \frac{1}{2} erf (\frac{σ}{2 \sqrt{2}} - \frac{log (1 + r_{F})}{\sqrt{2} σ}) . \end{matrix}

(13)

For typical markets, from 1978 to 2020, the value of the daily risk-free rate of returns oscillated in the range 0 < r_F ≤ 2.5 × 10⁻⁴. Eq (11) can be tested using the values of r_F and the estimates of μ and σ) as can be seen in Fig 2. Using these values for r_F, the probability of negative returns is found to be q = 0.5±0.02, thus the Bernoulli process seems a reasonable first approximation for the probability of a change in sign for the return data.

Fig 2 — The risk-free rate of return r_F oscillates between the interval [0, 0.061). For these values the probability of a negative return is q = 0.5±0.02. Subfig 2(a) is the probability of negative return given the risk-free interest rate r_f and mean σ. The σ parameter can be estimated from the return time series Q_i, and r_f is estimated from a reference asset that depends on the market being studied. Intersection with q = 0.5 is also shown. Subfig 2(b) is the daily risk-free rate of returns r_F for the US market, developed markets and emerging markets. For the period ranging from 1978 to 2020 the daily risk-free rate oscillates in the interval [0, 0.061). Data was downloaded from http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.

Under this framework, it becomes natural to use the biased Bernoulli process as the null hypothesis for the time series of sign changes of the log-returns [18]. It is known that the distribution of the number x of failures needed to get one success for a Bernoulli process with success probability p = 1 − q is the geometric distribution $G (p)$ . The number of failures is then given by

\begin{matrix} P (x) = P (N = x) = p {(1 - p)}^{x} = p q^{x} . \end{matrix}

(14)

The duration of an elementary downward trend in daily data is the number of days before the price increases, so the distribution of such trend duration should follow a geometric distribution. An identical argument applies to the duration of an upward trend. Note that such sequences of identical outcomes are also known as runs or clumps in the mathematical literature. Some historical references on this subject, are [19–21]. Where chapter X of the first reference was during many years the classical textbook reference to Theory of Runs; second reference shows an interesting statistical test based on runs properties to demonstrate that two sets of independent observations corresponding to two independent random variables have the same distribution and finally, the third reference presents an intensive treatment of the theory of runs still of current interest.

In the next section we describe the data used for testing the model just presented, as well as a discussion of the goodness of fit test applied to compare the observed and expected distributions, namely the geometric distribution, of the duration of the trends of upwards/downwards price, which coincide with the sign of the log-returns.

Data sample and methodology

In this work, daily close data values of four financial indices were analyzed, namely Dow Jones Industrial Average (DJIA), NASDAQ Composite, the Mexican Índice de Precios y Cotizaciones (IPC) and Nikkei 225, during the period between October 30 1978—August 07 2020. All data sample for the mentioned time span is available as suplementary material, see S1 Dataset at the end of this paper. Number of analyzed records and found uninterrupted uptrends and downtrends, as defined in subsection Definitions are displayed in Table 1.

Table 1. Numbers of total observed records and respective uninterrupted trends for all data samples of financial indices studied.

The data have been filtered, e.g. by removing null records.

Market	Records	Trends	Uptrends	Downtrends
DJIA	10571	5362	2681	2681
Nasdaq	10534	4779	2389	2390
IPC	10432	4410	2205	2205
Nikkei	10300	5167	2583	2584

Open in a new tab

Remember that by construction, for each data sample, the number of uninterrupted uptrends and downtrends are the same if the analyzed financial time series has an even number of total trends and they differ in one unity if the total number of trends is odd respectively.

The composition of trends for each data sample is described in Tables 2–5. Additional and brief comments on the different duration of constructed uninterrupted trend data samples may be found in section Conclusions.

Table 2. Composition of uninterrupted trends observed in the DJIA data sample.

Duration (days)	2681 Uptrends	2681 Downtrends	5362 Total
1	1262	1429	2691
2	674	673	1347
3	354	316	670
4	210	147	357
5	90	70	160
6	45	28	73
7	23	10	33
8	11	7	18
9	5	0	5
10	3	0	3
11	2	0	2
12	1	1	2
13	1	0	1

Open in a new tab

Table 5. Composition of uninterrupted trends observed in Nikkei index.

Duration (days)	2583 Uptrends	2584 Downtrends	5167 Total
1	1253	1337	2590
2	652	644	1296
3	328	321	649
4	165	164	329
5	81	64	145
6	41	27	68
7	31	17	48
8	15	4	19
9	9	5	14
10	3	0	3
11	2	0	2
12	1	1	2
13	0	0	0
14	0	0	0
15	1	0	1
16	1	0	1

Open in a new tab

Table 3. Composition of uninterrupted trends observed in the Nasdaq data sample.

Duration (days)	2389 Uptrends	2390 Downtrends	4779 Total
1	977	1240	2217
2	532	586	1118
3	384	300	684
4	209	128	337
5	120	71	191
6	58	36	94
7	45	13	58
8	29	10	39
9	10	4	14
10	9	1	10
11	7	0	7
12	5	0	5
13	1	0	1
14	1	0	1
15	0	0	0
16	0	1	1
17	0	0	0
18	1	0	1
19	1	0	1

Open in a new tab

Table 4. Composition of uninterrupted trends found in the IPC data sample.

Duration (days)	2205 Uptrends	2205 Downtrends	4410 Total
1	865	981	1846
2	537	559	1096
3	320	323	643
4	192	151	343
5	111	97	208
6	74	42	116
7	45	25	70
8	22	8	30
9	16	8	24
10	10	4	14
11	5	3	8
12	3	1	4
13	0	1	1
14	1	0	1
15	3	0	3
16	0	0	0
17	0	0	0
18	0	0	0
19	0	1	1
20	1	0	1
21	0	0	0
22	0	0	0
23	0	1	1

Open in a new tab

Finally, in Table 6, we show the descriptive statistics of data presented in the current section. Values of first four central moments are displayed. It can be seen that the mean value of the observable trends duration for all studied markets is close to two, it is bigger for less mature markets and that uptrends mean duration is slightly bigger that downtrend mean duration for all markets.

Table 6. Descriptive statistics of data presented in Tables 2–5.

Market	Mean	RMS	Skewness	Kurtosis
DJIA overall	1.9713 ± 0.0185	1.3511 ± 0.0130	2.0215 ± 0.0334	8.9454 ± 0.9998
DJIA uptrends	2.0595 ± 0.0275	1.4298 ± 0.0194	1.9844 ± 0.0471	8.7538 ± 0.9996
DJIA downtrends	1.8375 ± 0.0229	1.1893 ±0.0162	1.8010 ± 0.0472	6.7113 ± 0.9996
Nasdaq Overall	2.2040 ± 0.0243	1.6825 ± 0.0172	2.3962 ± 0.0354	12.5405 ±0.9998
Nasdaq uptrends	2.4519 ± 0.0385	1.8884 ± 0.0273	2.1989 ±0.0500	10.9085 ± 0.9996
Nasdaq downtrends	1.9233 ± 0.0273	1.3378 ± 0.0193	2.2818 ± 0.0499	12.0090 ±0.9996
IPC overall	2.3653 ± 0.0276	1.8307 ± 0.0195	2.5248 ± 0.0369	14.8844 ± 0.9998
IPC uptrend	2.4899 ±0.0407	1.9202 ± 0.0288	2.0464± 0.0519	8.8614 ± 0.9996
IPC downtrends	2.1800 ± 0.0340	1.5995 ± 0.0240	2.4244 ± 0.0520	13.7515 ± 0.9996
Nikkei overall	1.9932 ± 0.0198	1.4214 ± 0.0140	2.2929 ±0.0341	11.3466 ± 0.9998
Nikkei uptrends	2.0735 ± 0.0304	1.5434± 0.0215	2.4070 ± 0.0481	12.0543 ± 0.9996
Nikkei downtrends	1.9094 ±0.0251	1.2774 ± 0.0178	1.9697 ±0.0482	8.4111 ±0.9996

Open in a new tab

The Anderson-Darling goodness of fit test

In order to compare the observed and expected distributions of trend durations, the Anderson-Darling (AD) test described in references [22, 23] was used. The AD test belongs to a family of goodness of fit tests called the Cramér-von Mises tests, which includes the Anderson-Darling test, Watson’s test and the Cramér-von Mises test itself. The family was originally developed to test continuous distributions, but a generalization for discrete distributions appeared for the first time in an article by Choulakian et.al. [23]. The Anderson-Darling test was found to be the most suitable for this purpose because it places more weight on the tails of a distribution than other goodness of fit tests.

The principle behind this kind of tests is defining a statistic that serves to measure the distance between a theoretical distribution function F₀(k) and the empirical (cumulative) distribution function for n events, F_n(k). Every value of the statistic is associated with a p-value, that can be interpreted as the probability of obtaining a value of the statistic at least as large as the one obtained, given that the null hypothesis.

\begin{matrix} H_{0} : F_{n} (k) = F_{0} \end{matrix}

(15)

is true. If the p-value is smaller than a previously defined threshold value α, the null hypothesis is rejected.

Two separate tests were applied on our data:

A test of whether the observed data comes from a geometric distribution with p = q = 0.5. Based on the model outlined in section An ‘Efficient Market’ toy model for the distribution of run durations and empirical evidence on data, we interpret a rejection of this null hypothesis as evidence that the market is moving up or down in the investigated period.
A test of whether the observed data is drawn from a distribution belonging to a parametric family $G (p)$ . This tell us whether the up and down ticks can be modeled as a Bernoulli process.

These two tests will allow us to assess the validity of the Bernoulli hypothesis.

For a more complete discussion on the Anderson-Darling test for discrete data, including some comments about how it was applied to the geometric distribution case, see S1 Appendix.

Data analysis

In order to motivate our analysis, in Fig 3, for the four markets studied in this paper, we show the ratio of the upward to total price changes in daily data plotted against time for the years 1978—2020. This ratio is calculated over an overlapping time window of 504 trading days shifted every 5 days. It can be seen that variations of this ratio fluctuate closer to the value of $\frac{1}{2}$ for DJIA and Nikkei, whereas for Nasdaq and IPC they are greater than those expected for the same time windows in a Bernoulli process with parameter p = 1/2.

Hereunder, we present different detailed studies on the four markets to see whether consecutive price increments/decrements with the same sign do follow a geometric distribution. Firstly, after estimating the distributions of the duration of uninterrupted uptrends and downtrends for all data, we separately and independently fit a geometric distribution to each one of these distributions, where the observed sum of the same duration uptrends and downtrends is the only constraint. For this reason, although we denote the estimated parameter p and q for uptrend and downtrend durations respectively, this is only nominal, since we fit those parameters separately and independently without constricting the geometric fits to comply the constraint p + q = 1, i.e. we consider and analyze the sequences of uninterrupted uptrends and downtrends durations separately. Due to this reason, we are sometimes prone in this paper to refer only to the parameter p in our discussion. More on this point can be found at the end of the current section.

Analyzed empirical data can be consulted in Tables 2–5, and their corresponding geometric fits are displayed in Fig 4 for all probability distributions of uptrend and downtrend durations corresponding to the four different indices studied here. A Maximum Likelihood Fit (MLF) was applied. The results of these fits can be consulted in Table 7. In Fig 4, black solid small circles represent observations, the geometric fit corresponds to the red solid line and, as a visual guide, blue dashed lines indicate a geometric distribution with parameter p = q = 0.5.

Table 7. Fitted p and q parameters of the geometric model.

DJIA and Nikkei empirical trend durations distribution are well fitted by the geometric model. Nasdaq and IPC are not. NDF means “number of degrees of freedom”. Fits were performed on the data listed in Tables 2–5.

Market	Fitted region uptrends (days)	p	χ²/NDF uptrends	Fitted region downtrends (days)	q	χ²/NDF downtrends
DJIA	≤ 13	0.4870±0.0068	8.5436/12	≤ 8	0.5409±0.0071	2.6636/7
Nasdaq	≤14	0.4160±0.0064	15.9685/13	≤10	0.5180±0.0073	3.8665/9
IPC	≤12	0.40006±0.0066	2.0455/11	≤13	0.4583±0.0071	8.5935/12
Nikkei	≤12	0.4886±0.0068	6.6170/11	≤9	0.5248±0.0071	7.0781/8

Open in a new tab

In order to obtain a good fit, with appropriate and correct p and χ² values, the fitting procedure was applied on the plots region where no null event gaps were observed in the trends duration distributions, i.e. the region where trend duration showed zero events for first time were excluded from the fit. Cut off applied are also shown in all corresponding plots of Fig 4 and are indicated by a dotted, vertical line. The only distribution that does not present any empty value in trend duration is the one corresponding to the DJIA uptrends and therefore it was fitted in the whole range of observed values.

From Fig 4 and Table 7, it can be seen that, although all discussed markets display some extreme trends durations deviating in different grades from the geometric model, for the whole of our data sample, distributions of increasing and decreasing trends durations for DJIA and Nikkei can be fitted reasonably by a geometric distribution with $p = \frac{1}{2}$ , while the corresponding empirical runs distributions of Nasdaq and IPC are also reasonably fitted by a geometric distribution with parameters not necessarily equal to $\frac{1}{2}$ . More on this facts will be discussed below and in next sections. From above fits, we can rank markets in order of decreasing distance from the p = 0.5 model, with Nikkei being the closest, followed by the DJIA then the Nasdaq, and the IPC being the most distant during the analyzed time period.

Finally, although for small and medium size trends, especially for the more mature markets DJIA and Nikkei, the geometric model with $p = \frac{1}{2}$ is a good approximation, it is not possible to conclude that the conditions $p = q = \frac{1}{2}$ and p + q = 1 are fulfilled in general for all market and every run duration. Classical analyses [24] reported periods where for runs of the monthly index DJIA, p = 0.57 and q = 0.43 (1897–1929) and in contrast the S&P monthly composite index is not even consistent with the equal probability condition $p = q = \frac{1}{2}$ and even with the probability conservation condition p + q = 1, for example for the period January 1871 to December 1917, for this index p = 0.67 and q = 0.50 and for the time span January 1918 to March 1956, p = 0.6 and q = 0.60. These results suggest that for these cases and the financial indices examined in [24], we are not dealing with a random process with $p = q = \frac{1}{2}$ . In the classical reference [24], p and q values are not estimated by a fit procedure as the performed here: instead they calculate the relative frequencies of indices up and down events to different time scales.

More deviations observed empirically of the hypothesis $p = q = \frac{1}{2}$ are reported in [25–27]. For an interesting and more modern analysis on runs for high frequency financial data, see [28].

Time variation of p and q and other estimates of these parameters

In order to gain a better insight about how our empirical trend duration distributions dynamically differ from the geometric theoretical distribution, the evolution in time of p and q values are plotted in the upper and lower left panels of Fig 5 respectively. These parameters are independently calculated over a time window of 252 trading days shifted every ten days, separately over the sequences of observed uninterrupted uptrends and downtrends durations. We see that their corresponding values tend to oscillate around $\frac{1}{2}$ and that markets with p and q values closer to $\frac{1}{2}$ are DJIA and Nikkei. Empirical distributions of the calculated values of p and q for all markets are displayed in the right, upper and lower panels of same Fig 5 respectively. The corresponding mean and standard deviation of p and q values are displayed in Table 8, where other size rolling time windows were also used in their calculation. It can be verified that listed values are all consistent with those estimated by the geometric fit procedure shown in Fig 4 with estimated fit parameters given in Table 7.

Table 8. Mean and standard deviation values of p and q distributions shown in Fig 5, and generated with a rolling, overlapping time window of 252 days.

Same values generated for two additional overlapping time frames of 200 and 300 days are also displayed.

Market	Time window (overlapping)	<p>	σ _p	<q>	σ _q
DJIA	200	0.4877±0.0017	0.0266±0.0012	0.5437±0.0017	0.0276±0.0012
	252	0.4881±0.0015	0.0236±0.0011	0.5440±0.0016	0.0257±0.0012
	300	0.4881±0.0014	0.0220±0.0010	0.5445±0.0016	0.0240±0.0011
Nasdaq	200	0.4210±0.0038	0.0561±0.0027	0.5198±0.0030	0.0448±0.0021
	252	0.4208±0.0037	0.0543±0.0026	0.5187±0.0029	0.0427±0.0021
	300	0.4207±0.0037	0.0534±0.0026	0.5177±0.0028	0.0406±0.0020
IPC	200	0.4130±0.0044	0.0629±0.0031	0.4732±0.0036	0.0506±0.0025
	252	0.4129±0.0044	0.0616±0.0031	0.4741±0.0034	0.0472±0.0024
	300	0.4128±0.0043	0.0600±0.0030	0.4749±0.0032	0.0450±0.0023
Nikkei	200	0.4892±0.0031	0.0486±0.0022	0.5235±0.0019	0.0295±0.0014
	252	0.4895±0.0031	0.0473±0.0022	0.5233±0.0017	0.0264±0.0012
	300	0.4898±0.0031	0.0463±0.0022	0.5234±0.0016	0.0243±0.0011

Open in a new tab

Here it is important to mention the estimate of p and q shown in Table 8, at this moment only serves to corroborate the values obtained by the geometric fitting procedure. We mention this, because notwithstanding the independence of both measurements and the agreement between values displayed in both Tables 7 and 8, entries of distributions shown in the two right histograms in Fig 5 are not all really statistically independent, since they were calculated by using a rolling time window of 252 days as described above. In addition to this, p and q values show a certain degree of non-stationarity and finally, these results can be dependent on the choice of the time-window size. Even considering these three facts, the agreement between the estimates obtained by these two different procedures over different time frames is remarkable. Taking in consideration these facts and in order to confirm the quality of our estimation, we show in Table 9 the results obtained, this time using no overlapping time windows of again 200, 252 and 300 days.

Table 9. Mean values and standard deviation of p and q distributions, generated this time by using no overlapping time windows, of of 200, 252 and 300 days.

Obtained values are consistent with those shown in previous Table 8.

Market	Time window (no overlapping)	<p>	σ _p	<q>	σ _q
DJIA	200	0.4874±0.0082	0.0297±0.0061	0.5436±0.0082	0.0295±0.0060
	252	0.4906±0.0076	0.0240±0.0057	0.5438±0.0074	0.0234±0.0055
	300	0.4867±0.0067	0.0231±0.0058	0.5427±0.0091	0.0257±0.0069
Nasdaq	200	0.4161±0.0172	0.0596±0.0127	0.5158±0.0122	0.0404±0.0090
	252	0.4148±0.0205	0.0616±0.0154	0.5192±0.0152	0.0457±0.0114
	300	0.4155±0.0207	0.0584±0.0156	0.5101±0.0133	0.0353±0.0102
IPC	200	0.4124±0.0200	0.0663±0.0148	0.4661±0.0179	0.0594±0.0133
	252	0.4047±0.0228	0.0646±0.0173	0.4665±0.0204	0.0577±0.0154
	300	0.4080±0.0252	0.0667±0.0193	0.4691±0.0237	0.0628±0.0181
Nikkei	200	0.4875±0.0145	0.0502±0.0107	0.5240±0.0091	0.0315±0.0091
	252	0.4870±0.0160	0.0507±0.0120	0.5243±0.0101	0.0320±0.0075
	300	0.4874±0.0178	0.0504±0.0135	0.5234±0.0095	0.0269±0.0072

Open in a new tab

To end this subsection, we observe that the distance from the p = 0.5 model for the different studied markets established at the end of section Data analysis by the geometric fitting procedure is again confirmed by the values of p and q showed in Table 9.

Mean value and variance of p + q distribution

Even if we calculate p and q independently and we use this notation in nominal way, in this subsection and for completeness reasons, we carefully study the probability conservation that a geometric stochastic process must to meet, i.e. p + q = 1; in order to see what is happening, we show in Fig 6(a), the behavior of p + q as a function of time, calculated as explained before, by using a 252 trading days rolling time frame shifted each 10 days. It can be seen that p + q for DJIA and Nikkei oscillates around 1, whereas IPC and Nasdaq get closer on time to this value and, then, after year 2000, they follow the same behavior than DJIA and Nikkei. The upper right panel of same figure shows p + q empirical distribution for all studied markets. Non stationarity effects are observed in all of them in different degree, however mean value of p + q are close to 1 in all those distributions. Cutting off all data previous to year 2000 and repeating this analysis, it is observed that indeed p + q for all markets fluctuate closer and around the value of 1, and that even runs of IPC and Nasdaq markets turn closer to geometric as time passes. Distributions of these fluctuations are plotted in Fig 6(d). Corresponding mean and standard deviation of p + q distributions for all analyzed and restricted after year 1999 data, can be seen in Table 10, where also we have calculated these mean values for 200 and 300 trading days rolling time windows.

Table 10. Mean and standard deviation of p + q distributions, where rolling, overlapping time frames of 200, 252 and 300 trading days were set up and shifted each 10 days.

Observed p + q mean values are very close to the value of 1 for Nikkei and DJIA and despite non stationarity, are also close to 1 for Nasdaq and IPC markets. Restricting our analysis to dates thereafter year 1999, clearly p + q values are even nearest to the value of 1 for all markets.

Market	Time window (overlapping)	<p + q>	σ _{(p + q)}	<p + q> after 2000	σ_{(p + q)} after 2000
DJIA	200	1.0394±0.0020	0.0660±0.0014	1.0600±0.0038	0.0705±0.0022
	252	1.0321±0.0018	0.0435±0.0010	1.0513±0.0020	0.0451±0.0014
	300	1.0357±0.0017	0.0055±0.0013	1.0559±0.0030	0.0584±0.0018
Nasdaq	200	0.9425±0.0034	0.1080±0.0024	1.0267±0.0027	0.0612±0.0019
	252	0.9340±0.0030	0.1046±0.0023	1.0180±0.0019	0.0424±0.0013
	300	0.9378±0.0032	0.1023±0.0023	1.0223±0.0023	0.0521±0.0016
IPC	200	0.8830±0.0035	0.1133±0.0025	0.9621±0.0028	0.0633±0.0020
	252	0.8750±0.0033	0.1092±0.0024	0.9680±0.0019	0.0425±0.0013
	300	0.8792±0.0033	0.1065±0.0024	0.9592±0.0023	0.0532±0.0023
Nikkei	200	1.0199±0.0025	0.0790±0.0018	1.0390±0.0024	0.0556±0.0017
	252	1.0124±0.0020	0.0620±0.0014	1.0306±0.0016	0.0350±0.0011
	300	1.0163±0.0022	0.0711±0.0016	1.0359±0.0020	0.0456±0.0014

Open in a new tab

In the same way we proceeded in previous subsection Time variation of p and q and other estimates of these parameters, we calculate mean values and RMS of p + q for no overlapping time frames of 200, 252 and 300 days. Obtained values are shown in below Table 11, also calculated for all time period of the recorded data sample and for the span of time after year 2000.

Table 11. Again, mean and standard deviation of p + q distributions, this time calculated by using 200, 252 and 300 no overlapping time frames.

Displayed values are consistent with those of Table 10.

Market	Time window (no overlapping)	<p + q>	σ _{(p + q)}	<p + q> after 2000	σ_{(p + q)} after 2000
DJIA	200	1.0394±0.0094	0.0678±0.0067	1.0597±0.0147	0.0750±0.0106
	252	1.0355±0.0094	0.0603±0.0067	1.0574±0.0140	0.0451±0.0014
	300	1.0369±0.0097	0.0573±0.0069	1.0571±0.0140	0.0596±0.0102
Nasdaq	200	0.9386±0.0150	0.1081±0.0150	1.0249±0.0107	0.0546±0.0077
	252	0.9394±0.0170	0.1087±0.0122	1.0238±0.0139	0.0424±0.0013
	300	0.9366±0.0170	0.1004±0.0121	1.0216±0.0113	0.0479±0.0082
IPC	200	0.8806±0.0163	0.1175±0.0116	0.9599±0.0143	0.0727±0.0103
	252	0.8773±0.0184	0.1179±0.0131	0.9604±0.0129	0.0580±0.0129
	300	0.8743±0.0188	0.1095±0.0134	0.9573±0.0120	0.0496±0.0088
Nikkei	200	1.0164±0.0108	0.0771±0.0077	1.0352±0.0120	0.0601±0.0087
	252	1.0182±0.0121	0.0769±0.0087	1.0404±0.0108	0.0483±0.0078
	300	1.0149±0.0129	0.0751±0.0092	1.0351±0.0129	0.0534±0.0094

Open in a new tab

For the full period of all analyzed data and from the measurements shown in Tables 10 and 11, we can rank studied markets in the following order of closeness to the geometric distribution: 1) DJIA, 2) Nikkei, 3) Nasdaq and lastly 4) IPC.

Estimation of <p> and σ_p

We have estimated p by applying the usual, one parameter fitting procedure illustrated in section Data analysis. We have seen that the estimate value of p for the different data samples, are compatible with the corresponding values of <p> obtained by averaging data for each movable overlapping and not overlapping time windows with sizes given in Tables 8 and 9. The above and the following is, of course, also valid for the case of q.

The process of finding the maximum likelihood estimate of the parameter p in a geometric distribution as given by Eq (14):

\begin{matrix} P (x) = {(1 - p)}^{x} p, x = 0, 1, \dots \end{matrix}

(16)

is a well known methodology which consists of finding the value $\hat{p}$ , of p, which maximizes the likelihood function. For the case of the geometric distribution, given a random sample x₁, …, x_n, we obtain:

\begin{matrix} \hat{p} = \frac{1}{1 + \bar{x}} \end{matrix}

(17)

where $\bar{x}$ denotes the sample mean.

From the asymptotic properties of the MLE estimators, see [29, 30], $\hat{p}$ has approximately a normal distribution with mean <p> and variance n⁻¹ p²(1 − p). Formulas to calculating the error of the mean and variance are well known, see [31], although their estimation is usually automatically made in the background by the scientific software used to perform the data analysis, in our case Mathematica.

To conclude this subsection, we must point out that the measurements show that p + q is slightly greater than 1 for DJIA and Nikkei, are not contradictory with empirical experience, since by studying runs for different time scales, a slight excess of uptrends over downtrends has been observed in financial data at least since the 1930s [24–26]. Remember also that our two measurements of p and q were performed in an independent way and that the early financial literature also evidences that, at least for some time spans, the evolution of runs is not well represented by a random walk with equal probabilities of going up and down. We believe these empirical facts are well known in financial econometrics, but may not be well-known by physicists.

In this paper, we do not only confirm these experimental facts and show time evolution of p, q and p + q, but in next section, we will estimate the fraction of time runs of markets follow a geometric behavior with $p = \frac{1}{2}$ and with any p.

Anderson-Darling test in the case p = 0.5

To study dynamically how the theoretical statistical model differs from the empirical data, we calculate the Anderson-Darling statistics for the corresponding trends durations of the observed empirical distribution and the theoretical, geometric distribution with parameter p = 0.5. Fig 7 display the obtained p-values of the Anderson-Darling statistic, $A_{n}^{2}$ for different time periods (not to be confused with the p parameter of the geometric distribution). Remember that in the case we are interested, a p-value is the probability of obtaining a value of $A_{n}^{2}$ at least as big as the one that was really obtained, given that the probability distribution is actually geometric.

Analysis presented in Fig 7(a) shows that for the DJIA, the greatest deviations from the geometric distribution with parameter $p = \frac{1}{2}$ , occurred between the years 2002–2011. Fig 7(b) for Nasdaq, it is observed that as time goes by, p-values of the Anderson-Darling show that empirical data tends to agree better with the geometric distribution $G (0.5)$ , especially after year 2000. Fig 7(c) shows that, similar to NASDAQ, the IPC index agreement between data and the geometric distribution increases with time. Finally, Fig 7(d) shows that for the Nikkei case, p-values of the Anderson-Darling show a good agreement between the Geometric model with $p = \frac{1}{2}$ and the observed trend duration distribution.

As an auxiliary analysis, Fig 8 shows the dates when the events from Fig 7 have a p-value below the α = 0.05 significance level, or in other words, the dates for which the null hypothesis can be rejected, with a significance level α = 0.05 and the complementary dates for which the geometric hypothesis with $p = q = \frac{1}{2}$ cannot be rejected.

The above observations are compatible with the plot shown in Fig 5 presented in subsection Time variation of p and q and other estimates of these parameters, where it is shown that the greatest, however diminishing deviations from the geometric distribution with p = 0.5 occurred between the years 1980–2000, especially for Nasdaq and IPC and to a lesser extent for DJIA and Nikkei. In next subsection Anderson-Darling parametric test for the geometric distribution, we shall show that, in all cases, still the geometric model upholds, by allowing the parameter p to vary freely.

Anderson-Darling parametric test for the geometric distribution

Let us explore the possibility that trend durations follow a geometric distribution with any parameter p ∈ (0, 1). Results of this parametric test are displayed in Fig 9. It can be observed that, for all studied markets, the assumption of a Bernoulli process for price directions holds reasonably well for most of the time, except for sporadic deviations that are usually related to extreme market movements such as in the case of a financial crisis. Again Fig 10 is an auxiliary figure that shows the dates when events from Fig 9 have a p-value below the α = 0.05 significance level, i.e. the dates when studied markets do not follow the geometric model at the mentioned significance level. As it may be seen, for an important fraction of time, markets do seem to follow the geometrical model with some parameter p.

Fig 9 — 9(a) is DJIA, 9(b) is Nasdaq, 9(c) is IPC and 9(d) is Nikkei.

Fig 10 — Again, for easy reading colored points are enlarged.

An application of previous results will be discussed in the next section A simple application: Assessing the fraction of time markets runs follow a geometric distribution.

A simple application: Assessing the fraction of time markets runs follow a geometric distribution

Continuing the discussion at the end of section Data analysis, we can also use the results presented there to assess the percentage of time that the market follows the geometric model with $p = \frac{1}{2}$ and in a p free parametric way. In order to do this we propose the following methodology:

Calculate a time series of p-values from the sample of trends durations using the geometric process with p = 0.5 as the null hypothesis.
Count the number of points above the significance level value α = 0.05.
Divide it by the length of the time series to obtain the percentage of time the market has behaved as a market following the geometric process.

Repeat it for the case non parametric, i.e. the same null hypothesis but now with any p.

Following above criterion, we rank studied markets as follows: closest for a bigger time fraction to the geometric model with $p = \frac{1}{2}$ was the DJIA, followed by Nikkei 225, then the NASDAQ Composite and the end, the IPC. Here, under this criterion more mature markets runs follow closer the geometric model with $p = \frac{1}{2}$ for a longer, but this time Nikkei 225 and DJIA exchange rank position. Results obtained by means of this methodology, for the geometric case with $p = \frac{1}{2}$ as well as for a parametric free way, may be consulted in Table 12.

Table 12. Fraction of time, the overall of the studied data trends durations follow a geometric distribution with parameter p = 0.5, and with any p, both cases for a significance level of 5%.

Market/Case	Time fraction p = 0.5	Time fraction Non Parametric
DJIA	0.84	0.9673
Nasdaq	0.47	0.9626
IPC	0.37	0.9683
Nikkei	0.81	0.9419

Open in a new tab

Conclusions

The study of runs used to be an important research area in financial econometrics [24–27]. Some interesting more modern empirical studies have been performed on daily and high frequency data [28, 32, 33]; runs have been applied for example to assess market randomness [34] and even flash crashes in high frequency data [32]. In this paper, the probability distributions of the duration of elementary trends or price runs were studied for the market indices Dow Jones Industrial Average (DJIA), NASDAQ Composite, the Mexican Índice de Precios y Cotizaciones (IPC) and for the Japanese Nikkei 225. According to the discussion of section An ‘Efficient Market’ toy model for the distribution of run durations, these distributions are expected to be geometric, with parameter p = 0.5 and memoryless. Indeed the geometric distribution with $p = \frac{1}{2}$ , provides a good model for trends of small and medium size, lets say until 10 or 12 days, for DJIA and Nikkei, and with p non necessarily $\frac{1}{2}$ for less mature markets. On the other side, we show that trend duration distributions in all markets display outliers, however more statistics is needed to study these extreme events that do not seem to follow the geometric model.

Additionally, by selecting overlapping and non overlapping time frames of 200, 252 and 300 trading days, we display p and q behavior over time, and the distribution of these parameters, allowing us to estimate their mean and RMS values and compare the former with the corresponding values obtained by a fitting procedure. Agreement obtained is remarkably good. We have shown that for all markets p and q values are evolving towards the value of $\frac{1}{2}$ . The p + q evolution over time is also displayed and by using same methodology we observe that <p + q> is approaching over time to the value of one for all markets. Finally, markets with uninterrupted trends durations closer to follow a geometric behavior with $p = \frac{1}{2}$ may be ranked in the following order: Nikkei, DJIA, Nasdaq and IPC, meaning more mature market are closer to the geometric behavior with $p = \frac{1}{2}$ .

Anderson-Darling test has been used to quantify the likelihood that a series of trends durations were generated by a process compatible with the geometric model with $p = \frac{1}{2}$ ; we also employed it to assess for how long, trends durations follow the geometric distribution with $p = \frac{1}{2}$ , as well as for any other value of the parameter p. Corresponding dates during which markets runs do not follow the geometric behavior for $p = \frac{1}{2}$ , and in a parametric free way, are displayed respectively in auxiliary Figs 8 and 10. Numerical time fractions displayed in Table 12 correspond to the fraction of the time markets follow the geometric distribution with $p = \frac{1}{2}$ and in parametric free way. First column of this table shows that for the significance level of 5%, price runs distribution of DJIA follows a geometric distribution with $p = \frac{1}{2}$ the 84% of the time, Nikkei 81%, Nasdaq 47% and IPC 37% of the time. Ranking obtained by this criterion, although exchange DJIA and Nikkei 225 positions, once again classifies more mature markets at the top of the list. Second column of same Table 12 shows that the distributions of all studied markets trends duration are close to a geometric distribution with a parameter p not necessarily equal to $\frac{1}{2}$ a high fraction of all time. This fact can be supported by the quality of fits and respective fit parameters non equal to $\frac{1}{2}$ displayed in Fig 4 and Table 7, showing that with the exception of the few extreme values, geometric model fits well Nasdaq and IPC runs duration, with parameter p (and q) non necessarily close to $\frac{1}{2}$ .

Obtained results also show us that for more mature markets runs distribution are closer for a longer time spans to the geometric distribution with $p = \frac{1}{2}$ , and that less mature markets runs seems to evolve on time to this same distribution. This empirical result reminds us the fact that worldwide markets increase their efficiency with time [35–38].

In section An ‘Efficient Market’ toy model for the distribution of run durations, we state that the geometric model with $p = \frac{1}{2}$ applied to price runs it may consistent with the EMH. However, if the empirical analysis falsifies the process, this does not mean that the EMH is falsified. In our opinion, more and deeper study should be necessary to clarify these facts, given that market efficiency refers to returns and not to price runs.

Finally, besides the above mentioned problem of making explicit the relation between market efficiency and the geometric behaviour of price runs, we have some additional remarks possibly leading to future work: in this paper, we analyzed regularly sampled data i.e, daily close price data, and although the geometric model seems well suited to model short and medium price trends durations, this observable is really a continuous random variable and conceivably the geometric model might not be suitable to describe non regularly sampled data [33], as for example in tick-by-tick data. The second remark has to do with the extreme values observed in the different trends durations distributions that occur with a higher probability than expected from the geometric model, as can be observed for values of trend durations above cut-off values signaled in the different panels of Fig 4 and recorded in the second and fifth columns of Table 7. In the present analysis, we observe at most two of these extreme events in the different panels of Fig 4, which is insufficient for saying something interesting on the distribution of these outliers. Also it will be interesting to study the relation of these runs extreme events with extreme returns events, particularly financial crashes; for example by using smaller time windows in our analyses. Third, although by their construction, in any data sample the number of downtrends and uptrends must be the same or their difference at most of one unit; from the composition of trends shown in Tables 2–5, it can be seem that for very short duration trends, number of downtrends predominate and for medium and long duration trends there are more uptrends than downtrends, this asymmetry and its relation with corresponding returns deserves a more detailed study.

In our opinion, and even if data analyses such as the one presented here have a long history, we have managed to find new results of possible interest to the econophysicist and financial communities. Specifically, apart from independent and consistently estimating the parameters p and q, by two different methods, we show their time evolution as well as the time evolution of their addition p + q. Moreover, we not only show that the runs distribution of the markets studied is compatible with the geometric distribution with the estimated parameters, but we also estimate when and the fraction of the time during which the markets follow this behavior, parametrically for p = q = 0.5 and non-parametrically. The detection of when and for how long the distribution of the durations of market price runs have a geometric distribution is in our opinion our most important result and achievement of those presented here, from both, the academic and practical points of view; it is really not obvious that the duration of ascending and descending runs independently follow geometric distributions with for example different parameter values respectively.

From an academic point of view, it might be interesting to see what happens to the efficiency of markets when p and q differ significantly: would they adapt their price variations to compensate for the difference in probabilities of seeing upward or downward uninterrupted trends? Answering this question would be material for another article. On the other hand, obviously these results could easily be incorporated into various trading systems. In addition, another simple application of our methodology, was ranking the different analyzed markets according to the larger fraction of time they follow the geometric behavior for the parametric case. Rank that, on the other hand, seems to coincide with the level of efficiency of the markets studied, issue that, since EMH is given in terms of market prices variations and not in terms of runs, also deserves further study.

Although discussed at last paragraph of section Introduction, we conclude this paper remarking that empirical results as those reported herein are also important and of interest because any adequate agents based market model or of any other kind must reproduce them. See references [4, 9].

Supporting information

S1 Dataset. File S1_DataSet.zip contains all analyzed data set.

(ZIP)

Click here for additional data file.^{(147.5KB, zip)}

S1 Appendix. The discrete version of the Anderson-Darling goodness-of-fit test.

(PDF)

Click here for additional data file.^{(210.8KB, pdf)}

Acknowledgments

We thank Ms. S. Jiménez for her LATEX writing and correcting.

Data Availability

Data was available in www.yahoo https://es-us.finanzas.yahoo.com. Mathematica software and all data is also available in: https://github.com/CarlosManuelRodr/TrendDurationAnalysis and https://github.com/CarlosManuelRodr/TrendDurationAnalysis/tree/main/Research/OriginalDataset Resoectively.

Funding Statement

ARHM and CMRM received support from grants 425854 and 5150 from the Consejo Nacional de Ciencia y Tecnología. Conacyt. https://conacyt.mx/ THS received support from grant number 425854 from the Consejo Nacional de Ciencia y Tecnología. Conacyt. https://conacyt.mx/ ES is partially supported by the Dr Perry James (Jim) Browne Research Centre at the Department of Mathematics, University of Sussex. http://www.sussex.ac.uk/broadcast/read/55282 The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1. Lo AW, MacKinlay AC. A Non-random Walk Down Wall Street. Princeton paperbacks. Princeton University Press; 1999. Available from: https://books.google.com.mx/books?id=fGmfQgAACAAJ. [Google Scholar]
2. Cont R. Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance. 2001;1:223–236. doi: 10.1080/713665670 [DOI] [Google Scholar]
3. Pagan A. The econometrics of financial markets. Journal of Empirical Finance. 1996;3(1):15–102. doi: 10.1016/0927-5398(95)00020-8 [DOI] [Google Scholar]
4. Ponta L, Trinh M, Raberto M, Scalas E, Cincotti S. Modeling non-stationarities in high-frequency financial time series. Physica A. 2019;521(1):173–196. doi: 10.1016/j.physa.2019.01.069 [DOI] [Google Scholar]
5. Maldarella D, Pareschi L. Kinetic models for socio-economic dynamics of speculative markets. Physica A. 2012;391(1):715–730. doi: 10.1016/j.physa.2011.08.013 [DOI] [Google Scholar]
6.Ehrentreich N. Agent-based modeling: The Santa Fe Institute artificial stock market model revisited. Volume 602. ISBN: 978-3-540-73879-4, Springer Science & Business Media (2007).
7.Lux T. Stochastic behavioral asset pricing models and the stylized facts. Technical report. 2008, Economics working paper, Department of Economics. Christian-Albrechts-Universitat Kiel.
8. Farmer JD. The economy needs agent-based modelling. Nature. 2009;460(7256):685–686. doi: 10.1038/460685a [DOI] [PubMed] [Google Scholar]
9.Meyer M. How to Use and Derive Stylized Facts for Validating Simulation Models. In Claus Beisbart and Nicole J. Saam (eds.), Computer Simulation Validation: Fundamental Concepts, Methodological Frameworks, and Philosophical Perspectives. Series: Simulation Foundations, Methods and Applications. ISBN: 978-3319707655, Springer. pp. 383-403 (2019).
10.Takayasu H. (Ed), Empirical Science of Financial Fluctuations: The advent of econophysics, ISBN: 978-4431703167, Springer (2000).
11. Fama EF. Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance. 1970;25(2):383–417. doi: 10.1111/j.1540-6261.1970.tb00518.x [DOI] [Google Scholar]
12. Mantegna RN, Stanley HE. An Introduction to Econophysics: Correlations and Complexity in Finance. USA: Cambridge University Press; 1999. [Google Scholar]
13. Samuelson PA. Proof that Properly Anticipated Prices Fluctuate Randomly. Industrial Management Rev. 1965;6(2):41–45. [Google Scholar]
14. Engle RF. Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of U.K. Inflation. Econometrica. 1982;50(4):987–1002. doi: 10.2307/1912773 [DOI] [Google Scholar]
15. Bollerslev T. Generalized Autoregressive Conditional Heteroskedasticity. J Econometrics. 1986;31(3):307–327. doi: 10.1016/0304-4076(86)90063-1 [DOI] [Google Scholar]
16. LeRoy S. Risk Aversion and the Martingale Property of Stock Prices. International Economic Review. 1973;14(2):436–46. doi: 10.2307/2525932 [DOI] [Google Scholar]
17. Lucas RE. Asset Prices in an Exchange Economy. Econometrica. 1978;46(6):1429–1445. doi: 10.2307/1913837 [DOI] [Google Scholar]
18. Scalas E. Scaling in the market of futures. Physica A. 1998;253(1):394–402. doi: 10.1016/S0378-4371(97)00652-3 [DOI] [Google Scholar]
19.Wilks SS. Mathematical Statistics, ISBN: 978-4431703167, Princeton University Press (1943). Available from: https://books.google.com.mx/books?id=k38pAQAAMAAJ.
20. Wald A, Wolfowitz J. On a test whether two samples are from the same population. Ann. Math. Stat. 1940;11:147–162. Available from: http://dml.mathdoc.fr/item/1177731909/ [Google Scholar]
21. Mood AM. The Distribution Theory of Runs. Ann. Math. Stat. 1940;11:367–392. Available from: 10.1214/aoms/1177731825.full [Google Scholar]
22. Bracquemond C, Crétois E, Gaudoin O. A comparative study of goodness-of-fit tests for the geometric distribution and application to discrete time reliability. Laboratoire Jean Kuntzmann; 2002. [Google Scholar]
23. Choulakian V, Lockhart RA, Stephens MA. Cramér-von Mises Statistics for Discrete Distributions. The Canadian Journal of Statistics / La Revue Canadienne de Statistique. 1994;22(1):125–137. doi: 10.2307/3315828 [DOI] [Google Scholar]
24. Alexander SS. Price Movements in Speculative Markets: Trends or Random Walks. Industrial Management Review. 1964;(2):7–26. [Google Scholar]
25. Cowles A, Jones HE. Some a Posteriori Probabilities in stock Market Action. Econometrica. 1937;5(280):280–294. doi: 10.2307/1905515 [DOI] [Google Scholar]
26. Cowles A. A Revision of a Previous Conclusions Regarding Stock Price Behavior. Econometrica. 1960;28(4):909–915. doi: 10.2307/1907573 [DOI] [Google Scholar]
27. Fama EF. The Behavior of Stock-Market Prices. The Journal of Business. 1965;38(1):34–105. doi: 10.1086/294743 [DOI] [Google Scholar]
28. Sieczka P, Hołyst JA. Statistical properties of short term price trends in high frequency stock market data. Physica A: Statistical Mechanics and its Applications. 2008;387(5):1218–1224. doi: 10.1016/j.physa.2007.10.048 [DOI] [Google Scholar]
29.Silvey SD. Statistical Inference. Chapman & Hall. London. Chapman & Hall Monographs on Statistics and Applied Probability; 1975. ISBN: 978-0412138201 London.
30.Bickel PJ, Doksum KA. Mathematical Statistics. ISBN: 978-0816207848. Prentice Hall, Englewood Cliffs, New Jersey; 1977.
31. Harding B, Tremblay C, Cousineau D. Standard errors: A review and evaluation of standard error estimators using Monte Carlo simulations TQMP. 2007; 10(2):107–123. doi: 10.20982/tqmp.10.2.p107 [DOI] [Google Scholar]
32. Aldridge I. High-Frequency Runs and Flash-Crash Predictability. Journal of Portfolio Management. 2014;40(3):113–123. doi: 10.3905/jpm.2014.40.3.113 [DOI] [Google Scholar]
33. Li H, Gao Y. Statistical distribution and time correlation of stock returns runs. Physica A. 2013;377(1):193–198. doi: 10.1016/j.physa.2006.11.016 [DOI] [Google Scholar]
34. Bradley JV. Distribution-Free-Statistical Tests. USA: Prentice-Hall; 1968. [Google Scholar]
35. Toth B, Kertész J. Increasing market efficiency: Evolution of cross-correlations of stock returns. Physica A. 2006;360(2):505–515. doi: 10.1016/j.physa.2005.06.058 [DOI] [Google Scholar]
36. Yaoqi G et al. China’s copper futures market efficiency analysis: Based on nonlinear Granger causality and multifractal methods. Resources policy. 2020;68:101716. doi: 10.1016/j.resourpol.2020.101716 [DOI] [Google Scholar]
37. Adam Z et al. Where have the profits gone? Market efficiency and the disappearing equity anomalies in country and industry returns. Journal of Banking & Finance. 2020;121:105966. [Google Scholar]
38. Coronel-Brizio HF, Hernández-Montoya AR, Huerta-Quintanilla R, Rodríguez-Achach ME. Evidence of increment of efficiency of the Mexican Stock Market through the analysis of its variations. Physica A. 2007;380(2):391–398. doi: 10.1016/j.physa.2007.02.109 [DOI] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0270492.r001

Decision Letter 0

Aurelio F Bariviera

2 Dec 2021

PONE-D-21-27632An empirical data analysis of “price runs” in daily financial indices: dynamically assessing market geometric behaviorPLOS ONE

Dear Dr. Hernandez Montoya,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jan 16 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Aurelio F. Bariviera, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following in the Acknowledgments Section of your manuscript:

"We thank Ms. Selene Jim´enez for her LATEX writing and correcting. This work has been endorsed by Conacyt-Mexico under project grant numbers 425854 and 5150 supported by FOINS. ES is partially supported by the Dr Perry James (Jim) Browne Research Centre at the Department of Mathematics, University of Sussex."

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form.

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows:

"ARHM and CMRM received support from grants 425854 and 5150 from the Consejo Nacional de Ciencia y Tecnología. Conacyt. https://conacyt.mx/

THS received support from grant number 425854 from the Consejo Nacional de Ciencia y Tecnología. Conacyt. https://conacyt.mx/

ES is partially supported by the Dr Perry James (Jim) Browne Research Centre at the Department of Mathematics, University of Sussex. http://www.sussex.ac.uk/broadcast/read/55282

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

Please include your amended statements within your cover letter; we will change the online submission form on your behalf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The paper investigates the distribution of the duration of uninterrupted trends for the daily indices DJIA, NASDAQ, IPC, and Nikkei 225 from 10/30/1978 to 08/07/2020 and compares the simple geometric statistical model with p=1/2 consistent with the EMH to the empirical data. Results show that the geometric distribution with parameter p = 1/2 provides a good model for uninterrupted trends of short and medium duration for the more mature markets, however, the longest duration events still need to be statistically characterized.

As a general comment, I think the paper makes an interesting contribution to the literature by analyzing a new data analysis. At the same time, I think the paper needs some improvements before being published in this journal.

Specific comments

1) I suggest the authors number the sections

2) In section “Data sample and methodology” I would suggest the authors enrich the description by adding a Table reporting the main statistical properties of the data.

3) In the section conclusion, the authors mention that the empirical results found are also important for agent-based models to validate the simulations. I would suggest the authors add in the introduction a paragraph where the importance of stylized facts for markets simulators is described. I suggest also add the following citations:

a. Ponta, L., Trinh, M., Raberto, M., Scalas, E., & Cincotti, S. (2019). Modeling non-stationarities in high-frequency financial time series. Physica A: statistical mechanics and its applications, 521, 173-196.

b. Meyer, M. (2019). How to use and derive stylized facts for validating simulation models. In Computer simulation validation (pp. 383-403). Springer, Cham.

4) In the section conclusion, the authors should add the practical implication of the study.

5) Please check the quality of the figures. Some are very large others very small.

Reviewer #2: The authors do not show how they compute standard deviation of p and q, technically. The reviewer thinks that how to compute standard deviation of p and q is not obvious.

The title of this paper is not suitable to the current content proven in this manuscript. Specifically, the authors analyze the ratio of the direction of price changes by using p and q. Statistics of p and q are price direction for the short term (one business day), namely, Pr(x) with x = 1 or -1. The authors assume that the price direction should be independent. However, the price run should be measured by correlation among the directions of price change for some time horizon. Pr(x1, x2), Pr(x1, x2, x3), and more.

Such correlations are captured by test for independence by the chi-squared test for high dimensional contingency table.

Moreover, some researchers have been conducted this type of analysis in the literature of econophysics about fifteen years before. Please read the following book: Hideki Takayasu (ed.), Empirical science of financial fluctuations: the advent of econophysics, Springer (2002)

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Jul 7;17(7):e0270492. doi: 10.1371/journal.pone.0270492.r002

Author response to Decision Letter 0

11 May 2022

>Reviewer #1: The paper investigates the distribution of the duration of uninterrupted trends for the daily indices DJIA, NASDAQ, IPC, and Nikkei 225 from 10/30/1978 to 08/07/2020 and compares the >simple geometric statistical model with p=1/2 consistent with the EMH to the empirical data. Results show that the geometric distribution with parameter p = 1/2 provides a good model for uninterrupted >trends of short and medium duration for the more mature markets, however, the longest duration events still need to be statistically characterized.

>As a general comment, I think the paper makes an interesting contribution to the literature by analyzing a new data analysis. At the same time, I think the paper needs some improvements before being >published in this journal.

Thank you very much for your opinion on our work.

>Specific comments

>1) I suggest the authors number the sections

To the best of our knowledge, Plos ONE journal does not include enumeration of sections, subsections, etc. in its published articles. This is easy to check by downloading any issue from Plos ONE journal website. We are only following Plos ONE latex template to write and submit our paper.

>2) In section “Data sample and methodology” I would suggest the authors enrich the description by adding a Table reporting the main statistical properties of the data.

We agree with this suggestion, we have included in our paper the new table number 6, displaying the descriptive statistics of the analyzed data and edited the text accordingly.

>3) In the section conclusion, the authors mention that the empirical results found are also important for agent-based models to validate the simulations. I would suggest the authors add in the >introduction a paragraph where the importance of stylized facts for markets simulators is described. I suggest also add the following citations:

We agree with this suggestion, in section “Introduction”, we have added the following text (lines 27 to 37):

“Empirical studies of financial and economic data are becoming increasingly relevant for the following reasons:

1) Currently dozens of stylized facts have been observed and more are still being discovered.

2) The study and prediction of stylized facts by means of methodologies of multi-agents market models is an important area of research in Finance and Econophysics.

3) Stylized facts are an import tool to validate proposed numerical and multi-agent market models; and

4) At present, we still lack a general, microscopic theory or model to explain the origin of stylized facts, we think simulation methodologies using agents could be useful in the construction of such general theory. Some interesting references on these issues are the following: [3–8].”

Where the corresponding cited bibliography is:

3. Pagan, A., The econometrics of financial markets. Journal of Empirical Finance. 1996;3(1):15–102. doi: 1016/0927-5398(95)00020-8.

4. Ponta L, Trinh M, Raberto M, Scalas E, Cincotti S. Modeling non-stationarities in high-frequency financial time series. Physica A, 2019;521(1):173–196. doi: 10.1016/j.physa.2019.01.069.

5. Maldarella D, Pareschi L. Kinetic models for socio-economic dynamics of speculative markets. Physica A, 2012;391(1):715–730. doi: 10.1016/j.physa.2011.08.013.

6. Ehrentreich N. Agent-based modeling: The Santa Fe Institute artificial stock market model revisited. Volume 602. ISBN: 978-3-540-73879-4, Springer Science & Business Media (2007).

7. Lux T. Stochastic behavioral asset pricing models and the stylized facts. Technical report. 2008, Economics working paper, Department of Economics. Christian Albrechts-Universitat Kiel.

8. Farmer JD. The economy needs agent-based modeling. Nature. 2009;460(7256):685–686. doi: 10.1038/460685a.

9. Meyer M. How to Use and Derive Stylized Facts for Validating Simulation Models. In Claus Beisbart and Nicole J. Saam (eds.), Computer Simulation Validation: Fundamental Concepts, Methodological

Frameworks, and Philosophical Perspectives. Series: Simulation Foundations, Methods and Applications. ISBN: 978-3319707655, Springer. pp. 383-403 (2019).

10. Takayasu H. (Ed), Empirical Science of Financial Fluctuations: The advent of econophysics, ISBN: 978-4431703167, Springer (2000).

>4) In the section conclusion, the authors should add the practical implication of the study.

We thought we had explained the main applications of our results in the "Results" section; however, this question gives us the opportunity to think about this issue and make explicit what we consider to be the most important implication of our results. So, as requested, we have added the following text in the "Conclusions" section, lines 497 to 518:

“we show their time evolution as well as the time evolution of their addition $p+q$. Moreover, we not only show that the runs distribution of the markets studied is compatible with the geometric distribution with the estimated parameters, but we also estimate when and the fraction of the time during which the markets follow this behavior, parametrically for $p=q=0.5$ and non-parametrically. The detection of when and for how long the distribution of the durations of market price runs have a geometric distribution is in our opinion our most important result and achievement of those presented here, from both, the academic and practical points of view; it is really not obvious that the duration of ascending and descending runs independently follow geometric distributions with for example different parameter values respectively.

From an academic point of view, it might be interesting to see what happens to the efficiency of markets when $p$ and $q$ differ significantly: would they adapt their price variations to compensate for the difference in probabilities of seeing upward or downward uninterrupted trends? Answering this question would be material for another article. On the other hand, obviously these results could easily be incorporated into various trading systems. In addition, another simple application of our methodology, was to classify the different analyzed markets according to the largest fraction of time they follow the geometric behavior for the parametric case. Rank that, on the other hand, seems to coincide with the level of efficiency of the markets studied, issue that, since EMH is given in terms of market prices variations and not in terms of runs, also deserves further study.”

>5) Please check the quality of the figures. Some are very large others very small.

I am afraid, just as the same case of point 1), we do not have control on this issue, Plos ONE latex template for papers to be evaluated by reviewers gives this output. By using the Plos ONE final version latex template, all figures look good.

>Reviewer #2: The authors do not show how they compute standard deviation of p and q, technically. The reviewer thinks that how to compute standard deviation of p and q is not obvious.

We agree with your comment, we have added in pages 19-20, lines 328 to 356, a new subsection titled “Estimation of $<\\hat{p}>$ and $\\sigma_p$”, where we explain the methodology for the calculation of parameter $p$. We have added the following references to the text:

37. Silvey SD. Statistical Inference. Chapman & Hall. London. Chapman & Hall Monographs on Statistics and Applied Probability; 1975. ISBN: 978-0412138201. London.

38. Bickel PJ, Doksum KA. Mathematical Statistics ISBN: 978-0816207848. Prentice Hall, Englewood Cliffs, New Jersey; 1977.

39. Harding B, Tremblay C, Cousineau D. Standard errors: A review and evaluation of standard error estimators using Monte Carlo simulations TQMP. 2007; 10(2):107–123.

doi:10.20982/tqmp.10.2.p107.

>The title of this paper is not suitable to the current content proven in this manuscript.

We agree in this point, the title could be misleading. For this reason, we have included in it the word “distributional” to have the following final title:

“An empirical data analysis of “price runs” in daily financial indices: dynamically assessing market geometric distributional behavior”.

It is a subtle change, but in this, we gain clarity about what we are really analyzing.

>Specifically, the authors analyze the ratio of the direction of price changes by using p and q. Statistics of p and q are price direction for the short term (one business day), namely, Pr(x) with x = 1 or -1. >The authors assume that the price direction should be independent.

In reality, we do not analyze explicitly the ratio of the direction of price changes, in fact, in order to motivate presented ideas, we only show a single plot showing the behavior of ratio of upward to total prices evolution. In our paper we count and directly analyze upward and downward run durations separately, assuming they are independent, instead of assuming the condition $p+q = 1$; on the contrary we statistically verify the condition $p+q = 1$.

>However, the price run should be measured by correlation among the directions of price change for some time horizon. Pr(x1, x2), Pr(x1, x2, x3), and more. Such correlations are captured by test for >independence by the chi-squared test for high dimensional contingency table.

We are not sure about the meaning of “price runs measurement”, in our paper we study the distribution of price runs length or price runs duration. In any case, if the suggestion is to study the finite dimensional distributions of up and down price movements, this would be a different analysis, as we are focusing on runs.

By the other side, we have studied autocorrelations of returns coming from runs. This is more natural because returns calculated from runs are signed while runs durations are not. However, this would be also a different analysis than the presented in our paper.

We consider that as it stands our paper is already sufficiently large and complete to include additional analyses. However, the idea of using contingency tables to analyze data seems really interesting to us and surely is worth of exploration, but again this would be a different analysis than the one presented in our article.

>Moreover, some researchers have been conducted this type of analysis in the literature of econophysics about fifteen years before. Please read the following book: Hideki Takayasu (ed.), Empirical >science of financial fluctuations: the advent of econophysics, Springer (2002)

In fact, these kind of studies and other related, as for example returns distribution or in general, the study of market empirical properties, have a long tradition in the econometric literature; are currently actively investigated for a great number of researchers worldwide, and date back much more than 15 years ago (our older reference dates back to 1937), as we show in our bibliography.

Universal empirical properties of financial data, named stylized facts, is an active research area for the economics and physicists communities, and dozens of these properties have been reported as well as some new others.

Returning to our work, we should stress that we report empirical results, which to our knowledge are novel and original. For example, as we answered to referee number one in above point 4), no methodology and measurement of the fraction of time in which the distribution of market price runs follows a geometric distribution with parameter compatible with p=0.5 or the dates when this happens has been published before, the same can be said of the result that geometric distribution is a good model of the duration of the runs with the parameter p changing or of how p and q evolve with time, etc. In opinion, these are enough interesting for the academical community.

Thank you for the reference, we have included the outstanding book edited by Professor Takayasu in our bibliography. It includes papers describing important methodologies that would become of great importance to study financial markets, such as the applications of agents and to model financial markets or the applications of random matrix theory to study the correlations of market sectors, etc.

It has been included as reference number 10, lines 557 and 558:

10. Takayasu H. (Ed), Empirical Science of Financial Fluctuations: The advent of econophysics, ISBN: 978-4431703167, Springer (2000).

However, we finally mention here, that this reference does not include any work with an analysis similar to the one presented by us in our paper.

On code and data availability:

Mathematica and python versions of the code used in our data analyses presented in our paper, may be downloaded at:

https://github.com/CarlosManuelRodr/TrendDurationAnalysis

Analyzed data is available at:

https://github.com/CarlosManuelRodr/TrendDurationAnalysis/tree/main/Research/OriginalDataset

This is now mentioned in the text, lines 167 to 169:

“Data sample files were downloaded from Yahoo Finance. All data sample for the mentioned time span is available at the following link: https://github.com/CarlosManuelRodr/TrendDurationAnalysis/tree/main/Research/OriginalDataset.”

Finally, a regular number of minor corrections were made to the manuscript, and they are:

i) Table 3, pg 10: For Nasdaq uptrends with durations 16 days, entry must be 0 instead 1 and for Nasdaq downtrends with duration 16 days entry must be 1 and not 0. For downtrends with durations 18 and 19 days, both entries must be 0 and not 1.

From figures 4c) and 4d), it is clear that above errors were really typos. It can be seen there that distribution of Nasdaq uptrends durations has no entries with 16 days, and there is an entry in Nasdaq downtrends durations distribution with duration of 16 days and no entries for downtrends with durations 18 and 19 days because in this case the maximum downtrend duration is 16 days long.

For convenience to referees to verify these typos in table 3, below we have inserted subfigures 4c) and 4d) of our paper:

(Here we insert subfigures 4c) and 4d) in the pdf file of "Response to Reviewers letter".)

ii) These simple corrections imply that we should have corrected in same table 3 the total number of uptrends in second column, first row from 2390 to 2389; the total number of downtrends in first row third column from 2391 to 2390 and the overall number of trends in fourth column from 4781 to 4779.

iii) For same reason, and for Nasdaq, table 1 third row, columns 3rd, 4th and 5th were corrected from 4781, 2390 and 2391 to 4779, 2389 and 2390 trends, respectively. Also for Nasdaq, number of records has been corrected from 10481 to 10534 entries. The correctness of the last change may be verified observing the number of records in file:

https://github.com/CarlosManuelRodr/TrendDurationAnalysis/blob/main/Research/OriginalDataset/nasdaq.json

which has 10549 records in total, with 13 initial lines with information on the index itself, time period covered and two final lines with no data, remaining a total of 10534 Nasdaq data records to analize.

Finally, to indicate some data preprocessing was applied, the text: “The data have been filtered, e.g. by removing null records.” has been added to the captions of figure 1 (for example, Nikkei data has eleven null records).

iv) In tables 2 to 5, first rows, first columns: the units of the variable duration have been included --> (days).

v) In line 24:

It said;

“...where we describe a basic random process consistent…”

It says now:

“...where we study empirically a basic random process consistent …”

vi) Because its historical and theoretical interest, he following text was inserted in lines 151 to 157:

“Some historical references on this subject, are [19], [20] and [21]. Where chapter X of the first reference was during many years the classical textbook reference to Theory of Runs; second reference shows an interesting statistical test based on runs properties to demonstrate that two sets of independent observations corresponding to two independent random variables have the same distribution and finally, the third reference presents an intensive treatment of the theory of runs still of current interest.”

Respective, included references are:

19. Wilks SS. Mathematical Statistics, ISBN: 978-4431703167, Princeton University Press (1943).

Available from: https://books.google.com.mx/books?id=k38pAQAAMAAJ}.

20. Wald A, Wolfowitz J. On a test whether two samples are from the same population. Ann. Math. Stat. 1940;11:147–162. Available from: http://dml.mathdoc.fr/item/1177731909/.

21. Mood AM. The Distribution Theory of Runs. Ann. Math. Stat. 1940;11:367–392. Available from: https://projecteuclid.org/journals/annals-of-mathematical-statistics/volume-11/issue-4/The-Distribution-Theory-of-Runs/10.1214/aoms/1177731825.full

vii) In line 487:

“…financial crahes…” was corrected to “…financial crashes...”

viii) In reference 1. line 532, “Lo AWC...” was corrected to “Lo AW...”

ix) Reference 22 in lines 583 to 585 is not available online any more, then inactive URL has been deleted, also in S1.Appendix Supporting information.

x) In reference 32, lines 605 and 606, Kertez was corrected to Kertész. Also, year of reference was corrected from 2005 to 2006.

xi) In references 33) and 34), in lines 607 and 610, respectively, “et all” was corrected to “et al”.

Again, we thank you and our anonymous referees for your time and effort in reviewing our paper.

Attachment

Submitted filename: ResponseReviewers.pdf

Click here for additional data file.^{(135.6KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0270492.r003

Decision Letter 1

Aurelio F Bariviera

13 Jun 2022

An empirical data analysis of “price runs” in daily financial indices: dynamically assessing market geometric distributional behavior

PONE-D-21-27632R1

Dear Dr. Hernandez Montoya,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Aurelio F. Bariviera, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

**********

6. Review Comments to the Author

Reviewer #1: The paper has been improved following the referees’ suggestions, and now, according to me, it is ready to be published in this journal.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

PLoS One. doi: 10.1371/journal.pone.0270492.r004

Acceptance letter

Aurelio F Bariviera

23 Jun 2022

PONE-D-21-27632R1

An empirical data analysis of “price runs” in daily financial indices: dynamically assessing market geometric distributional behavior

Dear Dr. Hernández-Montoya:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Aurelio F. Bariviera

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Dataset. File S1_DataSet.zip contains all analyzed data set.

(ZIP)

Click here for additional data file.^{(147.5KB, zip)}

S1 Appendix. The discrete version of the Anderson-Darling goodness-of-fit test.

(PDF)

Click here for additional data file.^{(210.8KB, pdf)}

Attachment

Submitted filename: ResponseReviewers.pdf

Click here for additional data file.^{(135.6KB, pdf)}

Data Availability Statement

[pone.0270492.ref001] 1. Lo AW, MacKinlay AC. A Non-random Walk Down Wall Street. Princeton paperbacks. Princeton University Press; 1999. Available from: https://books.google.com.mx/books?id=fGmfQgAACAAJ. [Google Scholar]

[pone.0270492.ref002] 2. Cont R. Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance. 2001;1:223–236. doi: 10.1080/713665670 [DOI] [Google Scholar]

[pone.0270492.ref003] 3. Pagan A. The econometrics of financial markets. Journal of Empirical Finance. 1996;3(1):15–102. doi: 10.1016/0927-5398(95)00020-8 [DOI] [Google Scholar]

[pone.0270492.ref004] 4. Ponta L, Trinh M, Raberto M, Scalas E, Cincotti S. Modeling non-stationarities in high-frequency financial time series. Physica A. 2019;521(1):173–196. doi: 10.1016/j.physa.2019.01.069 [DOI] [Google Scholar]

[pone.0270492.ref005] 5. Maldarella D, Pareschi L. Kinetic models for socio-economic dynamics of speculative markets. Physica A. 2012;391(1):715–730. doi: 10.1016/j.physa.2011.08.013 [DOI] [Google Scholar]

[pone.0270492.ref006] 6.Ehrentreich N. Agent-based modeling: The Santa Fe Institute artificial stock market model revisited. Volume 602. ISBN: 978-3-540-73879-4, Springer Science & Business Media (2007).

[pone.0270492.ref007] 7.Lux T. Stochastic behavioral asset pricing models and the stylized facts. Technical report. 2008, Economics working paper, Department of Economics. Christian-Albrechts-Universitat Kiel.

[pone.0270492.ref008] 8. Farmer JD. The economy needs agent-based modelling. Nature. 2009;460(7256):685–686. doi: 10.1038/460685a [DOI] [PubMed] [Google Scholar]

[pone.0270492.ref009] 9.Meyer M. How to Use and Derive Stylized Facts for Validating Simulation Models. In Claus Beisbart and Nicole J. Saam (eds.), Computer Simulation Validation: Fundamental Concepts, Methodological Frameworks, and Philosophical Perspectives. Series: Simulation Foundations, Methods and Applications. ISBN: 978-3319707655, Springer. pp. 383-403 (2019).

[pone.0270492.ref010] 10.Takayasu H. (Ed), Empirical Science of Financial Fluctuations: The advent of econophysics, ISBN: 978-4431703167, Springer (2000).

[pone.0270492.ref011] 11. Fama EF. Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance. 1970;25(2):383–417. doi: 10.1111/j.1540-6261.1970.tb00518.x [DOI] [Google Scholar]

[pone.0270492.ref012] 12. Mantegna RN, Stanley HE. An Introduction to Econophysics: Correlations and Complexity in Finance. USA: Cambridge University Press; 1999. [Google Scholar]

[pone.0270492.ref013] 13. Samuelson PA. Proof that Properly Anticipated Prices Fluctuate Randomly. Industrial Management Rev. 1965;6(2):41–45. [Google Scholar]

[pone.0270492.ref014] 14. Engle RF. Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of U.K. Inflation. Econometrica. 1982;50(4):987–1002. doi: 10.2307/1912773 [DOI] [Google Scholar]

[pone.0270492.ref015] 15. Bollerslev T. Generalized Autoregressive Conditional Heteroskedasticity. J Econometrics. 1986;31(3):307–327. doi: 10.1016/0304-4076(86)90063-1 [DOI] [Google Scholar]

[pone.0270492.ref016] 16. LeRoy S. Risk Aversion and the Martingale Property of Stock Prices. International Economic Review. 1973;14(2):436–46. doi: 10.2307/2525932 [DOI] [Google Scholar]

[pone.0270492.ref017] 17. Lucas RE. Asset Prices in an Exchange Economy. Econometrica. 1978;46(6):1429–1445. doi: 10.2307/1913837 [DOI] [Google Scholar]

[pone.0270492.ref018] 18. Scalas E. Scaling in the market of futures. Physica A. 1998;253(1):394–402. doi: 10.1016/S0378-4371(97)00652-3 [DOI] [Google Scholar]

[pone.0270492.ref019] 19.Wilks SS. Mathematical Statistics, ISBN: 978-4431703167, Princeton University Press (1943). Available from: https://books.google.com.mx/books?id=k38pAQAAMAAJ.

[pone.0270492.ref020] 20. Wald A, Wolfowitz J. On a test whether two samples are from the same population. Ann. Math. Stat. 1940;11:147–162. Available from: http://dml.mathdoc.fr/item/1177731909/ [Google Scholar]

[pone.0270492.ref021] 21. Mood AM. The Distribution Theory of Runs. Ann. Math. Stat. 1940;11:367–392. Available from: 10.1214/aoms/1177731825.full [Google Scholar]

[pone.0270492.ref022] 22. Bracquemond C, Crétois E, Gaudoin O. A comparative study of goodness-of-fit tests for the geometric distribution and application to discrete time reliability. Laboratoire Jean Kuntzmann; 2002. [Google Scholar]

[pone.0270492.ref023] 23. Choulakian V, Lockhart RA, Stephens MA. Cramér-von Mises Statistics for Discrete Distributions. The Canadian Journal of Statistics / La Revue Canadienne de Statistique. 1994;22(1):125–137. doi: 10.2307/3315828 [DOI] [Google Scholar]

[pone.0270492.ref024] 24. Alexander SS. Price Movements in Speculative Markets: Trends or Random Walks. Industrial Management Review. 1964;(2):7–26. [Google Scholar]

[pone.0270492.ref025] 25. Cowles A, Jones HE. Some a Posteriori Probabilities in stock Market Action. Econometrica. 1937;5(280):280–294. doi: 10.2307/1905515 [DOI] [Google Scholar]

[pone.0270492.ref026] 26. Cowles A. A Revision of a Previous Conclusions Regarding Stock Price Behavior. Econometrica. 1960;28(4):909–915. doi: 10.2307/1907573 [DOI] [Google Scholar]

[pone.0270492.ref027] 27. Fama EF. The Behavior of Stock-Market Prices. The Journal of Business. 1965;38(1):34–105. doi: 10.1086/294743 [DOI] [Google Scholar]

[pone.0270492.ref028] 28. Sieczka P, Hołyst JA. Statistical properties of short term price trends in high frequency stock market data. Physica A: Statistical Mechanics and its Applications. 2008;387(5):1218–1224. doi: 10.1016/j.physa.2007.10.048 [DOI] [Google Scholar]

[pone.0270492.ref029] 29.Silvey SD. Statistical Inference. Chapman & Hall. London. Chapman & Hall Monographs on Statistics and Applied Probability; 1975. ISBN: 978-0412138201 London.

[pone.0270492.ref030] 30.Bickel PJ, Doksum KA. Mathematical Statistics. ISBN: 978-0816207848. Prentice Hall, Englewood Cliffs, New Jersey; 1977.

[pone.0270492.ref031] 31. Harding B, Tremblay C, Cousineau D. Standard errors: A review and evaluation of standard error estimators using Monte Carlo simulations TQMP. 2007; 10(2):107–123. doi: 10.20982/tqmp.10.2.p107 [DOI] [Google Scholar]

[pone.0270492.ref032] 32. Aldridge I. High-Frequency Runs and Flash-Crash Predictability. Journal of Portfolio Management. 2014;40(3):113–123. doi: 10.3905/jpm.2014.40.3.113 [DOI] [Google Scholar]

[pone.0270492.ref033] 33. Li H, Gao Y. Statistical distribution and time correlation of stock returns runs. Physica A. 2013;377(1):193–198. doi: 10.1016/j.physa.2006.11.016 [DOI] [Google Scholar]

[pone.0270492.ref034] 34. Bradley JV. Distribution-Free-Statistical Tests. USA: Prentice-Hall; 1968. [Google Scholar]

[pone.0270492.ref035] 35. Toth B, Kertész J. Increasing market efficiency: Evolution of cross-correlations of stock returns. Physica A. 2006;360(2):505–515. doi: 10.1016/j.physa.2005.06.058 [DOI] [Google Scholar]

[pone.0270492.ref036] 36. Yaoqi G et al. China’s copper futures market efficiency analysis: Based on nonlinear Granger causality and multifractal methods. Resources policy. 2020;68:101716. doi: 10.1016/j.resourpol.2020.101716 [DOI] [Google Scholar]

[pone.0270492.ref037] 37. Adam Z et al. Where have the profits gone? Market efficiency and the disappearing equity anomalies in country and industry returns. Journal of Banking & Finance. 2020;121:105966. [Google Scholar]

[pone.0270492.ref038] 38. Coronel-Brizio HF, Hernández-Montoya AR, Huerta-Quintanilla R, Rodríguez-Achach ME. Evidence of increment of efficiency of the Mexican Stock Market through the analysis of its variations. Physica A. 2007;380(2):391–398. doi: 10.1016/j.physa.2007.02.109 [DOI] [Google Scholar]

PERMALINK

An empirical data analysis of “price runs” in daily financial indices: Dynamically assessing market geometric distributional behavior

Héctor Raúl Olivares-Sánchez

Carlos Manuel Rodríguez-Martínez

Héctor Francisco Coronel-Brizio

Enrico Scalas

Thomas Henry Seligman

Alejandro Raúl Hernández-Montoya

Roles

Abstract

Introduction

Definitions

Fig 1. Elementary trends on the time series for the prices of the DJIA during the period from Oct/30/1978 to Jan/09/1979.

The Efficient Market Hypothesis

An ‘Efficient Market’ toy model for the distribution of run durations

Fig 2. For a typical time series of Qi the parameter μ is close to zero oscillating on a small interval ranging from −0.004 to 0.05.

Data sample and methodology

Table 1. Numbers of total observed records and respective uninterrupted trends for all data samples of financial indices studied.

Table 2. Composition of uninterrupted trends observed in the DJIA data sample.

Table 5. Composition of uninterrupted trends observed in Nikkei index.

Table 3. Composition of uninterrupted trends observed in the Nasdaq data sample.

Table 4. Composition of uninterrupted trends found in the IPC data sample.

Table 6. Descriptive statistics of data presented in Tables 2–5.

The Anderson-Darling goodness of fit test

Data analysis

Fig 3. Ratio of upward to total price changes in daily data, plotted against time for the interval from 10–30-1978 to 08–07-2020, calculated over a time window of 504 trading days.

Fig 4. Subfigures 4(a), 4(c), 4(e) and 4(g) present the uptrends duration distributions and subfigures 4(b), 4(d), 4(f) and 4(h) correspond to downtrends duration distributions.

Table 7. Fitted p and q parameters of the geometric model.

Time variation of p and q and other estimates of these parameters

Fig 5.

Table 8. Mean and standard deviation values of p and q distributions shown in Fig 5, and generated with a rolling, overlapping time window of 252 days.

Table 9. Mean values and standard deviation of p and q distributions, generated this time by using no overlapping time windows, of of 200, 252 and 300 days.

Mean value and variance of p + q distribution

Fig 6. 6(a) is p + q vs time; 6(b) is p + q distribution for studied data; 6(c) is p + q vs time after 2000 year; and, 6(d) is p + q distribution after year 2000.

Table 10. Mean and standard deviation of p + q distributions, where rolling, overlapping time frames of 200, 252 and 300 trading days were set up and shifted each 10 days.

Table 11. Again, mean and standard deviation of p + q distributions, this time calculated by using 200, 252 and 300 no overlapping time frames.

Estimation of <p> and σp

Anderson-Darling test in the case p = 0.5

Fig 7. Subfigures 7(a) to 7(d) all show the p-values of the Anderson-Darling statistic evolution on time for DJIA, Nasdaq, IPC and Nikkei indices, respectively.

Fig 8. Colored points represent the dates where events from Fig 7, have a p-value below the α = 0.05 significance level, i.e. dates where the statistical behavior of the durations of the uninterrupted trends are more distant from a geometric with parameter p = 0.5.

Anderson-Darling parametric test for the geometric distribution

Fig 9. p-values of the parametric family Anderson-Darling statistic for the studied markets.

Fig 10. Colored points, show dates from the parametric test where events observed in Fig 9 have a p-value below the α = 0.05 and then geometric model for any p can not be applied to describe runs size distribution for the different analyzed markets at that significance level.

A simple application: Assessing the fraction of time markets runs follow a geometric distribution

Table 12. Fraction of time, the overall of the studied data trends durations follow a geometric distribution with parameter p = 0.5, and with any p, both cases for a significance level of 5%.

Conclusions

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Aurelio F Bariviera

Roles

Author response to Decision Letter 0

Decision Letter 1

Aurelio F Bariviera

Roles

Acceptance letter

Aurelio F Bariviera

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Fig 2. For a typical time series of Q_i the parameter μ is close to zero oscillating on a small interval ranging from −0.004 to 0.05.

Estimation of <p> and σ_p