Farmer et al. 10.1073/pnas.0409157102.

Supporting Information

Files in this Data Supplement:

Supporting Text
Supporting Table 1
Supporting Table 2
Supporting Figure 5
Supporting Figure 6
Supporting Figure 7
Supporting Figure 8
Supporting Figure 9
Supporting Figure 10
Supporting Figure 11




Table 1. Summary statistics for stocks in the data set

Stock ticker

No. events (1,000s)

Average (per day)

Limit (1,000s)

Market (1,000s)

Deletions (1,000s)

Eff. limit (shares)

Eff. market (shares)

No. of days

AZN

608

1,405

292

128

188

4,967

4,921

429

BARC

571

1,318

271

128

172

7,370

6,406

433

CW.

511

1,184

244

134

134

12,671

11,151

432

GLXO

814

1,885

390

200

225

8,927

6,573

434

LLOY

644

1,485

302

184

159

13,846

11,376

434

ORA

314

884

153

57

104

12,097

11,690

432

PRU

422

978

201

94

127

9,502

8,597

354

RTR

408

951

195

100

112

16,433

9,965

431

SB.

665

1,526

319

176

170

13,589

12,157

426

SHEL

592

1,367

277

159

156

44,165

30,133

429

VOD

940

2,161

437

296

207

89,550

71,121

434

Fields from left to right: stock ticker symbol, total number of events (effective market orders + effective limit orders + order cancellations) in thousands, average number of events in a trading day, number of effective limit orders in thousands, number of effective market orders in thousands, number of order deletions in thousands, average limit order size in shares, average market order size in shares, and number of trading days in the sample.





Table 2. A summary of the bootstrap error analysis described in the text

Regression

Estimated

Standard

Bootstrap

Low

High

Spread intercept

0.06

0.21

0.29

0.25

0.33

Spread slope

0.99

0.08

0.10

0.09

0.11

Diffusion intercept

2.43

1.22

1.76

1.57

1.97

Diffusion slope

1.33

0.19

0.25

0.23

0.29

The columns (left to right) are the estimated value of the parameter, the standard error from the cross-sectional regression in Fig. 10, the one standard deviation error bar estimated by the bootstrapping method, and the one standard deviation low and high values for the extrapolation, as shown in Figs. 3 e and f and 4 e and f.





Supporting Figure 5

Fig. 5. Illustration of the procedure for measuring the price diffusion rate for Vodafone (VOD) on August 4, 1998. On the x axis, we plot the time tin units of ticks, and on the y axis, the variance of midprice diffusion V(t ). According to the hypothesis that midprice diffusion is an uncorrelated Gaussian random walk, the plot should obey V(t) = Dt. To cope with the fact that points with larger values of thave fewer independent intervals and are less statistically significant, we use a weighted regression to compute slope D.





Supporting Figure 6

Fig. 6. Time series (Upper) and autocorrelation function (Lower) for daily price diffusion rate Dt for Vodafone. Because of long-memory effects and the short length of the series, the long-lag coefficients are poorly determined; the figure is simply to demonstrate that the correlations are quite large.





Supporting Figure 7

Fig. 7. Subsample analysis of regression of predicted vs. actual spread. To get a better feeling for the true errors in this estimation (as opposed to standard errors, which are certainly too small), we divide the data into subsamples (using the same temporal period for each stock) and apply the regression to each subsample. (a) The results for the intercept; (b) the results for the slope. In both cases, we see that progressing from right to left, as the subsamples increase in size, the estimates become tighter. (c and d) The mean and standard deviation for the intercept and slope. We observe a systematic tendency for the mean to increase as the number of bins decreases. (e and f) The logarithm of the standard deviations of the estimates against log n, the number of each points in the subsample. The line is a regression based on binnings ranging from m = N to m = 10 (lower values of m tend to produce unreliable standard deviations). The estimated error bar is obtained by extrapolating to n = N. To test the accuracy of the error bar, the dashed lines are one standard deviation variation on the regression, whose intercepts with the n = N vertical line produce high and low estimates.





Supporting Figure 8

Fig. 8. Subsample analysis of regression of predicted vs. actual price diffusion (see Fig. 10), similar to Fig. 7. The scaling of the errors is much less regular than for the spread, so the error bars are less accurate.





Supporting Figure 9

Fig. 9. Market impact collapse under four kinds of axis rescaling. In each case, we plot a normalized version of the order size on the horizontal axis vs. a (possibly normalized) average market impact log(pt+1) - log(pt) on the vertical axis. (a) Collapse using nondimensional units based on the model; (b) order size is normalized by its mean value for the sample. (c) Order size is normalized the average daily volume. (d) Order size is multiplied by the current best midpoint price, making the horizontal axis the monetary value of the trade.





Supporting Figure 10

Fig. 10. The variance plot procedure used to determine error bars for mean market impact conditional on order size. The horizontal axis n denotes the number of points in the m different samples, and the vertical axis is the standard deviation of the m sample means. We estimate the error of the full sample mean by extrapolating n to the full sample length.





Supporting Figure 11

Fig. 11. The average market impact vs. order size plotted on log-log scale. (Upper Left and Upper Right) Buy and sell orders in nondimensional coordinates; the fitted line has slope b = 0.26 ± 0.02 for buy orders and b = 0.23 ± 0.02 for sell orders. In contrast, Lower Left and Lower Right show the same thing in dimensional units, using British pounds to measure order size. Although the exponents are similar, the scatter among different stocks is much greater.