Skip to main content
. 2021 Oct 8;9(2):nwab183. doi: 10.1093/nsr/nwab183

Figure 2.

Figure 2.

Clustering 1.1 billion taxi locations in New York City. This dataset contains 1133769628 two-dimensional GPS locations (see Methods). (a) Visualization of WFC and k-means results. The cluster numbers Inline graphic were set to match those identified by WFC. (b) Running time and usability of clustering algorithms with different dataset sizes using centralized computing. Different dataset sizes are obtained by slicing dataset with changing time windows (see Methods). WFC (Total) and WFC (Ave.) represent the total and average per-scale running times of WFC respectively. As dataset size increases, more and more methods fail computationally, which are not plotted. (c) Running times of WFC and k-means using distributed computing. The results were computed by 10 runs of each algorithm, and error bars indicate the standard error of the mean.