Skip to main content
. 2002 Apr 23;99(9):5825–5829. doi: 10.1073/pnas.032093399

Table 1.

Data collected in 5 experimental Web crawls

Anchor URL encountered URL read in Clusters Total links Co-links
Shakespeare 277,114 69,982 1,560 3,730 321
Needlework 341,398 102,895 1,498 4,440 727
Revisionist 66,771 22,933 947 1,307 67
Piazzolla 47,978 20,404 2,125 2,577 70
Mol. Biol. 318,705 110,286 1,518 6,351 868

Our protocol ensures that nodes within distance r = 1 of the anchor are fully explored to a given depth d = 3. The crawls started at sites of literary studies of Shakespeare, instructions for needlework, a notorious Swiss revisionist (anti-Semitic) site, tango (specifically the music of Piazzolla), and a site for molecular biology (Mol. Biol.). A URL has been “read in” if our robot has fetched and analyzed the corresponding page. We deem a URL as “encountered” if it appeared in any page we have read in. The link counts are for the clustered graphs.