Table 1.
Anchor | URL encountered | URL read in | Clusters | Total links | Co-links |
---|---|---|---|---|---|
Shakespeare | 277,114 | 69,982 | 1,560 | 3,730 | 321 |
Needlework | 341,398 | 102,895 | 1,498 | 4,440 | 727 |
Revisionist | 66,771 | 22,933 | 947 | 1,307 | 67 |
Piazzolla | 47,978 | 20,404 | 2,125 | 2,577 | 70 |
Mol. Biol. | 318,705 | 110,286 | 1,518 | 6,351 | 868 |
Our protocol ensures that nodes within distance r = 1 of the anchor are fully explored to a given depth d = 3. The crawls started at sites of literary studies of Shakespeare, instructions for needlework, a notorious Swiss revisionist (anti-Semitic) site, tango (specifically the music of Piazzolla), and a site for molecular biology (Mol. Biol.). A URL has been “read in” if our robot has fetched and analyzed the corresponding page. We deem a URL as “encountered” if it appeared in any page we have read in. The link counts are for the clustered graphs.