Table 2.
Description of the email spam dataset and 20 newsgroups datasets [12].
| Source domains (N S) | Target domains (N S) | |
|---|---|---|
| Email spam | User 1 (2500) | Public set (4000) |
| User 2 (2500) | ||
| User 3 (2500) | ||
|
| ||
| rec versus sci | rec.autos and sci.crypt (1976) | rec.sport.hockey and sci.space (1982) |
| rec.motorcycles and sci.electronics (1977) | ||
| rec.sport.baseball and sci.med (1978) | ||
|
| ||
| comp versus rec | comp.graphics and rec.autos (1957) | comp.sys.mac.hardware and rec.sport.hockey (1955) |
| comp.os.ms-windows.misc and rec.motorcycles (1956) | ||
| comp.sys.ibm.pc.hardware and rec.sport.baseball (1970) | ||
|
| ||
| sci versus comp | sci.crypt and comp.graphics (1959) | sci.space and comp.sys.mac.hardware (1943) |
| sci.electronics and comp.os.ms-windows.misc (1947) | ||
| sci.med andcomp.sys.ibm.pc.hardware (1966) | ||