Jing Fu, Richard T. Goodwin, et al.
ICCC 2019
Many of the first successful applications of statistical learning to anti-spam filtering were personalized classifiers that were trained on an individual user's spam and ham e-mail. Proponents of personalized filters argue that statistical text learning is effective because it can identify the unique aspects of each individual's e-mail. On the other hand, a single classifier learned for a large population of users can leverage the data provided by each individual user across hundreds or even thousands of users. This paper investigates the trade-off between globally-and personallytrained anti-spam classifiers. We find that globally-trained text classification easily outperforms personally-trained classification under realistic settings. This result does not imply that personalization is not valuable. We show that the two techniques can be combined to produce a modest improvement in overall performance.
Jing Fu, Richard T. Goodwin, et al.
ICCC 2019
Barry Leiba, Joel Ossher, et al.
CEAS 2005
Janusz Marecki, Gerry Tesauro, et al.
AAMAS 2012
Richard Segal
TREC 2005