I know this sounds lame, but I have never really needed it until now. I've recently been really busy with a new job taking up my working hours, and some renovation on our home taking up my time off, so I haven't been keeping on top of spam mail.
I run postfix as my MTA and have that apply spamc and anomy sanitizer to incoming mail before piping the mail into cyrus imapd. I use IMAP for mail as I could be using any of a number of different machines here to check my mail. This makes training the spam filter a little difficult, so I have a simple training-by-filing setup:
I have INBOX.junk, INBOX.junk.ham and INBOX.junk.spam. The first of these collects spam recognized by spamc during delivery. The user will then copy mail from junk into junk.ham if is was a false positive, or copy mail from INBOX into junk.spam if it wasn't recognized as spam.
Periodically, I run a script to train based on the contents of the ham and spam folders. Now, the problem I had was that I'd left way too much spam hanging around in the spam folder (the script doesn't do any pruning), and so it was failing due to the excessive number of command line arguments.
So, the moral of the story is: always use xargs for maintenance scripts, even if you don't think you'll need it.