How spam may feed the thinking machine
Published: 24 Aug 2004 16:50 BST
It is hard to find a good word to say for spam. Incoherent, unpleasant and unwanted, it slimes through cyberspace on the backs of zombies and oozes into our inbox with the stench of month-old haddock. Yet far from fatally clogging up our information arteries, spam may provide the impetus for a true revolution in information technology -- one we've been expecting for more than fifty years.
All the problems caused by the stuff can be solved if we can answer one simple question: what is spam? You and I know within a second of opening a piece of email whether it's spam or not - but computers are terribly bad at replicating the task. All spam-filters suffer from two problems, the false negative and the false positive. We can -- we do -- put up with the false negatives, the spam written cleverly enough to bypass whichever tests are flavour of the month.
False positives, when a real email is junked before we read it, are potentially ruinous. Unless filters are absolutely sure, they err on the side of slackness. They are never absolutely sure: some always gets through. And, because spam works on the law of averages, as long as some gets through, the spammers will ramp up the rate to make sure that enough hits to make the sums work. The pressure on our systems is immense.
So what's so hard about spotting spam? By common consent, the first serious spammers were Laurence Canter and Martha Seigel, who started sending out mass postings in 1994 advertising immigration services. At once, the battle was joined: people started writing filters and ditching missives from Canter and Seigel's ISP -- as the only spammers on the planet, they were easy to find. They changed ISP (not entirely voluntarily) and the arms race between spammers and filters had begun.
Since then, spam-filter software has learned -- for example -- that spam looks very similar, so the spammers learned to include different random text in each message. Then the filters found that some fairly simple tests for basic English construction spotted the randomness, so the spammers learned to construct fake English sentences or include snippets of surreally inappropriate text. Key words were a giveaway, so the spammers learned to misspell and punctuate violently.
Full Talkback thread
1 comment







