[plug] A plan for spam spiders.

Craig Ringer craig at postnewspapers.com.au
Sat May 7 18:24:00 WST 2005


On Sat, 2005-05-07 at 18:13 +0800, Daniel J. Axtens wrote:
> > So when the spider finds a link to "DO NOT CLICK ME AS THIS PAGE WILL
> > CRASH YOUR COMPUTER" which is also enticingly placed in robots.txt as
> > forbidden fruit, it excitedly clicks through, recieves a gzipped html
> > file, which it unpacks to view the hidden goodies, and BLAM! 1 gigabit of
> > crud explodes in its head, depleting the spam servers memory, and vmem and
> > causing the smoke to leak out of its vile little brain.
> > 
> > The question is;- WOULD IT WORK!
> 
> Would a spam spider ungzip a gzipped file?

If it uses a quality library for HTTP, then it could be done
transparently (HTTP supports compressed results). Of course, if the bot
and HTTP library avoid loading the whole stream into RAM at once and
instead process it as a stream, they wouldn't much mind (except for the
processing overhead).

-- 
Craig Ringer




More information about the plug mailing list