[gopher] Spidering teh gopherspace

Kim Holviala kim at holviala.com
Wed Apr 14 18:29:41 UTC 2010


On 2010-04-14 17:41, John Goerzen wrote:

> You may be interested in
> http://git.complete.org/z-old/gopherbot/
>
> though my goal was somewhat different; I aimed to archive all of
> gopherspace, not search it.

I'm actually doing both, searching and caching (not really archiving). 
After giving it a thought I figured out that indexing offline content is 
easier - I can reindex as many times as I want without redownloading.

I'd take a look at your code, but it's in Haskell and uses a database....

Actually, the reason I'm doing the search engine is that I've ALWAYS 
wanted to do my own search engine. And spidering the gopherspace is 
actually fun, unlike spidering the web...

> I believe it was actually successful; I
> burned some DVDs and shared them with a few people on this mailing list
> at the time.  IIRC it was about 5 DVDs.

Thanks for that info - I was wondering about how much disk space I 
should allocate for the cache...



- Kim




More information about the Gopher-Project mailing list