[gopher] Spidering teh gopherspace
Kim Holviala
kim at holviala.com
Wed Apr 14 18:29:41 UTC 2010
On 2010-04-14 17:41, John Goerzen wrote:
> You may be interested in
> http://git.complete.org/z-old/gopherbot/
>
> though my goal was somewhat different; I aimed to archive all of
> gopherspace, not search it.
I'm actually doing both, searching and caching (not really archiving).
After giving it a thought I figured out that indexing offline content is
easier - I can reindex as many times as I want without redownloading.
I'd take a look at your code, but it's in Haskell and uses a database....
Actually, the reason I'm doing the search engine is that I've ALWAYS
wanted to do my own search engine. And spidering the gopherspace is
actually fun, unlike spidering the web...
> I believe it was actually successful; I
> burned some DVDs and shared them with a few people on this mailing list
> at the time. IIRC it was about 5 DVDs.
Thanks for that info - I was wondering about how much disk space I
should allocate for the cache...
- Kim
More information about the Gopher-Project
mailing list