[gopher] A little Gopher crawler

James Mills prologic at shortcircuit.net.au
Thu Dec 25 10:18:55 UTC 2014


Hi again,

So I wrote that crawler and improved upon it
(it's still single threaded) but here's what I have to share.

prologic at daisy
Thu Dec 25 20:06:53
~/tmp
$ time ./gspider.py &> index

real 0m6.244s
user 0m0.120s
sys 0m0.047s

prologic at daisy
Thu Dec 25 20:07:16
~/tmp
$ wc -l index
3039 index

This is run on localhost on the same machine against
my cgod python gopher server (see other thread).

The performance is *MUCH* better :)

cheers
James


James Mills / prologic

E: prologic at shortcircuit.net.au
W: prologic.shortcircuit.net.au

On Mon, Dec 15, 2014 at 3:23 PM, James Mills <prologic at shortcircuit.net.au>
wrote:
>
> Hi All and Kim (author of Gophernicus!),
>
> I wrote this little cralwer today:
> https://gist.github.com/b781e02b0299fef1f3f6
>
> I'm a bit disappointed in performance though of crawling my local
> Gopehrspace (basically via localhost):
>
> prologic at daisy
> Mon Dec 15 15:13:08
> ~/tmp
> $ time ./gspider.py &> index
>
> real 5m27.825s
> user 0m6.126s
> sys 0m5.825s
>
> prologic at daisy
> Mon Dec 15 15:18:51
> ~/tmp
> $ wc -l index
> 355 index
>
> Any comments? :)
>
> cheers
> James
>
> James Mills / prologic
>
> E: prologic at shortcircuit.net.au
> W: prologic.shortcircuit.net.au
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/gopher-project/attachments/20141225/a9bce3ad/attachment-0001.html>


More information about the Gopher-Project mailing list