[gopher] Improving Gopher Searches

Cameron Kaiser spectre at floodgap.com
Thu Feb 17 17:57:05 UTC 2011


> Improving the "freshness" of the Veronica-2 / VISHNU database
> is probably the best way to ensure that recently added material
> is findable.  That got me wondering if perhaps what's needed
> is a local indexer that could be run at least daily, with the
> resulting data ball either left in the root directory for
> retrieval by, or sent in to, the Veronica-2 / VISHNU server(s).
> The idea is to reduce the amount of gopherspace that needs to
> be actively crawled by the Veronica-2 / VISHNU servers which,
> from Cameron's posts, appear to take a while and aren't very
> frequent.
> 
> I'm not very knowledgeable on database management; is the above
> scheme feasible? If so, what should the data ball look like?
> A flat file seems like it'd be adequate and leaves the "how" up
> to individual server operators.

Jeff and I talked a bit about this offline, so I'll just post this
here. The POWER6 will significantly mitigate this problem simply
by throwing more hardware at it. helsinki is a 500MHz G3 with 1GB of
RAM and a 40GB PATA drive. This isn't, by modern standards, much power,
memory or storage bandwidth, so for adequate search speed searches are
cached against a particular database revision and the database has live
and search partitions so that the robot isn't continually invalidating
any wins we get by caching search results.

uppsala, on the other hand, is a dual-core 4.2GHz POWER6 with 8GB of
RAM and three 15Krpm 146GB SAS drives in RAID 5. This is fast enough
to have a single database where the robot runs continuously and searches
can occur against the live database because, simply, the system is more
than fast enough. The tables mostly fit in RAM, and hitting the disk is
a minimal penalty in wall-clock terms.

I finished uppsala's OS updates this past weekend and I hope to start
installing packages on it this weekend. If all goes well, it should be
crawling for its new "fresh" unified database by March.

-- 
------------------------------------ personal: http://www.cameronkaiser.com/ --
  Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckaiser at floodgap.com
-- Success can eliminate as many options as failure. -- Tom Robbins -----------



More information about the Gopher-Project mailing list