[gopher] The Gopher Archive
Kim Holviala
kim at holviala.com
Fri Apr 23 11:21:23 UTC 2010
On 22.4.2010 23:07, Brian Koontz wrote:
>> I'm archiving Teh Gopher. All of it - well all textual searchable
>> information, not binaries nor images...
>
> The next logical step would be to set up a mechanism to mirror the
> archive, because we all know what happens when one large repository
> suddenly goes down for what will likely be forever (hal3000.cx,
> anyone)?
Replace $ROOT with whatever directory you want to keep the files in.
$ rsync rsync.gophernicus.org::archive/
drwxr-xr-x 4096 2010/04/19 15:51:09 .
drwxr-xr-x 4096 2010/04/23 01:06:42 sites
$ rsync -avz --progress rsync.gophernicus.org::archive/ $ROOT/
receiving incremental file list
created directory $ROOT
./
sites/
sites/last
29 100% 28.32kB/s 0:00:00 (xfer#1, to-check=1066/1069)
sites/1/
sites/1/155.198.1.33:70/
sites/3/
sites/3/gopher.386server.info:70/
[...]
Archive directory structure is pretty simple: all of the sites are
under, uh, sites/ (more directories are coming under there) and they are
grouped by the first letter of the primary domain name.
So for example gopher.floodgap.com's port 70 can be found from
$ROOT/sites/f/gopher.floodgap.com:70/
Under the site directory there are one or more subdirectories, the
archived files are under the cache/ directory. Under there you have
one-letter directories which present the first letter of the md5 sum of
the original selector. The actual downloaded files are saved with the
selector-md5summed filename and have some mime headers, dual CRLF's and
a bit-perfect unmodified copy of the original file.
Uh, complicated.
Let's take this file from floodgap:
/archive/walnut-creek-cd-simtel/BEEHIVE/MYZ80/00-INDEX.TXT
$ printf "/archive/walnut-creek-cd-simtel/BEEHIVE/MYZ80/00-INDEX.TXT" |
md5sum
e9c26adf54530a785378971bbac7cd23 -
$ ls -la
$ROOT/sites/f/gopher.floodgap.com\:70/cache/e/e9c26adf54530a785378971bbac7cd23
-rw-r--r-- 1 kimmy users 1334 2010-04-23 14:12
$ROOT/f/gopher.floodgap.com:70/cache/e/e9c26adf54530a785378971bbac7cd23
$ head -20
$ROOT/sites/f/gopher.floodgap.com:70/cache/e/e9c26adf54530a785378971bbac7cd23
Location:
gopher://gopher.floodgap.com:70/0/archive/walnut-creek-cd-simtel/BEEHIVE/MYZ80/00-INDEX.TXT
Host: gopher.floodgap.com:70
Filetype: 0
Selector: /archive/walnut-creek-cd-simtel/BEEHIVE/MYZ80/00-INDEX.TXT
Referer:
gopher://gopher.floodgap.com:70/1/archive/walnut-creek-cd-simtel/BEEHIVE/MYZ80
Name: 00-INDEX.TXT
Title: /archive/walnut-creek-cd-simtel/BEEHIVE/MYZ80/00-INDEX.TXT
Date: 2010-Apr-23 11:12
Timestamp: 1272021150
Size: 892
MYZ80111.ZIP 105339 05-22-93 V1.11 Of Simeon Cran's CP/M emulator
for the
| PC. This is Simeon Crans' complete CP/M
| package for the PC. It needs a 286 (or
| better) to run and is packed with goodies,
| such as the ability to run CP/M 2.2 or
3.0,
| 32-bit processor aware, multitasker aware,
| ADM3A/Televideo emulation, complete key
| re-mapping, etc etc. You've tried the
rest,
| now try the BEST!! I haven't seen a better
- Kim
More information about the Gopher-Project
mailing list