Debian machines are blacklisted on http://bugs.pilot-link.org?

Sandro Tosi morph at debian.org
Sat Jun 6 17:30:12 UTC 2009


Hi David,
thanks for your reply.

On Sat, Jun 6, 2009 at 19:01, David A. Desrosiers
<david.a.desrosiers at gmail.com> wrote:
> On Sat, 2009-06-06 at 18:49 +0200, Ludovic Rousseau wrote:
>> We do a page access for each forward link, every time the tool is
>> executed.
>
> Each time "btslink" hits the server, it pulls 85 pages in under 3
> minutes. For a data-driven site that takes 2-3 seconds to execute the
> query, that unnecessarily hammers the database server, so the system
> automatically throttled it back.

Ok, I understand your point. Let's consider the ideal situation:

- only one debian machine should run bts-link
- bts-link is usually executed twice a week.

Given this, is it acceptable from your side to receive the workload
you described or it's a no-no in every case?

Let me briefly introduce how bts-link works:

1. gather all the links for remote issue trackers from all debian bugs
2. query each of those links to see if the status is updated
3. send a notification to the debian bugs that has remote bug updated

To do point 2 we run 4 threads in parallel trhu all the remote issue
trackers (so it's unlikely that all 4 threads are quering your host).

>> Anyhow, we run bts-link only twice a week (at times even less
>> frequently). Is it possible to ask for definitive whitelisting to the
>> admin? At least for the merkel ip. "
>
> Then someone is spoofing your UserAgent from several places, because we
> get one spider hitting us every single day with the UA of "btslink",
> reaching for the same content (and not even using HTTP/1.1, which would
> be significantly more efficient, or checking Last-Modified or Expires
> headers.

are the other hosts on the same net of  inf-6632.int-evry.fr ? if so,
it's one of our collaborator, that probably has a local copy running.
I already sent him an email to ask to stop that execution.

> But further digging shows that the IPs that reverse to those hosts are
> not blocked in any way from our end. Can you give me a set of IPs which
> you see as being "blacklisted" so I can check those instead? Or can you
> reach the bugtracker pages now?

Now I'm able to see the pages (FTR, I tried
http://bugs.pilot-link.org/1053 around 19.15 GMT+2), so and of now I
don't have a IP "supposed to be blacklisted".

I'd like to know if the proposed workload is acceptable and if, in
case some "reaction" (like blacklisting or throttling) is triggered on
your side, you can whitelist somehow the IP of our official machine
where it's currently running ( merkel.debian.org - 192.25.206.16 ).

Thanks a lot for your collaboration.

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi



More information about the Bts-link-devel mailing list