[pkg-ntp-maintainers] Bug#683061: Bug#683061: ntp: diff for NMU version 1:4.2.6.p5+dfsg-2.1

Helmut Grohne helmut at subdivi.de
Tue Aug 28 08:06:16 UTC 2012


On Tue, Aug 28, 2012 at 09:22:41AM +0200, Kurt Roeckx wrote:
> There are existing bugs to relevant software already about how
> they misbehave where I know about it.

Please add affects indications to those bugs, to make it easier to catch
duplicates.

> Not responding is fine, responding with a temporary error is also
> fine.  Responding with an error indicating the hostname does not
> exist is not fine.  And I've seen way too many cases where ntpd
> gets as answer that the hostname does not exist.
> 
> > If ntp then fails to resolve debian.pool.ntp.org and
> > fails to repeat that atfer the DNS server is started thus failing to
> > synchronize time at all, is that a bug in ntp?
> 
> Does ntpd report anything in the log files?

Aug 28 08:19:22 localhost ntpd[28507]: ntpd 4.2.6p5 at 1.2349-o Sat May 12 09:54:55 UTC 2012 (1)
Aug 28 08:19:22 localhost ntpd[28508]: proto: precision = 0.698 usec
Aug 28 08:19:22 localhost ntpd[28508]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Aug 28 08:19:22 localhost ntpd[28508]: Listen and drop on 1 v6wildcard :: UDP 123
Aug 28 08:19:22 localhost ntpd[28508]: Listen normally on 2 lo 127.0.0.1 UDP 123
...
Aug 28 08:19:22 localhost ntpd[28508]: Listen normally on 11 lo ::1 UDP 123
Aug 28 08:19:22 localhost ntpd[28508]: peers refreshed
Aug 28 08:19:22 localhost ntpd[28508]: Listening on routing socket on fd #28 for interface updates
Aug 28 08:19:22 localhost ntpd[28508]: Deferring DNS for 0.debian.pool.ntp.org 1
Aug 28 08:19:22 localhost ntpd[28508]: Deferring DNS for 1.debian.pool.ntp.org 1
Aug 28 08:19:22 localhost ntpd[28508]: Deferring DNS for 2.debian.pool.ntp.org 1
Aug 28 08:19:22 localhost ntpd[28508]: Deferring DNS for 3.debian.pool.ntp.org 1
Aug 28 08:19:22 localhost ntpd[28517]: signal_no_reset: signal 17 had flags 4000000
Aug 28 08:19:24 localhost ntpd_intres[28517]: host name not found: 0.debian.pool.ntp.org
Aug 28 08:19:24 localhost ntpd_intres[28517]: host name not found: 1.debian.pool.ntp.org
Aug 28 08:19:24 localhost ntpd_intres[28517]: host name not found: 2.debian.pool.ntp.org
Aug 28 08:19:24 localhost ntpd_intres[28517]: host name not found: 3.debian.pool.ntp.org

After this ntpq -p returns "No association ID's returned"

> If ntpd gets back that the hostname does not exist, I think it shouldn't
> keep trying for ever to try and resolv it.  I would argue that it
> had a bug in that case.

Agreed.

> I will make the change to put it in Required-Start, but that's
> not what this bug is really about.

Maybe it will, but for other reasons. See below.

> What does your /etc/nsswitch.conf say for hosts?

hosts:          files dns

> Can you reproduce this problem with the server normally started
> but unbound stopped?

Note that I am running resolvconf. Stopping unbound will change
/etc/resolv.conf. I currently cannot tell how the resolv.conf looks like
during boot. Maybe it is initially empty? So to track down the issue I
did something else. I added iptables -A OUTPUT -p udp -d 127.0.0.1 -j
DROP to simulate a stopped name server, started ntp and then atfer some
time removed the iptables rule. As you claimed before, ntp works as
expected in this case and retries the resolving until it succeeds. So
more likely the cause is related to the late changing of
/etc/resolv.conf. I can only speculate here. Another option for it would
be to contain some value received from dhcp. It then would contain a
possibly broken name server I have no control about.

So arguably this issue stems from different assumptions on
/etc/resolv.conf (by resolvconf and ntp). You could say that resolvconf
is broken by design. I am not sure on how to proceed here. Given that
the number of preconditions required to reproduce is this high I agree
with your downgrading of the severity. As far as I can see you need:

1) A name server that is started after ntp.
2) resolvconf
3) Maybe also a broken upstream name server.

I would say that adding Should-Start or Required-Start really is the fix
for this issue and I could not reproduce the issue after adding
Required-Start to my local init script. But I only understood why it
solves the issue after you made me look at it very hard. Thanks.

Please decide for yourself, whether you want this change in wheezy. It
is fine with me to only have it in sid.

Helmut



More information about the pkg-ntp-maintainers mailing list