[Pkg-iscsi-maintainers] Bug#687619: Bug#687619: iscsitarget restart fail if more than 32 session try reconnect
Laszlo Fekete
blackluck at ktk.bme.hu
Fri Sep 14 13:31:50 UTC 2012
Hello,
On 2012. September 14. 17:40:48 you wrote:
> On Friday 14 September 2012 03:36 PM, Laszlo Fekete wrote:
>
> I have fresh debian squeeze with standard 2.6.32-5-amd64 kernel and
iscsitarget + iscsitarget-dkms packages and have some other squeeze with open-
iscsi initiators.
> The problem is after a long testing, that if there is more than 32 active
sessions when try /etc/init.d/iscsitarget restart, it sometimes fails, so no
more than 32 initiator can reconnect.
> What is the error you see on the initiator? Does it report that the
connection already exists?
>
In the logs reports iscsi connection error detected and try to recover.
>
> For more details:
> I have 16 targets with summary 40 active sessions, try to restart the iscsi
target and sometimes (not all the time, 20-60% of the tries for me) it fail.
first 32 clients can reconnect without problem, but the others don't get answer
from the target and don't working. There is no error on target or initiator
side, just the initiators try to reconnect.
> There is two type of the initiators:
> - using one session to access the target with only one ip, open-iscsi config
limits and timeout are the default
> - using 4 session to access the two target (some of the targets are
duplicated within 2 servers with drbd) with 2 session/ip per target. These
initiators using multipath and minimal (1sec) timeouts.
> Both type contains sessions which are just connected, but the target isn't
mounted on that client which failed to reconnect.
>
> If the iscsi target restart fail it random which initiator stuck, I think it
only depend on who is the faster to be in the first 32 session.
> I am not sure here. The open-iscsi default replacement timeout is 120 secs.
Even then, when the target is back, it will poll it.
>
You're right, I meant for 5sec default settings this:
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
and with 1 sec also this settings on those connections where using multipath.
>
> Tried to check, maybe this is a network connection, but if the restart fail
and try to telnet to them sometimes it also don't answer (tcpdump show, that
the target server got the request, but don't send any answer).
>
> Checked that maybe on the iscsi target stop stucked, but it seems to be
okay, the session closed, the modules unloaded, there isn't any error, tried
to raise the sleep time before start to 10 sec from 1, but got the same error.
> What 1 sec setting are you referring here?
>
Sleep time in init script which is in restart after the stop and before the
start.
> So I think there is a limit somewere that the in a short time no more than
32 initiator can connect, but don't find any of this.
> Checked the newer iscsi targets changelog and don't see any report that
describe this problem, so don't tried to upgrade wheezy.
>
> Debian GNU/Linux 6.0, kernel 2.6.32-5-amd64, libc 2.11.3-3
>
> There are settings in ietd.conf where you can set the Max number of
connections/sessions. Have you explored them?
>
I tried to set these to a high number or set 0 to disable (which is the
default for max sessions, as the manual said), but nothing changed.
Also tried to change ImmediateData and InitialR2T settings and add
NOPInterval 1
NOPTimeout 1
settings to all targets (every setting tried to set for some target and all
target too for testing). These settings doesn't solve the problem
But if I decrease the active session number to 32 or lower (stop some of the
initiators) the restart working fine every time and after that if I can the
stopped initiators start one by one, there is no problem, just when I try to
restart the iscsi target if more than 32 active sessions.
Regards, blackluck
--
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System
More information about the Pkg-iscsi-maintainers
mailing list