[Pkg-iscsi-maintainers] Bug#687619: Bug#687619: iscsitarget restart fail if more than 32 session try reconnect

Ritesh Raj Sarraf rrs at debian.org
Tue Sep 18 12:57:58 UTC 2012


Hello Laszlo,

Glad that that solved the problem. I have CCed the upstream devs and the
list. If it gets committed upstream, I can consider pulling it.


On Tuesday 18 September 2012 03:18 PM, Laszlo Fekete wrote:
> Hello!
> 
> The INCOMING_MAX parameter change (in source code) solved the problem.
> 
> http://blog.wpkg.org/2007/09/09/solving-reliability-and-scalability-problems-
> with-iscsi/
> 
> I think it would be very helpful if this parameter could be changed with a 
> global variable at iscsitarget daemon start or just create an error log entry 
> if the limit reached, maybe a simple warning at init script if more than 32 
> active sessions, that it's possible fail.
> 
> Regards, blackluck
> 
> On 2012. September 14. 23:48:27 Laszlo Fekete wrote:
>> On 2012. September 15. 01:06:27 Ritesh Raj Sarraf wrote:
>>> On Friday 14 September 2012 09:07 PM, Laszlo Fekete wrote:
>>>>> Is there an error message/code ?
>>>>
>>>> This is in the initiator logs:
>>>> Sep 13 14:40:09 mail01 iscsid: Kernel reported iSCSI connection 4:0
>>>> error
>>>> (1020) state (3)
>>>> Sep 13 14:40:20 mail01 iscsid: connection4:0 is operational after
>>>> recovery
>>>> (2 attempts)
>>>
>>> So the connection did recover.
>>
>> Yes, it recovers because of 1-5 another iscsi target restart after the first
>> failed restart just the initiator don't see any change if the target
>> restart failed.
>> The connection recovers only after a sucessful restart but not all restart
>> sucessful if there is more than 32 sessions try to recover in a short time.
>>
>>>>> Why do you change it to 1 ? That's a very low value and will just flood
>>>>> the target.
>>>>
>>>> As I said, using multipath, so want a fast response if there is a
>>>> connection/session error to change to the other path. That's why I'm
>>>> using
>>>
>>> The multipath path checker loop triggers every 5 seconds.
>>>
>>>> these values:
>>>> node.session.timeo.replacement_timeout = 5
>>>> node.session.err_timeo.abort_timeout = 5
>>>> node.session.err_timeo.lu_reset_timeout = 5
>>>> node.session.err_timeo.host_reset_timeout = 60
>>>> node.session.iscsi.FastAbort = Yes
>>>> node.session.iscsi.InitialR2T = No
>>>> node.session.iscsi.ImmediateData = Yes
>>>> node.session.iscsi.FirstBurstLength = 262144
>>>> node.session.iscsi.MaxBurstLength = 16776192
>>>> node.conn[0].timeo.logout_timeout = 5
>>>> node.conn[0].timeo.login_timeout = 5
>>>> node.conn[0].timeo.auth_timeout = 45
>>>> node.conn[0].timeo.noop_out_interval = 1
>>>> node.conn[0].timeo.noop_out_timeout = 1
>>>>
>>>> But as I said, this also affected to that initiators which don't use
>>>> multipath and had the default open-iscsi values.
>>>>
>>>>
>>>> There is an INCOMING_MAX 32 limit in the source, that wrote few minutes
>>>> before your last mail, hope you got that, I think that will be the
>>>> problem and will check it next week.
>>>
>>> Okay!! Let me know what your findings are. From what you have shared up
>>> till now, I don't see much a problem with IET or open-iscsi.
>>
>> The problem is if there are more than 32 active connections when restart
>> iscsi target it may fail and don't see any error in the logs, just the
>> initiators try to reconnect.
>>
>> You can tell to raise the timeouts, but that's still like lottery. If I have
>> 80 sessions when restarting the target and 35 of them try to reconnect in
>> the same time it will also fail and there is nothing error message.
>>
>>
>> I hope increasing the default INCOMING_MAX 32 setting in the source code
>> will solve the problem. (Next week I'm going to test this.)
>>
>> If you say this isn't a bug, that's fine because this is a limit in the
>> source code (if really it's the problem) and can't be configured
>> dinamically. But this wasn't clear for me and spent 4 days with debugging
>> to suspect only that maybe there is a 32 limit somewhere.
>>
>> So maybe a warning message would be helpful about that in the init script if
>> there are more than 32 active sessions or create an error log entry that
>> reached the incoming_max limit.


-- 
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 897 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-iscsi-maintainers/attachments/20120918/d9d21f9a/attachment.pgp>


More information about the Pkg-iscsi-maintainers mailing list