[Pkg-iscsi-maintainers] Bug#687619: Bug#687619: iscsitarget restart fail if more than 32 session try reconnect

Laszlo Fekete blackluck at ktk.bme.hu
Fri Sep 14 15:37:53 UTC 2012


On 2012. September 14. 20:51:39 Ritesh Raj Sarraf wrote:
> On Friday 14 September 2012 07:01 PM, Laszlo Fekete wrote:
> > In the logs reports iscsi connection error detected and try to recover.
> 
> Is there an error message/code ?
This is in the initiator logs:
Sep 13 14:40:09 mail01 iscsid: Kernel reported iSCSI connection 4:0 error 
(1020) state (3)
Sep 13 14:40:20 mail01 iscsid: connection4:0 is operational after recovery (2 
attempts)

> 
> >> If the iscsi target restart fail it random which initiator stuck, I think
> >> it> 
> > only depend on who is the faster to be in the first 32 session.
> > 
> >> I am not sure here. The open-iscsi default replacement timeout is 120
> >> secs.
> > 
> > Even then, when the target is back, it will poll it.
> > 
> > You're right, I meant for 5sec default settings this:
> > node.conn[0].timeo.noop_out_interval = 5
> > node.conn[0].timeo.noop_out_timeout = 5
> > and with 1 sec also this settings on those connections where using
> > multipath.
> Why do you change it to 1 ? That's a very low value and will just flood
> the target.
As I said, using multipath, so want a fast response if there is a 
connection/session error to change to the other path. That's why I'm using 
these values:
node.session.timeo.replacement_timeout = 5                                                                                                                                                                                                                                
node.session.err_timeo.abort_timeout = 5                                                                                                                                                                                                                                  
node.session.err_timeo.lu_reset_timeout = 5                                                                                                                                                                                                                               
node.session.err_timeo.host_reset_timeout = 60                                                                                                                                                                                                                            
node.session.iscsi.FastAbort = Yes                                                                                                                                                                                                                                        
node.session.iscsi.InitialR2T = No                                                                                                                                                                                                                                        
node.session.iscsi.ImmediateData = Yes                                                                                                                                                                                                                                    
node.session.iscsi.FirstBurstLength = 262144                                                                                                                                                                                                                              
node.session.iscsi.MaxBurstLength = 16776192                                                                                                                                                                                                                              
node.conn[0].timeo.logout_timeout = 5
node.conn[0].timeo.login_timeout = 5
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.noop_out_interval = 1
node.conn[0].timeo.noop_out_timeout = 1

But as I said, this also affected to that initiators which don't use multipath 
and had the default open-iscsi values.


There is an INCOMING_MAX 32 limit in the source, that wrote few minutes before 
your last mail, hope you got that, I think that will be the problem and will 
check it next week.

> 
> >> Tried to check, maybe this is a network connection, but if the restart
> >> fail
> > 
> > and try to telnet to them sometimes it also don't answer (tcpdump show,
> > that the target server got the request, but don't send any answer).
> > 
> >> Checked that maybe on the iscsi target stop stucked, but it seems to be
> > 
> > okay, the session closed, the modules unloaded, there isn't any error,
> > tried to raise the sleep time before start to 10 sec from 1, but got the
> > same error.> 
> >> What 1 sec setting are you referring here?
> > 
> > Sleep time in init script which is in restart after the stop and before
> > the
> > start.
> 
> That sleep won't help.



More information about the Pkg-iscsi-maintainers mailing list