Bug#556610:

Sergey Kirpichev skirpichev at gmail.com
Thu Jun 14 07:27:23 UTC 2012


On Mon, Apr 16, 2012 at 4:16 PM,  <bug556610 at arcor.de> wrote:
> According to the proposed patch I had a thought and 4 proposals to also cover
> the case of systems that are not powered on for 24 hours 7 days a week.
> (i.e. power saving policy in place)

Looks good, let's see.

> 0) Because checking an array is such a long operation, it may be desirable for
> checkarray to distinguish --reset and --interrupt options instead of an
> amobiguous --cancel, and be able to --resume any previously started check.

Added to the new patchset (coming soon).  --resume option looks
redundant, for now: default action (check) will resume previously
started check or start the new one.

> Overall, the kernel should correctly stop an attempt for an automatic check while a
> manual check is running. If a manual check is interrupted manually or by shutting
> down, it may be continued manually or by the next automatic check, if
> enabled. However, interrupting a an automatic check should not interrupt manual
> checks. The easiest would be to just skip automatic checks if checks
> are already running. Let's see.

Right now, this (start/stop/interrupt) can be done via kernel
interfaces in /sys/block/mdX/md/ by sysadmin...  Thus, there is no
chance to easy distinguish "manual" and "automatic" checks.  It's a
good reason to implement things as simple, as possible.

> 1) If writing the sync_completed value to sync_min upon --interrupt does
> not survive a reboot, checkarray would definitely have to save the value
> separately. Otherwise, all it would ever check is always the same small part
> at the beginning of the array.

...  Just as it do now.

And this save/restore can (optionally!) do /etc/init.d/mdadm.  Why checkarray?

> (Thus, yes, Sergey I'd say it may be a good idea to let --interrupt "dump" e.g.
> "mdX-next_checkarray_block" state files under /var/lib/mdadm/ for all checks
> that are still running, and delete any others files to ensure they start fresh next time.
> The --reset option may just delete all state files.)

For now, we can save this dump as an option (default: off).  But this
ugly "database" (and "autocheck-running" file too) shouldn't be
introduced "at all cost" in checkarray.  It looks as a job for
/etc/init.d/mdadm, see above.

> 2) Cronjobs defined in /etc/cron.d are never run if the machine is off during
> the scheduled times (i.e. currently checkarray is never run). However, this
> should be an easy thing to fix, because anacron is in installed by default:

Why local admin can't just change cronjobs shedule?

> Simply, drop /etc/cron.d/mdadm from the package, and modify the
> /etc/cron.daily/mdadm script to run checkarray --cron --idle --all --quiet.

Just start array check is NOT enough (with incremental checks).  You
should also stop it.

> If the machine is running 24h, by default cron will run anacron at 07:30
> in the morning (a time the admin may ensure is before office hours). Anacron then
> takes care of cron.daily. If the machine starts up earlier, it cron.daily runs earlier.
> Just letting the machine boot early enough ensures cron.daily tasks are done in time.
> If anacron is uninstalled cron will run cron.daily at 06:25.
>
> Limiting the time for performing array checks, in addition to using
> the --idle switch may be realized by adapting the following:
>
> 3) Instead of defining a separate cron job that calls checkarray --cancel later
> (which may never happen if machine is shutdown and is not possible using
> cron.daily), let checkarray --cron touch say /var/lib/mdadm/autocheck-running file
> and start the checks, if there are currently no (manual) checks running.
> Then sleep say for an AUTOCHECK_TIMESLICE duration defined in
> /etc/default/mdadm. After the sleep, if the autocheck-running file is still there,
> remove the file and proceed with the --interrupt action for, otherwise exit.

1) Array check will be stopped anyway on shutdown.  We just should
   figure out if we would like to save/restore sync_min/sync_max on
   shutdown/reboot.
2) It's not a good idea to reinvent cron, IMHO.  But we can instead
   restrict by sync_max the interval of daily checks (my new patch).  So,
   /etc/cron.daily/mdadm will just cancel the previous check and then run
   the new one.
3) anyway, a real patch from you can tell us much more then a long
   specification, right? ;)

> 4) /etc/init.d/mdadm should call checkarray --interrupt to trigger proper
> saving of all states in case the machine is shut down before an
> AUTOCHECK_TIMESLICE has ended.

Thus, /etc/init.d/mdadm can handle almost all "black magics" stuff for
save and restore (default: not do so!) of sync_min/sync_max.





More information about the pkg-mdadm-devel mailing list