Bug#556610:
Sergey Kirpichev
skirpichev at gmail.com
Thu Jun 14 07:27:23 UTC 2012
On Mon, Apr 16, 2012 at 4:16 PM, <bug556610 at arcor.de> wrote:
> According to the proposed patch I had a thought and 4 proposals to also cover
> the case of systems that are not powered on for 24 hours 7 days a week.
> (i.e. power saving policy in place)
Looks good, let's see.
> 0) Because checking an array is such a long operation, it may be desirable for
> checkarray to distinguish --reset and --interrupt options instead of an
> amobiguous --cancel, and be able to --resume any previously started check.
Added to the new patchset (coming soon). --resume option looks
redundant, for now: default action (check) will resume previously
started check or start the new one.
> Overall, the kernel should correctly stop an attempt for an automatic check while a
> manual check is running. If a manual check is interrupted manually or by shutting
> down, it may be continued manually or by the next automatic check, if
> enabled. However, interrupting a an automatic check should not interrupt manual
> checks. The easiest would be to just skip automatic checks if checks
> are already running. Let's see.
Right now, this (start/stop/interrupt) can be done via kernel
interfaces in /sys/block/mdX/md/ by sysadmin... Thus, there is no
chance to easy distinguish "manual" and "automatic" checks. It's a
good reason to implement things as simple, as possible.
> 1) If writing the sync_completed value to sync_min upon --interrupt does
> not survive a reboot, checkarray would definitely have to save the value
> separately. Otherwise, all it would ever check is always the same small part
> at the beginning of the array.
... Just as it do now.
And this save/restore can (optionally!) do /etc/init.d/mdadm. Why checkarray?
> (Thus, yes, Sergey I'd say it may be a good idea to let --interrupt "dump" e.g.
> "mdX-next_checkarray_block" state files under /var/lib/mdadm/ for all checks
> that are still running, and delete any others files to ensure they start fresh next time.
> The --reset option may just delete all state files.)
For now, we can save this dump as an option (default: off). But this
ugly "database" (and "autocheck-running" file too) shouldn't be
introduced "at all cost" in checkarray. It looks as a job for
/etc/init.d/mdadm, see above.
> 2) Cronjobs defined in /etc/cron.d are never run if the machine is off during
> the scheduled times (i.e. currently checkarray is never run). However, this
> should be an easy thing to fix, because anacron is in installed by default:
Why local admin can't just change cronjobs shedule?
> Simply, drop /etc/cron.d/mdadm from the package, and modify the
> /etc/cron.daily/mdadm script to run checkarray --cron --idle --all --quiet.
Just start array check is NOT enough (with incremental checks). You
should also stop it.
> If the machine is running 24h, by default cron will run anacron at 07:30
> in the morning (a time the admin may ensure is before office hours). Anacron then
> takes care of cron.daily. If the machine starts up earlier, it cron.daily runs earlier.
> Just letting the machine boot early enough ensures cron.daily tasks are done in time.
> If anacron is uninstalled cron will run cron.daily at 06:25.
>
> Limiting the time for performing array checks, in addition to using
> the --idle switch may be realized by adapting the following:
>
> 3) Instead of defining a separate cron job that calls checkarray --cancel later
> (which may never happen if machine is shutdown and is not possible using
> cron.daily), let checkarray --cron touch say /var/lib/mdadm/autocheck-running file
> and start the checks, if there are currently no (manual) checks running.
> Then sleep say for an AUTOCHECK_TIMESLICE duration defined in
> /etc/default/mdadm. After the sleep, if the autocheck-running file is still there,
> remove the file and proceed with the --interrupt action for, otherwise exit.
1) Array check will be stopped anyway on shutdown. We just should
figure out if we would like to save/restore sync_min/sync_max on
shutdown/reboot.
2) It's not a good idea to reinvent cron, IMHO. But we can instead
restrict by sync_max the interval of daily checks (my new patch). So,
/etc/cron.daily/mdadm will just cancel the previous check and then run
the new one.
3) anyway, a real patch from you can tell us much more then a long
specification, right? ;)
> 4) /etc/init.d/mdadm should call checkarray --interrupt to trigger proper
> saving of all states in case the machine is shut down before an
> AUTOCHECK_TIMESLICE has ended.
Thus, /etc/init.d/mdadm can handle almost all "black magics" stuff for
save and restore (default: not do so!) of sync_min/sync_max.
More information about the pkg-mdadm-devel
mailing list