Bug#556610:

bug556610 at arcor.de bug556610 at arcor.de
Mon Apr 16 12:16:33 UTC 2012


According to the proposed patch I had a thought and 4 proposals to also cover
the case of systems that are not powered on for 24 hours 7 days a week.
(i.e. power saving policy in place)

0) Because checking an array is such a long operation, it may be desirable for
checkarray to distinguish --reset and --interrupt options instead of an 
ambiguous --cancel, and be able to --resume any previously started check.
Overall, the kernel should correctly stop an attempt for an automatic check while a
manual check is running. If a manual check is interrupted manually or by shutting
down, it may be continued manually or by the next automatic check, if
enabled. However, interrupting a an automatic check should not interrupt manual
checks. The easiest would be to just skip automatic checks if checks
are already running. Let's see.

1) If writing the sync_completed value to sync_min upon --interrupt does
not survive a reboot, checkarray would definitely have to save the value
separately. Otherwise, all it would ever check is always the same small part 
at the beginning of the array.
(Thus, yes, Sergey I'd say it may be a good idea to let --interrupt "dump" e.g. 
"mdX-next_checkarray_block" state files under /var/lib/mdadm/ for all checks 
that are still running, and delete any others files to ensure they start fresh next time.
The --reset option may just delete all state files.)


2) Cronjobs defined in /etc/cron.d are never run if the machine is off during 
the scheduled times (i.e. currently checkarray is never run). However, this
should be an easy thing to fix, because anacron is in installed by default:

Simply, drop /etc/cron.d/mdadm from the package, and modify the 
/etc/cron.daily/mdadm script to run checkarray --cron --idle --all --quiet.
(With incremental checks it may be possible to stop restricting the automatic
checkarray to run only on sundays, nevertheless the script could evaluate
"date +%w".)

If the machine is running 24h, by default cron will run anacron at 07:30
in the morning (a time the admin may ensure is before office hours). Anacron then
takes care of cron.daily. If the machine starts up earlier, it cron.daily runs earlier. 
Just letting the machine boot early enough ensures cron.daily tasks are done in time.
If anacron is uninstalled cron will run cron.daily at 06:25.

Limiting the time for performing array checks, in addition to using
the --idle switch may be realized by adapting the following:

3) Instead of defining a separate cron job that calls checkarray --cancel later
(which may never happen if machine is shutdown and is not possible using 
cron.daily), let checkarray --cron touch say /var/lib/mdadm/autocheck-running file
and start the checks, if there are currently no (manual) checks running.
Then sleep say for an AUTOCHECK_TIMESLICE duration defined in 
/etc/default/mdadm. After the sleep, if the autocheck-running file is still there,
remove the file and proceed with the --interrupt action for, otherwise exit.

4) /etc/init.d/mdadm should call checkarray --interrupt to trigger proper
saving of all states in case the machine is shut down before an 
AUTOCHECK_TIMESLICE has ended.
If an autocheck-running file is present, checkarray --shutdown then removes it
and does the --cron --cancel --all operation.

Regards,
Christian







More information about the pkg-mdadm-devel mailing list