martin f. krafft: update md.txt from linus' HEAD
Martin F. Krafft
madduck at alioth.debian.org
Sat Sep 10 10:58:06 UTC 2011
Module: mdadm
Branch: contrib/docs/md.txt
Commit: 93842bf7b4f1beb5aac95bbf683944a96ced6ce2
URL: http://git.debian.org/?p=pkg-mdadm/mdadm.git;a=commit;h=93842bf7b4f1beb5aac95bbf683944a96ced6ce2
Author: martin f. krafft <madduck at debian.org>
Date: Mon Aug 1 11:39:14 2011 +0200
update md.txt from linus' HEAD
Signed-off-by: martin f. krafft <madduck at debian.org>
---
docs/md.txt | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++------
1 files changed, 101 insertions(+), 12 deletions(-)
diff --git a/docs/md.txt b/docs/md.txt
index 727238d..fc94770 100644
--- a/docs/md.txt
+++ b/docs/md.txt
@@ -1,7 +1,5 @@
-# From: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob_plain;f=Documentation/md.txt;hb=v2.6.31
-
Tools that manage md devices can be found at
- http://www.<country>.kernel.org/pub/linux/utils/raid/....
+ http://www.kernel.org/pub/linux/utils/raid/
Boot time assembly of RAID arrays
@@ -138,7 +136,7 @@ raid_disks != 0.
Then uninitialized devices can be added with ADD_NEW_DISK. The
structure passed to ADD_NEW_DISK must specify the state of the device
-and it's role in the array.
+and its role in the array.
Once started with RUN_ARRAY, uninitialized spares can be added with
HOT_ADD_DISK.
@@ -235,9 +233,9 @@ All md devices contain:
resync_start
The point at which resync should start. If no resync is needed,
- this will be a very large number. At array creation it will
- default to 0, though starting the array as 'clean' will
- set it much larger.
+ this will be a very large number (or 'none' since 2.6.30-rc1). At
+ array creation it will default to 0, though starting the array as
+ 'clean' will set it much larger.
new_dev
This file can be written but not read. The value written should
@@ -298,6 +296,51 @@ All md devices contain:
active-idle
like active, but no writes have been seen for a while (safe_mode_delay).
+ bitmap/location
+ This indicates where the write-intent bitmap for the array is
+ stored.
+ It can be one of "none", "file" or "[+-]N".
+ "file" may later be extended to "file:/file/name"
+ "[+-]N" means that many sectors from the start of the metadata.
+ This is replicated on all devices. For arrays with externally
+ managed metadata, the offset is from the beginning of the
+ device.
+ bitmap/chunksize
+ The size, in bytes, of the chunk which will be represented by a
+ single bit. For RAID456, it is a portion of an individual
+ device. For RAID10, it is a portion of the array. For RAID1, it
+ is both (they come to the same thing).
+ bitmap/time_base
+ The time, in seconds, between looking for bits in the bitmap to
+ be cleared. In the current implementation, a bit will be cleared
+ between 2 and 3 times "time_base" after all the covered blocks
+ are known to be in-sync.
+ bitmap/backlog
+ When write-mostly devices are active in a RAID1, write requests
+ to those devices proceed in the background - the filesystem (or
+ other user of the device) does not have to wait for them.
+ 'backlog' sets a limit on the number of concurrent background
+ writes. If there are more than this, new writes will by
+ synchronous.
+ bitmap/metadata
+ This can be either 'internal' or 'external'.
+ 'internal' is the default and means the metadata for the bitmap
+ is stored in the first 256 bytes of the allocated space and is
+ managed by the md module.
+ 'external' means that bitmap metadata is managed externally to
+ the kernel (i.e. by some userspace program)
+ bitmap/can_clear
+ This is either 'true' or 'false'. If 'true', then bits in the
+ bitmap will be cleared when the corresponding blocks are thought
+ to be in-sync. If 'false', bits will never be cleared.
+ This is automatically set to 'false' if a write happens on a
+ degraded array, or if the array becomes degraded during a write.
+ When metadata is managed externally, it should be set to true
+ once the array becomes non-degraded, and this fact has been
+ recorded in the metadata.
+
+
+
As component devices are added to an md array, they appear in the 'md'
directory as new directories named
@@ -317,18 +360,20 @@ Each directory contains:
A file recording the current state of the device in the array
which can be a comma separated list of
faulty - device has been kicked from active use due to
- a detected fault
+ a detected fault or it has unacknowledged bad
+ blocks
in_sync - device is a fully in-sync member of the array
writemostly - device will only be subject to read
requests if there are no other options.
This applies only to raid1 arrays.
- blocked - device has failed, metadata is "external",
- and the failure hasn't been acknowledged yet.
+ blocked - device has failed, and the failure hasn't been
+ acknowledged yet by the metadata handler.
Writes that would write to this device if
it were not faulty are blocked.
spare - device is working, but not a full member.
This includes spares that are in the process
of being recovered to
+ write_error - device has ever seen a write error.
This list may grow in future.
This can be written to.
Writing "faulty" simulates a failure on the device.
@@ -336,8 +381,11 @@ Each directory contains:
Writing "writemostly" sets the writemostly flag.
Writing "-writemostly" clears the writemostly flag.
Writing "blocked" sets the "blocked" flag.
- Writing "-blocked" clear the "blocked" flag and allows writes
- to complete.
+ Writing "-blocked" clears the "blocked" flags and allows writes
+ to complete and possibly simulates an error.
+ Writing "in_sync" sets the in_sync flag.
+ Writing "write_error" sets writeerrorseen flag.
+ Writing "-write_error" clears writeerrorseen flag.
This file responds to select/poll. Any change to 'faulty'
or 'blocked' causes an event.
@@ -374,6 +422,37 @@ Each directory contains:
array. If a value less than the current component_size is
written, it will be rejected.
+ recovery_start
+ When the device is not 'in_sync', this records the number of
+ sectors from the start of the device which are known to be
+ correct. This is normally zero, but during a recovery
+ operation is will steadily increase, and if the recovery is
+ interrupted, restoring this value can cause recovery to
+ avoid repeating the earlier blocks. With v1.x metadata, this
+ value is saved and restored automatically.
+
+ This can be set whenever the device is not an active member of
+ the array, either before the array is activated, or before
+ the 'slot' is set.
+
+ Setting this to 'none' is equivalent to setting 'in_sync'.
+ Setting to any other value also clears the 'in_sync' flag.
+
+ bad_blocks
+ This gives the list of all known bad blocks in the form of
+ start address and length (in sectors respectively). If output
+ is too big to fit in a page, it will be truncated. Writing
+ "sector length" to this file adds new acknowledged (i.e.
+ recorded to disk safely) bad blocks.
+
+ unacknowledged_bad_blocks
+ This gives the list of known-but-not-yet-saved-to-disk bad
+ blocks in the same form of 'bad_blocks'. If output is too big
+ to fit in a page, it will be truncated. Writing to this file
+ adds bad blocks without acknowledging them. This is largely
+ for testing.
+
+
An active md device will also contain and entry for each active device
in the array. These are named
@@ -490,6 +569,16 @@ also have
within the array where IO will be blocked. This is currently
only supported for raid4/5/6.
+ sync_min
+ sync_max
+ The two values, given as numbers of sectors, indicate a range
+ within the array where 'check'/'repair' will operate. Must be
+ a multiple of chunk_size. When it reaches "sync_max" it will
+ pause, rather than complete.
+ You can use 'select' or 'poll' on "sync_completed" to wait for
+ that number to reach sync_max. Then you can either increase
+ "sync_max", or can write 'idle' to "sync_action".
+
Each active md device may also have attributes specific to the
personality module that manages it.
More information about the pkg-mdadm-commits
mailing list