Немного истории: 2 года назад я был действительно счастлив узнать, что mdadm так мощен, что он даже может изменить массивы, таким образом, можно запустить с меньшего массива и затем вырастить его, как Вам нужно. Я купил диски на 3x1 ТБ и сделал RAID-5. Это было прекрасно в течение года.
Затем я купил 2x больше и попытался измениться к RAID-6 из 5 дисков, и из-за некоторой путаницы с версиями суперблока, потерял все содержание. Должен был восстановить его с нуля, но 2 ТБ данных закончились.
Вчера я купил еще 2 диска, и на этот раз у меня было все: правильно созданный массив, UPS. Я отключил карту намерения записи, добавил 2 новых диска как запчасти и выполнил команду для роста массива к 7 дискам.
Это начало работать, но скорость была смехотворно медленной, ~100kb/sec. После обработки первых 37 МБ на такой удивительной скорости, одном из старых сбоев жестких дисков. Я правильно завершаю работу ПК и разъединил сбойный диск. После начальной загрузки казалось, что это воссоздало поглощенную карту, как это было все еще в конфигурации mdadm, таким образом, я удалил его из конфигурации и перезагрузил снова.
Теперь все, что я вижу, - то, что вся мертвая блокировка процессов mdadm, и ничего не делает.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1937 root 20 0 12992 608 444 D 0 0.1 0:00.00 mdadm
2283 root 20 0 12992 852 704 D 0 0.1 0:00.01 mdadm
2287 root 20 0 0 0 0 D 0 0.0 0:00.01 md0_reshape
2288 root 18 -2 12992 820 676 D 0 0.1 0:00.01 mdadm
И все, что я вижу в mdstat:
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sdb1[1] sdg1[4] sdf1[7] sde1[6] sdd1[0] sdc1[5]
2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [7/6] [UU_UUUU]
[>....................] reshape = 0.0% (37888/976561152) finish=567604147.2min speed=0K/sec
Я уже попробовал mdadm 2.6.7, 3.1.4 и 3.2 - ничто не помогает. Я терял свои данные снова? Какие-либо предложения о том, как я могу сделать эту работу?
ОС является Сервер Ubuntu 10.04.2.
PS. Само собой разумеется, данные недоступны - я не могу смонтировать/dev/md0 для сохранения самых ценных данных.
Вы видите мое разочарование - очень определенная вещь, я был взволнован неудавшимися дважды берущими 5 ТБ своих данных с ним.
Обновление: кажется, что в kern.log существует некоторая хорошая информация:
21:38:48 ...: [ 166.522055] raid5: reshape will continue
21:38:48 ...: [ 166.522085] raid5: device sdb1 operational as raid disk 1
21:38:48 ...: [ 166.522091] raid5: device sdg1 operational as raid disk 4
21:38:48 ...: [ 166.522097] raid5: device sdf1 operational as raid disk 5
21:38:48 ...: [ 166.522102] raid5: device sde1 operational as raid disk 6
21:38:48 ...: [ 166.522107] raid5: device sdd1 operational as raid disk 0
21:38:48 ...: [ 166.522111] raid5: device sdc1 operational as raid disk 3
21:38:48 ...: [ 166.523942] raid5: allocated 7438kB for md0
21:38:48 ...: [ 166.524041] 1: w=1 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524050] 4: w=2 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524056] 5: w=3 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524062] 6: w=4 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524068] 0: w=5 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524073] 3: w=6 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524079] raid5: raid level 6 set md0 active with 6 out of 7 devices, algorithm 2
21:38:48 ...: [ 166.524519] RAID5 conf printout:
21:38:48 ...: [ 166.524523] --- rd:7 wd:6
21:38:48 ...: [ 166.524528] disk 0, o:1, dev:sdd1
21:38:48 ...: [ 166.524532] disk 1, o:1, dev:sdb1
21:38:48 ...: [ 166.524537] disk 3, o:1, dev:sdc1
21:38:48 ...: [ 166.524541] disk 4, o:1, dev:sdg1
21:38:48 ...: [ 166.524545] disk 5, o:1, dev:sdf1
21:38:48 ...: [ 166.524550] disk 6, o:1, dev:sde1
21:38:48 ...: [ 166.524553] ...ok start reshape thread
21:38:48 ...: [ 166.524727] md0: detected capacity change from 0 to 2999995858944
21:38:48 ...: [ 166.524735] md: reshape of RAID array md0
21:38:48 ...: [ 166.524740] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
21:38:48 ...: [ 166.524745] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
21:38:48 ...: [ 166.524756] md: using 128k window, over a total of 976561152 blocks.
21:39:05 ...: [ 166.525013] md0:
21:42:04 ...: [ 362.520063] INFO: task mdadm:1937 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520073] mdadm D 00000000ffffffff 0 1937 1 0x00000000
21:42:04 ...: [ 362.520083] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520092] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0
21:42:04 ...: [ 362.520100] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198
21:42:04 ...: [ 362.520107] Call Trace:
21:42:04 ...: [ 362.520133] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:42:04 ...: [ 362.520148] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:42:04 ...: [ 362.520159] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456]
21:42:04 ...: [ 362.520169] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456]
21:42:04 ...: [ 362.520179] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:42:04 ...: [ 362.520188] [<ffffffff81414df0>] md_make_request+0xc0/0x130
21:42:04 ...: [ 362.520194] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130
21:42:04 ...: [ 362.520205] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0
21:42:04 ...: [ 362.520214] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20
21:42:04 ...: [ 362.520222] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60
21:42:04 ...: [ 362.520230] [<ffffffff8129fc80>] submit_bio+0x80/0x110
21:42:04 ...: [ 362.520236] [<ffffffff8116c849>] submit_bh+0xf9/0x140
21:42:04 ...: [ 362.520244] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0
21:42:04 ...: [ 362.520251] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70
21:42:04 ...: [ 362.520258] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40
21:42:04 ...: [ 362.520265] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160
21:42:04 ...: [ 362.520272] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20
21:42:04 ...: [ 362.520279] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0
21:42:04 ...: [ 362.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:42:04 ...: [ 362.520290] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:42:04 ...: [ 362.520297] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120
21:42:04 ...: [ 362.520304] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20
21:42:04 ...: [ 362.520310] [<ffffffff810f591e>] read_cache_page+0xe/0x20
21:42:04 ...: [ 362.520317] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0
21:42:04 ...: [ 362.520324] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460
21:42:04 ...: [ 362.520331] [<ffffffff811a7938>] check_partition+0x138/0x190
21:42:04 ...: [ 362.520338] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0
21:42:04 ...: [ 362.520344] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0
21:42:04 ...: [ 362.520350] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:42:04 ...: [ 362.520356] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:42:04 ...: [ 362.520362] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:42:04 ...: [ 362.520369] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:42:04 ...: [ 362.520377] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:42:04 ...: [ 362.520385] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:42:04 ...: [ 362.520391] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:42:04 ...: [ 362.520398] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:42:04 ...: [ 362.520406] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310
21:42:04 ...: [ 362.520414] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:42:04 ...: [ 362.520421] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:42:04 ...: [ 362.520428] [<ffffffff811418b0>] sys_open+0x20/0x30
21:42:04 ...: [ 362.520437] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:42:04 ...: [ 362.520446] INFO: task mdadm:2283 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520450] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520454] mdadm D 0000000000000000 0 2283 2212 0x00000000
21:42:04 ...: [ 362.520462] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520470] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0
21:42:04 ...: [ 362.520478] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78
21:42:04 ...: [ 362.520485] Call Trace:
21:42:04 ...: [ 362.520495] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:42:04 ...: [ 362.520502] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:42:04 ...: [ 362.520508] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190
21:42:04 ...: [ 362.520514] [<ffffffff811741b0>] blkdev_put+0x10/0x20
21:42:04 ...: [ 362.520520] [<ffffffff811741f3>] blkdev_close+0x33/0x60
21:42:04 ...: [ 362.520527] [<ffffffff81145375>] __fput+0xf5/0x210
21:42:04 ...: [ 362.520534] [<ffffffff811454b5>] fput+0x25/0x30
21:42:04 ...: [ 362.520540] [<ffffffff811415ad>] filp_close+0x5d/0x90
21:42:04 ...: [ 362.520546] [<ffffffff81141697>] sys_close+0xb7/0x120
21:42:04 ...: [ 362.520553] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:42:04 ...: [ 362.520559] INFO: task md0_reshape:2287 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520567] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000
21:42:04 ...: [ 362.520575] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520582] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0
21:42:04 ...: [ 362.520590] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8
21:42:04 ...: [ 362.520597] Call Trace:
21:42:04 ...: [ 362.520608] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:42:04 ...: [ 362.520616] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:42:04 ...: [ 362.520626] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456]
21:42:04 ...: [ 362.520634] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:42:04 ...: [ 362.520644] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456]
21:42:04 ...: [ 362.520651] [<ffffffff81052713>] ? __wake_up+0x53/0x70
21:42:04 ...: [ 362.520658] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0
21:42:04 ...: [ 362.520668] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10
21:42:04 ...: [ 362.520675] [<ffffffff8141640c>] md_thread+0x5c/0x130
21:42:04 ...: [ 362.520681] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:42:04 ...: [ 362.520688] [<ffffffff814163b0>] ? md_thread+0x0/0x130
21:42:04 ...: [ 362.520694] [<ffffffff81084416>] kthread+0x96/0xa0
21:42:04 ...: [ 362.520701] [<ffffffff810131ea>] child_rip+0xa/0x20
21:42:04 ...: [ 362.520707] [<ffffffff81084380>] ? kthread+0x0/0xa0
21:42:04 ...: [ 362.520713] [<ffffffff810131e0>] ? child_rip+0x0/0x20
21:42:04 ...: [ 362.520718] INFO: task mdadm:2288 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520725] mdadm D 0000000000000000 0 2288 1 0x00000000
21:42:04 ...: [ 362.520733] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520741] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000
21:42:04 ...: [ 362.520748] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8
21:42:04 ...: [ 362.520755] Call Trace:
21:42:04 ...: [ 362.520763] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:42:04 ...: [ 362.520771] [<ffffffff812a6d50>] ? exact_match+0x0/0x10
21:42:04 ...: [ 362.520777] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:42:04 ...: [ 362.520783] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0
21:42:04 ...: [ 362.520790] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:42:04 ...: [ 362.520795] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:42:04 ...: [ 362.520801] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:42:04 ...: [ 362.520808] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:42:04 ...: [ 362.520815] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:42:04 ...: [ 362.520821] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:42:04 ...: [ 362.520828] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:42:04 ...: [ 362.520834] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:42:04 ...: [ 362.520841] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40
21:42:04 ...: [ 362.520848] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330
21:42:04 ...: [ 362.520855] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0
21:42:04 ...: [ 362.520862] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:42:04 ...: [ 362.520868] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:42:04 ...: [ 362.520874] [<ffffffff811418b0>] sys_open+0x20/0x30
21:42:04 ...: [ 362.520882] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:44:04 ...: [ 482.520065] INFO: task mdadm:1937 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520077] mdadm D 00000000ffffffff 0 1937 1 0x00000000
21:44:04 ...: [ 482.520087] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520096] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0
21:44:04 ...: [ 482.520104] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198
21:44:04 ...: [ 482.520112] Call Trace:
21:44:04 ...: [ 482.520139] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:44:04 ...: [ 482.520154] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:44:04 ...: [ 482.520165] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456]
21:44:04 ...: [ 482.520175] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456]
21:44:04 ...: [ 482.520185] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:44:04 ...: [ 482.520194] [<ffffffff81414df0>] md_make_request+0xc0/0x130
21:44:04 ...: [ 482.520201] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130
21:44:04 ...: [ 482.520212] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0
21:44:04 ...: [ 482.520221] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20
21:44:04 ...: [ 482.520229] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60
21:44:04 ...: [ 482.520237] [<ffffffff8129fc80>] submit_bio+0x80/0x110
21:44:04 ...: [ 482.520244] [<ffffffff8116c849>] submit_bh+0xf9/0x140
21:44:04 ...: [ 482.520252] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0
21:44:04 ...: [ 482.520258] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70
21:44:04 ...: [ 482.520266] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40
21:44:04 ...: [ 482.520273] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160
21:44:04 ...: [ 482.520280] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20
21:44:04 ...: [ 482.520286] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0
21:44:04 ...: [ 482.520293] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:44:04 ...: [ 482.520299] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:44:04 ...: [ 482.520306] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120
21:44:04 ...: [ 482.520313] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20
21:44:04 ...: [ 482.520319] [<ffffffff810f591e>] read_cache_page+0xe/0x20
21:44:04 ...: [ 482.520327] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0
21:44:04 ...: [ 482.520334] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460
21:44:04 ...: [ 482.520341] [<ffffffff811a7938>] check_partition+0x138/0x190
21:44:04 ...: [ 482.520348] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0
21:44:04 ...: [ 482.520355] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0
21:44:04 ...: [ 482.520361] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:44:04 ...: [ 482.520367] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:44:04 ...: [ 482.520373] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:44:04 ...: [ 482.520380] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:44:04 ...: [ 482.520388] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:44:04 ...: [ 482.520396] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:44:04 ...: [ 482.520403] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:44:04 ...: [ 482.520410] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:44:04 ...: [ 482.520417] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310
21:44:04 ...: [ 482.520426] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:44:04 ...: [ 482.520432] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:44:04 ...: [ 482.520438] [<ffffffff811418b0>] sys_open+0x20/0x30
21:44:04 ...: [ 482.520447] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:44:04 ...: [ 482.520458] INFO: task mdadm:2283 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520462] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520467] mdadm D 0000000000000000 0 2283 2212 0x00000000
21:44:04 ...: [ 482.520475] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520483] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0
21:44:04 ...: [ 482.520490] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78
21:44:04 ...: [ 482.520498] Call Trace:
21:44:04 ...: [ 482.520508] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:44:04 ...: [ 482.520515] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:44:04 ...: [ 482.520521] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190
21:44:04 ...: [ 482.520527] [<ffffffff811741b0>] blkdev_put+0x10/0x20
21:44:04 ...: [ 482.520533] [<ffffffff811741f3>] blkdev_close+0x33/0x60
21:44:04 ...: [ 482.520541] [<ffffffff81145375>] __fput+0xf5/0x210
21:44:04 ...: [ 482.520547] [<ffffffff811454b5>] fput+0x25/0x30
21:44:04 ...: [ 482.520554] [<ffffffff811415ad>] filp_close+0x5d/0x90
21:44:04 ...: [ 482.520560] [<ffffffff81141697>] sys_close+0xb7/0x120
21:44:04 ...: [ 482.520568] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:44:04 ...: [ 482.520574] INFO: task md0_reshape:2287 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520578] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520582] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000
21:44:04 ...: [ 482.520590] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520597] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0
21:44:04 ...: [ 482.520605] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8
21:44:04 ...: [ 482.520612] Call Trace:
21:44:04 ...: [ 482.520623] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:44:04 ...: [ 482.520633] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:44:04 ...: [ 482.520643] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456]
21:44:04 ...: [ 482.520651] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:44:04 ...: [ 482.520661] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456]
21:44:04 ...: [ 482.520668] [<ffffffff81052713>] ? __wake_up+0x53/0x70
21:44:04 ...: [ 482.520675] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0
21:44:04 ...: [ 482.520685] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10
21:44:04 ...: [ 482.520692] [<ffffffff8141640c>] md_thread+0x5c/0x130
21:44:04 ...: [ 482.520699] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:44:04 ...: [ 482.520705] [<ffffffff814163b0>] ? md_thread+0x0/0x130
21:44:04 ...: [ 482.520711] [<ffffffff81084416>] kthread+0x96/0xa0
21:44:04 ...: [ 482.520718] [<ffffffff810131ea>] child_rip+0xa/0x20
21:44:04 ...: [ 482.520725] [<ffffffff81084380>] ? kthread+0x0/0xa0
21:44:04 ...: [ 482.520730] [<ffffffff810131e0>] ? child_rip+0x0/0x20
21:44:04 ...: [ 482.520735] INFO: task mdadm:2288 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520743] mdadm D 0000000000000000 0 2288 1 0x00000000
21:44:04 ...: [ 482.520751] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520759] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000
21:44:04 ...: [ 482.520767] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8
21:44:04 ...: [ 482.520774] Call Trace:
21:44:04 ...: [ 482.520782] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:44:04 ...: [ 482.520790] [<ffffffff812a6d50>] ? exact_match+0x0/0x10
21:44:04 ...: [ 482.520797] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:44:04 ...: [ 482.520804] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0
21:44:04 ...: [ 482.520810] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:44:04 ...: [ 482.520816] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:44:04 ...: [ 482.520822] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:44:04 ...: [ 482.520829] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:44:04 ...: [ 482.520837] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:44:04 ...: [ 482.520843] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:44:04 ...: [ 482.520850] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:44:04 ...: [ 482.520857] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:44:04 ...: [ 482.520864] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40
21:44:04 ...: [ 482.520871] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330
21:44:04 ...: [ 482.520878] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0
21:44:04 ...: [ 482.520885] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:44:04 ...: [ 482.520891] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:44:04 ...: [ 482.520897] [<ffffffff811418b0>] sys_open+0x20/0x30
21:44:04 ...: [ 482.520905] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:46:04 ...: [ 602.520053] INFO: task mdadm:1937 blocked for more than 120 seconds.
21:46:04 ...: [ 602.520059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:46:04 ...: [ 602.520065] mdadm D 00000000ffffffff 0 1937 1 0x00000000
21:46:04 ...: [ 602.520075] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0
21:46:04 ...: [ 602.520084] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0
21:46:04 ...: [ 602.520091] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198
21:46:04 ...: [ 602.520099] Call Trace:
21:46:04 ...: [ 602.520127] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:46:04 ...: [ 602.520142] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:46:04 ...: [ 602.520153] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456]
21:46:04 ...: [ 602.520162] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456]
21:46:04 ...: [ 602.520171] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:46:04 ...: [ 602.520180] [<ffffffff81414df0>] md_make_request+0xc0/0x130
21:46:04 ...: [ 602.520187] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130
21:46:04 ...: [ 602.520197] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0
21:46:04 ...: [ 602.520206] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20
21:46:04 ...: [ 602.520215] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60
21:46:04 ...: [ 602.520222] [<ffffffff8129fc80>] submit_bio+0x80/0x110
21:46:04 ...: [ 602.520229] [<ffffffff8116c849>] submit_bh+0xf9/0x140
21:46:04 ...: [ 602.520237] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0
21:46:04 ...: [ 602.520244] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70
21:46:04 ...: [ 602.520252] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40
21:46:04 ...: [ 602.520259] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160
21:46:04 ...: [ 602.520266] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20
21:46:04 ...: [ 602.520273] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0
21:46:04 ...: [ 602.520279] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:46:04 ...: [ 602.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:46:04 ...: [ 602.520292] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120
21:46:04 ...: [ 602.520300] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20
21:46:04 ...: [ 602.520306] [<ffffffff810f591e>] read_cache_page+0xe/0x20
21:46:04 ...: [ 602.520314] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0
21:46:04 ...: [ 602.520321] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460
21:46:04 ...: [ 602.520328] [<ffffffff811a7938>] check_partition+0x138/0x190
21:46:04 ...: [ 602.520335] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0
21:46:04 ...: [ 602.520342] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0
21:46:04 ...: [ 602.520348] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:46:04 ...: [ 602.520354] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:46:04 ...: [ 602.520359] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:46:04 ...: [ 602.520367] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:46:04 ...: [ 602.520375] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:46:04 ...: [ 602.520383] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:46:04 ...: [ 602.520390] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:46:04 ...: [ 602.520397] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:46:04 ...: [ 602.520404] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310
21:46:04 ...: [ 602.520413] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:46:04 ...: [ 602.520419] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:46:04 ...: [ 602.520425] [<ffffffff811418b0>] sys_open+0x20/0x30
21:46:04 ...: [ 602.520434] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:46:04 ...: [ 602.520443] INFO: task mdadm:2283 blocked for more than 120 seconds.
21:46:04 ...: [ 602.520447] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:46:04 ...: [ 602.520451] mdadm D 0000000000000000 0 2283 2212 0x00000000
21:46:04 ...: [ 602.520460] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0
21:46:04 ...: [ 602.520468] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0
21:46:04 ...: [ 602.520475] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78
21:46:04 ...: [ 602.520483] Call Trace:
21:46:04 ...: [ 602.520492] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:46:04 ...: [ 602.520500] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:46:04 ...: [ 602.520506] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190
21:46:04 ...: [ 602.520512] [<ffffffff811741b0>] blkdev_put+0x10/0x20
21:46:04 ...: [ 602.520518] [<ffffffff811741f3>] blkdev_close+0x33/0x60
21:46:04 ...: [ 602.520526] [<ffffffff81145375>] __fput+0xf5/0x210
21:46:04 ...: [ 602.520533] [<ffffffff811454b5>] fput+0x25/0x30
21:46:04 ...: [ 602.520539] [<ffffffff811415ad>] filp_close+0x5d/0x90
21:46:04 ...: [ 602.520545] [<ffffffff81141697>] sys_close+0xb7/0x120
21:46:04 ...: [ 602.520552] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
Мне удалось связаться с Нилом Брауном (разработчиком), и он немедленно предложил увеличить stripe_cache_size по крайней мере до 2048 года. Это напоминает мой предыдущий вопрос, где я не смог сделать этот параметр постоянным.
Итак, после установки 8192 изменение формы продолжилось, поэтому проблема решена. Боже, благослови Нила Брауна: -)
Иногда изменение будет находиться в speed=0K/sec, потому что файл резервной копии не удался быть созданным или был потерян во время обработки.
Решение, в этом случае, было предоставлено Neil Brown в ответ на электронное письмо на linux-raid@vger.kernel.org.
Необходимо смочь просто остановить массив и повторно собраться с другим файлом резервной копии и волшебным флагом "-недопустимое резервное копирование" (требуемый mdadm 3.2 или позже).
Файл резервной копии только действительно необходим в случае катастрофического отказа. Поскольку Вы остановите массив чисто не будет никакой потребности восстановить что-либо, когда Вы повторно соберетесь, таким образом - недопустимое резервное копирование (Которые говорят ", нет ничего в файле резервной копии, но это в порядке), совершенно безопасно.
NeilBrown
Для RAID5, как устройство /dev/md0
, с 7 дисками, смонтированными в /mnt/data
; процедура к его ответу:
Все следующие команды должны быть выполнены как корень или эквивалентные.
Найдите любые открытые соединения с диском:
lsof /mnt/data
Закройте их или сервисы остановки, которые могут взаимодействовать с ним.
Обычно:
systemctl stop <SERVICE_NAME>
или
service <SERVICE_NAME> stop
Размонтирование, остановитесь и затем повторно соберитесь:
umount /mnt/data
mdadm --stop /dev/md0
mdadm --assemble --invalid-backup --backup-file=/root/mdadm0.bak /dev/md0 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1
В зависимости от предшествующих конфигураций устройство может автоматически повторно смонтироваться после команды блока. В противном случае смонтируйтесь с:
mount /dev/md0 /mnt/data
Затем безопасно перезапустить любые услуги или соединения, убегающие из там.