Author Topic: codemonger.org  (Read 1544 times)

daniel.santos

  • Guest
codemonger.org
« on: 20 November 2009, 20:03:49 »
I just wanted to drop another status update on this.  The original problem turns out that my motherboard didn't support the proper voltage for my memory and that was what was causing the kernel Oops(es).  Corsair tech support gave me a work around that resulted in my motherboard properly choosing new timings and I ran memtest86 for 10 days with no errors.

As for the hard drives, it appears that on the last kernel oops, one or two of the raid5 devices were marked faulty.  Actually, it's really wierd because two say they are fine, but when I query the 3rd, it claims that only one of the other devices are OK, so I'm not sure what happened.  Either way, my main priority is not loosing the data, so I want to make sure I go about this in a way that ensures that to the greatest degree possible.

For those who are interested in more details (or maybe even has some ideas :) ) here they are:

Code: [Select]
livecd ~ # mdadm --version
mdadm - v1.12.0 - 14 June 2005
livecd ~ # fdisk -l /dev/sd{a,b,c}

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1         131     1052226   83  Linux
/dev/sda2             132         829     5606685   82  Linux swap / Solaris
/dev/sda3             830       60801   481725090   fd  Linux raid autodetect

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1         131     1052226   83  Linux
/dev/sdb2             132         829     5606685   82  Linux swap / Solaris
/dev/sdb3             830       60801   481725090   fd  Linux raid autodetect

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1         131     1052226   83  Linux
/dev/sdc2             132         829     5606685   82  Linux swap / Solaris
/dev/sdc3             830       60801   481725090   fd  Linux raid autodetect
livecd ~ # mdadm --misc -E /dev/sd{a,b,c}3
/dev/sda3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : feb2ff9a:d8245ff2:880a5a50:9054dc10
  Creation Time : Sun Dec 30 07:16:04 2007
     Raid Level : raid5
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sun Oct 11 08:57:10 2009
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 476254b2 - correct
         Events : 0.2

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8        3        0      active sync   /dev/sda3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
/dev/sdb3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : feb2ff9a:d8245ff2:880a5a50:9054dc10
  Creation Time : Sun Dec 30 07:16:04 2007
     Raid Level : raid5
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sun Oct 11 08:57:10 2009
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 476254c4 - correct
         Events : 0.2

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       19        1      active sync   /dev/sdb3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
/dev/sdc3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : feb2ff9a:d8245ff2:880a5a50:9054dc10
  Creation Time : Sun Dec 30 07:16:04 2007
     Raid Level : raid5
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0

    Update Time : Sun Oct 11 09:01:26 2009
          State : active
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 476255e8 - correct
         Events : 0.5

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       35        2      active sync   /dev/sdc3

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       35        2      active sync   /dev/sdc3
livecd ~ # mknod /dev/md0 b 9 0
livecd ~ # mdadm --assemble /dev/md0 /dev/sd{a,b,c}3
mdadm: /dev/md0 assembled from 1 drive - not enough to start the array.
livecd ~ # cat /proc/mdstat
Personalities :
md0 : inactive sdc3[2] sdb3[1] sda3[0]
      1445174976 blocks super non-persistent

unused devices: <none>
livecd ~ # mdadm -D /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
livecd ~ # mdadm -Q /dev/sda3
/dev/sda3: is not an md array
/dev/sda3: device 0 in 3 device undetected raid5 md0.  Use mdadm --examine for more detail.

ZaRo

  • Guest
Re: codemonger.org
« Reply #1 on: 23 November 2009, 08:58:40 »
Hello there :)

1) mind setting the time to now (hint use NTP)
2) what does mdadm --examine say?

/ZaRo

daniel.santos

  • Guest
Re: codemonger.org
« Reply #2 on: 26 November 2009, 08:54:48 »
That's not the time silly, that's the version of mdadm I'm using (on the livecd anyway)! :P

Also, mdadm -E is the same as mdadm --examine.  After futzing with it some the other day, I'm worried about the SATA cables so I've ordered some new ones, with the cute metal retainers (http://www.newegg.com/Product/Product.aspx?Item=N82E16812123290) and I'm going to reboot with those on before I do anything that can change the disks because I want to make sure everything is OK.

ZaRo

  • Guest
Re: codemonger.org
« Reply #3 on: 1 December 2009, 08:51:11 »
Right. I need to read what the screen says :)

Raid 5 should be fine with just 2 of the drives working, so you can get the server up and running by taking one hdd out.

Then you can copy the stuff to some other place to have a backup :)

/ZaRo

daniel.santos

  • Guest
Re: codemonger.org
« Reply #4 on: 13 December 2009, 13:07:27 »
Right. I need to read what the screen says :)
I posted what the screen says originally (look for the line "mdadm --misc -E /dev/sd{a,b,c}3").  Ahh! now that I look at it again I think I see the problem!  It didn't appear that it would let me run it because it was claiming (sometimes) that all disks were fine and then other times that they were faulty.  It looks like device 2 (/dev/sdc3) just has it's meta-data screwed.  Maybe I can physically disconnect device 2 and see if it will let me start it with 2 devices that way (because it didn't let me do it before).
« Last Edit: 13 December 2009, 13:09:54 by daniel.santos »

 

anything