Pantek Library
Hosting Provided By
CybrHost
High Speed Hosting

kernel/3069: RAIDframe operations cause panics

From: Alex Cichowski <e12(at)tfz.net>
Date: Sat Jan 18 2003 - 03:54:45 EST


>Number: 3069
>Category: kernel
>Synopsis: RAIDframe operations cause panics
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jan 18 08:29:18 MST 2003
>Closed-Date:
>Last-Modified:
>Originator: Alex Cichowski
>Release: OpenBSD 3.2-current (Jan 13 snapshot)
>Organization:
net
>Environment:

	System      : OpenBSD 3.2-current (Jan 13 snapshot)
	Architecture: OpenBSD.i386
	Machine     : i386

>Description:

raidctl -u always causes a panic, and raidctl -R always causes a panic for a non-autoconfigured RAID array. For an autoconfigured RAID array, raidctl seems to cause a panic when run for the second time on the same component. Using raidctl -F to reconstruct onto a spare, however, seems to work. See below for detailed panic output.

Problems like this have been experienced by at least two other people:  http://marc.theaimsgroup.com/?l=openbsd-misc&m=104251852204413&w=2  http://marc.theaimsgroup.com/?l=openbsd-misc&m=103833892014496&w=2

The problem occurs on both the 3.2 release and the January 13 snapshot, with kernels built with the GENERIC configuration plus the following line:  pseudo-device raid 4

>How-To-Repeat:

The raidctl -u panic:



OpenBSD 3.2-current (GENERIC+RAID) #0: Fri Jan 17 20:46:54 PST 2003

    root@test.local:/usr/src/sys/arch/i386/compile/GENERIC+RAID

cpu0: disabling processor serial number
cpu0: Intel Pentium III ("GenuineIntel" 686-class, 512KB L2 cache) 499 MHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SYS,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SIMD
real mem = 133541888 (130412K)
avail mem = 117727232 (114968K)
using 1655 buffers containing 6778880 bytes (6620K) of memory mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 09/11/00, BIOS32 rev. 0 @ 0xfda74 apm0 at bios0: Power Management spec V1.2 apm0: AC on, battery charge unknown
pcibios0 at bios0: rev. 2.1 @ 0xf0000/0x10000
pcibios0: PCI IRQ Routing Table rev. 1.0 @ 0xf2e70/208 (11 entries)
pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82371FB PCI-ISA" rev 0x00)
pcibios0: PCI bus #2 is the last bus

bios0: ROM list: 0xc0000/0xc000
pci0 at mainbus0 bus 0: configuration mode 1 (no bios) pchb0 at pci0 dev 0 function 0 "Intel 82820 MCH" rev 0x03: rng active, 8Kb/sec ppb0 at pci0 dev 1 function 0 "Intel 82820 AGP" rev 0x03 pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "ATI Radeon VE QY" rev 0x00 wsdisplay0 at vga1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) ppb1 at pci0 dev 30 function 0 "Intel 82801AA Hub-to-PCI" rev 0x02 pci2 at ppb1 bus 2
eap0 at pci2 dev 7 function 0 "Ensoniq AudioPCI97" rev 0x09: irq 10 ac97: codec id 0x43525914 (Cirrus Logic CS4297A rev 4) ac97: codec features headphone, 20 bit DAC, 18 bit ADC, Crystal Semi 3D audio0 at eap0
sis0 at pci2 dev 10 function 0 "NS DP83815 10/100" rev 0x00: irq 11 address 00:09:5b:04:58:5a nsphyter0 at sis0 phy 0: DP83815 10/100 integrated, rev. 1 pcib0 at pci0 dev 31 function 0 "Intel 82801AA LPC" rev 0x02 pciide0 at pci0 dev 31 function 1 "Intel 82801AA IDE" rev 0x02: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: <WDC WD600BB-00CAA1> wd0: 16-sector PIO, LBA, 57241MB, 16383 cyl, 16 head, 63 sec, 117231408 sectors wd1 at pciide0 channel 0 drive 1: <WDC WD600BB-00CAA1> wd1: 16-sector PIO, LBA, 57241MB, 16383 cyl, 16 head, 63 sec, 117231408 sectors wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 4 wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 4 atapiscsi0 at pciide0 channel 1 drive 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <SONY, CD-RW CRX195E1, ZYS5> SCSI0 5/cdrom removable cd0(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2 uhci0 at pci0 dev 31 function 2 "Intel 82801AA USB" rev 0x02: irq 10 usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: vendor 0x0000 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered "Intel 82801AA SMBus" rev 0x02 at pci0 dev 31 function 3 not configured isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0 pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
sysbeep0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec biomask c440 netmask cc40 ttymask dcc2
pctr: 686-class user-level performance counters enabled mtrr: Pentium Pro MTRR support
Kernelized RAIDframe activated
dkcsum: wd0 matched BIOS disk 80
dkcsum: wd1 matched BIOS disk 81
root device? wd0e
root on wd0e
rootdev=0x4 rrootdev=0x304 rawdev=0x302
Automatic boot in progress: starting file system checks. /dev/rwd0e: file system is clean; not checking setting tty flags
starting network
add net default: gateway 192.168.2.1
starting system logger
starting rpc daemons:.
savecore: no core dump
checking quotas: done.
building ps databases: kvm dev.
clearing /tmp
starting pre-securelevel daemons:.
setting kernel security level: kern.securelevel: 0 -> 1 preserving editor files
creating runtime link editor directory cache. starting network daemons: sendmail inetd sshd. starting local daemons:.
standard daemons: cron.
Sat Jan 18 00:02:03 PST 2003

OpenBSD/i386 (test.local) (tty00)

login: root
Password:
Last login: Fri Jan 17 23:53:22 on tty00 Jan 18 00:02:10 test login: ROOT LOGIN (root) ON tty00 Jan 18 00:02:10 test login: ROOT LOGIN (root) ON tty00 OpenBSD 3.2-current (GENERIC+RAID) #0: Fri Jan 17 20:46:54 PST 2003

Do you need help?X

Welcome to OpenBSD: The proactively secure Unix-like operating system.

Please use the sendbug(1) utility to report bugs in the system. Before reporting a bug, please try to reproduce it with the latest version of the code. With bug reports, please try to ensure that enough information to reproduce the problem is enclosed, and if a known fix for it exists, include that as well.

You have mail.
Terminal type? [vt100]
Don't login as root, use su
test# sh
# kill `cat /var/run/syslog.pid`

Jan 18 00:02:18 test syslogd: exiting on signal 15
Jan 18 00:02:18 test syslogd: exiting on signal 15
Jan 18 00:02:18 test syslogd: exiting on signal 15
# mount
/dev/wd0e on / type ffs (local)
# disklabel wd0
# using MBR partition 3: type A6 off 63 (0x3f) size 117226242 (0x6fcbb02)
# /dev/rwd0c:

type: ESDI
disk: ESDI/IDE disk
label: WDC WD600BB-00CA
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 117231408
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0

16 partitions:

#        size   offset    fstype   [fsize bsize   cpg]
  a:    41265       63    4.2BSD     1024  8192    16   # (Cyl.    0*- 40)
  b:   523776 100663296      swap                       # (Cyl. 99864*- 100383)
  c: 117231408        0    unused        0     0        # (Cyl.    0 - 116300)
  d: 96258960    41328    4.2BSD     1024  8192    16   # (Cyl.   41 - 95535)
  e:  2097648 96300288    4.2BSD     1024  8192    16   # (Cyl. 95536 - 97616)
  f:  2097648 98565264      RAID                        # (Cyl. 97783 - 99863)
  g:  2097648 101291904    4.2BSD     1024  8192    16  # (Cyl. 100488 - 102568)
# disklabel wd1

# using MBR partition 3: type A6 off 63 (0x3f) size 117226242 (0x6fcbb02) # /dev/rwd1c:
type: ESDI
disk: ESDI/IDE disk
label: WDC WD600BB-00CA
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 117231408
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0

16 partitions:

#        size   offset    fstype   [fsize bsize   cpg]
  a:    41265       63    4.2BSD     1024  8192    16   # (Cyl.    0*- 40)
  c: 117231408        0    unused        0     0        # (Cyl.    0 - 116300)
  d: 96258960    41328    4.2BSD     1024  8192    16   # (Cyl.   41 - 95535)
  f:  2097648 98565264      RAID                        # (Cyl. 97783 - 99863)
  g:  8388576 106786512    4.2BSD     1024  8192    16  # (Cyl. 105939 - 114260)
Do you need more help?X
# cat >raid0.conf
START array
1 2 0
START disks
/dev/wd0f
/dev/wd1f
START layout
128 1 1 1
START queue
fifo 100
# dd if=/dev/zero bs=1m count=1 of=/dev/wd0f 1+0 records in
1+0 records out
1048576 bytes transferred in 0.028 secs (36509035 bytes/sec) # dd if=/dev/zero bs=1m count=1 of=/dev/wd1f 1+0 records in
1+0 records out
1048576 bytes transferred in 0.027 secs (37908102 bytes/sec) # raidctl -C raid0.conf raid0
raid0: Component /dev/wd0f being configured at row: 0 col: 0
         Row: 0 Column: 0 Num Rows: 0 Num Columns: 0
         Version: 0 Serial Number: 0 Mod Counter: 0
         Clean: No Status: 0

Number of rows do not match for: /dev/wd0f. Number of columns do not match for: /dev/wd0f. /dev/wd0f is not clean !
raid0: Component /dev/wd1f being configured at row: 0 col: 1
         Row: 0 Column: 0 Num Rows: 0 Num Columns: 0
         Version: 0 Serial Number: 0 Mod Counter: 0
         Clean: No Status: 0

Column out of alignment for: /dev/wd1f.
Number of rows do not match for: /dev/wd1f. Number of columns do not match for: /dev/wd1f. /dev/wd1f is not clean !
raid0: There were fatal errors
raid0: Fatal errors being ignored.
raid0 (root)#
# raidctl -I 12345 raid0
raid0: no disk label
# raidctl -i raid0
raid0: no disk label
# raidctl -s raid0
raid0: no disk label
raid0 Components:
           /dev/wd0f: optimal
           /dev/wd1f: optimal

No spares.
Parity status: DIRTY
Reconstruction is 100% complete.
Parity Re-write is 11% complete.
Copyback is 100% complete.
# raidctl -s raid0
raid0: no disk label
raid0 Components:
           /dev/wd0f: optimal
           /dev/wd1f: optimal

No spares.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
# raidctl -u raid0
raid0: no disk label
panic: lockmgr: LK_RELEASE of unlocked lock Stopped at _Debugger+0x4: leave
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb> trace
_Debugger(dadf9608,0,0,d024c0b0,d3992c7c) at _Debugger+0x4
_panic(d0221a60,601ed050,2,dad9cc34,dadf83d0) at _panic+0x81
_lockmgr(dadf9608,6,dadf8450,dad9f004,2) at _lockmgr+0x469
_ufs_unlock(dad9c9e8,0,dad9cd18,d035e309,dad9f004) at _ufs_unlock+0x2b
_VOP_UNLOCK(dadf83d0,0,dad9f004,dad9cd14,0) at _VOP_UNLOCK+0x35
_rf_close_component(d0976000,dadf83d0,0,d03222fa,d0974000) at _rf_close_compone
nt+0x4c
_rf_UnconfigureVnodes(d0976000,0,dad9caa8,d01de9ea,d0976000) at _rf_Unconfigure Vnodes+0x5d
_rf_Shutdown(d0976000,dada147c,dad9cd18,d03222fa,dad9cdbc) at _rf_Shutdown+0xdb
_raidioctl(1302,20007202,dad9ceb8,3,dad5fd94) at _raidioctl+0x6ef
_spec_ioctl(dad9cdbc,dad5fd94,dad9cdb8,d025780d,dad9cda8) at _spec_ioctl+0x96
_spec_vnoperate(dad9cdbc,dad5fd94,dad9ce38,d02568a1,dad7e090) at _spec_vnoperat
e+0x1b
_VOP_IOCTL(dae08704,20007202,dad9ceb8,3,dad590f0) at _VOP_IOCTL+0x49
_vn_ioctl(dad7e090,20007202,dad9ceb8,dad5fd94,dad5fd94) at _vn_ioctl+0xeb
_sys_ioctl(dad5fd94,dad9cf88,dad9cf7c,d03619e3,0) at _sys_ioctl+0x32d
_syscall() at _syscall+0x26d

--- syscall (number 54) ---
0x12bcb:
ddb> ps
   PID   PPID   PGRP    UID  S       FLAGS  WAIT       COMMAND
*32550  18445  32550      0  2      0x4006             raidctl
 19205      0      0      0  3    0x100204  rfwcond    raid0
 18445   7649  18445      0  3      0x4086  pause      sh
 18338      1  18338      0  3     0x40184  select     sendmail
  7649      1   7649      0  3      0x4086  pause      csh
 32581      1  32581      0  3      0x4086  ttyin      getty
 13008      1  13008      0  3      0x4086  ttyin      getty
Can we help you?X
15729 1 15729 0 3 0x4086 ttyin getty 15122 1 15122 0 3 0x4086 ttyin getty 693 1 693 0 3 0x4086 ttyin getty 26844 1 26844 0 3 0x84 select cron 3830 1 3830 0 3 0x84 select sshd 28832 1 28832 0 3 0x184 select inetd 9 0 0 0 3 0x100204 usbevt usb0 8 0 0 0 3 0x100204 apmev apm0 7 0 0 0 3 0x100204 crypto_wa crypto 6 0 0 0 3 0x100204 aiodoned aiodoned 5 0 0 0 3 0x100204 syncer update 4 0 0 0 3 0x100204 cleaner cleaner 3 0 0 0 3 0x100204 reaper reaper 2 0 0 0 3 0x100204 pgdaemon pagedaemon 1 0 1 0 3 0x4084 wait init 0 -1 0 0 3 0x80204 scheduler swapper ddb> show registers es 0xdad90010 _end+0xa755db8 ds 0x10 edi 0xd0221a60 _lockstatus+0x1f4 esi 0xdad9c974 _end+0xa76271c ebp 0xdad9c948 _end+0xa7626f0 ebx 0x100 edx 0xd02326a7 _tablefull+0x23 ecx 0x3f8 eax 0x1 eip 0xd03535c0 _Debugger+0x4 cs 0x8 eflags 0x202 esp 0xdad9c948 _end+0xa7626f0 ss 0xdad90010 _end+0xa755db8

_Debugger+0x4: leave

The raidctl -R panic:



# raidctl -s raid0
raid0: no disk label
raid0 Components:
           /dev/wd0f: optimal
           /dev/wd1f: optimal

No spares.
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.
# raidctl -R /dev/wd1f raid0
raid0: no disk label
Closing the open device: /dev/wd1f
panic: lockmgr: LK_RELEASE of unlocked lock Stopped at _Debugger+0x4: leave
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb> trace
_Debugger(dadfdca0,0,0,d28cb000,fffffff6) at _Debugger+0x4
_panic(d0221a60,0,dae54ce4,d0232e37,dadfe3d8) at _panic+0x81
_lockmgr(dadfdca0,6,dadfe458,dada0004,0) at _lockmgr+0x469
_ufs_unlock(dae54cd4,ffffffff,0,23,dada0004) at _ufs_unlock+0x2b
Can't find what you're looking for?X
_VOP_UNLOCK(dadfe3d8,0,dada0004,0,60) at _VOP_UNLOCK+0x35 _rf_close_component(d0976000,dadfe3d8,0,d0207f63,d0207cde) at _rf_close_compone
nt+0x4c
_rf_ReconstructInPlace(d0976000,0) at _rf_ReconstructInPlace+0x1a5 _rf_ReconstructInPlaceThread(d09c8d60) at _rf_ReconstructInPlaceThread+0x3c Bad frame pointer: 0xd0695f08
ddb> ps
   PID   PPID   PGRP    UID  S       FLAGS  WAIT       COMMAND
*27426      0      0      0  2    0x100204             raid_reconip
 14474  29388  14474      0  3      0x4006  biowait    raidctl
  6183      0      0      0  3    0x100204  rfwcond    raid0
 29388  18857  29388      0  3      0x4086  pause      sh
  5329      1   5329      0  3     0x40184  select     sendmail
 18857      1  18857      0  3      0x4086  pause      csh
 25462      1  25462      0  3      0x4086  ttyin      getty
 31613      1  31613      0  3      0x4086  ttyin      getty
  2785      1   2785      0  3      0x4086  ttyin      getty
  3987      1   3987      0  3      0x4086  ttyin      getty
   365      1    365      0  3      0x4086  ttyin      getty
 10960      1  10960      0  3        0x84  select     cron
 27002      1  27002      0  3        0x84  select     sshd
 11637      1  11637      0  3       0x184  select     inetd
     9      0      0      0  3    0x100204  usbevt     usb0
     8      0      0      0  3    0x100204  apmev      apm0
     7      0      0      0  3    0x100204  crypto_wa  crypto
     6      0      0      0  3    0x100204  aiodoned   aiodoned
     5      0      0      0  3    0x100204  syncer     update
     4      0      0      0  3    0x100204  cleaner    cleaner
     3      0      0      0  3    0x100204  reaper     reaper
     2      0      0      0  3    0x100204  pgdaemon   pagedaemon
     1      0      1      0  3      0x4084  wait       init
     0     -1      0      0  3     0x80204  scheduler  swapper
ddb> show registers
es          0xdae50010  _end+0xa815db8
Don't know where to look next?X
ds 0xd0010 edi 0xd0221a60 _lockstatus+0x1f4 esi 0xdae54c60 _end+0xa81aa08 ebp 0xdae54c34 _end+0xa81a9dc ebx 0x100 edx 0xd02326a7 _tablefull+0x23 ecx 0x3f8 eax 0x1 eip 0xd03535c0 _Debugger+0x4 cs 0x8 eflags 0x202 esp 0xdae54c34 _end+0xa81a9dc ss 0xdae50010 _end+0xa815db8

_Debugger+0x4: leave

>Fix:

This message from Greg Oster describes a possible fix:  http://marc.theaimsgroup.com/?l=openbsd-misc&m=103903524029246&w=2

>Release-Note:
>Audit-Trail:
>Unformatted:
Received on Sat Jan 18 10:32:11 2003

Confused? Frustrated?X

This archive was generated by hypermail 2.1.8 : Wed Aug 23 2006 - 13:29:48 EDT


Contact Us  Legal Notices  Order Services Online 
Pantek Home  Privacy Policy  IT news  Site Map  Pantek Library