Opened 17 years ago
Closed 17 years ago
#46 closed defect (fixed)
Fix the RAID
Reported by: | price | Owned by: | andersk |
---|---|---|---|
Priority: | blocker | Milestone: | Alpha |
Component: | other | Version: | |
Keywords: | Cc: |
Description
Our RAID is still mysteriously, horrendously slow in certain circumstances. It starts spewing thousands of the same puzzling errors, and some IO hangs for minutes.
The vendor blames the errors on the cable connecting the RAID to black-mesa. Tonight we swapped out the cable and one of the two connectors on its ends; nothing improved.
Tonight we also tried booting a 2.6.24 kernel. The error messages recurred, but the delay could not be reproduced. If we fix #41, we can upgrade the kernel; or we could consider backporting the new driver to our present etch (2.6.18-something) kernel.
We should also go back to the vendor, tell them that replacing the cable didn't help, and ask them to actually help debug the problem.
Change History (11)
comment:1 Changed 17 years ago by quentin
comment:2 Changed 17 years ago by tabbott
So, what were the results of swapping out components?
comment:3 Changed 17 years ago by tabbott
- Owner changed from sipb-xen to quentin
comment:4 Changed 17 years ago by quentin
- Status changed from new to assigned
The results were that nothing fixed it, and a few swaps made it (seem) even worse. We are awaiting a reply to our support ticket (they told us to call them if it was an emergency)
--Quentin
comment:5 Changed 17 years ago by price
The current status from Quentin is that we need to try a new driver they've supplied. This apparently involves recompiling the kernel.
comment:6 Changed 17 years ago by anonymous
- Owner changed from quentin to andersk
- Status changed from accepted to assigned
comment:7 Changed 17 years ago by broder
- Milestone set to Alpha
comment:8 Changed 17 years ago by broder
- Milestone set to Alpha
comment:9 Changed 17 years ago by price
- Priority changed from critical to blocker
comment:10 Changed 17 years ago by price
The new driver was in the kernel booted Saturday night, and all seems to be well so far.
I'll wait a few days before closing this, but if someone else is confident the upgrade fixed it they should feel free to close.
comment:11 Changed 17 years ago by price
- Resolution set to fixed
- Status changed from assigned to closed
Closing, as it seems to be fixed.
To be clear, it seems that 2.6.24 just recovers from the errors better, rather than spewing them continuously. It is the lack of spewing errors that results in better I/O performance.
We can't actually upgrade the Xen kernel to 2.6.24; #41 refers to guest kernels, not host kernels.
I'm contacting the vendor now.
--Quentin