Ticket #2078 (closed defect: fixed)

Opened 10 years ago

Last modified 10 years ago

glamo-mci.0: ****** insanity timeout

Reported by: Sprite_tm Owned by: openmoko-kernel
Priority: normal Milestone:
Component: kernel Version:
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: no PatchReviewResult:
Reproducible:

Description

I use Debian on an 8GB SDHC-card, and compile my kernels from the andy-tracking-branch straight from Git. For a while now (actually, since I started using the andy-tracking-kernel, which probably is a few weeks now), I had strange hangs with my Freerunner, which I eventually, using the debug board, could trace back to SD-problems.

The problem is reproducable by doing a simple 'dd if=/dev/mmcblk0 of=/dev/zero bs=1024k'. On my freerunner, this will exit with an i/o-error, sometimes after 22M, sometimes after 220M, but it'll never faultlessly read the complete SDHC-card.

The errors I get in my dmesg are the following:
[21474792.505000] glamo-mci glamo-mci.0: insanity timeout
[21474792.505000] glamo-mci glamo-mci.0: Error after cmd: 0xc300
[21474792.510000] mmcblk0: error -110 sending read/write command
[21474792.515000] end_request: I/O error, dev mmcblk0, sector 14882256
(after a non-fatal error)

[ 552.545000] glamo-mci glamo-mci.0: insanity timeout
[ 552.545000] glamo-mci glamo-mci.0: Error after cmd: 0x4300
[ 552.550000] mmcblk0: error -110 sending read/write command
(and after this, my userspace hung on trying to do anything with the sd-card)

My card is an 8G Sandisk Micro-SDHC-card (non-Ultra), class 4, but I had the hanging-problems with an 8G Sandisk Mobile Ultra-SDHC-card too, so that would hint at the problem being independant of SDHC-hardware.

Messing with the sd_max_clk (16MHz->5MHz) and the sd_drive (0->3->6) didn't seem to affect the error.

I just tried reverting to the 2.6.24-kernel which came with the debian distribution (iirc, that's the same as fso uses) and the dd does succeed there 100% OK, even at 16MHz sd_clk_max (althoug it seems to move data a lot slower then with the newer kernel, but that could be just me.)

Attachments

mmcblk0-error-110.dmesg (20.6 KB) - added by lindi 10 years ago.
glamo-sd-dont-fail-on-insanity-timeout.diff (539 bytes) - added by Sprite_tm 10 years ago.

Change History

comment:1 Changed 10 years ago by lindi

I have similar experience. With stable-tracking at 80f4b57fef5dcffb access to SD card stops with errors like

[21474629.595000] glamo-mci glamo-mci.0: ****** insanity timeout
[21474629.600000] glamo-mci glamo-mci.0: Error after cmd: 0x4300
[21474629.605000] mmcblk0: error -110 sending read/write command
[21474629.610000] end_request: I/O error, dev mmcblk0, sector 2141144
[21474629.610000] Buffer I/O error on device mmcblk0p2, logical block 265683
[21474662.305000] S3C24XX RTC, (c) 2004,2006 Simtec Electronics

and

[  128.140000] glamo-mci glamo-mci.0: ****** insanity timeout
[  128.140000] glamo-mci glamo-mci.0: Error after cmd: 0xc300
[  128.145000] mmcblk0: error -110 sending read/write command

Also here 2.6.24-20081103.git7172ec57 from debian seems to work just fine vrt. SD use. I am attaching full dmesg output from a bootup where the SD becomes unusable quite soon (I think I tried to edit a small text file).

Changed 10 years ago by lindi

comment:2 Changed 10 years ago by lindi

Forgot to mention: I have a 2G SD card so this problem is not limited to larger cards.

comment:3 Changed 10 years ago by Sprite_tm

I just compiled a kernel with a small hack: instead of erroring out on the insanity timeout, I let the code-flow continue as if nothing happened. In glamo-mci.c:

if (insanity_timeout < 0) {

  • cmd->error = -ETIMEDOUT;

+ cmd->error = -ETIMEDOUT;

dev_err(&host->pdev->dev, " insanity timeout\n");

It seems like this has a good effect. I tried dd'ing my 64MB boot partition to md5sum twice, and both times the md5sum was the same, while multiple insanity timeouts were in the dmesg after each time. Assuming the insanity timeouts aren't based on a sector (which I highly doubt, the insanity timeouts seem to be random) this does solve the problem, albeit in an extremely hackish and dirty way.

comment:4 Changed 10 years ago by lindi

Thanks a lot for your efforts! I tested your hack with stable-tracking 80f4b57fef5dcffb and

sudo pv /dev/mmcblk0 >  /dev/null

completed in 27 minutes (I have 2G card) and triggered 11 insanity timeouts:

Dec  1 23:32:22 ginger authpriv.notice sudo:    lindi : TTY=pts/1 ; PWD=/home/lindi ; USER=root ; COMMAND=/usr/bin/pv /dev/mmcblk0
Dec  1 23:34:17 ginger user.err kernel: [   37.915000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:36:05 ginger user.err kernel: [  146.065000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:39:53 ginger user.err kernel: [  374.095000] fbcon_event_notify action=9, data=c7027e08
Dec  1 23:40:25 ginger user.err kernel: [  405.600000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:40:40 ginger user.err kernel: [  420.240000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:42:59 ginger user.err kernel: [  559.235000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:43:07 ginger user.err kernel: [  567.750000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:43:21 ginger user.err kernel: [  581.490000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:48:20 ginger user.err kernel: [  881.040000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:49:54 ginger user.err kernel: [  974.250000] fbcon_event_notify action=9, data=c7027e08
Dec  1 23:52:50 ginger user.err kernel: [ 1151.170000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:57:24 ginger user.err kernel: [ 1424.705000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  1 23:59:54 ginger user.err kernel: [ 1574.405000] fbcon_event_notify action=9, data=c7027e08
Dec  1 23:59:57 ginger user.err kernel: [ 1577.255000] glamo-mci glamo-mci.0: ****** insanity timeout
Dec  2 00:00:12 ginger authpriv.notice sudo:    lindi : TTY=pts/1 ; PWD=/home/lindi ; USER=root ; COMMAND=/bin/true

comment:5 Changed 10 years ago by andy

Thanks for reporting the downgrading of this to a warning hiding the problem, since passing the timeout up is going to be fatal and it seems we can stumble on with no indication from Glamo it completed and work around it.

If you have a minute please patchify the change and send it on the kernel list; if you'd rather it get done for you let me know and I'll sort it out tomorrow.

Changed 10 years ago by Sprite_tm

comment:6 Changed 10 years ago by Sprite_tm

Something like this?

comment:7 Changed 10 years ago by andy

Yes... thanks for reminding me... I modified it a little for style and sent it on stable-tracking

http://git.openmoko.org/?p=kernel.git;a=commitdiff;h=7a55cd6f948a33c4452dd99da39e15efe832f2e2

because andy-tracking bases off stable-tracking, it inherits it too.

Thanks for finding the workaround and the patch.

comment:8 Changed 10 years ago by andy

  • Status changed from new to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.