Ticket #2274 (new defect)

Opened 4 years ago

Last modified 3 years ago

Kernel regression: white screen of death reappeared with 2.6.29

Reported by: CyrusDreams Owned by: openmoko-kernel
Priority: highest Milestone: stable-kernel-2009.1
Component: kernel Version: GTA02v5
Severity: critical Keywords: WSoD white screen suspend
Cc: Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: no PatchReviewResult:
Reproducible:

Description

Steps to reproduce:

  • use uImage-2.6.29-oe10+gitr119792+81c61a7d1abb03aecd13f5395aba355e996a1641-r3.3-om-gta02.bin from SHR-project dated 2009-04-17 (uname -a: Linux om-gta02 2.6.29-rc3 #1 PREEMPT Fri Apr 17 18:34:52 CEST 2009 armv4tl unknown)
  • use together with rootfs from SHR testing from 2009-04-16
  • let the device get into suspend, let it cool down a little bit
  • press power button and the GTA02 will go to the "white screen of death"

Known good version:

  • uImage-2.6.28-oe1+gitr34240a1c06ae36180dee695aa25bbae869b2aa26-r3-om-gta02 from 2009-03-02 (uname -a: Linux om-gta02 2.6.28-rc4 #1 PREEMPT Sun Feb 8 19:53:16 CET 2009 armv4tl unknown), same rootfs with 2.6.28 modules
  • the device wakes up correctly, even when cooled down to approx. 15°C
  • both scenarios use U-Boot 1.3.2-rc2-dirty-moko12 dated 2008-04-02

Change History

comment:1 Changed 4 years ago by arhuaco

  • Priority changed from normal to highest
  • Severity changed from major to critical
  • Milestone set to stable-kernel-2009.1

comment:2 Changed 4 years ago by nicolas.dufresne

Few question to help debugging:

  • Is it reproducable using Qi ?
  • Will it come back to normal if you suspend again and wake it up ?

My own testing proved that a big difference between 2.6.24 and 2.6.29 is that the new kernel can recover from WSOD by going back to resume an other time. But I never identified why, and also I've never tested 2.6.29 with U-Boot. Answer to does question may be the key of this long standing enigma.

comment:3 Changed 4 years ago by CyrusDreams

  • The device will come back to normal if it goes to suspend again and woken up

comment:4 Changed 4 years ago by CyrusDreams

  • it is not reproducible using Qi

comment:5 Changed 4 years ago by nicolas.dufresne

Thanks for taking time to test ! I think I know now why Andy started working on a reset mechanism for the LCM driver. I guess it's required to support the fact that U-Boot initialize the Glamo and the LCM. I'll try to cleanly reimplement his work.

comment:6 Changed 4 years ago by arhuaco

It's very good to know this. Great!

I'm reverting the commit now. We can apply the patch later when we're ready.

(I'll push as soon as I test locally).

comment:7 Changed 4 years ago by lindi

I just hit this with andy-tracking b4136a36f31a65d0 after resume. SD card access seems to work during WSOD and suspend/resume seems to bring the display back to a usable state. Here's some version number information:

kernel: Linux ginger 2.6.29-GTA02_lindi-andy-tracking-mokodev #1 PREEMPT Sat Apr 25 15:56:52 EEST 2009 armv4tl GNU/Linux
kernel cmdline: rootfstype=jffs2 root=/dev/mtdblock6 console=ttySAC2,115200 console=tty0 loglevel=8 regular_boot mtdparts=physmap-flash:-(nor);neo1973-nand:0x00040000(u-boot),0x00040000(u-boot_env),0x00800000(kernel),0x000a0000(splash),0x00040000(factory),0x0f6a0000(rootfs) rootfstype=ext2 root=/dev/mmcblk0p2 rootdelay=5 mem=127M panic=10
u-boot: U-Boot 1.3.2+gitr46+dc633f4be2527f844158aa5085c278b0c3039d3f (Aug 8 2008 - 03:58:49)
xserver-xorg-core: 2:1.5.99.902-1
xf86-video-glamo: 703acea13
distro: debian gnu/linux unstable
hardware revision: 24420350

comment:8 Changed 4 years ago by seven

Hi, I have experienced a similar problem.

Yesterday I flashed the SHR testing distribution with u-boot, and after flashing I could boot only from NOR boot until I replaced u-boot with qi.

Otherwise I kept getting static on the screen and the device unresponsive to inputs, and had to plug the battery off to get it working.

While going into NOR boot and selecting "boot" option was working.

I used these images: (found in http://build.shr-project.org/shr-testing/images/om-gta02/ )
kernel: uImage-2.6.28-stable+gitr0e5fe639e234cdeb11d8441f19c5b3109a8b6a17-r2-om-gta02.bin
rootfs: shr-lite-image-om-gta02.jffs2

I remember that uname -r returned "2.6.29-r2".

Today 18 july 2009 I flashed the om2009 and had the same behaviour.

-first, I updated the GSM firmware as described here: http://wiki.openmoko.org/wiki/GSM/Flashing#uSD-card_Image_.28GTA02_only.29
-then I flashed the u-boot, kernel and rootfs (i nor-booted between each flash) with the images taken from http://downloads.openmoko.org/distro/testing/NeoFreerunner/ :

-u-boot: u-boot-gta02v5-1.3.1+gitr650149a53dbdd48bf6dfef90930c8ab182adb512-r1.bin
-kernel: uImage-2.6.28-stable+gitr0+f19f259d3c1afde8eae53983fd19f61831927413-r3-om-gta02.bin
-rootfs: fso-paroli-image-om-gta02.jffs2 (dated 16-Jun-2009)

(please note: uname -r still gives 2.6.29-r2)

Again when I attempted to boot i got static noise and device unresponsive, having to pull off the battery. Going into NOR boot menu and selecting the "Boot" entry worked.

Flashing qi ( qi-s3c2442-1.0.2+gitr3b8513d8b3d9615ebda605de4bda18371aa3f359.udfu ) solved the problem.

According to /proc/cpuinfo, my hardware is GTA02 and revision 0360.

comment:9 Changed 3 years ago by joerg

obviously still not solved
AIUI "newer" kernels implement complete power down of LCM via power regulator (PMU LDO) switchoff.
On resume it is mandatory for whoever does init of LCM video chip JBT6k74 to wait for sabilization of VDD power, after LDO is enabled. Alas I don't know any good inicator to probe if voltage has stabilized, probably we'll need a usleep() of empirically determined duration here (my guess is 100ms should be sufficient)

comment:10 Changed 3 years ago by werner

Section 8.8.6.3 "Power failure detection" of the PMU manual suggests that a wait of 100 ms should be sufficient. For extra paranoia, one could watch INT5 after that.

Note: See TracTickets for help on using tickets.