Ticket #2071 (closed defect: fixed)

Opened 6 years ago

Last modified 6 years ago

Booting failed, stop at wifi driver

Reported by: tick Owned by: raster
Priority: highest Milestone:
Component: E - Illume Version:
Severity: blocker Keywords:
Cc: testing@…, john_lee@… Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: no PatchReviewResult:
Reproducible: always

Description

OE branch: org.openmoko.dev 832877aee5d70374cd3f1ecf930e05771dd96ecd
kernel: f5b973489beb1a1239dfad53e3ad6e36ff7ee958

refresh latest rootfs and kernel image
uImage-2.6.24+gitr0+f5b973489beb1a1239dfad53e3ad6e36ff7ee958-r2-om-gta02.bin
openmoko-openmoko-asu-image-glibc-opk--20081015-om-gta02.rootfs.jffs2

Neo will stop booting while entering slash animator.
Using debug board to check log as attached file.
It stoped at modprobe wifi module

Attachments

booting_failed.log (9.6 KB) - added by tick 6 years ago.
Boot failed when initialing wifi
log (148.9 KB) - added by werner 6 years ago.
list_installed (47.1 KB) - added by john_lee 6 years ago.
opkg list_installed

Change History

Changed 6 years ago by tick

Boot failed when initialing wifi

comment:1 Changed 6 years ago by tick

  • Priority changed from normal to highest
  • Owner changed from openmoko-devel to openmoko-kernel
  • Component changed from unknown to System Software
  • Severity changed from normal to blocker
  • Cc wendy added

comment:2 Changed 6 years ago by wendy_hung

  • Cc testing@… added; wendy removed

Tick, thanks!!

comment:3 follow-up: ↓ 5 Changed 6 years ago by werner

How long did you leave the system in that state ? Sometimes, JFFS2 garbage
collection kicks in at that point, and it can take several minutes before
anything else happens.

comment:4 Changed 6 years ago by Weiss

I'm experiencing the same problem after my latest "opkg upgrade" after the testing autobuilder got un-stuck this week. On reboot, the kernel seems to start up fine, but freezes just after the boots appear. The grey bar comes down from the top, but no green pixels appear at all.

I re-flashed with the latest testing image and kernel (testing-om-gta02-20081014.rootfs.jffs2 and testing-om-gta02-20081015.uImage.bin) and had exactly the same situation. I flashed back to Om2008.8-update and things were fine again. Just to check, I've just tried the testing image again, with the same result.

For me, it hangs for at least five minutes with no activity.

I'm not entirely sure it's directly related to the kernel, unless there's something bad about this particular revision. I was using a kernel built from the tip of 'stable' previously, with no problems.

comment:5 in reply to: ↑ 3 Changed 6 years ago by tick

Replying to werner:

How long did you leave the system in that state ? Sometimes, JFFS2 garbage
collection kicks in at that point, and it can take several minutes before
anything else happens.

at least 30+ minutes
and it keeps at the same status.

comment:6 Changed 6 years ago by john_lee

  • Cc john_lee added

comment:7 Changed 6 years ago by john_lee

  • Cc john_lee@… added; john_lee removed

comment:8 Changed 6 years ago by werner

Thanks ! I've been able to reproduce it now. I've disabled module loading
in my kernel to prevent contamination from the OE build. The last program
invocations I see are
/usr/sbin/exquisite-write -wait
followed by one or two invocations of /sbin/hotplug
Then the system hangs, apparently with trashed page tables :-(

Could exquisite-write -wait unleash some mayhem ? If I suppress execution
of exquisite-write, the system boots normally.

comment:9 Changed 6 years ago by werner

Correction: the page tables are fine. The system just isn't stuck in the
kernel but in user space. If I enter the kernel, everything looks normal.
If I open a shell before running the rc scripts, I can interact with the
system.

For example, such a shell can be launched from /etc/init.d/rcS:

#
# Call all parts in order.
#
/bin/sh </ttySAC2 >/ttySAC2 2>&1 &
exec /etc/init.d/rc S

Warning: this shell doesn't get signals, so don't run anything that
you'd have to stop with C.

I've attached a log of the systems' final moments. PID 316 is exquisite,
PID 317 is exquisite-write.

Changed 6 years ago by werner

comment:10 Changed 6 years ago by werner

BTW, I got the strace as follows:

cd /etc/init.d
mv rcS rcS.real
cat >rcS <<EOF
#!/bin/sh
exec /usr/bin/strace -o /ttySAC2 -f /bin/sh -c /etc/init.d/rcS.real "$@"
EOF
chmod 755 rcS
mknod /ttySAC2 c 204 66

comment:11 Changed 6 years ago by Kareema

The source of the problem seems to be "/usr/bin/exquisite-write". It's called from "/etc/init.d/exquisite" during boot. To boot without problems it's enough to comment out the line "exquisite-write -wait 20" in the file "/etc/init.d/exquisite"; the other "exquisite-write" calls don't seem to be problematic.

comment:12 Changed 6 years ago by john_lee

  • Owner changed from openmoko-kernel to raster
  • Reproducible set to always
  • Component changed from System Software to E - Illume

comment:13 Changed 6 years ago by john_lee

After removing exquisite from rc*.d/, it boots into illume, but no icon, the only runnable app is 'installer' in the lower bar. Use the build-in config of illume will bring up the configuration module, but if I click on power settings it shows blank.

Changed 6 years ago by john_lee

opkg list_installed

comment:14 Changed 6 years ago by gromgull

no way to add myself to CC without commenting?

comment:15 Changed 6 years ago by john_lee

  • Status changed from new to closed
  • HasPatchForReview unset
  • Resolution set to fixed

exquisite problem was fixed after update to svnrev 36882, but this leads to another ticket #2082.

Note: See TracTickets for help on using tickets.