Ticket #2229 (accepted defect)

Opened 6 years ago

Last modified 6 years ago

GSM was working fine for months, suddenly will not register using any GSM stack

Reported by: danek2 Owned by: joerg
Priority: normal Milestone:
Component: hardware Version: GTA02v5
Severity: critical Keywords: calypso, gsm, registration
Cc: Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: no PatchReviewResult:
Reproducible: always

Description

Disclaimer: I suspect this affects my handset only, as I haven't seen similar reports on the forums or here. One person (BillK on the forums) has a problem he believes is similar, although his phone does still work on some distributions, whereas mine does not. I also don't know if his phone ever worked when using gsm0710muxd. See bug 2215.

I started using Neo Freerunner as a daily phone (with much difficulty at first) in August '08. I was very happy with it until one day in December it suddenly stopped registering GSM. The phone was on, I had been making some calls in 2008.12, and a few hours later, phone still on from before, I tried dialing out and discovered that I wasn't registered. I used a landline to make the call I needed to make, and discovered that people had been trying to contact me for some time and were going straight to voicemail. I rebooted my phone, as had become common practice by that point, but still did not register. At the time, I assumed that perhaps I was getting poor reception (I was working in an unfamiliar building), but a colleague of mine who was at the same location with me at the time and has the same GSM provider was able to use his phone normally. Later, when I was outdoors I tried rebooting again, to no avail. I got to my office, where I had an old GSM phone, and was able to put my SIM there so I could use a phone for the rest of the day.

When I got home that night, I tried swapping SIMs. I tried three different SIMs from two different carriers, all of which had previously worked with the Freerunner. Then I tried reflashing images. After a couple of reflashes failed to restore functionality, I put the Freerunner down.

I have not once been able to successfully register GSM since the day that it mysteriously stopped working. Every once in a while when I get the time to mess around with the phone, I try some more troubleshooting, but have not had much time to devote to this. Nevertheless, I have tried many different images, including 2007.2 (or 2008.4), 2008.8, 2008.9, 2008.12, Qtopia 4.3 and 4.4, and FSO Milestones 3, 4, and 5. GSM was known to be working with all of these images prior to the time that it stopped working, with the exception of FSO Milestone 5, for the obvious reason that it was not available before the phone stopped working. Not a single one of them has worked since the phone stopped registering.

I have also tried using GSM manually, according to http://wiki.openmoko.org/wiki/Manually_using_GSM - this also fails. I attempted this on 2007.2/2008.4, since the other distributions don't talk to GSM the same way. It hangs on "Connected." and never indicates readiness for AT commands.

I know that the SIM card is visible to the system, because it can see my saved contacts and SMS messages, and because when I set a SIM pin it prompts me for the PIN, accepts the correct one, and rejects an incorrect one.

The firmware has also been updated, though this has not changed anything. It was running moko8 at the time that it suddenly stopped working. After updating to moko10 no change was observed.

I also tried, at BillK's suggestion, writing different UART settings to /dev/ttySAC0, but this never achieved anything, either. See http://lists.openmoko.org/nabble.html#nabble-tt1649863

I don't know what to do anymore at this point, and suspect that my Calypso has mysteriously died. The only thing I haven't erased and restored on the phone yet is the NAND u-boot environment: this got corrupted some time ago, and when I write a new u-boot environment to partition 2 (the partition names are not visible to DFU-Util in my NAND u-boot until I write a u-boot environment) it gets borked again when I next reboot. The only way to boot the phone is to boot into NOR u-boot; otherwise I get a garbled screen. However, I don't think this has anything to do with the GSM registration problem, as the phone had been unbootable from NAND u-boot for about six weeks before GSM stopped registering.

I am attaching three dumps from logread:

logdump.txt was taken from 2008.12 about a week after GSM stopped working.
logdumpwpin.txt was taken from the same setup, after setting a SIM PIN using another phone.
logdumpfso5.txt was taken today using FSO Milestone 5.

To recap:

  • GSM registration worked fine for about four months.
  • It suddenly and mysteriously stopped one day and hasn't worked since.
  • Three different SIMs which were known to work no longer work.
  • It used to work with gsmd, qpe, and gsm0710muxd. It now works with none of these.
  • It was running moko8 at the time that it stopped working. Upgrading to moko10 did not change the situation.

Attachments

logdumps.tar.gz (22.3 KB) - added by danek2 6 years ago.
logdump.txt (43.6 KB) - added by danek2 6 years ago.
logdumpwpin.txt (45.2 KB) - added by danek2 6 years ago.
logdumpfso5.txt (63.4 KB) - added by danek2 6 years ago.
logdumpafternandfix.txt (45.0 KB) - added by danek2 6 years ago.
gsm-debug-log-000 (24.6 KB) - added by slm3095om 6 years ago.
Modem debug output
batt.log (12.7 KB) - added by slm3095om 6 years ago.

Change History

Changed 6 years ago by danek2

Changed 6 years ago by danek2

Changed 6 years ago by danek2

Changed 6 years ago by danek2

comment:1 Changed 6 years ago by danek2

I recently saw a message on the support list giving instructions on how to fix u-boot environment from the NOR u-boot console. I did this and can now successfully boot from NAND u-boot for the first time in several months. (See http://lists.openmoko.org/nabble.html#nabble-td2391517)

I was excited, as I was hoping that fixing the u-boot environment would somehow make GSM start working again, even though booting from NAND had stopped working a couple of months before GSM stopped working. Unfortunately, after booting the phone, and reflashing a few times, I still could not get GSM to register.

I am attaching a new log file taken from logread after the u-boot env fix. It still seems to be doing the same thing.

Changed 6 years ago by danek2

comment:2 Changed 6 years ago by lindi

http://docs.openmoko.org/trac/attachment/ticket/2229/logdumpafternandfix.txt suddenly sees

AtChat? : F : "AT-Command Interpreter ready"

so it looks like your gsm chip just restarted itself? Can you capture debug output of the gsm chip via the headphone connector? You can find instructions on building the cable at http://lists.openmoko.org/pipermail/hardware/2009-April/thread.html

comment:3 follow-up: ↓ 14 Changed 6 years ago by joerg

reset on registering clearly looks like a weak worn battery (or dirty contacts) so voltage drops below the magic 3.5V due to high source-impedance when modem starts to draw current for TX-activity.
Please try a spare bat, or probe actual voltage of bat at connectors with a scope (~0.5s/div) during registering.

comment:4 Changed 6 years ago by joerg

  • Status changed from new to accepted
  • Owner changed from hardware to joerg

comment:5 follow-up: ↓ 7 Changed 6 years ago by slm3095om

I too am suffering from this problem. It started a couple weeks ago. For testing, I am talking to the modem with cu, here is a log of the conversation

AT-Command Interpreter ready
AT+CGMR
+CGMR: "GSM: gsm_ac_gp_fd_pu_em_cph_ds_vc_cal35_ri_36_amd8_ts0-Moko11"

OK
AT%CSTAT=1;+CMEE=2;+CREG=2
OK
AT+CFUN=1
%CSTAT: PHB, 0
%CSTAT: PHB, 0
%CSTAT: PHB, 0
%CSTAT: PHB, 0

OK
AT+COPS=0,0
OK

+CREG: 2

+CREG: 5,"2710","C60E"
AT-Command Interpreter ready

I will attach the log file from the modem's debug output.

I do have a spare battery and I have cleaned the contacts. The problem is present with both batteries. Sorry, but I don't have ready access to a scope.

One thing that I did notice in my testing is that the modem resets with 2008.12 but it hangs with 2009. I have to do an 'echo 1 > power_on' to get it back. I thought that was a little odd, I would have expected to at least have to do 'echo 0 > power_on; echo 1 > power_on'

Changed 6 years ago by slm3095om

Modem debug output

comment:6 follow-up: ↓ 8 Changed 6 years ago by slm3095om

In the absence of a scope, I ran the following shell script while I reran my test

while :
do
        ts=`date +%s`
        cap=`cat /sys/devices/platform/bq27000-battery.0/power_supply/bat/capacity`
        vol=`cat /sys/devices/platform/bq27000-battery.0/power_supply/bat/voltage_now`
        cur=`cat /sys/devices/platform/bq27000-battery.0/power_supply/bat/current_now`
        echo $ts $cap $vol $cur

        usleep 500
done

If those values are anywhere close to accurate, my battery stays above 4.1 volts the entire time.

Changed 6 years ago by slm3095om

comment:7 in reply to: ↑ 5 Changed 6 years ago by joerg

Replying to slm3095om:

One thing that I did notice in my testing is that the modem resets with 2008.12 but it hangs with 2009. I have to do an 'echo 1 > power_on' to get it back. I thought that was a little odd, I would have expected to at least have to do 'echo 0 > power_on; echo 1 > power_on'

The semantics of modem power management has changed in new kernels, usually power_on is handled internally. It's the power pushbutton function of the modem, and new kernels correctly operate this pushbutton for a few seconds when closing the actual powerswitch to route power to the modem from battery. Old kernels needed userland to do this and userland didn't echo 0 >power_on after actuating this button.

If this problem is triggered by modem inrush current on registering, could you please test with a different provider or - better - go to a place with much higher signal (more close to the basestation) and try registering there?

comment:8 in reply to: ↑ 6 ; follow-ups: ↓ 10 ↓ 15 Changed 6 years ago by joerg

Replying to slm3095om:

In the absence of a scope, I ran the following shell script while I reran my test

Alas the voltages data from battery coulomb counter are very slow. You'd need a real scope to do this test.

Could you all please state which hw version (board revision, and datecode) you got.

comment:9 Changed 6 years ago by joerg

Steve, could you get me one of the devices? Seems hw-defect

comment:10 in reply to: ↑ 8 ; follow-ups: ↓ 11 ↓ 12 Changed 6 years ago by slm3095om

Replying to joerg:

Replying to slm3095om:

In the absence of a scope, I ran the following shell script while I reran my test

Alas the voltages data from battery coulomb counter are very slow. You'd need a real scope to do this test.

How long does the voltage drop have to last to affect the modem? I could throw together a "poor man's oscilloscope" using an Arduino board. I can sample one of the 10bit ADCs about 500 times per second; that should allow me to detect the voltage drop if it last for 4 or 5 milliseconds. Not a 10MHz scope by a long shot, but is it close enough?

Could you all please state which hw version (board revision, and datecode) you got.

from /proc/cpuinfo:
Hardware : GTA02
Revision : 0350

the date code is 20080620

comment:11 in reply to: ↑ 10 Changed 6 years ago by joerg

  • Severity changed from normal to critical

Replying to slm3095om:

How long does the voltage drop have to last to affect the modem? I could throw together a "poor man's oscilloscope" using an Arduino board. I can sample one of the 10bit ADCs about 500 times per second; that should allow me to detect the voltage drop if it last for 4 or 5 milliseconds. Not a 10MHz scope by a long shot, but is it close enough?

That would be pretty good. A dropout is for sure in the range of tens of ms at least.
Maybe you could check the schematics / placement and test on both pads of R1752 (NC) and on H-TP1701

comment:12 in reply to: ↑ 10 ; follow-up: ↓ 13 Changed 6 years ago by joerg

Replying to slm3095om:

Also you might try to simply attach a rocksolid 3V5..4V5 power source instead of the battery. Beware of overvoltage!

comment:13 in reply to: ↑ 12 Changed 6 years ago by slm3095om

Replying to joerg:

Replying to slm3095om:

Also you might try to simply attach a rocksolid 3V5..4V5 power source instead of the battery. Beware of overvoltage!

One of these days I need to get a decent bench supply... in the meantime, I used 3 D cells. According to my 'Poor Man's Scope' that never dropped below 4.1 V.

Maybe you could check the schematics / placement and test on both pads of R1752 (NC) and on H-TP1701

I'm afraid your over estimating my capabilities... I'm pretty sure I could hit the test point, I'm not so sure about the resistor. Besides, I'm still trying to figure out how I would hold the battery in place, hold the probe in place and issue the commands in the ssh session; I would need at least three arms.

If this problem is triggered by modem inrush current on registering, could you please test with a different provider or - better - go to a place with much higher signal (more close to the basestation) and try registering there?

I took the phone and my notebook and parked across the street from a tower. I ran the test twice,
once with a SIM from Fido (my normal SIM) and once with a SIM from Rogers. In both cases the modem reset. This was with the FIC battery installed.

comment:14 in reply to: ↑ 3 Changed 6 years ago by dadap

Replying to joerg:

reset on registering clearly looks like a weak worn battery (or dirty contacts) so voltage drops below the magic 3.5V due to high source-impedance when modem starts to draw current for TX-activity.
Please try a spare bat, or probe actual voltage of bat at connectors with a scope (~0.5s/div) during registering.

Hi, this is danek2; I forgot my trac password and didn't set an e-mail the first time around, so I couldn't reset it. Sorry for the long delay, but I've been very busy. (getting married soon... wedding prep requires a non-trivial amount of effort, and already swamped with work to begin with.)

I finally found my Nokia dumbphone with a compatible battery, and it still doesn't register. I don't have access to a scope; only a multimeter. I'd really, really like to just send the phone in to be looked at, and I'd really, really like to have a working unlocked GSM phone in time for our honeymoon in six weeks, and it would be great if that phone was the Freerunner. All of my current GSM phones are locked to US providers.

Can I please just send this in? Is GTA02v5 actually under recall anyway?

comment:15 in reply to: ↑ 8 Changed 6 years ago by dadap

Replying to joerg:

Replying to slm3095om:

In the absence of a scope, I ran the following shell script while I reran my test

Alas the voltages data from battery coulomb counter are very slow. You'd need a real scope to do this test.

Could you all please state which hw version (board revision, and datecode) you got.

GTA02v5
SN 8A8603976
Revision 0350 (coincidence?)
Date Code 20080620 (coincidence?)

Note: See TracTickets for help on using tickets.