Ticket #2257 (new defect)

Opened 6 years ago

Last modified 5 years ago

gsm0710muxd: "Modem does not respond to AT commands"

Reported by: lindi Owned by: openmoko-kernel
Priority: high Milestone: stable-kernel-2009.1
Component: kernel Version: unspecified
Severity: major Keywords:
Cc: Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: no PatchReviewResult:
Reproducible: sometimes

Description

calypso/ttySAC0 seems to sometimes go to a state where even repeated invocations of gsm0710muxd fail with "Modem does not respond to AT commands". However, if I "echo 0 > power_on; sleep 3; echo 1 > power_on; echo 1 > reset" I can use

socat - file:/dev/ttySAC0,crtscts,crnl,b115200

just fine to talk to the modem. After that also gsm0710muxd is able to talk to it.

I am using gsm0710muxd abcbcd7cc532a8834906de3fc24c8f8fe7643cd4 and kernel from andy-tracking c1b03e4da22e8dd7a6caccb9e39a9201535ced11

Attachments

gsm0710muxd.strace.gz (13.1 KB) - added by lindi 6 years ago.
strace of gsm0710muxd when the problem occurs
strace.nelson.b4136a36f31a65d0998.txt (9.0 KB) - added by arhuaco 6 years ago.

Change History

Changed 6 years ago by lindi

strace of gsm0710muxd when the problem occurs

comment:1 Changed 6 years ago by andy

This is reminding me of the overrun trac, the errors there went away when the GSM firmware was updated to MOKO11 which operated the handshakes properly. What's the patchlevel in the Calypso that's making this problem?

comment:2 Changed 6 years ago by lindi

I upgraded to gsm_ac_gp_fd_pu_em_cph_ds_vc_cal35_ri_36_amd8_ts0-Moko11b1 several weeks ago.

comment:3 Changed 6 years ago by andy

OK... on #2180 they also talk about 11beta1 making it go away.

On that bug though they had some kind of syslog or non-strace dump of the serial traffic, it was possible to see the brokenness from that. Can you maybe try to capture that?

comment:4 Changed 6 years ago by lindi

In my case select always timeouts so gsm0710muxd does not even try to read data from the serial port. Thus there is nothing to log, right?

comment:5 Changed 6 years ago by andy

Oh so it is never even able to get started even once?

Maybe the problem is about the delays in the selftimed modem startup stuff added recently, such that the modem is not properly started: does it fit your understanding of the problem?

comment:6 Changed 6 years ago by lindi

Many times gsm0710muxd works after boot. However, sometimes it goes to a state where even restarting gsm0710muxd won't be able to read anything from the modem. I will revert into using known-to-work kernel b8b36e5ec3db71d5 for a while. I'll report back after a week or so.

comment:7 Changed 6 years ago by andy

OK, thanks for reporting about it as always.

comment:8 Changed 6 years ago by BillK

Sounds similar to bug #2215 - the description is somewhat misleading when I raised it :(

moko11 didnt fix it for me - still had to tweak the code in gs0710muxd and use an external script get it to talk - works reliably with current shr-testing with my changes to the mux - never worked with any gsm0710muxd version or the original, moko10 and now moko11 firmware tried until the changes were made. distros *NOT* using the mux work, though not as well as my current setup.

Is there any more work on the kernel driver side coming? - in 2.6.29 perhaps? Saw some things awhile back that made me think "thats will solve it" :)

Billk

comment:9 Changed 6 years ago by lindi

Ah indeed, "will not register" is very misleading. I ignored it since from the title I thought it was some 3G SIM issue.

comment:10 Changed 6 years ago by lindi

I ran b8b36e5ec3db71d5 for five days and did not hit this bug.

comment:11 Changed 6 years ago by BillK

I flashed shr-unstable yesterday (Sat 14th March) and across 3 reboots with between 10 and 30 minutes between, the mux didnt connect. Replaced the mux with my changes, and it came straight up, registered and was fine thereafter.

Its still broken :(

BillK

comment:12 Changed 6 years ago by lindi

Yes, b8b36e5ec3db71d5 is the known-to-work kernel from January.

comment:13 Changed 6 years ago by arhuaco

  • Reproducible set to sometimes
  • Priority changed from normal to high
  • Severity changed from normal to major
  • Milestone set to stable-kernel-2009.1

comment:14 Changed 6 years ago by joerg

Dieter reported some finding about an out-of-mem freeze in firmware which he thinks might be caused by a memleak.
Dieter, please have a look into this as well. Thanks
/j

comment:15 Changed 6 years ago by BillK

Flashed shr-unstable last night (built locally - 10th April) - 2.6.29 kernel

Never was able to talk to the GSM over many attempts using the mux - often got this error:

Apr 10 19:43:59 om-gta02 user.debug kernel: [ 185.015000] modem wakeup interrupt
Apr 10 19:44:04 om-gta02 user.debug kernel: [ 189.830000] rxerr: port=0 ch=0x00, rxs=0x0000000c
Apr 10 19:44:04 om-gta02 user.debug kernel: [ 190.075000] modem wakeup interrupt

comment:16 Changed 6 years ago by PaulFertser

BillK, please try lindi's command from the bug description (to connect with socat). If it works, please provide an strace log of a successful attempt.

comment:17 Changed 6 years ago by arhuaco

I'll start testing this bug.

# cd /sys/class/i2c-adapter/i2c-0/0-0073/neo1973-pm-gsm.
# echo 0 > power_on; sleep 3; echo 1 > power_on; echo 1 > reset
# socat - file:/dev/ttySAC0,crtscts,crnl,b115200
ATZ

(No reply, strace shows: select(16, [0 3], [], [], NULL).

It's the first time I talk to the modem in the GTA02. I should have received a reply from the modem, right?

BTW, I didn't get any reply with b8b36e5ec3db71d5. Do I need to enable echo?

comment:18 Changed 6 years ago by lindi

Yes,

echo 0 > power_on ; sleep 3; echo 1 > power_on; echo 1 > reset; socat - file:/dev/ttySAC0,crtsctsrnl,b115200

prints

AT-Command Interpreter ready

here.

comment:19 follow-up: ↓ 21 Changed 6 years ago by arhuaco

Thanks, It did indeed work for me with the known-to-work kernel.

With the lastest andy-tracking I got further with the patch ignore_s3c2410_serial_overruns.patch (https://docs.openmoko.org/trac/ticket/2180#comment:11).

At least the modem sends a reply but it enters a loop (OK, OK, OK..). I will test with SHR unstable later (I don't have gsm0710muxd in the distro I'm using now).

Here is an interesting thread.

http://www.mail-archive.com/openmoko-kernel@lists.openmoko.org/msg08118.html

I'll be reading more about this issue.

comment:20 Changed 6 years ago by mwester@…

The output from nspy may be helpful in tracking down what's really going on in the driver. Latest nspy patch (applies to the head of the andy-tracking branch) can be found at http://moko.mwester.net/download/nspy_andy_tracking_r901d73fe51...patch

Enable (at boot time would be good) by:

echo "1" >/sys/class/i2c-adapter/i2c-0/0-0073/neo1973-pm-gsm.0/nspy_enable

I use the following script to monitor what's going on:

#!/bin/sh
while /bin/true; do

cat /sys/class/i2c-adapter/i2c-0/0-0073/neo1973-pm-gsm.0/nspy_buffer
sleep 1

done

Prebuilt kernel (with SHR defconfig) available, for a short only, at:

http://moko.mwester.net/download/uImage-2.6.28-oe1+gitr119783+901d73fe51f33032b34b2ae5612eb863ec90532a-r3.4mwester-om-gta02.bin

(make sure to check the md5sum, my ISP has been known to twiddle bits: 5e2422398855284e33a2dc3b1ac30167 )

comment:21 in reply to: ↑ 19 Changed 6 years ago by joerg

Replying to arhuaco:

At least the modem sends a reply but it enters a loop (OK, OK, OK..). I will test with SHR unstable later (I don't have gsm0710muxd in the distro I'm using now).

see: http://wiki.openmoko.org/wiki/Neo_1973_and_Neo_FreeRunner_gsm_modem#Avoiding_Infinite_Echos

comment:22 Changed 6 years ago by arhuaco

I've been testing andy-tracking and gsm0710muxd is working well for me. I noticed it resets the modem when it starts.

I'm sorry to say I was confused by an instance of gsm0710muxd running in the background :-/

Do not forget to run "fuser /dev/ttySAC0" before making tests to check that no program is using this port.

I'll attach the output of strace. It seems to be working well here (I also tested with gsm0710muxd).

Are you still hitting this problem?

Changed 6 years ago by arhuaco

comment:23 Changed 6 years ago by BillK

yes, with 2.6.28 its still ther - but i can get it to work (see #2215)

with shr-testing 2.6.29 I have not had it work once yet no matter what I try (gets the rxerr as above)

I now have latest shr-testing working with 2.6.28, so next step is to try mwesters nspy kernel and see what that shows - tomorrow maybe - work now :)

BillK

comment:24 Changed 6 years ago by arhuaco

I tested with your setup. The shr-testing kernel I flashed is recent enough.

root@om-gta02 ~ $ fuser /dev/ttySAC0
1540
root@om-gta02 ~ $ kill -9 1540
root@om-gta02 ~ $ fuser /dev/ttySAC0

You might want to move /usr/sbin/gsm0710muxd out of the way while you make this test because it can be started while you are testing (it happened to me).

root@om-gta02 ~ $ cd /sys/class/i2c-adapter/i2c-0/0-0073/neo1973-pm-gsm.0/ ; echo 0 > power_on ; sleep 3 ; echo 1 > power_on ; echo 1 > reset ; socat - file:/dev/ttySAC0,crtscts,crnl,b115200
AT-Command Interpreter ready

(Last command worked in the same way 5 times).

Relevant kernel messages after you execute the command (once):

modem wakeup interrupt
rxerr: port=0 ch=0x00, rxs=0x0000000c

I tried latest andy-tracking and I think the later rxerr message is OK when you start using the port. I enabled dbg in samsung.h (and a few other messages, see diffs at the end) and I got this:

[21474677.270000] modem wakeup interrupt
[21474834.305000] dbg:s3c24xx_serial_stop_rx: port=c04a06fc
[ 32.630000] modem wakeup interrupt
[ 34.010000] dbg:s3c24xx_serial_startup: port=50000000 (f5000000,c04a09b4)
[ 34.015000] rxerr: port=0 ch=0x00, rxs=0x0000000c
[ 34.015000] dbg:break!
[ 34.015000] dbg:S3C2410_UERSTAT_FRAME
[ 34.015000] dbg:flag=0
[ 34.020000] dbg:requesting tx irq...
[ 34.025000] dbg:s3c24xx_serial_startup ok
[ 34.030000] dbg:config: 8bits/char
[ 34.035000] dbg:setting ulcon to 00000003, brddiv to 26, udivslot 00000000
[ 34.035000] dbg:uart: ulcon = 0x00000003, ucon = 0x000003c5, ufcon = 0x00000011

Could you repeat the same test? (without the kernel modifications), just make sure that /usr/sbin/gsm0710muxd doesn't interrupt your tests.

Now I guess even when this is working we might get interruptions once in a while and that is the problem that was originally reported.

gsm0710muxd has a watchdog. I assume that if "echo 0 > power_on; sleep 3; echo 1 > power_on; echo 1 > reset" is working then the watchdog could do exactly the same? (It already resets the modem but there could be a timing issue, perhaps a longer delay is needed before the power_on?).

--- a/drivers/serial/samsung.c
+++ b/drivers/serial/samsung.c
@@ -250,10 +250,14 @@ s3c24xx_serial_rx_chars(int irq, void *dev_id)

goto ignore_char;

}


  • if (uerstat & S3C2410_UERSTAT_FRAME)

+ if (uerstat & S3C2410_UERSTAT_FRAME) {
+ dbg("S3C2410_UERSTAT_FRAME\n");

port->icount.frame++;

  • if (uerstat & S3C2410_UERSTAT_OVERRUN)

+ }
+ if (uerstat & S3C2410_UERSTAT_OVERRUN) {
+ dbg("S3C2410_UERSTAT_OVERRUN\n");

port->icount.overrun++;

+ }

uerstat &= port->read_status_mask;


@@ -264,6 +268,7 @@ s3c24xx_serial_rx_chars(int irq, void *dev_id)

else if (uerstat & (S3C2410_UERSTAT_FRAME |

S3C2410_UERSTAT_OVERRUN))

flag = TTY_FRAME;

+ dbg("flag=%d\n", flag);

}

--- a/drivers/serial/samsung.h
+++ b/drivers/serial/samsung.h
@@ -98,10 +98,6 @@ console_initcall(s3c_serial_console_init)

#define s3c24xx_console_init(drv, inf) extern void no_console(void)
#endif


-#ifdef CONFIG_SERIAL_SAMSUNG_DEBUG
-
-extern void printascii(const char *);
-

static void dbg(const char *fmt, ...)
{

va_list va;

@@ -111,9 +107,5 @@ static void dbg(const char *fmt, ...)

vsprintf(buff, fmt, va);
va_end(va);


  • printascii(buff);

+ printk(KERN_ERR "dbg:%s", buff);

}

-
-#else
-#define dbg(x...) do { } while (0)
-#endif

comment:25 Changed 5 years ago by keroami

confirmed on gta01v4, with moko11. Including a similar recipe to wake up gsm/SIM.

Specifics:

  • always happens on warm reboot on 2.6.29-oe10+gitr119838
  • repeated reboots with 2.6.24-oe5+gitrfb42ce6724 on the exact same SHR 20090808 as well as SHR 20090703 do not ever show the problem
  • frameworkd shows ad infinitum attempts to reach GSM channels, but to no avail
  • when in suspend, the unreachable gsm/SIM will resume the Neo (dmesg: IRQ 17 asserted at resume / GSM wakeup interrupt (IRQ 17))
Note: See TracTickets for help on using tickets.