Ticket #2073 (new defect)

Opened 6 years ago

Last modified 5 years ago

voice-recording.state + arecord: Unable to handle kernel NULL pointer dereference at virtual address 00000000

Reported by: lindi Owned by: openmoko-kernel
Priority: high Milestone:
Component: kernel Version:
Severity: normal Keywords: ALSA
Cc: testing@…, joerg@… Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: no PatchReviewResult:
Reproducible: always

Description

Steps to reproduce:
1) wget http://wildsau.enemy.org/~moko/voice-recording.state
2) arecord -c 1 -f S16_LE -c2 -r22050 | hexdump -C

Expected results:
2) audio is recorded and shown as hex

Actual results:
2) arecord prints

Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 22050 Hz, Stereo
ALSA lib pcm_params.c:2135:(snd1_pcm_hw_refine_slave) Slave PCM not usable
arecord: set_params:896: Broken configuration for this PCM: no configurations available

and dmesg shows "Unable to handle kernel NULL pointer dereference at virtual address 00000000"

More info:
1) I am using debian with linux from http://downloads.openmoko.org/framework/milestone2/uImage-2.6.24+gitr0+7a1370a816b9348dd8f36a667905dd3533cefc9b-r4-om-gta02.bin
2) The complete kernel message is

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 0 [#1] PREEMPT
Modules linked in: snd_soc_neo1973_gta02_wm8753 snd_soc_s3c24xx_i2s snd_soc_s3c24xx snd_soc_wm8753 snd_soc_core snd_pcm snd_timer snd_page_alloc snd
CPU: 0    Not tainted  (2.6.24 #1)
PC is at __init_begin+0x3fff8000/0x34
LR is at neo1973_gta02_hifi_hw_free+0x30/0x34 [snd_soc_neo1973_gta02_wm8753]
pc : [<00000000>]    lr : [<bf041030>]    psr: a0000013
sp : c7f07e48  ip : bf039160  fp : c7f07e54
r10: bf03edc8  r9 : c7f06000  r8 : bf03c0b0
r7 : c7d456c0  r6 : bf039160  r5 : bf042d80  r4 : c7d40600
r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : bf039160
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 37e38000  DAC: 00000015
Process arecord (pid: 9281, stack limit = 0xc7f06268)
Stack: (0xc7f07e48 to 0xc7f08000)
7e40:                   c7f07e7c c7f07e58 bf02a6e4 bf041010 c7d456c0 c7e74600 
7e60: c7e74708 c7e85200 c7fd83a0 00000000 c7f07e94 c7f07e80 bf0194d4 bf02a67c 
7e80: c7e85200 c7d456c0 c7f07ebc c7f07e98 bf019564 bf019494 c7fd83a0 00000008 
7ea0: c7fd83a0 c7fc2d24 c7875f40 c7c0db20 c7f07eec c7f07ec0 c0096470 bf01952c 
7ec0: 00000000 00000000 c6d08614 c7fd83a0 c7c2a9a0 00000000 c7e852a0 00000000 
7ee0: c7f07efc c7f07ef0 c0096864 c00963d0 c7f07f1c c7f07f00 c0093270 c0096840 
7f00: 00000052 c7c2a9a0 00000001 00000010 c7f07f44 c7f07f20 c00484a0 c0093204 
7f20: c0043d80 c7d24bc0 c7c2a9a0 00000100 000000f8 c00290e8 c7f07f5c c7f07f48 
7f40: c0048544 c0048428 00000001 c7d24bc0 c7f07f74 c7f07f60 c00499e4 c0048504 
7f60: 00001008 402fd764 c7f07f94 c7f07f78 c0049fe8 c0049804 c7f07f9c 000931d0 
7f80: 000931ac 402fd764 c7f07fa4 c7f07f98 c004a008 c0049f48 00000000 c7f07fa8 
7fa0: c0028f40 c004a000 000931d0 000931ac 00000001 00000001 fbad2088 00000008 
7fc0: 000931d0 000931ac 402fd764 000000f8 bef1af68 bef1aef8 fffffffe bef1b24c 
7fe0: 401b7380 bef1aef0 40208f84 4026bfb0 60000010 00000001 00000000 00000000 
Backtrace: 
[<bf041000>] (neo1973_gta02_hifi_hw_free+0x0/0x34 [snd_soc_neo1973_gta02_wm8753]) from [<bf02a6e4>] (soc_pcm_hw_free+0x78/0xcc [snd_soc_core])
[<bf02a66c>] (soc_pcm_hw_free+0x0/0xcc [snd_soc_core]) from [<bf0194d4>] (snd_pcm_release_substream+0x50/0x98 [snd_pcm])
[<bf019484>] (snd_pcm_release_substream+0x0/0x98 [snd_pcm]) from [<bf019564>] (snd_pcm_release+0x48/0x8c [snd_pcm])
 r4:c7d456c0
[<bf01951c>] (snd_pcm_release+0x0/0x8c [snd_pcm]) from [<c0096470>] (__fput+0xb0/0x194)
 r8:c7c0db20 r7:c7875f40 r6:c7fc2d24 r5:c7fd83a0 r4:00000008
[<c00963c0>] (__fput+0x0/0x194) from [<c0096864>] (fput+0x34/0x38)
 r8:00000000 r7:c7e852a0 r6:00000000 r5:c7c2a9a0 r4:c7fd83a0
[<c0096830>] (fput+0x0/0x38) from [<c0093270>] (filp_close+0x7c/0x88)
[<c00931f4>] (filp_close+0x0/0x88) from [<c00484a0>] (put_files_struct+0x88/0xdc)
 r6:00000010 r5:00000001 r4:c7c2a9a0
[<c0048418>] (put_files_struct+0x0/0xdc) from [<c0048544>] (__exit_files+0x50/0x54)
 r8:c00290e8 r7:000000f8 r6:00000100 r5:c7c2a9a0 r4:c7d24bc0
[<c00484f4>] (__exit_files+0x0/0x54) from [<c00499e4>] (do_exit+0x1f0/0x744)
 r5:c7d24bc0 r4:00000001
[<c00497f4>] (do_exit+0x0/0x744) from [<c0049fe8>] (do_group_exit+0xb0/0xb8)
[<c0049f38>] (do_group_exit+0x0/0xb8) from [<c004a008>] (sys_exit_group+0x18/0x1c)
 r6:402fd764 r5:000931ac r4:000931d0
[<c0049ff0>] (sys_exit_group+0x0/0x1c) from [<c0028f40>] (ret_fast_syscall+0x0/0x2c)
Code: bad PC value.
---[ end trace e31040251ef66513 ]---
Fixing recursive fault but reboot is needed!

Please let me know if you are unable to reproduce this with some version of linux. I am more than happy to provide more info (and compile new version if you have suggested patches).

Attachments

arecord-crash1.txt (16.5 KB) - added by lindi 6 years ago.

Change History

comment:1 Changed 6 years ago by exactt

hi,

the kernel you are using is from july and hopelessly outdated. please try this with milestone3 kernel.

thx

comment:2 Changed 6 years ago by lindi

I can reproduce this also with

http://downloads.openmoko.org/framework/milestone3/uImage-2.6.24+r5+gitr1+ca19d156400f817960efe0d14680324b2ea34171-r5-om-gta02.bin
http://downloads.openmoko.org/framework/milestone3/modules-2.6.24+r5+gitr1+ca19d156400f817960efe0d14680324b2ea34171-r5-om-gta02.tgz

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 0 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (2.6.24 #1)
PC is at __init_begin+0x3fff8000/0x34
LR is at neo1973_gta02_hifi_hw_free+0x30/0x34
pc : [<00000000>]    lr : [<c024c0c0>]    psr: a0000013
sp : c7ef3e48  ip : c0403238  fp : c7ef3e54
r10: c03d4f58  r9 : c7ef2000  r8 : c03d4ef8
r7 : c7d39ac0  r6 : c0403238  r5 : c03d55f8  r4 : c7d19600
r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : c0403238
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: c000717f  Table: 36c54000  DAC: 00000015
Process arecord (pid: 1277, stack limit = 0xc7ef2268)
Stack: (0xc7ef3e48 to 0xc7ef4000)
3e40:                   c7ef3e7c c7ef3e58 c024626c c024c0a0 c7d39ac0 c7d57800 
3e60: c7d57908 c7f6a1e0 c7ef4ba0 00000000 c7ef3e94 c7ef3e80 c02339ec c0246204 
3e80: c7f6a1e0 c7d39ac0 c7ef3ebc c7ef3e98 c0233a7c c02339ac c7ef4ba0 00000008 
3ea0: c7ef4ba0 c7f90e6c c7893dec c7c0dd20 c7ef3eec c7ef3ec0 c00975b8 c0233a44 
3ec0: 00000000 00000000 c7f5c6bc c7ef4ba0 c7c13d40 00000000 c7f6a260 00000000 
3ee0: c7ef3efc c7ef3ef0 c00979ac c0097518 c7ef3f1c c7ef3f00 c00943b8 c0097988 
3f00: 00000052 c7c13d40 00000001 00000010 c7ef3f44 c7ef3f20 c00495e8 c009434c 
3f20: c0044ec8 c7f36920 c7c13d40 00000100 000000f8 c002a0e8 c7ef3f5c c7ef3f48 
3f40: c004968c c0049570 00000001 c7f36920 c7ef3f74 c7ef3f60 c004ab2c c004964c 
3f60: 00001008 402fd764 c7ef3f94 c7ef3f78 c004b130 c004a94c c7ef3f9c 000931d0 
3f80: 000931ac 402fd764 c7ef3fa4 c7ef3f98 c004b150 c004b090 00000000 c7ef3fa8 
3fa0: c0029f40 c004b148 000931d0 000931ac 00000001 00000001 fbad2088 00000008 
3fc0: 000931d0 000931ac 402fd764 000000f8 be8ba3a8 be8ba338 fffffffe be8ba68c 
3fe0: 401b7380 be8ba330 40208f84 4026bfb0 60000010 00000001 6d616c67 636d2d6f 
Backtrace: 
[<c024c090>] (neo1973_gta02_hifi_hw_free+0x0/0x34) from [<c024626c>] (soc_pcm_hw_free+0x78/0xcc)
[<c02461f4>] (soc_pcm_hw_free+0x0/0xcc) from [<c02339ec>] (snd_pcm_release_substream+0x50/0x98)
[<c023399c>] (snd_pcm_release_substream+0x0/0x98) from [<c0233a7c>] (snd_pcm_release+0x48/0x8c)
 r4:c7d39ac0
[<c0233a34>] (snd_pcm_release+0x0/0x8c) from [<c00975b8>] (__fput+0xb0/0x194)
 r8:c7c0dd20 r7:c7893dec r6:c7f90e6c r5:c7ef4ba0 r4:00000008
[<c0097508>] (__fput+0x0/0x194) from [<c00979ac>] (fput+0x34/0x38)
 r8:00000000 r7:c7f6a260 r6:00000000 r5:c7c13d40 r4:c7ef4ba0
[<c0097978>] (fput+0x0/0x38) from [<c00943b8>] (filp_close+0x7c/0x88)
[<c009433c>] (filp_close+0x0/0x88) from [<c00495e8>] (put_files_struct+0x88/0xdc)
 r6:00000010 r5:00000001 r4:c7c13d40
[<c0049560>] (put_files_struct+0x0/0xdc) from [<c004968c>] (__exit_files+0x50/0x54)
 r8:c002a0e8 r7:000000f8 r6:00000100 r5:c7c13d40 r4:c7f36920
[<c004963c>] (__exit_files+0x0/0x54) from [<c004ab2c>] (do_exit+0x1f0/0x744)
 r5:c7f36920 r4:00000001
[<c004a93c>] (do_exit+0x0/0x744) from [<c004b130>] (do_group_exit+0xb0/0xb8)
[<c004b080>] (do_group_exit+0x0/0xb8) from [<c004b150>] (sys_exit_group+0x18/0x1c)
 r6:402fd764 r5:000931ac r4:000931d0
[<c004b138>] (sys_exit_group+0x0/0x1c) from [<c0029f40>] (ret_fast_syscall+0x0/0x2c)
Code: bad PC value.
---[ end trace 5adabf8113763f41 ]---
Fixing recursive fault but reboot is needed!

comment:3 Changed 6 years ago by erl

I got this under an updated FSO-testing distribution as well.

comment:4 Changed 6 years ago by erl

I think this is triggered by the ALSA control "DAI mode" being set to 1 rather than 0.

This wiki page: http://wiki.openmoko.org/wiki/Recording_audio says that DAI mode 1 is for recording.

comment:5 Changed 6 years ago by lindi

I noticed that if I start arecord first and when it is running restore the voice-recording.state arecord starts to receive data and does not cause kernel crash. However, I also need to restore my normal ALSA state before I quit arecord or I get a kernel crash again.

comment:6 Changed 6 years ago by lindi

Any ideas if this bug has been fixed in some git branch already? This makes it very hard to use voip software since I can not reliably know when the program has opened the audio device and I can toggle DAI mode on. Some programs only open audio device on incoming call which means I need to change the voip programs itself to call "alsactl -f voice-recording.state restore".

comment:7 Changed 6 years ago by erl

FYI: There are some comments in bug #2073, which is a dupe of this bug.

comment:8 Changed 6 years ago by joerg

  • HasPatchForReview unset

probably erl meant #2179

from those comments by laf0rge it seems you could workaround by avoiding
-r22050
for arecord.
For the internal mic -c 2 is meaningless, and -r 22050 also seems a little exaggerated ;-)

comment:9 Changed 6 years ago by lindi

What rate should I use then? I get the oops if I don't specify any rate at all.

comment:10 Changed 6 years ago by andy

OOPS is not a legitimate response to any rate, it's a bug.

comment:11 Changed 6 years ago by lindi

With

1) alsactl -f voice-recording.state restore
2) arecord -c 1 -f S16_LE -c2 -r22050

I see that in

runtime->hw.rates =
    codec_dai->capture.rates & cpu_dai->capture.rates;

of soc-core.c

codec_dai->capture.rates == 0x00000000
cpu_dai->capture.rates == 0x000006fe

when arecord tries to open the device. It also seems that
neo1973_gta02_hifi_hw_params of neo1973_gta02_wm8753.c is not
called. If run arecord first then this function is called.

comment:12 Changed 6 years ago by lindi

If I just avoid dereferencing the NULL pointer I can still sometimes make the kernel crash differently. I did

lindi@ginger:/etc/alsa-scenarios$ sudo chvt 1
lindi@ginger:/etc/alsa-scenarios$ alsactl -f stereoout.state restore
lindi@ginger:/etc/alsa-scenarios$ alsactl -f voice-recording.state restore
lindi@ginger:/etc/alsa-scenarios$ arecord -f S16_LE -c1 -r44100 /dev/shm/a.wav &
[1] 1777
lindi@ginger:/etc/alsa-scenarios$ arecord: main:564: audio open error: Invalid argument

[1]+  Exit 1                  arecord -f S16_LE -c1 -r44100 /dev/shm/a.wav
lindi@ginger:/etc/alsa-scenarios$ alsactl -f stereoout.state restore
lindi@ginger:/etc/alsa-scenarios$ arecord -f S16_LE -c1 -r44100 /dev/shm/a.wav &
[1] 1780
lindi@ginger:/etc/alsa-scenarios$ Recording WAVE '/dev/shm/a.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Mono

lindi@ginger:/etc/alsa-scenarios$ alsactl -f stereoout.state restore

and with the patch for #2135 I was able to capture the full crash message after reboot. I'll attach a complete version to this report but the interesting parts are:

INFO: trying to register non-static key.
[21474693.155000] the code is fine but needs lockdep annotation.
[21474693.155000] turning off the locking correctness validator.
[21474693.155000] [<c03a6378>] (dump_stack+0x0/0x14) from [<c0075a74>] (__lock_acquire+0x1a0/0x794)
...
[21474693.155000] Unable to handle kernel NULL pointer dereference at virtual address 00000004
...
[21474693.155000] [<c02e6b40>] (s3c24xx_pcm_pointer+0x0/0x84) from [<c02d3e34>] (snd_pcm_period_elapsed+0xbc/0x330)
[21474693.155000]  r6:c7a64200 r5:c04f41fc r4:c7a73000
[21474693.155000] [<c02d3d78>] (snd_pcm_period_elapsed+0x0/0x330) from [<c02e6f6c>] (s3c24xx_audio_buffdone+0x2c/0xd0)
[21474693.155000] [<c02e6f40>] (s3c24xx_audio_buffdone+0x0/0xd0) from [<c00431e8>] (s3c2410_dma_irq+0xc4/0x4c4)
[21474693.155000]  r7:00000022 r6:00000000 r5:c04f41fc r4:c7a163c0
[21474693.155000] [<c0043124>] (s3c2410_dma_irq+0x0/0x4c4) from [<c00830c8>] (handle_IRQ_event+0x2c/0x68)
[21474693.155000]  r7:00000022 r6:00000000 r5:00000000 r4:c6c4f980
...
[21474693.160000] Kernel panic - not syncing: Fatal exception in interrupt
[21474693.165000] Rebooting in 10 seconds..arch_reset: attempting watchdog reset

Changed 6 years ago by lindi

comment:13 follow-up: ↓ 18 Changed 5 years ago by lindi

Graeme just pointed out I should use voip-handset.state. And indeed, "arecord | aplay" works perfectly with that!

comment:14 Changed 5 years ago by arhuaco

  • HasPatchForReview set

Replying to lindi:

Graeme just pointed out I should use voip-handset.state. And indeed, "arecord | aplay" works perfectly with that!

It's really good to know that! As Paul said on IRC the driver should never crash. He wrote a nice patch that I'm testing now.

https://paulfertser.is-a-geek.org/files/0001-Hack-to-temporarily-avoid-Oops-on-recording-with-DAI.patch

comment:15 Changed 5 years ago by arhuaco

  • Keywords ALSA added
  • Priority changed from normal to high

I feel like a hen raising ducks with this ALSA thing :-) I'll try to help anyway.

How do you feel about the dummy states we have in sound/soc/codecs/wm8753.c?

  • Is it OK to have dummy states?
  • Why do we have them on the first place?

We already know that Paul's patch avoids the crash but
there are more dummy states and another ALSA state that
works by using another mode. Thus I wonder if we have to:

  1. Allow dummy states as they are now and just have some code to prevent you from opening the device when a dummy state is selected so that we don't crash.

or

  1. Do something similar to what Paul did the patch he coded for all the other states.

I'm worried that if we do (2) we will end up with different states that
do the same thing and create even more confusion.

Please advise...

comment:16 follow-up: ↓ 17 Changed 5 years ago by PaulFertser

Please don't do both.

As now Mark Brown has finally received a GTA02 he is the most likely to fix any remaining breakage. I also studied the datasheet and the sources and now feel more confident with these issues, but fixing everything properly needs time.

You should consider that last patch of mine as a dirty hack to "get by" while waiting for a proper solution. But as there is no reason to use DAI 1 mode anyway (as DAI 2 with voip state works ok), i consider that it's irrelevant.

First of all, to understand this DAI stuff, please take a look at the new routing diagrams i posted at http://wiki.openmoko.org/wiki/Neo_1973_audio_subsystem .

The basic idea is that WM8753 has 2 physical DAIs (Digital Audio Interfaces), one connected to BT module (Bluetooth DAI) and the other connected to SoC (HiFi DAI). These DAIs are exposed from the ALSA layer as 2 devices (WM8753 HiFi WM8753 and Voice WM8753) on one card (neo1973-gta02). ALSA names for those are hw:0,0 and hw:0,1 respectively.

It also has 3 DACs (called Vx aka Voice DAC and Left/Right? aka HiFi DAC) and one ADC. All those DAC/ADCs can be connected in 4 mutually exclusive combinations to the DAIs. Also a switch exists that allows feeding both DAIs at the same time with the same ADC signal in any of these 4 modes.

In my understanding any operation (like setting format and rate) on a DAI should affect all DAC/ADCs connected to it on the time of operation (depending on currently selected DAI mode). Usually, format setting occurs when some application (aplay/arecord) opens a pcm for playback/capture. But transferring sound through Bluetooth DAI to/from SoC is impossible and therefore a special utility is needed to set DAC/ADC parameters, currently connected to the BT DAI. This is an example of such utility: http://opensource.wolfsonmicro.com/~gg/bluetooth-pcm/bluetooth_pcm.c .

As to the bug itself, i think we have 3 issues here:

First issue is that when we have a "dummy" DAI (that is, it's not connected to any DAC/ADC) kernel oopses on closing the pcm. Seems that it needs to be fixed properly.

Second issue is that in my understanding DAI 1 is defined incorrectly because in this mode HiFi? DAI is connected to Vx DAC and ADC, so it shouldn't be dummy and should set parameters for those DAC and ADC. Bluetooth DAI for this mode should be dummy instead.

Third issue is s3c24xx_pcm_pointer crash that probably should be investigated independently.

comment:17 in reply to: ↑ 16 Changed 5 years ago by arhuaco

  • Owner changed from openmoko-devel to openmoko-kernel
  • HasPatchForReview unset
  • Component changed from unknown to System Software

Replying to PaulFertser:

Please don't do both.

As now Mark Brown has finally received a GTA02 he is the most likely to fix any remaining breakage. I also studied the datasheet and the sources and now feel more confident with these issues, but fixing everything properly needs time.

Great. Not many people in this galaxy know how to fix this!

This is a very nice report you wrote. Now I'm jet-lagged as hell thus I will read it carefully tomorrow.

comment:18 in reply to: ↑ 13 Changed 5 years ago by arhuaco

  • Cc joerg added

Replying to lindi:

Graeme just pointed out I should use voip-handset.state. And indeed, "arecord | aplay" works perfectly with that!

I've was asked about the voip-handset.state file and Joerg tells me there are a few versions of it. Is there a link to the version that is working for you? / the one you think is good?

comment:19 Changed 5 years ago by arhuaco

Joerg,

I think it would be safe for us to assume that the file is /usr/share/openmoko/scenarios/voip-handset.stat , don't you think?

This program is working and it wouldn't work otherwise: http://wiki.openmoko.org/wiki/Voicenote

If there is some improvement about this file I guess we will be notified somehow anyway.

I apologize if I wasted somebody's time with this question.

comment:20 Changed 5 years ago by arhuaco

Joerg, I apologize if I seemed rude with you in the last update of this ticket, because I was. I know you really tried to help me do some progress with my task both via private email and on IRC. Thanks for that, really.

It's just that we had misleading information around and that is not your fault.

I updated this wiki page and I hope it makes more sense now.

http://wiki.openmoko.org/wiki/Recording_audio

I'll assume the ALSA state we have there works.

comment:21 Changed 5 years ago by sushama

  • Cc testing@… added; joerg removed

wget http://wildsau.enemy.org/~moko/voice-recording.state
alsactl -f voice-recording.state restore
arecord -D hw -f cd -v -t wav rec.wav
=> Segmentation fault
since we still get the 'Segmentation Fault' i leave this open.

comment:22 Changed 5 years ago by broonie

The oops should be fixed in current andy-tracking, though I've not confirmed this yet due to user space issues on my phone.

comment:23 Changed 5 years ago by joerg

  • Cc joerg@… added

Sushama
Please could you explain why you removed my CC?

comment:24 Changed 5 years ago by sushama

that was accidental,Sorry about that.

Note: See TracTickets for help on using tickets.