Ticket #1714 (accepted defect)

Opened 10 years ago

Last modified 10 years ago

Mediaplayer sometimes does not start up.

Reported by: hedora Owned by: Abraxa
Priority: normal Milestone:
Component: unknown Version:
Severity: major Keywords: HasPatch
Cc: Blocked By:
Blocking: Estimated Completion (week):
HasPatchForReview: yes PatchReviewResult:
Reproducible:

Description

Mediaplayer checks to see if another instance of itself is running at startup. If it detects another instance, it fails to start.

Unfortunately, the current locking code contains some race conditions and other bugs. I marked this "major" because I often have to manually remove lockfiles with the current version, or mediaplayer won't start.

I've attached a patch that obtains a lock based on the idea that "rename()" is atomic. It handles many corner cases the other code misses, but only works if "openmoko-mediaplayer" occurs somewhere in argv[0] of the other process.

The patch corrects the problem for me.

Attachments

mediaplayer-atomic-lock.patch (9.9 KB) - added by hedora 10 years ago.
Fixed case when /proc/?/cmdline has unexpected eof by assuming the other process died.
mediaplayer-atomic-lock-second-try.patch (11.0 KB) - added by hedora 10 years ago.
Fixed sigusr1 handling.

Change History

Changed 10 years ago by hedora

Fixed case when /proc/?/cmdline has unexpected eof by assuming the other process died.

comment:1 Changed 10 years ago by hedora

The patch doesn't work after all. The locking code is OK, but the lock is set before the sigusr1 signal handler is installed.

Other processes see the lock, signal us with sigusr1 to tell us to raise our window. Then we crash. The fix is to install a signal handler before setting the lock.

comment:2 Changed 10 years ago by Abraxa

  • Owner changed from openmoko-devel to Abraxa
  • Status changed from new to accepted

Thanks for the patch, I'll include it in the SHR fork as 2007.2 itself is unmaintained.

Out of curiosity I would be grateful however if you'd let me know in some way (mail, IRC, trac, whatever) how the old locking code had "other bugs" aside from the racing condition you mentioned. I wrote it without knowing any better so I'd be happy to learn why my code was suboptimal so I can avoid these kinds of mistakes in the future.

comment:3 Changed 10 years ago by hedora

Here's a patch that (might) actually work.

Here are two bugs that I found:

  • sigusr1 sent before signal handler installed
  • can spuriously send sigusr1 to unsuspecting programs that reused a stale pid, inadvertently killing them.

You could probably backport both those fixes to your locking code and produce a much smaller patch. (I'm considering doing this myself...)

I didn't look very carefully at the code; I'm not too familiar with flock(), and there was a comment suggesting it needed to be revamped. So, I blamed it, reimplemented it, and then found the real bug.

Changed 10 years ago by hedora

Fixed sigusr1 handling.

comment:4 Changed 10 years ago by zecke

  • Keywords HasPatch added

comment:5 Changed 10 years ago by roh

  • HasPatchForReview set

BatchModify?: set HasPatchForReview? on 'keyword' contains 'patch'

Note: See TracTickets for help on using tickets.