BUG: soft lockup - CPU#1 stuck for 10s! [php5:3111]

LordOfLA

Godlike!
Joined
2 Feb 2004
Messages
7,026
Anyone seen this before and know of a fix?

Centos 5.2 Kernel 2.6.18-92.el5PAE

Code:
BUG: soft lockup - CPU#1 stuck for 10s! [php5:3111]
Pid: 3111, comm:                 php5
EIP: 0060:[<f887f09f>] CPU: 1
EIP is at ext3_find_entry+0x16b/0x51e [ext3]
 EFLAGS: 00000293    Not tainted  (2.6.18-92.el5PAE #1)
EAX: 00000000 EBX: f649ec00 ECX: 00000000 EDX: f6a56600
ESI: f6be0b64 EDI: 00000045 EBP: f4284ee0 DS: 007b ES: 007b
CR0: 80050033 CR2: bfe93b78 CR3: 36507600 CR4: 000006f0
 [<f8880835>] ext3_lookup+0x26/0x10f [ext3]
 [<c047bc1e>] __lookup_hash+0xb1/0xe1
 [<c047d59d>] do_unlinkat+0x57/0x10e
 [<c040962b>] sys_ipc+0x133/0x149
 [<c046211e>] sys_brk+0xcb/0xd3
 [<c0404e95>] sysenter_past_esp+0x56/0x79
 =======================
 
Do you know what its trying to do? Looks like it might be trying to unlink a file on an ext3 file system but I'm guessing you've worked that much out already.
 
aye that part was obvious :)

What isn't obvious is why PHP deleting a file ends up with a CPU core locked at 100% and leaving the rest of the server almost useless...
 
the only files I can think of that PHP might link/unlink regularly are session files, but an individual script might add more to that. Of course if you've got the ZendOptimizer then I believe that caches the optimised scripts to the file system, at least I'd expect it to.

How are you running PHP? mod_php, fastcgi process?
 
CGI. FastCGI doesn't like output buffering with apache 2.2.

It's daft that the other 3 cores are largely useless though.
 
Ehm, the CPU locking means that the scheduler is unable to interrupt whatever is running on it, that is bad.

Taking a look at your EFLAG:

0x00000293 = 0b1010010011

Now when we look at:

http://en.wikipedia.org/wiki/FLAGS_register_(computing)

We notice that bit number 9 is "interrupt enable". Which seems to be enabled, so the scheduler should be able to interrupt the process, unless it is locked by the ext3 file system.

From what I have been able to get from some kernel hackers I hang out with on IRC it could be a bug in the module/kernel. Have you checked that you have the latest upgrades for your BIOS, apparently there have been a few micro-code updates for certain Intel processors/AMD processors which could also cause the issue you have described.

As for the other cores not working quite as well, that makes perfect sense. If one CPU is locked, certain caches will be locked as well, that makes cache flushes rather hard, as well as various other tasks.
 

Members online

No members online now.

Latest profile posts

Also Hi EP and people. I found this place again while looking through a oooollllllldddd backup. I have filled over 10TB and was looking at my collection of antiques. Any bids on the 500Mhz Win 95 fix?
Any of the SP crew still out there?
Xie wrote on Electronic Punk's profile.
Impressed you have kept this alive this long EP! So many sites have come and gone. :(

Just did some crude math and I apparently joined almost 18yrs ago, how is that possible???
hello peeps... is been some time since i last came here.
Electronic Punk wrote on Sazar's profile.
Rest in peace my friend, been trying to find you and finally did in the worst way imaginable.

Forum statistics

Threads
62,015
Messages
673,494
Members
5,623
Latest member
AndersonLo
Back