ILoggable

A place to keep my thoughts on programming

January 26, 2011 .net, mono ,

Fixing leaking semaphores with mod_mono

After porting my mod_mono ASP.NET MVC application to Ruby and Rails and setting up Phusion Passenger up to run the application under mono, I finally figured out how to fix the leaking semaphore issue. The real title of this post should probably be "PEBKAC or Don't assume errors are unrelated, you idiot".

Recap of the problem

The problem manifests itself as a build up of semaphore arrays by the apache process, which is visible via ipcs. When the site is first started the output looks like this:

[root@host ~]# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x01014009 1671168    root       600        52828      48
0x0101400a 1703937    root       600        52828      25
0x0101400c 1736706    root       600        52828      35

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0x00000000 10616832   apache     600        1
0x00000000 10649601   apache     600        1
0x00000000 10682370   apache     600        1
0x00000000 10715139   apache     600        1

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

Eventually it'll look like this:

[root@host ~]# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x01014009 1671168    root       600        52828      48
0x0101400a 1703937    root       600        52828      25
0x0101400c 1736706    root       600        52828      35

------ Semaphore Arrays --------
key        semid      owner      perms      nsems
0x00000000 10616832   apache     600        1
0x00000000 10649601   apache     600        1
0x00000000 10682370   apache     600        1
...
lots more
...
0x00000000 11141158   apache     600        1
0x00000000 11173927   apache     600        1
0x00000000 11206696   apache     600        1

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

At some point all ASP.NET pages will return blank pages. No errors, no nothing, .NET logging reports normal behavior, but no content is sent. And you can restart the mono processes and apache all you want, it won't come back. Sorry.

What's really wrong

Since day one i'd been receiving warnings at apache startup up, but since i didn't understand what they meant and things seemed to be working, i had been ignoring the,. Of course, that was a lie on its face. Things were clearly not working, with the leaking semaphores, but I conveniently filed the two issues as unrelated in my head and ignored them at my peril. The warning was this:

[Mon Jan 24 00:12:50 2011] [crit] The unix daemon module not initialized yet.
Please make sure that your mod_mono module is loaded after the User/Group
directives have been parsed. Not initializing the dashboard.

This warning was repeated for as many times as I had ASP.NET vhosts defined. I looked at my vhost configurations and saw nothing about users and groups and thought it was some weird mono issue and left it at that. But the actual problem was not in the vhost configuration but in httpd.conf. The problem was this default section:

#
# Load config files from the config directory "/etc/httpd/conf.d".
#
Include conf.d/*.conf

#
# If you wish httpd to run as a different user or group, you must run
# httpd as root initially and it will switch.
#
# User/Group: The name (or #number) of the user/group to run httpd as.
#  . On SCO (ODT 3) use "User nouser" and "Group nogroup".
#  . On HPUX you may not be able to use shared memory as nobody, and the
#    suggested workaround is to create a user www and use that user.
#  NOTE that some kernels refuse to setgid(Group) or semctl(IPC_SET)
#  when the value of (unsigned)Group is above 60000;
#  don't use Group #-1 on these systems!
#
User apache
Group apache

Obviously User and Group are set after all vhost configs are loaded. Pretty much exactly what the warning was saying (doh). I simply moved the Include below User/Group and since then I have not seen more than 9 semaphores and I've restarted mono, rebuilt the application and hit the app with ApacheBench, the combination of which used to drive semaphores up.

Since things are working now and ASP.NET MVC under mod_mono is significantly faster than the Rails port, I'm sticking with ASP.NET MVC for production right now, monitoring semaphores to make sure this really did fix the problem.

 

Leave a comment