[osiris-devel] Re: possible fix for osirismd defunct procs

lemmings osiris-devel at lemmin.gs
Mon Jan 1 08:34:28 EST 2007


On Thu, Dec 21, 2006 at 11:34:05AM -0500, David Vasil wrote:
> The original osirismd code had a signal handler for sigchld which would
> only check if a signal occurred before and after wait_for_data() in
> osirismd.c:process().  I added a signal handler specifically for the
> osirismd threads that handle scans to md_scan.c.  All it does is when it
> gets a SIGCHLD it issues a wait().

Your patch also does a log_info().

It is generally a bad idea to do anything other than set a flag (as
the current code does) or calling a known safe function (as documented
in individual library calls. A partial list is in signal(2)) in a
signal handler.

The reason is that many functions are not reentrant (For example,
log_info() calls syslog(3) which will probably call malloc. If an
existing malloc is in progress then you may get corruption of the
heap). The end result is infrequent, hard to diagnose crashes.

> Are there any brave souls who experience the zombie problem that could
> test this patch out?  If it fixes the problem I'll get Bruce to make a
> 4.2.3 release shortly after New Years.

A quick read of the source code suggests that the problem is likely to
be the code that was commented out in check_for_signals(). A patch
like the following (untested) would probably work better.

e

===================================================================
--- osirismd.c  (revision 69)
+++ osirismd.c  (working copy)
@@ -259,7 +259,6 @@
     {
         client_length = sizeof( client );

-        check_for_signals();
         wait_for_data();

         /* we have data waiting, see if it is to the server.  if it
 * */
@@ -1453,6 +1452,8 @@

 wait_start:

+    check_for_signals();
+
     FD_ZERO( &read_set );

     /* add the control socket to the set. */
@@ -1484,7 +1485,6 @@

     if( ( sresult < 0 ) && ( errno == EINTR ) )
     {
-        check_for_signals();
         goto wait_start;
     }
 }
@@ -1792,37 +1792,34 @@

         child_pid = wait(&status);

-/*
-        while( ( pid = waitpid( -1, &status, WNOHANG ) ) > 0 ||
-               ( ( pid < 0 ) && ( errno == EINTR ) ) )
-        ;
-*/
+        while ( ( child_pid = waitpid( -1, &status, WNOHANG ) ) > 0 )
+       {
+               if ( child_pid == scheduler_pid )
+               {
+                   log_error( LOG_ID_DAEMON_INFO, NULL,
+                              "SIGCHLD, scheduler has terminated!" );

-        if( child_pid == scheduler_pid )
-        {
-            log_error( LOG_ID_DAEMON_INFO, NULL,
-                       "SIGCHLD, scheduler has terminated!" );
+                   halt(EXIT_CODE_ERROR);
+               }

-            halt(EXIT_CODE_ERROR);
-        }
+               else if ( child_pid == log_pid )
+               {
+                   log_error( LOG_ID_DAEMON_INFO, NULL,
+                              "SIGCHLD, log application was
terminated!" );
+               }

-        else if ( child_pid == log_pid )
-        {
-            log_error( LOG_ID_DAEMON_INFO, NULL,
-                       "SIGCHLD, log application was terminated!" );
-        }
+               else if ( getpid() == scheduler_pid )
+               {
+                   log_info( LOG_ID_SCHEDULER_INFO, NULL,
+                             "SIGCHLD, scheduler child process has
terminated." );
+               }

-        else if ( getpid() == scheduler_pid )
-        {
-            log_info( LOG_ID_SCHEDULER_INFO, NULL,
-                      "SIGCHLD, scheduler child process has
                       terminated." );
-        }
-
-        else
-        {
-            log_info( LOG_ID_DAEMON_INFO, NULL,
-                      "SIGCHLD, client process has terminated." );
-        }
+               else
+               {
+                   log_info( LOG_ID_DAEMON_INFO, NULL,
+                             "SIGCHLD, client process has
terminated." );
+               }
+       }
     }

     if( received_sigpipe )




More information about the osiris-devel mailing list