Haxin Mainframes

A blog about stuff I do, find interesting, or want to blab about..

Implementing Exception Logging in Python

When tasked with logging all the exceptions that our software might encounter in the wild I tried a number of different techniques to log our python exception tracebacks. I ended up with an application that would take logs of a specific level and upload them to a server (in our case Loggly) where we could analyze them.

The first hurdle was determing how I could install my own method to process and log ALL tracebacks happening in the application.

At first I thought that the python system exception hook, sys.excepthook would be the perfect place to insert the logging code. I was trying something similar to:

import traceback
import StringIO
import logging
import os, sys

def my_excepthook(excType, excValue, traceback, logger=logger):
    logger.error("Logging an uncaught exception", exc_info=(excType, excValue, traceback))

sys.excepthook = my_excepthook

This worked for the main thread but I soon found that the my sys.excepthook would not exist across any new threads my process started. This is a huge issue because most everything happens in threads in this project.

After googling and reading plenty of documentation the most helpful information I found was from the Python Issue tracker.

The first post on the thread shows a working example of the sys.excepthook NOT persisting across threads (as shown below). Apparently this is expected behavior.

import sys, threading

def log_exception(*args):
    print 'got exception %s' % (args,)
sys.excepthook = log_exception

def foo():
    a = 1 / 0
threading.Thread(target=foo).start()

The messages on this Python Issue thread really result in 2 suggested hacks. Either subclass Thread and wrap the run method in our own try except block in order to catch and log exceptions or monkey patch threading.Thread.run to run in your own try except block and log the exceptions.

The first method of subclassing Thread seems to me to be less elegant in your code as you would have to import and use your custom Thread class EVERYWHERE you wanted to have a logging thread. This ended up being a hassle because I had to search our entire code base and replace all normal Threads with this custom Thread. However, it was clear as to what this Thread was doing and would be easier for someone to diagnose and debug if something went wrong with the custom logging code. A custom logging thread might look like this:

class TracebackLoggingThread(threading.Thread):
    def run(self):
        try:
            super(TracebackLoggingThread, self).run()
        except (KeyboardInterrupt, SystemExit):
            raise
        except Exception, e:
            logger = logging.getLogger('')
            logger.exception("Logging an uncaught exception")

The second method of monkey patching threading.Thread.run is nice because I could just run it once right after __main__ and instrument my logging code in all exceptions. Monkey patching can be annoying to debug though as it changes the expected functionality without being obvious that this was done. The suggested patch from the Python Issue tracker was:

def installThreadExcepthook():
    """
    Workaround for sys.excepthook thread bug
    From
http://spyced.blogspot.com/2007/06/workaround-for-sysexcepthook-bug.html

    Call once from __main__ before creating any threads.
    If using psyco, call psyco.cannotcompile(threading.Thread.run)
    since this replaces a new-style class method.
    """
    init_old = threading.Thread.__init__
    def init(self, *args, **kwargs):
        init_old(self, *args, **kwargs)
        run_old = self.run
        def run_with_except_hook(*args, **kw):
            try:
                run_old(*args, **kw)
            except (KeyboardInterrupt, SystemExit):
                raise
            except:
                sys.excepthook(*sys.exc_info())
        self.run = run_with_except_hook
    threading.Thread.__init__ = init

The main goal of creating somethign that could log any of our tracebacks was to send these tracebacks up a server where we could search through issues people were having in the wild and find bugs in our software. It was not until I started testing my exception logging I realized that I was going about it all wrong.

To test I had placed a

raise Exception("Test")
and/or a
1/0
somewhere in my code. However, wrapping a a method that called this method was a try except block that printed out the traceback and swallowed the exception. This was very frustrating because I saw the traceback bring printed using logging.error but not being processed by my custom exception handler. My exception handlers purpose was to take these tracebacks and ultimately upload them to a server.

The final implementation consisted of a few different classes that looked something like this:

class CustomLoggingfThread(Thread):
    def __init__(self, daemon_thread=False, group=None, target=None, name="LiveThread",
                 args=(), kwargs={}, verbose=None, clean_up=None):
        self.args = args
        self.kwargs = kwargs

        # This will make sure that the thread is wrapped in the try except block
        #regardless of whather or not you extend the class are use the target invocation
        #of Thread
        if type(self) != LiveThread:
            assert hasattr(self,"run")
            self.target = self.run
        else:
            self.target = target
        Thread.__init__(self, group=group, target=target, name=name,
                 args=args, kwargs=kwargs, verbose=verbose)
        self.run = self.__run
        self.daemon = daemon_thread


    def __run(self):
        try:
            self.target(*self.args, **self.kwargs)
        except Exception:
            logging.error("The thread %s raised an exception." % self.name, exc_info=True)
            #We consider exiting our application here if its an important thread that crashed and can't
            #be restarted.

As you can see, our custom logging thread became VERY simple. It just wraps the threaded function in a try except block and logs any tracebacks to logging.error

This class is meant to be used in combination with our custom logging filter and handlers. The python Logging module provides some great tools to help you control what you do with your logging. My implementation looked something like this:



class CustomLoggingFilter(logging.Filter):
    def filter(self, record):
        return record.levelno >= logging.WARNING

class CustomLogglyHttpHandler(hoover.LogglyHttpHandler):

...
    Removed some code to be brief..
...

    def emit(self, record):
        if isinstance(record.msg, (list, dict)):
            record.msg = dumps(record.msg, cls=self.json_class, default=str)
        msg = self.format(record)
        self.synchronous_loggly_post(msg)



loggly_handler = CustomLogglyHttpHandler(.....some args....)
loggly_handler.addFilter(CustomLoggingFilter())

#Get the root logger so all log msgs run through our handler/filter
loggly_logger = logging.getLogger('')
loggly_logger.addHandler(loggly_handler)

The above code is really what puts this all together. It adds a logging handler to post only the log messages of a certian log level (via a filter) to the server. In the actuall implementation we do a few things such as buffering of the log messages as well. These were left out to keep the post short(er).