Apology

This past Friday evening I let some frustrations from the week boil over into name-calling on a friends blog post. I apologize to my friend, and I apologize to the Twisted community. Both deserve better.

Comments

Please see my blog to comment on this. In addition, there is at least one reply from a Twisted developer.

Purpose

In my final comment on that post I expressed my intent to respond and to further explain the feelings about twisted that caused me to react in the way I did. I also promised to do my best to ensure that my future posts would be a constructive experience. It is my intent that the series of posts will result in one of two things, or preferably both.

  1. I hope that I will learn something from these posts. Whether it be something about Twisted. Or whether it only turns out to be a lesson on communication.
  2. If any of my complaints are valid, hopefully they will be communicated in a such a manner that the Twisted community can address them and subsequently improve twisted.

Motivation

For me, one of the most important criteria I use to evaluate any language, framework or library is how well it stands up in the face of simple problems. All too often, a framework promises help the developer with all of their complicated problems, yet fails to avoid complicating the solutions to simple problems. Even more frustrating is when simple problems require significant effort to 'fit' into the paradigm required by the framework. When I use the term "over-designed", I typically use it to refer to pieces of code that have been factored into such a large set of classes, with the intent of making anything possible, without regard to making the simple things simple.

The Problem

The problem I faced was a simple one. I inherited some code that was responsible for the typical user registration functions on a website. A user could register and an email would be sent to their account. I was tasked with writing a full set of unit and integration tests for this code. An additional requirement (not of my own choosing) was that the interface for this code could not change.

In order to write integration tests that ensure the email was properly sent I had to simulate an smtp server, and cause the email to be sent to this server. The interface provided by smtpd module in python's stdlib is very straightforward and easy to use. It took me a few minutes of reading, and a few minutes of implementation, but I quickly got an smtpd server that was controllable in unit tests, and could be used to verify that emails were sent.

The threaded solution

I only had to override two method - the process_message method allows me to receive and store message for further inspections by the tests, and the __init__ method in order to initialize the state of the received messages. Rather than require the unit tests themselves to call the asyncore loop, I decided that adding the ability to start and stop the server would be appropriate. It turned out to work quite well, and be very simple to implement. Two more methods were added to the TestSMTPServer object - the start method which starts the background thread and the close method which closes the server the thread to stop, and to wait for the background thread to finish.

The code follows, and is, in my opinion, straightforward, simple and easy to read. For a complete set of code, see the threaded version.

smtp_helper.py


"""

A helper class which allows test code to intercept and store sent email.
"""
import smtpd
import asyncore
from business import DEFAULT_HOST, DEFAULT_PORT


#-------------------------------------------------------------------------------
class TestSMTPServer(smtpd.SMTPServer):
    """
    An SMTP Server used to allow integration tests which ensure email is sent.
    """

    def __init__(self, localaddr=(DEFAULT_HOST, DEFAULT_PORT)):
        self.rcvd = []
        smtpd.SMTPServer.__init__(self, localaddr, None)

    def start(self):
        import threading
        self.poller = threading.Thread(target=asyncore.loop,
                kwargs={'timeout':0.1, 'use_poll':True})
        self.poller.start()

    def process_message(self, peer, mailfrom, rcpttos, data):
        self.rcvd.append((mailfrom, rcpttos, data))

    def close(self):
        smtpd.SMTPServer.close(self)
        self.poller.join()



A twisted version.

Soon after I completed the above version our company switched to Twisted.

I am not a twisted expert. I had only the twisted book, the online twisted documentation and the twisted source from which to learn to use twisted. The following code should not be mis-construed to be the only, or the best way to solve this problem. Rather, it is what I arrived at after a weekend of working on it.

The initial version of the server was taken from the twisted book on page 124. The code is available from o'reilly. It required three classes, and 9 methods for its implementation.

But I wasn't done. This example is not enough to allow me to test that messages are sent, nor to properly shutdown the server. Due to the nature of asynchronous programming, I had to explicitly add in a set of callbacks to inform the tests when the messages were received. After this I realized that the example did not have the appropriate hooks to shutdown its connections, or to wait for them to be shutdown. Both are necessary for this code to run within the context of a unit test. After asking in the #twisted channel, I was directed to a blog post which talked about waiting for connections to be closed within trial. As it isn't important to making my point, I won't go into the details of the approach outlined in that blog post, nor will I bother explaining how it all works.

The end result was 4 classes, and 14 methods - all of which seem to be needed in order for me to pull this off. The number of lines of code doesn't bother me as much as the number of concepts, and the number of interactions between the pieces of code that are required for me to understand this code. It's non-obvious (unless you intimately understand the way that twisted works) how these pieces should be put together. Unfortunately, it is more complicated, harder to read (or follow) and harder to maintain than the previous version. For a complete set of code, see the twisted version.



smtp_helper_tw.py


"""
An smtp server that integration with twisted.trial to ensure emails are sent.
"""
from twisted.internet import defer, protocol
from twisted.mail import smtp
from zope.interface import implements

from email.Header import Header

#-------------------------------------------------------------------------------
# This was taken (mostly) from page 124 of the Twisted book.
# I spent some 20 minutes looking through the mail.smtp source code trying to 
# determine if all of these methods were needed or if I could cut some to 
# result in slimmer code.  I failed to be able to sufficiently read and 
# understand all of the interactions between the various classes in that time
# period, and gave up in order to go help my wife cook brownies.  
#
# In order for our tests to be able to know that the message has finished
# delivery so that we can ensure that the message was as intended.
class CallbackMessageReceiver(object):
        implements(smtp.IMessage)

        def __init__(self, callback):
            # We separate the current lines we're using
            self.lines = []

            # from the full messages that we've received.
            self.callback = callback

        def lineReceived(self, line):
            self.lines.append(line)

        def eomReceived(self):
            # message is complete, tell everyone about it
            self.callback("\n".join(self.lines) + "\n")
            self.lines = []
            return defer.succeed(True)


#-------------------------------------------------------------------------------
class CallbackMessageDelivery(object):
    implements(smtp.IMessageDelivery)

    def __init__(self, callback):
        self.callback = callback

    def receivedHeader(self, helo, origin, recipients):
        myHostname, clientIP = helo
        headerValue = "by %s from %s with ESMTP ; %s" % (
                myHostname, clientIP, smtp.rfc822date())
        return "Received: %s" % Header(headerValue)

    def validateTo(self, user):
        # Use the CallbackMessageReceiver to append messages to our
        # message list.
        return lambda: CallbackMessageReceiver(self.callback)

    def validateFrom(self, helo, originAddress):
        return originAddress


#-------------------------------------------------------------------------------
class SMTPProtocol(smtp.SMTP):
    """
    From http://blackjml.livejournal.com/23029.html
    we need to be able to inform people of lost connections, this
    follows pattern presented in that blog post, which I was told to follow
    by the author when I asked for help in this in the #twisted channel
    on irc.freenode.net.

    In other words, if you want your tests to properly succeed, you
    MUST set the onConnectionLost to a new deferred when you setUp your tests
    and then return that deferred from the tearDown method so that the
    tests waits for the SMTPServer to disconnect.
    """
    def __init__(self, *args, **kwargs):
        smtp.SMTP.__init__(self, *args, **kwargs)
        self.onConnectionLost = defer.Deferred()

    def connectionLost(self, *args, **kwargs):
        smtp.SMTP.connectionLost(self, *args, **kwargs)
        self.onConnectionLost.callback(self)



#-------------------------------------------------------------------------------
class DeferredSMTPFactory(protocol.ServerFactory):
    """
    A twisted-style SMTP server factory.
    """
    def __init__(self):
        self.listOfDeferreds = []
        self.protocolDeferreds = {}

    def informDeferreds(self, result):
        deferred = self.listOfDeferreds.pop(0)
        deferred.callback(result)

    def messageReceived(self):
        d = defer.Deferred()
        self.listOfDeferreds.append(d)
        return d

    def protocolConnectionLost(self, protocol):
        del self.protocolDeferreds[protocol]
        return protocol

    def buildProtocol(self, addr):
        delivery = CallbackMessageDelivery(self.informDeferreds)
        smtpProtocol = SMTPProtocol(delivery)
        smtpProtocol.factory = self
        d = self.protocolDeferreds[smtpProtocol] = smtpProtocol.onConnectionLost
        d.addCallback(self.protocolConnectionLost)
        return smtpProtocol



Conclusion

In the previous sections, I speak as if I wrote all of the code. To be strictly fair, I didn't write the initial threaded version, a friend did, although it took me all of 2 minutes to read and follow it. On the other hand, I did write the twisted version. The twisted version took me all of a Friday afternoon, most of the free time on a weekend and most of the morning one Monday to get it working. This was even after I stopped by the #twisted channel on irc.freenode.net to get some help. Which I didn't get. The told me not to do it. Their reasoning boiled down to:

You're not writing a unit test. You're testing the twisted smtp implementation, and we already have tests, so don't do that.

Fair enough. I trust their unit tests. So I asked them how they would test it and they replied:

Change the interface of the methods you're testing such that they accept an SMTP client object and then pass in a mock smtp client object which stores the messages instead of sending them.

Unfortunately, that wasn't an option. Adding parameters to any API is an important decision that must be carefully considered; the complexity of an interface scales very quickly for each parameter added (maybe fumanchu will finally write the blog post on this subject that he's been threatening to). Even after writing this code myself, and understanding it fully at the time, it took me nearly 2 hours to extract it into a standalone set of tests. I still don't understand it fully again. The threaded version? It took me 20 minutes to extract and get running, and I fully understand it again in a glance.

I found that throughout my working on the twisted version of the code, and even now when I return to it twisted consistently forces me to understand and keep in my head, 2-4 times the amount of information, concepts and names of objects and how they are intended to interact with each other. In the context of a complicated application this is to be expected. On the other hand, when a simple problem presents itself, I expect to be able to build and use a simple solution. All of my experiences with twisted suffer from this level of unnecessary complication, and density. The resulting solutions hard to understand or even explain.

Now, I also admit that twisted's primary purpose may not have been to integrate an SMTP server into its testing framework. However, I think my point stands - twisted fails in it's promise to make my task, as a developer, easier. And more importantly twisted deviates from the zen of python:

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!