As I was putting the baby down for bed tonight it hit me that I should blog about today’s commit to trunk. As an introduction, when Hula was created from the NetMail codebase, the team did what they called scrub the code. Basically they removed any portions that were not releaseable and made the code more clean. SMTP didn’t get a whole lot of attention and was very, very similar to the code that was used on the NIMS codebase years before. Parts of the smtp agent were quite old and pretty messy. Because of that we decided that it would be a good thing to re-write it and split it out into incoming and outgoing. Thus the birth of what we called smtpc where c stands for Client. Now starts the devel type stuff so if you don’t want to know the innards, skip the rest.
The queue system has some pretty cool benefits (a couple that I want to re-write and clean up very soon). Basically it allows us to have agents anywhere via IP. Email messages enter in queue 0 and progress through the queueing system one queue at a time. Agents can be registered at each queue number to do various tasks (antivirus, antispam, server side rules etc.).
One of the things that the old system did not do was know if an email address was remote or local. Some time ago I committed code to determine if addresses are local (the new aliasing system). Because the old system did not do this, the smtp agent listened on queue 6 and queue 7. The purpose of queue 6 was solely to determine if an email address was remote and rewrite the email envelope to reflect the new address location. This was a big waste of resources and made the smtp code much more complex than it needed to be. With the new agent, smtpc now only listens to queue 7. Messages that hit queue 7 will be delivered remotely since the queue agent now rewrites address on mail dropoff to reflect remote or local.
At this point I’d like to explain a little bit about the code. Some of this information should end up in the project wiki at some point, but I’m pretty stupid when it comes to wiki’s and don’t have the time tonight to learn it, but I think this might be handy for those fixing the bugs that surely exist in the current smtpc codebase.
I’ll start in XplServiceMain() at the bottom of smtpc.c. The main portion of this function is the call to XplStartMainThread(). Basically the rest of it is configuration code.
SMTPAgentServer(). This is the function that is run by the main thread. This function is also pretty simple. The main call here is BongoQueueAgentListenWithClientPool(). This is a pretty cool function that creates a thread pool. This is a blocking call. Basically the innards of this function will listen on a port and any time a new connections is made it will spawn a thread and call the function passed in parameter 6 (ProcessEntry in this case).
ProcessEntry(). This is probably the most important function in the whole file. The first thing we do is handshake with the queue. The queue authentication scheme is kinda complex and this function handles it for us. Prior to this function it was a whole lot of duplicated code. When the agent gets connected it reads out a string that consists of:
6020 qID envelopeLength messageLength numRecips
The next thing that we do is BongoQueueAgentReadEnvelope() which reads in the envelope into a long character string. This is one long string where \r\n (or \n) are replaced by \0. We use the BONGO_ENVELOPE_NEXT() macro to move us down the string past those \0 characters.
Now we start into the main loop processing the envelope. Since the BONGO_ENVELOPE_NEXT() macro steps based off strlen, and since i’m going to want to null characters, I save a pointer to the end of the current string so that the macro will work right. This is a little wasteful since in effect I’m basically doing a strlen(”"), but I wanted to use the standard macros to make this as straightforward as possible. You might be looking at the code thinking, why the heck is he writing the envelope back to the queue. Well, if you want to make any changes to the envelope at all you need to write the whole envelope back. So on all envelope lines except those that are recipients, I immediately write them back out to the queue. I do this since mail delivery might not be sucessful on a recipient. If it is not, then I want to retry later when the queue agent determines that the smtp agent should handle a message. I obviously do not want to write out the successful entries else the message would be duplicated in that user’s mailbox.
Within this loop, when I get to a recipent I allocate a structure to hold all the information for them and then at the end of the loop I call the macro to step me through to the next envelope line. After this I start another loop over all recipients. This loop is the loop that needs a little help. I tried to come up with a good way to do this, but decided in the interest of time to just make it work then optimize it later. The loop compares the current entry with the next and skips it if they are the same. Thus we only get a delivery once to a user if they were somehow duplicated in the envelope. I do this by a field in the recipient structure called the SortField. This field is the reformatted email address like: domain.com@local_part. I did this so that the sorting of the array would have all domains grouped together. That way eventually I could connect once to a given MX and drop the message off for everyone. I currently connect to the MX as if it was unique in the list (that’s where the optimzation comes in). Perhaps we could key it off a configuration setting of some kind. Perhaps recipients_per_connection or something, that is just an idea.
There is a call in this loop to LookupRemoteMX(). This is some of Alex’s code from his amazing fixes to the dns lookup code. This function connects to the MX and returns the connection to it. I’ve got a fixme in there because I’m not really sure if skipping the message is right. Options are bounce it, or skip it which would try again later. Then we call DeliverMessage() to actually do the smtp convo with the remote site. Based off the result I either send a DSN of success to the sender, try again later, or send a DSN of failure.
Now for the second most important function: DeliverMessage(). From here on out we’re mostly SMTP with only one call to the queue to fetch the mail body. I lamented the label that I used, but without it the code would be pretty ugly in handling the tls stuff (which it works for me atm). Really this is all straight smtp, so while this is pretty important I’m not going to go over that here. This function also disconnects from the remote MX (which is also a part of that loop optimization above).
I hope that I didn’t bore the tar out of you with that discussion, but the queue is a confusing beast at times, and I’d like to get information out there so that we can get others developing agents.