it’s been about a month since my last blog post. guess it’s time for another
lots has happened in the time since the last post. i didn’t realize after i’d checked in the aliasing code that i’d left a huge hanging section that hadn’t been completed and that pretty much stopped the remove branch from working. so i dug and found new reserves and have been a busy guy since then trying to get stuff going for the upcoming m3. i figure i’ll just go down the commit log and where i’m at now to explain what is going on in my portion of the m3 bongoverse.
added a DOMAIN LOCATION command to the queue agent. this allows protocol consumers to pass in an email address and get a result if the domain is local or remote. eventually we’ll add back in the relay domain stuff (smarthosts)
one of the problems with mail not going out was that the default configuration was incorrectly set to smarthost outgoing mail through another server which had never been configured.
i found when playing with gass’ odbc work that return codes could be integers when we were comparing as booleans. depending on how the comparison was done, it could produce runtime errors because of conversions and posted an example application to the -devel list which should compile anywhere regardless of a bongo repo. this caused a couple of issues in code to determine if users existed and if their passwords were correct most of the agents do something with those functions so for example smtp auth and imap login both were suspect
as stated above getting aliasing code actually running in smtp and getting it to accept mail. this led to bugs in the queue along the same types of lines
by the time i had all this working i had a fully working bongo server and i set about getting things set up to run it. i can’t run it on my normal server as i have too many users that *need* email to expose them to the alpha code. i consulted with my amazing isp and now have two new shiny ip addresses that i can bind. i ping’d my brother who hosts some domains on my server regarding availability of stealing the MX for one of them. he probably won’t ever user the MX for swedepop.com so i set up bongo on one of the ips and subscribed to all the bongo lists. this has provided amazing opportunity to test and debug the system as you’ll see below. (mail me if you’d like an account. it’s open to anyone who won’t spam
)
we found that bounces weren’t functioning properly. this is kinda important in the email world so i moved on to figuring this out. it took a very long time as the queue code is a little messy. it feels like it might be a lot of original code, though i can’t be entirely sure. i didn’t dive into the queue code too much back in the day at Novell. i committed the fix in two steps. the first being mostly correct, but mainly it was so that i could put the new code out on swedepop.com to test it accurately with some live servers. around 1am on the 15th that problem was tracked down to some broken code and a configuration setting not set properly.
the next step was importing. alex and i had been chatting and he asked if i’d tried importing. i know other had and it didn’t work too well or died in the process. he was busy this week so i headed in that direction. i spent a ton of time trying to import my 14600 email hula trash folder and would quickly run out of memory on my 128m servers. i tracked this down to bad python code, however i couldn’t find a memory efficient way to do it. python just doesn’t do a good job of freeing memory and returning it to the system. as i found this to be a lost cause, i decided to write my own importer to not use the filesystem mailbox stuff and just use imap. i figured this would be good since then the implementation of the underlying filestructure wouldn’t matter any more. the server would hide that from me. i extended -storetool to allow for passing in imap information. this was run on the trash bin and worked.
the live server showed its true colors yet again when it started throwing odd errors that i tracked down to calling fclose() then trying to fseek() the same handle. got that fixed so mail delivery could continue.
in the mean time gass had responded to my email to the list as to how to import lots of mailboxes. my response email crashed the server guaranteed every time it ran. this was a puzzler. it took a long time of crazy debugging in both queue and store, but i finally tracked it down to a bug in connio’s ConnWriteFile() this function shouldn’t really have been called, but because of another bug that i have yet to tackle was. in the queue, if the recipients store is local the queue, we run a queue command that says “deliver the file on the filesystem at <insertpathhere> into this user’s mailbox”. that determination is done by calling a library function that should return the ip address bongo is bound to. that function was failing. because of that, queue assumed that the store was on a remote system and used connio to connect to it and then passed the file’s contents over the connection (which calls ConnWriteFile() ). this function wasn’t properly delivering the files full contents which caused an abort since both ends of the connection were not in protocol sync anymore. (one thought it was done sending info and could run the next command and the other was waiting for more information). what a mess!
this whole error condition pointed out a missing unlink() command of a temporary file used when receiving the data from the file on the store end.
“dogs and cats living together, mass hysteria!!!” <bill murray in ghostbusters>
anyhow, most of the situation is under control and now (once i fix the ip address problem) i can move on to other stuff. i’ve got aliasing work to do (fixing it to work properly and setting it up so that you can say somegroup@domain.com => ‘user1@domain.com’, ‘user2@domain.com’,… along with that alex and i have chatted about an i idea i had thought about a long time ago but not mentioned to anyone until rprice told me MA had done this, and then alex came up with it too on his own. splitting the smtpd daemon into two pieces. one for incoming mail and one for outgoing mail.
oh one other thing on aliasing is that once we get aliasing figured out fully and implemented somewhat in the queue, we can remove the requirement that smtp listen on both queue 6 and 7. it does this because mail dropped off “must be dropped off as a remote domain if location is unknown” <paraphrased from the nmap docs> so the agent on queue6 just makes sure that the “remote domain” isn’t really a local domain and creates a whole new queue entry for it! what an amazing waste!!!! the mail just went through everything (spam, virus, any other queue agents) and now we re-create it just so it can get delivered locally. queue really could do with some optimization and by doing it we’d increase overall performance a ton!
speaking of performance, using imap sucking off my hula box i was able to insert all 14627 mails in about 43 minutes. not a bad clip considering all the processing overhead we do. i think we could improve that (along with the odd memory leak in there i mentioned in my -devel email).
more to come in a bit.