0.6/0.7 work

September 5th, 2009

with great fanfare we released 0.6 a week ago.  at least i heard the fanfare.  it sounded something like John Williams’ score to Star Wars.  i know i’m not the only person running this on servers.  i run it both on my test server (which thanks to a suggestion from Lance Haig) is getting even more spam daily and my home production server.  the system is fairly stable, however there still seems to be a bug in the dreaded queue.  no worries though as there is major work planned for 0.7 in the queue!

we had a great meeting friday where we discussed the things that we are planning for 0.7.  unfortunately for everyone (especially me) i had a dentist appointment and had to duck out early.  as a result of that (and not having others there) we didn’t get a chance to talk about the web bits which is another of the major tasks on the plate for 0.7.  as for my part of it there is a lot of work.  starting off i’ll be creating a new queue agent.  this agent will work much differently from the previous agent.  it’ll basically be a store agent that moves mail through a queue system inside the store.  this makes some things pretty nice actually.  for one we’ll finally be able to do a sensible version of “mailq” as all data is in the store.   i’ll also be merging the logic for the rules agent into the queue agent.  we need functionality inside the queue to know how to process mail based off document properties (think spam flagged or infected email) and figured we might as well move it all to the central location.

so the basic architecture will be as follows:  each agent will have a collection that they manage -inbound and -outbound (names may change but the idea is thus).  the queue agent will pick up mail dropped off in its -inbound queue and determine which agent needs it next.  it’ll move the mail into that agent’s -inbound queue at which point it’ll get processed by the agent and dropped in its -outbound.  mail will progress through the system until it is either delivered locally or remotely.

this will require some extensive changes to the system.  obviously the new queue agent and the drop of the old one.  it’ll also be a bit re-work of major portions of most queue agents.  some really need it.  store will also need some additions to make this all work, most notably a command which will allow the queue to move a mail from one store (the _system store) to another (the destination user’s store), and a way to do multiple a watch on multiple collections.

i’m SUPER — yeah, yeah i know, caps???? fat never uses caps — excited about this.  i’ve long thought that the queue needed some major work as it is the last big portion of legacy code leftover from the 90s.  i think it is an impressive architecture and i am still in awe of its original creator’s abilities, but it needs some big time love.

one other task came out of it for me and that was the continuance of valgrind.  since we dropped memmgr we’ve found some pretty serious bugs that were causing us major issues.  some were directly due to valgrind, and others were just due to crazy runtime issues.

all in all there is a load of work coming down the line.  i don’t think i’ve been more excited for the changes though.

memmgr

June 18th, 2009

ok, ok.  i get the hint.  i’ll blog.  everyone in the channel has been clammoring for more information and i’m terrible at sharing it.  the startling lack of posts is not because there isn’t anything going on, but because i’m such a crappy writer, which you either already know or will soon find out.

a lot has changed since the last blog post.  some of it i’m sure you are already aware of, but some you may not be.  the biggest note regarding the trunk is that we’re in testing on the .5 release now.  i took time to migrate my personal domain off hula and onto bongo which has been great.  i can now find and fix the bugs that we were hitting on hula.  there are still some hard to find bugs, but i’m betting that the other big news will help find it.

now on to the most amazing stuff….  the new memmgr branch.  this has been a major blessing for the bongo project.  back in the day, Novell engineers decided to write a custom memory manager for the product.  i think it was back around the NetMail 3 days.  i don’t recall it being there with NIMS, but i could be wrong.  anyhow, it apparently  greatly increased the speed of the product on Netware.  it stuck around and we got the scrubbed version via the hula project.  i recall in my programming classes in college writing my own memory manager and the benefits of them, however it does make some things a bit difficult.  finding memory leaks and overwrites can be a bear.  tools like valgrind had a very hard time with the memory allocation system as they would see one massive alloc and not a bunch of smaller ones.

i don’t know if there will be any sort of speed improvements over the custom memory allocator, however my gut says not really (note here that i’m no expert on memory subsystems…).  the amazing benefit to us is that valgrind can sensibly operate with our binaries now.  i just spent time debugging bongo-config on the branch and i’m surprised we even worked before given how things were coded.  i was freeing stuff then using it and valgrind was useless to find it as the custom manager didn’t return the memory to the host, but kept it for pooled use later.  this will be an incredible addition to our codebase.

another major change was cmake.  alex wrote a bit about the make system on his blog, but i’d like to throw my two cents in.  the new system is simply amazing and quiet.  don’t get me wrong, there is nothing i like more than watching gcc compile endless .c files on a friday night (which is why i use gentoo), but this is awesome.  percent complete and all.  along with that is the quick installs and “make distclean” being as simple as “rm -fr {SRC}/build”.

the last major change, so far, with the branch is logging.  we got rid of log4c and now just have a simple syslog interface.  this will be swapped with something a bit more modular and configurable in the near future, but for now it works great.  just be ready for spam in your syslogs (which is another of task on my todo — fixing the log levels).

overall .5 is incredibly stable on my production server, and the branch is now running on my test server.  the store is running stable against my 1.4GB 151,400 (and growing) email mailbox.  expect leak fixes and other improvements over the short term!

i did have one idea the other day based off a conversation with Chris at my work.  an agent that can arbitrarily add, or change, header lines.  this is the lowest priority task on my list as its use would be rare, but it might be nice to have.  other tasks on the plate are the re-write of queue which we’ve been talking about for a while now.  we have some good ideas on this.  rewriting smtpd which apart from queue is the last major section of truly legacy code.  and of course we can’t forget things like bongo-manager, bongo-queuetool, and my favorite, snmp.

there are a lot of tasks there, some required, some just dreams, but all i think good ones.  this is starting to get fun!

antispam and queue

September 29th, 2008

so it has been forever that i’ve done any blogging.  life has been crazy for the last little bit.  I spent the day and finally finished the antivirus fixes that i’ve been wanting to do.  it is now basically identical to the antispam agent in how it works.  it does not send any sort of a bounce currently though and i’m wondering what the “proper” thing to do is.  i suppose that i should send some sort of a note to the destination if the email had a virus on it, however do i do that only if the destination is local?

Other changes that i’ve done are some improvements to the queue.  i’ve replaced the locking mechanism that was in the code with a bongohash.  it is a ton easier to understand.  i’ve also added some debug to the queue to see what is going on with the locking.  i plan on adding logging to the anti(spam|virus) agents.

bongo-test.info is running the latest code and is preforming pretty well.  my task this week(end) will be to build an smtp bit bucket and have my server forward the mail so that i can test outbound smtp email performance, however inbound is working pretty stinking well atm.  i’m on all my usual lists and am up to 12,298 emails.  tbird is runing great now with bongo.  i downloaded the headers for 2100 emails which tbird sorted great.  It took a bit of time, but it worked.  i can’t give exact numbers, but i did sort 60 emails in an amount of time that i’d consider reasonable from an end user perspective.  i’ll have to run some timings on it shortly.

the other task that i have is to write up a queue email list protocol extension.  at least that’s how i was planning on doing it.  I was going to add a queue command that would list all the emails currently in the queue following normal queue protocol syntax where it would be 2001-EMAIL INFO though i’m not exactly sure yet how i’m going to format the information.  the addition of this command might require a “get me all the information for queue entry x” (though i’m not sure yet what that would even look like).  i’d like to get this done as soon as i can since i’m soooooo far behind on bongo coding (and i’m feeling it).  i really need to get life under control so i can bongo more :)

smtpc

March 2nd, 2008

As I was putting the baby down for bed tonight it hit me that I should blog about today’s commit to trunk.  As an introduction, when Hula was created from the NetMail codebase, the team did what they called scrub the code.  Basically they removed any portions that were not releaseable and made the code more clean.  SMTP didn’t get a whole lot of attention and was very, very similar to the code that was used on the NIMS codebase years before.  Parts of the smtp agent were quite old and pretty messy.  Because of that we decided that it would be a good thing to re-write it and split it out into incoming and outgoing.  Thus the birth of what we called smtpc where c stands for Client.  Now starts the devel type stuff so if you don’t want to know the innards, skip the rest.

The queue system has some pretty cool benefits (a couple that I want to re-write and clean up very soon).  Basically it allows us to have agents anywhere via IP.  Email messages enter in queue 0 and progress through the queueing system one queue at a time.  Agents can be registered at each queue number to do various tasks (antivirus, antispam, server side rules etc.).

One of the things that the old system did not do was know if an email address was remote or local.  Some time ago I committed code to determine if addresses are local (the new aliasing system).  Because the old system did not do this, the smtp agent listened on queue 6 and queue 7.  The purpose of queue 6 was solely to determine if an email address was remote and rewrite the email envelope to reflect the new address location.  This was a big waste of resources and made the smtp code much more complex than it needed to be.  With the new agent, smtpc now only listens to queue 7.  Messages that hit queue 7 will be delivered remotely since the queue agent now rewrites address on mail dropoff to reflect remote or local.

At this point I’d like to explain a little bit about the code.  Some of this information should end up in the project wiki at some point, but I’m pretty stupid when it comes to wiki’s and don’t have the time tonight to learn it, but I think this might be handy for those fixing the bugs that surely exist in the current smtpc codebase.

I’ll start in XplServiceMain() at the bottom of smtpc.c.  The main portion of this function is the call to XplStartMainThread().  Basically the rest of it is configuration code.

SMTPAgentServer().  This is the function that is run by the main thread.  This function is also pretty simple.  The main call here is BongoQueueAgentListenWithClientPool().  This is a pretty cool function that creates a thread pool.  This is a blocking call.  Basically the innards of this function will listen on a port and any time a new connections is made it will spawn a thread and call the function passed in parameter 6 (ProcessEntry in this case).

ProcessEntry().  This is probably the most important function in the whole file.  The first thing we do is handshake with the queue.  The queue authentication scheme is kinda complex and this function handles it for us.  Prior to this function it was a whole lot of duplicated code.  When the agent gets connected it reads out a string that consists of:

6020 qID envelopeLength messageLength numRecips

The next thing that we do is BongoQueueAgentReadEnvelope() which reads in the envelope into a long character string.  This is one long string where \r\n (or \n) are replaced by \0.  We use the BONGO_ENVELOPE_NEXT() macro to move us down the string past those \0 characters.

Now we start into the main loop processing the envelope.  Since the BONGO_ENVELOPE_NEXT() macro steps based off strlen, and since i’m going to want to null characters, I save a pointer to the end of the current string so that the macro will work right.  This is a little wasteful since in effect I’m basically doing a strlen(”"), but I wanted to use the standard macros to make this as straightforward as possible.  You might be looking at the code thinking, why the heck is he writing the envelope back to the queue.  Well, if you want to make any changes to the envelope at all you need to write the whole envelope back.  So on all envelope lines except those that are recipients, I immediately write them back out to the queue.  I do this since mail delivery might not be sucessful on a recipient.  If it is not, then I want to retry later when the queue agent determines that the smtp agent should handle a message.  I obviously do not want to write out the successful entries else the message would be duplicated in that user’s mailbox.

Within this loop, when I get to a recipent I allocate a structure to hold all the information for them and then at the end of the loop I call the macro to step me through to the next envelope line.  After this I start another loop over all recipients.  This loop is the loop that needs a little help.  I tried to come up with a good way to do this, but decided in the interest of time to just make it work then optimize it later.  The loop compares the current entry with the next and skips it if they are the same.  Thus we only get a delivery once to a user if they were somehow duplicated in the envelope.  I do this by a field in the recipient structure called the SortField.  This field is the reformatted email address like:  domain.com@local_part.  I did this so that the sorting of the array would have all domains grouped together.  That way eventually I could connect once to a given MX and drop the message off for everyone.  I currently connect to the MX as if it was unique in the list (that’s where the optimzation comes in).  Perhaps we could key it off a configuration setting of some kind.  Perhaps recipients_per_connection or something, that is just an idea.

There is a call in this loop to LookupRemoteMX().  This is some of Alex’s code from his amazing fixes to the dns lookup code.  This function connects to the MX and returns the connection to it.  I’ve got a fixme in there because I’m not really sure if skipping the message is right.  Options are bounce it, or skip it which would try again later.  Then we call DeliverMessage() to actually do the smtp convo with the remote site.  Based off the result I either send a DSN of success to the sender, try again later, or send a DSN of failure.

Now for the second most important function: DeliverMessage().  From here on out we’re mostly SMTP with only one call to the queue to fetch the mail body.  I lamented the label that I used, but without it the code would be pretty ugly in handling the tls stuff (which it works for me atm).  Really this is all straight smtp, so while this is pretty important I’m not going to go over that here.  This function also disconnects from the remote MX (which is also a part of that loop optimization above).

I hope that I didn’t bore the tar out of you with that discussion, but the queue is a confusing beast at times, and I’d like to get information out there so that we can get others developing agents.

End of Year

December 24th, 2007

Everyone,
  DISCLAMER: excuse the sappiness, it’s the season

  As I sit here in my kitchen I look around and see all the Christmas letters that we’ve received from friends and family.  I figured that one of my own might be appropriate.  This has been a great year for the Bongo community, one for which I am grateful.  We as a project were basically born this year.  I looked over my hula logs (yes I still log there for no reason, yes I still have all the old logs, and yes they are still all available).  According to http://www.feltonline.com/p_c/hula-logs/hula-2007-01-13.txt @ 12:48 (according to the logfile about line 303) we start to see “bongo = strdup(hula);”.  I remember the day fondly and I don’t think I’ll soon forget it.  Starting there, many a late night was spent in utter enjoyment (go ahead, call me a geek) coding away.  I had spent nights coding on Hula, however this was different.  We were our own entity now and leading our own path.  One which attracted debates at times, but one that, I think, has turned out very well.  We are close to our latest milestone and only have a few rough edges before we can call ourselves a releasable product.  I am excited for that day.

  Now for a slight status update.  I’ve been reading up on various RFCs that I didn’t even know existed.  Things like Delivery Status Notification RFCs used in SMTP.  All this in prep for a rewrite of the SMTP daemon.  I’ve got the outgoing code (smtpc) approximately half way there using the generic agent type system.  I’ve had to make changes to the api a bit which required changes to other agents.  Mostly simple changes that should have been there from the beginning and very simple in nature.  Once I dig out of those RFCs I’ll dive into the incoming code (smtpd).  That one promises to be a little more complex.  After that I’m not sure where I’ll go.  Perhaps back to the queue which could use a bit of help, or perhaps on the agent cert/configuration/startup code that Alex and I have chatted about in the past.  The only other major outstanding issue is the Log4c issue that Lance and I were chatting about which I haven’t yet gotten back to.

  I’ve gotten my test server at swedepop.com back up on the new maildir store.  I haven’t imported the old mailbox yet though. I hope to get that done fairly soon though it doesn’t take long on the lists I’m on to get a ton of mail :)   Also upcoming is new hardware.  A friend of mine was cleaning house and sent me a couple of machines that should be pretty quick in installing.  One is another Athlon 64 (my test/dev servers are VMs on an Athlon 64 XEN-32bit) and an old Athlon XP.  I think one of them is destined to be OpenSolaris as I’ve really wanted to run Bongo through dtrace.  The other I’m not sure about yet.  If I can put OpenSolaris on the XP machine then perhaps I can run an Athlon in 64-bit mode in case we have any more odd 32/64 issues.  If not perhaps I’ll move the dev or test server over to it.

  That’s it from my end of the Bongoverse.  I hope that everyone has a wonderful holiday season and can spend it with family and friends.  As I stated before, we have a great community with some great talent.

pat
(fatpelt)

P.S.  I’d like everyone to pause for a minute and look over my amazing patience throughout this note.  I had to use the backspace far too often because *i actually found the shift key for once!!!*

queue aliasing changes and smtp

November 29th, 2007

i’ve been trying to re-write the smtp agent over the past few nights using the new generic bongoagent style (which is much cleaner than the old system).  in trying to get it to go nicely i found that it’s not really easy in the current system to have two threads with connection pools to the queue agent.  the smtp agent did this so that it could listen to to both queue 6 and 7 both.  the really stupid thing about the nmap protocol was that if it wasn’t know if a user was remote or local, one must assume it is remote.  but the remote delivery queue comes after the local delivery queue.  so by the time we get to queue 7 we’d have missed our chance to deliver locally.  thus queue 6 was born.  smtp’s *only* job on queue 6 was to rewrite the envelope to be a local destination if it really was.  the new patch (r559) attemts to move that responsibility into the queue.  as bongoqueue is rewriting the envelope (the reason for which escapes me really, but it turned out to be useful), i use the aliasing code that i added earlier to resolve if a user is remote or local and replace the recipient lines to correctly reflect destination.

whew!  this being done, now i can continue on with the smtp rewrite.  for lack of a better name, i named it bongosmtp_o for outbound.  i’m sure there is a better name for the “agent that delivers smtp mail to remote systems” (vs “the agent that receives smtp mail from remote systems — which i’d planed on naming bongosmtp_i).  suggestions are very welcome .

in getting ready for the generic style bongo agent i’ve made a few modifications to core libraries to clean out a bit of un-needed code.  the next commit will touch a bunch of files, though the changes are pretty minimal for all except the new smtp.

antispam

November 21st, 2007

I’ve come to the realization that the antispam code needs a good rewrite.  it is incredibly messy.  after adding an insane abount of debugging statements to the queue i finally figured out why i was getting c*.001 files left in the spool.  every time the antispam agent was called it would modify the message envelope then delete it.  the delete would happen, but the remaining w file would still be there and dutifully renamed into a c file later on.  the queue needs a ton of work for file IO.  i know MA did a bunch to optimize it but i didn’t realize how bad we really were in terms of file IO.  http://www.feltonline.com/p_c/bongo/queue-io.txt shows how bad it is for one message passing through the queue being replaced by the message with the spam headers attached.  ick!

more bug fixes

October 30th, 2007

Lance Haig has done some great testing and pointed out a couple of bugs.  the first was due to his using a tool to import his imap stuff off netmail to bongo.  when the imap daemon got *ported* from the hula trunk to the hula branch (which is now bongo) a lot of things had changed.  the most important of those changes was obviously the store.  in trunk an mbox format was used that had problems with the concept of sharing a mailbox.  functions existed to do crazy things in the case of one person expunging a message and letting someone else know (at least that is what it seemed like in the beginning).  the namespace imap command (which his tool needed), had been commented out and replaced with an unknown command error message.  i tried to enable the code but found those “crazy things” functions missing from the code.  in fact i could not find them anywhere in our codebase.  on a whim i decided to check the hula source and bingo there they were.  i didn’t spend too much time trying to figure out what they were doing, i just removed the code and re-enabled the namespace command.  with the new backend, i’m not sure that we need that information any longer as the store takes care of all of that.

after that he decided to play with his new system and needed to know how global domains worked.  i, thankfully, did some testing on that before sending instructions over to him and found that some work needed to be done to get it working.  i decided that i might as well get it going with global domains, a concept that we’d discussed and decided to continue.

global domains you say?  what are they.  well i think we should come up with a better name for them.  really what they are is:  “my usernames are not fully distinguished.  if mail comes in for a user at any global domain, strip off the domain portion and use that for the username.”  for example.  on my feltonline server here, my username is not fully qualified.  so which of the domains that i service does that email address belong to?  it belongs to ALL of the domains in my global domain list, at the same time.  i figured hey!  the aliasing system should be able to handle that if it is possible to alias one domain to the empty string.

i had to code a little bit for it to work, but now the system automagically adds a domain level alias for any domain in the global domains list to the empty string.  pretty much changing “pfelt@swedepop.com” into “pfelt”  (where swedepop.com is a global domain).  i had to add another property to the queue agent’s configuration document “domains”  which i think should be changed once we get a decent name for the concept.

the aliaising system is still somewhat limited though so don’t forget!!  you still can’t do somegroupname@somedomain.com => user1, user2, user3 type stuff, that’s coming.  i’m gonna be doing some more testing on this, but the code has been comitted and i’d appreciate any testing.  after all of this coding to getting it working it feels a little hackish how i did it, and i’ll probably end up changing it when i get around to fixing the one to many type situation anyhow.

i hope to be able to get to some python stuff tomorrow so that we can square away the installation stuff (actually setting hosted and global domains on configuration).

sheesh! what a patch

October 24th, 2007

so, i’ve just committed another couple of revisions to the branch.  i figure i’ll go through them and give another state of the state as i’ll be out of town till monday.

back when i was debugging the store/queue system due to my mail importing stuff i noticed that there were messages that weren’t showing up properly in imap.  one of them being a 10meg video that my brother sent me.  for some reason i couldn’t open it at all.  (which incidentally brings up another bug that i can’t forget to send to halex in DF regarding that same email).  i barely scratched the surface before moving to something else, filing the issue away for a later time.  i figured, at the time, that imap was something i didn’t really want to dive into as it is a pretty complex agent.  well, i came back to it tonight and found where the error was.  a simple bug where the store sends back the structure.  the GetMimeInfo() function would allocate a character string and store all the information in a BongoArray.  we would allocate say 50 bytes (by using a strlen) and then we’d say memcpy(src, dest, sizeof(char *)).  oops.  we’d only really get the 2002 response code from nmap and not the structure.  basically any multipart messages would not be readable properly by imap.

then, i committed a ~1300 line patch to queue/imap/pop/smtp where i stripped out setting from their configuration documents and put them in the global document.  because of this patch all existing branch installations will be broken till they get the new global config document.  there is a sample one included with the bongo-config application and if you don’t mind losing current settings you can just re-run the bongo-config install process.  that system does not yet currently modify the documents before writing them out, so you’ll have to change the document to show correct values.

if you want to run it by hand:

  1. t elnet localhost 689
  2. auth user admin bongo
  3. store _system
  4. write /config 7 NUMBEROFCHARSINDOC Fglobal
  5. paste in new document

after that you should be running again.  i’ve had the patch running on swedepop for the last day and not seen any issues with it, but please test it out.

other things this patch did include some basic housecleaning in smtp.  there was an unused variable taking up stack space that i cleared out.  smtp also got a new queue registration loop.  occasionally on startup here, smtp would not correctly register with the queue becuase it hadn’t been up yet (which i just realized in my commit message i said store instead of queue, oops!).  this should work now regardless as there is a loop with a sleep in it.  i also change a ton of XplConsolePrintf() statements into Log() statements giving us a little better logging.  i’m sure i misclassified some of them, but those are pretty easy changes.

in case some of you missed in the logs, i subscribed to the lkml just for traffic (i’ve gotten 176 mails in a little over 24hours — if you have any ideas for other high traffic mailing lists let me know, i’d love to subscribe).  bongo is handling it ok and the convo stuff in dragonfly works pretty well, though i wish there was a way to have the “inbox” view formatted more like the summary view.  perhaps just a different button or something.

state of the state:  i’m all committed except for the experimental aliasing code that doesn’t work.  branch feels very stable and useable at the moment except for that oddness with the antispam stuff marking messages as spam due to an odd header thrown in somewhere in the system (that’s next on my list unless something else comes up).  after that aliasing, then perhaps a major overhaul of smtp or the requested changes to the imap mail importer to run once for all mailboxes.  that is gonna be tricky i think especially since my python isn’t all that good.  if anyone wants to volunteer for that i’d be happy to send it over :)

all that being said, i’ll be back sunday night some time.  i hope to check my email some though i don’t exactly know what the status will be on online time.  now’s the time i really miss having a laptop.

bugs, bugs, bugs

October 17th, 2007

it’s been about a month since my last blog post. guess it’s time for another :)

lots has happened in the time since the last post. i didn’t realize after i’d checked in the aliasing code that i’d left a huge hanging section that hadn’t been completed and that pretty much stopped the remove branch from working. so i dug and found new reserves and have been a busy guy since then trying to get stuff going for the upcoming m3. i figure i’ll just go down the commit log and where i’m at now to explain what is going on in my portion of the m3 bongoverse.

added a DOMAIN LOCATION command to the queue agent. this allows protocol consumers to pass in an email address and get a result if the domain is local or remote. eventually we’ll add back in the relay domain stuff (smarthosts)

one of the problems with mail not going out was that the default configuration was incorrectly set to smarthost outgoing mail through another server which had never been configured.

i found when playing with gass’ odbc work that return codes could be integers when we were comparing as booleans. depending on how the comparison was done, it could produce runtime errors because of conversions and posted an example application to the -devel list which should compile anywhere regardless of a bongo repo. this caused a couple of issues in code to determine if users existed and if their passwords were correct most of the agents do something with those functions so for example smtp auth and imap login both were suspect

as stated above getting aliasing code actually running in smtp and getting it to accept mail. this led to bugs in the queue along the same types of lines

by the time i had all this working i had a fully working bongo server and i set about getting things set up to run it. i can’t run it on my normal server as i have too many users that *need* email to expose them to the alpha code. i consulted with my amazing isp and now have two new shiny ip addresses that i can bind. i ping’d my brother who hosts some domains on my server regarding availability of stealing the MX for one of them. he probably won’t ever user the MX for swedepop.com so i set up bongo on one of the ips and subscribed to all the bongo lists. this has provided amazing opportunity to test and debug the system as you’ll see below.  (mail me if you’d like an account.  it’s open to anyone who won’t spam ;) )

we found that bounces weren’t functioning properly. this is kinda important in the email world so i moved on to figuring this out. it took a very long time as the queue code is a little messy. it feels like it might be a lot of original code, though i can’t be entirely sure. i didn’t dive into the queue code too much back in the day at Novell. i committed the fix in two steps. the first being mostly correct, but mainly it was so that i could put the new code out on swedepop.com to test it accurately with some live servers. around 1am on the 15th that problem was tracked down to some broken code and a configuration setting not set properly.

the next step was importing. alex and i had been chatting and he asked if i’d tried importing. i know other had and it didn’t work too well or died in the process. he was busy this week so i headed in that direction. i spent a ton of time trying to import my 14600 email hula trash folder and would quickly run out of memory on my 128m servers. i tracked this down to bad python code, however i couldn’t find a memory efficient way to do it. python just doesn’t do a good job of freeing memory and returning it to the system. as i found this to be a lost cause, i decided to write my own importer to not use the filesystem mailbox stuff and just use imap. i figured this would be good since then the implementation of the underlying filestructure wouldn’t matter any more. the server would hide that from me. i extended -storetool to allow for passing in imap information. this was run on the trash bin and worked.

the live server showed its true colors yet again when it started throwing odd errors that i tracked down to calling fclose() then trying to fseek() the same handle. got that fixed so mail delivery could continue.

in the mean time gass had responded to my email to the list as to how to import lots of mailboxes. my response email crashed the server guaranteed every time it ran. this was a puzzler. it took a long time of crazy debugging in both queue and store, but i finally tracked it down to a bug in connio’s ConnWriteFile() this function shouldn’t really have been called, but because of another bug that i have yet to tackle was. in the queue, if the recipients store is local the queue, we run a queue command that says “deliver the file on the filesystem at <insertpathhere> into this user’s mailbox”. that determination is done by calling a library function that should return the ip address bongo is bound to. that function was failing. because of that, queue assumed that the store was on a remote system and used connio to connect to it and then passed the file’s contents over the connection (which calls ConnWriteFile() ). this function wasn’t properly delivering the files full contents which caused an abort since both ends of the connection were not in protocol sync anymore. (one thought it was done sending info and could run the next command and the other was waiting for more information). what a mess!

this whole error condition pointed out a missing unlink() command of a temporary file used when receiving the data from the file on the store end.

“dogs and cats living together, mass hysteria!!!” <bill murray in ghostbusters>

anyhow, most of the situation is under control and now (once i fix the ip address problem) i can move on to other stuff. i’ve got aliasing work to do (fixing it to work properly and setting it up so that you can say somegroup@domain.com => ‘user1@domain.com’, ‘user2@domain.com’,… along with that alex and i have chatted about an i idea i had thought about a long time ago but not mentioned to anyone until rprice told me MA had done this, and then alex came up with it too on his own. splitting the smtpd daemon into two pieces. one for incoming mail and one for outgoing mail.

oh one other thing on aliasing is that once we get aliasing figured out fully and implemented somewhat in the queue, we can remove the requirement that smtp listen on both queue 6 and 7. it does this because mail dropped off “must be dropped off as a remote domain if location is unknown” <paraphrased from the nmap docs> so the agent on queue6 just makes sure that the “remote domain” isn’t really a local domain and creates a whole new queue entry for it! what an amazing waste!!!! the mail just went through everything (spam, virus, any other queue agents) and now we re-create it just so it can get delivered locally. queue really could do with some optimization and by doing it we’d increase overall performance a ton!

speaking of performance, using imap sucking off my hula box i was able to insert all 14627 mails in about 43 minutes. not a bad clip considering all the processing overhead we do. i think we could improve that (along with the odd memory leak in there i mentioned in my -devel email).

more to come in a bit.