aliasing and the queue

September 4th, 2007

boy! have the previous weeks been insane or what?  i started working in smtp but got sidetracked into the queue.  there are a few dark corners left in the system and i think this is one of them.  there is some interesting code in queue.  anyhow, i pretty much got sidetracked inside the RewriteAddress() function of smtp. this is one innefficient function as it parses the email address three times in an effort to determine if the address is local, remote, or relay.  it supports some pretty odd things that i’d never heard of before.  the first is called the “bang path”.  it allows one to describe how a message gets from point a to point be.  it is old holdover code from the uucp days before proper email came about.  the sencond thing that it supports is the “percent hack”.  this code allows you to specify how to relay mail.  for example:  pfelt%endpoint.com@relay.com will send the mail to relay.com who will then rewrite the address to pfelt@endpoint.com.  i’ve not ever heard of anyone using this system and i can’t find it anywhere in the specs.  perhaps that is why it is called a hack.

this got me going in the queue because i had had a chat with rprice a while back about aliasing.  in the old nims/netmail code there was an agent that was designed to do aliasing.  this meant that we had to accept all mail into the system and perform aliasing later, rejecting the mail then if the user didn’t exist.  this taxed resources more than it should have as we couldn’t reject the mail at smtp time.  he thought that aliasing should be a part of smtp, and i agreed with him.  then i thought that it would be handy if it were in a central place so anyone could resolve addresses properly.  alex and i chatted about it and thought queue would be the ideal place.

it took a while to come up with something for a lot of reasons.  one being employment getting in the way, but i finally got something going.  it works pretty well except for a bug or two.  based off the email conversation, i’m going to reimplement a couple of things a little differently.  the output is going to change to use 2000 level responses, allowing us to use groups.  this will also require that we change the AliasStruct to have a BongoArray of strings instead of just one string.  that shouldn’t be too bad now that i understand how to read in arrays :)

the major bug is that i can’t currently do an alias like pfelt@domain.com => someotheruser  (wanting the end email to be someotheruser@domain.com).  if that is tried currently it segfaults.  i’m thinking of changing the recursive function to return an int instead of a bool which should allow me to determine what portions have been sent to the client.  with the bool that i’ve got, i’ve either sent the new email to the client or not.

alex has done some amazing stuff with connio and the dns code.  i’m in awe at what he’s done actually.  connio was a mess with the macros that prevented real debugging of the code.   because of this, i’m going to be trying to set up a funky config to let me test bongo inbound and outbound on select users of my domain.  it should be interesting.  i’ll post what i find.

antivirus v2

May 26th, 2007

i’ve finally gotten the antivirus stuff to go.  there were some things that kinda messed me up for a little bit, but we got those straightened out.  the worst of all of them was the eicar test string. it really doesn’t match unless it is in a file!  i spent hours trying to figure out why my test messages all got scanned negative for viruses when they had the eicar string in them.  it is because i wan’t mime encoding the eicar string, i was only pasting it directly into the email.  i guess i need to sleep more.

anyhow, we had a long chat on irc about what should happen with the antivirus scanning.  what should the agent do and what shouldn’t it do.  the commit i just made (r84) works against clamd.  i’ve tested it localhost only not a remote clam, but i think it will work ok ( i just don’t have a remote clam to play with ).  this version doesn’t have all the cool connection pooling stuff that i want to add, nor am i sure that it will work against any other scanners.   the current store configuration document enables clamd by default and enables notifying the recipient that a virus was sent (this code will probably disappear).  i debated doing that stuff before committing this, but decided against it for several reasons.

  1. i was 3 days late getting it done for m2
  2. we talked a little about adding something like milter support to smtp so that we can do antispam and antivirus at smtp time instead of letting the message hit the queue.  even if we don’t do that i’d like to allow for smtp time scanning perhaps via a library or something ( i’m just rambling here ).
  3. it is 2:30 am and we are having friends over tomorow afternoon for some killer barbecued pork……  mmmm…..

as for the missing semaphore that locked up on me, that is as designed.  as i said before, the code for antispam is quite different than the code for antivirus.  the antivirus stuff blocks on a semaphore until there is something to do, presumably for scanners that require scanning of each mime part separately (currently coded as everything except clam).

oh, and one other note:  when coding/debuging antivirus code, it is IMPORTANT to have clamd running and listening on a tcp port….

antivirus

May 21st, 2007

i finally had some time to sit and play with the antivirus agent.  i got the config in place and it reads the config out great.  there are 6 configuration parameters

  • enabled (boolean)
  • flags (integer)  — defined in avirus.h:AVirusFlags
  • patterns (string) — i’m not sure what this does atm
  • queue (integer) — defaults to queue 0
  • host (string) — host ip address or hostname
  • port (integer) — port to connect to for clamd

the unfortunate thing with this agent is that it is different from the antispam agent in terms of codepath.  as i started debugging it, i hit an odd do{} that had a semaphore in it that i never seemed to signal.  because of that, i hang.  obviously there is a bug in there somewhere.

things that need to happen:  convert this to use the new style of queue reading (like the new smtp stuff), convert it to use connection pooling like antispam does.  convert it to use something like ParseHost from antispam.  figure out officialName.  antivirus when detecting a virus removes the mail from the queue and creates a new one in its place from postmaster.

so in essence there is a little too much for me to be able to get this thing straightened out by wednesday unfortunately.  i really, really, really wanted to get it done, and had it been more like antispam it would be, but there are portions of the code that i haven’t yet looked over (and as such don’t fully understand).   i don’t think that this is really in a state to commit until i at least fix the semaphore problem.  i’d stay up longer to do it tonight (”it has personally insulted me” <an old co-worker>), but i have an eye appt in the early am (*shudder*) and i need to get some sleep :(

antispam v3 and smtp

May 6th, 2007

today was a fun day.  when i finally got to start coding i decided to finally commit the smtp patch that i worked on for so long.  it helps prevent a buffer overflow (or it should at least).  it was a long time coming and seems to work ok for me in my limited testing environment.  i hope i found all the little gotcha’s, but how often does that ever happen ? ;)    there is still a little work left on the smtp side of things.  i should go over a couple more functions to see if they are ok with how they read in the envelope from the queue agent.  i’d like to switch them all over to using the full connio and storing it in memory rather than reading off the wire as it goes.  this should prevent other possible memory overruns at the expense of a little bit of memory (the envelopes however are not generally that large).

on the antispam front there was a bit of additional work that i completed last night and today.  the commit log for the antispam stuff was a little long and i fear that i messed up the cia-2 bot which is kind enough to post the full commit log into the irc chat.  amongst the changes were the notable dropping of the dropthreshold and headerthreshold parameters.  we decided that the antispam agent itself should not decide on dropping the mail but should allow the user to specify what to do with the spam via the rules agent (which could probably use some looking over soon).  the configuration for the antispam agent now is quite small.  there are only three things that i don’t currently handle properly wrt the configuration, but they can come later.  the first is that the spam host configuration parameter (host) in the old code could have returned more than one item.  the current implementation only expects one line (a string).  this should be changed to allow for the use of more than one spamd agent if desired.  the other two are, incidentally, identical in implementation of the configuration.  they are the allow and deny lists for the agent itself.  the agent could be configured to assume a connecting ip address is a spammer and automatically dropping the mail.  i’m not sure if we want to keep this in the system yet.  the code is still there to check the lists, however they will always contain no items in them,, so it only provides a minor slowdown atm.

hopefully i can get to taking a look at antivirus now that antispam mostly works (at least for me). that way we can get out m2 quicker.

antispam v2

May 5th, 2007

ok, i know. A POST FROM FAT??!!! and now here is another on the same day. i just couldn’t stop hacking and got most of antispam running. basic functionality exists and seems to work against spamassassin. there are some configuration settings that i don’t set yet but here are the ones that are available atm:

  • enabled : boolean
    • are antispam services desired (this does not cause the agent to start automatically, that setting is in the manager, this just allows one to turn on or off the service)
  • timeout : integer
    • connection timeout to the spamd daemon
    • defaults to 20 ms
  • header_threshold : double
    • spam threshold to add header fields denoting spam checking and results
    • defaults to -9999
  • drop_threshold : double
    • threshold to drop the message from the queue entirely
    • defaults to 9999
  • quarantine_queue : integer
    • (i’m not sure what this setting does atm, but it was there)
    • defaults to queue 0
  • feedback_enabled
    • (i’m not sure yet what feedback is)
    • defaults to false

there is code to do the hostname stuff, but as i look at it it is probably incorrect. in either case, it does connect to the spamd on localhost’s default spamd port and scans the mail. as this is night number two of 3:00am bedtimes, i’m gonna head out for now, but i’m coming back to this later today :)

smtp and antispam

May 4th, 2007

it’s been a long while since i last blogged on what was going on in the bongo world. i’ve been pretty busy of late at work with a big project, and as such bongo time has been a little limited. nevertheless, i’ve been doing some good work in several areas of late.

the first is smtp. rprice from novell (now messaging architects) and i were chatting one day and he alerted me to a possible bug in smtp that would need some attention. i’ve spent a lot of time in re-working portions of smtp to hopefully fix the problem and will release the patch for it shortly. i’m waiting for ma to release their patch purely from a courtesy point of view (and because i like rprice so much :)

the issue actually should affect any queue agent, but currently smtp is the only queue agent that we have that functions (more on that in a minute). it turned out to be not as bad as i thought because i was having issues understanding one particular function that is quite complex in what it is trying to do and what buffers it is trying to use. anyhow, the new stuff should do a better job albeit using slightly more memory and possibly just a hair slower. i guess it remains to be seen though….

antispam. it has been on the plate for some time. i stayed up wayyyyy too late last night playing with it. i learned json and jpath (neither of which i had really used before) and i started playing with the store and alex’s json config parsing stuff. after a while, with alex’s help, i finally got stuff going and am about half done with the config stuff. i’m gonna have to re-work it slightly. i was trying to consolidate two config functions into one, but i think i’ll just leave it in two. so basically the re-work of the current code is just a copy and paste over to a different function. so far it is going really well, and i hope to have antispam running before the weekend is done.

on another front, i saw the mockup for the hawkeye (which i hadn’t seen at all up to this point). i think it is seriously amazing! we have one amazing product. this thing is gonna be so amazing. i can’t wait to see what comes out of it.

ubuntu, wordpress, and bongo

March 8th, 2007

i finally got ubuntu installed on my laptop. there are some really cool benefits to having linux on my laptop. i can do cool pipe stuff that was hard to do on my windows installation that makes some tasks i do much easier. i have however not found a decent development tool for us to use at work. we do a ton of web development in php and we have some pretty specific needs as to how an environment would work. the closest thing i’ve found is something like vim with a crazy .vimrc. not sure i want to try to got that route though as it would be kinda nasty and my coworkers would find it difficult to use.

the nice side affect for this is to allow me to more easily develop on bongo as now i have linux with me and don’t have to shell somewhere to work. this allowed me to commit my buggy patch removing the rest of openssl from the system. the bug is an odd one that i can seem to reproduce easily, but alex could only get the other day. i hoped to get that worked on tonight, but i got side tracked trying to fix my wordpress blog. which leads me to my next issue.

the wordpress community stinks. they are so rude out on #wordpress@freenode. after a bunch of heated exchanges they cursed at me telling me that wp_cron.php wasn’t a used part of the system. i asked why it was installed by default and they told me it wasn’t, it isn’t used. i did a bunch of debugging and found that it was getting called, but i didn’t know what it was for. they kept yelling that it wasn’t used. i finally gave in and told them i’d just comment it out. great. the stuff loads fine and fast now (load times before were near 12 seconds). i log in to start blogging this and guess what. i know what wp-cron does. it is what does the “every so many seconds save a draft of the current blog”. as i’m typing, the system is throwing errors on the bottom of the screen stating that it can’t find a function that is defined in the file that i removed. odd. i don’t know what the issues is with that stuff and why it doesn’t seem to want to run at all fast, but all seems to be workign fine without it. as for the community, i was shocked and appalled at how rude they were out there. it is enough to make me want to switch off wordpress altogether just because of the people.

one thing that impresses me about bongo is the community. it might change in the future, but right now we have a good core group of people that are almost always polite to anyone asking questions. that, to me, is a very important thing. i’m glad i’m a part of it.

smtpd is done!!!

February 19th, 2007

after a lot of hours on smtp it is finally done. what a job. the patch came out to be 5160 lines long. the original smtpd.c file is 8944 lines long. i did a couple of simple grep and there were 2427 removed lines and 784 added lines. not too shabby. some of the patch is just whitespace or code formatting things, but a ton of it is the connio stuff. i’ve also rolled in a couple of patches that i submitted to hula-dev on the smtpd and various rfc compliance issues (one of which i critically needed here at my house for spam — the received line stuff). i ran through a bunch of test cases on that one and they all seemed to work great. let me list them out:

  • helo mail to a remote addr — denied relay
  • helo mail to local addr — worked
  • ehlo/auth mail to remote addr –worked
  • ehlo/auth mail to remote addr — worked
  • ehlo/tls mail to local addr– worked
  • ehlo/ssl/auth mail to remote addr — worked
  • ehlo/tls/auth mail to remote addr –worked

overall i’m pretty happy with how it turned out and really how long it took. i spent long hours (mostly late friday night) for 3 weekends (i think — the first one is a blur) and got it all going. mostly the coding went from about 10pm to 4am those nights with some work saturday and late night sunday as i could. this has turned out to be a very long and tedious process.

one major snag i hit just last night was remote delivery via tls. i had originally thought that it worked, but when i did a packet sniff to verify i found that the ConnNegotiate was just returning as if it had worked and the client went on its merry way. after a bit of digging and playing, i found that i had to make minor modifications to connio as the command sequence is different for gnutls servers and clients. i really should have caught that when i did the connio gnutls patches, but at the time i didn’t have any tls client apps to play with. this fix had the added benefit of fixing the mailproxy agent so that it should work (though i’ve not tested it). as i write this it seems to me now that ConnNegotiate() and ConnEncrypt are basically the same function now. perhaps we should merge them (i should look at this someday).

i’m not sure what i’ll hack on next, perhaps the nmap bug that i just submitted where there is still legacy ip code in nmap.c. i didn’t look into it wheni found it, but my gut feel is that that code is for nmap-to-nmap transfers of mail. we don’t really support this atm though, so i’m not sure if i’ll go there.

maybe i’ll just sit back, relax, and wait for the smtp bug reports to come in :)   or maybe i’ll try to figure out why wordpress is going so amazingly slow on my box.  i’m running plain wordpress 2.1 with no new plugins or anything. i’ve tried the cache stuff and it didn’t seem to help.  it can’t be mysql as the rest of my site works great.  i dunno what it is.  if any of you wordpress guru’s out there know what i’ve got misconfigured, let me know asap :)

the most i’ve written since my undergrad research paper

February 12th, 2007

time for the weekly blog :) i’ve spent another weekend heavilly devoted to smptd. i’ve chatted with a person who tried to do some hacking on the smtp server and gave up, and he told me of another who gave up and decided it’d be easier to rewrite it than to go where i’m going. do i agree? partly. for those of you who have not read the irc logs or were not around, at one point i commented on “looking at the same block of code for about an hour and a half, and finally come to the conclusion it is a complex piece of code”. it might be conceptually easier to think a rewrite would be better, but it would take a lot longer to try to get all the current functionality into the daemon. so what is the current status ?

i’ve gotten smtpd to accept mail for local delivery now for non-ssl connections. the ssl stuff won’t work quite yet as there is a small block of code i haven’t gotten to yet where the STARTTLS command is implemented. the code that goes in that block is already scattered through the system though so it should be fairly easy to get going.

remote delivery is close to being done, but there are two functions that i have yet to rework. one gets the answer from the remote system GetClientAnswer() and the other is one that i haven’t quite figure out yet SendServerEscaped(). i believe this one is supposed to escape data that could mess up the conversation (things like a period) in a certain spot. why this could happen i haven’t found yet, but the function is called in multiple places in code.

i’ve also been going to town on the variables. a bunch out of the ConnectionStruct struct are no longer needed so i’ve removed them. there were also a bunch of globals that i got rid of too. one in particular baffles me MaxFloodCount. it gets set in several places, but never used. i’m not sure what its intention was.

another area that needs a little bit of attention is the connection to the queue agent for queue processing. smtp is different from imap, pop, or dragonfly in that it registers with the queue on a specific queue number. when a message hits that queue number (6), the queue agent will call smtp to let it do processing. thus we have a thread system that listens for incoming connections from the queue server and processes those requests. the code is written but not tested yet (this is obvously tied closely with the outgoing mail tests).

there is also still a large number of things that i’d like to fix. there are blocks of code that are commented out via #if 0 statements, and abuses of variables clearly named for one purpose for something completely different. eventually (not for this release) it would be good to modularize the code more like imap making the code much easier to read. right now all the command processing is one big switch() statement. this could make for fast code (depending on your opinion of compiler optimizations — i’d like to see a study on speed comparisons), but it makes for code that is much harder to maintain and debug. another thing that kinda bugs is to have a variable on one line getting set with the result from another line. this makes gdb interesting as all you see is VariableName = (since my gdb doesn’t like to print multiline). other things like the non-standard (for bongo at least) matching of {} and other such things that don’t match.

on other fronts, alex and i have been doing lots of talking on lots of things. one thing he mentioned was address rewrites vs aliasing and how we are going to handle it. it would be nice to just have that code in smtp instead of a separate queue agent the way that it was before. this however would not work as then dragonfly would then have to be modified to use smtp instead of talking direct to the store. this would not be a cool thing for us as that would slow dragonfly down. it’ll probably end up in a library that is linked or used in some non-specific way like alex suggested of using the auth system in the store (which i think is an excellent idea. the only thing that worries me is non-global domain aliasing).

we chatted a little bit about bulid versioning just today. if you look at the agent code near the top is a define PRODUCT_VERSION which i believe used to be an expanded string in code (when i was on the team we used pvcs but i’m sure that’s long since been done away with). we were discussing a way to auto generate that constant. this needs a little more discussion.

we had a small discussion on dmc as well. dmc was a management tool to allow you to pass tuning parameters of almost any sort, and to get statistics from various agents. it is a great idea, but it is pretty complex code how it is currently implemented. the rules agent (which i’m not sure if it was ever completed) used a new system that is really pretty. we discussed the possibility of using this type of system as well. part of me thinks this would be really cool as then the snmp stuff i’d like to add eventually won’t have to be in every agent, but only in once place that speaks dmc-speak.

i think the other major discussion we had was the one concerning bongo-manager that alex mentioned where it provides configuration information. i think in the long run this is a good idea as bongo-mananger needs communication with the agents anyhow. it was pixelpapst on irc that proposed that idea after alex and i were chatting about mdb and configuration data.

one last thing i’d like to mention is an article i found on /. that i posted to the irc channel. it should be required reading in my opinion (InformationWeek). alex told me i should have blogged it and he’s right. i agree with a lot of what this guy says, some of it i don’t. i think that we have some of the things that he thinks makes an open source that will succeed. it’s a good read and if you want to discuss it, hop into irc :)

after having written this mostrosity, i’m seriously thinking about writing more often. this is just too much work for one sitting!

Finally! A blog!!

February 6th, 2007

i’ve finally gotten a blog going. this thing will mainly be about bongo stuff (since my life is pretty boring). as was mentioned by alex, i’m currently working on a patch to smtp to make it use connio. this has turned out to be quite a monster as there is a ton of code in smtpd that needs to be stripped and re-worked. this does have the good side affect of making smtp ssl work again since we stripped out linking with openssl due to license restrictions. as of right now the current patch is 1969 lines long and i’m still goin.

as for other stuff, alex and i have been having some long chats on mdb. the current thought is that we’ll add a new layer between mdb and the consumer that will for right now just map onto mdb. once all agents are *ported* to using it, we’ll strip out mdb and re-map the back end to be a direct access module. this will take some tough thinking since we want it to work right for the c agents and the python web uis (both hawkeye and dragonfly).

another task on the upcoming list is one that alex mentioned as well, the anti* stuff. currently the antispam stuff doesn’t work because the needed schema attributes did not get pulled over by the novell guys before they got re-tasked off the project. i had submitted a patch for it which worked great as long as you already had a configured system. for anyone who had a new system however it messed things up really bad. i ended up opening a huge can of worms that ties nicely into the mdb discussion.

i hope to be able to go to lunch tomorrow with a couple of old friends, micah and rodney. i’m sure they will both shudder at the stuff we are doing right now, but perhaps it won’t come up.

i’ll try to keep this thing up to date with my current bongo-ings.