Monday, November 27, 2017

Hit Count and Last Hit


After having left MAPS pretty much alone for a long time, though I did implement "nuke" which helps nuke whole domains, I finally have implemented new functionality for MAPS. You see over this last year or so I've been getting a lot of spam mostly oriented in which the domain name was wildly varying. So lots of mail loops were happening. Whacking whole domains became a favorite past time of mine and hence the "nuke" script (not an official part of MAPS mind you). Still it seemed like a losing battle.

Then I got to thinking - something I've been thinking for a while - how effective are all of my null list entries, now numbering in the 17K area! I thought that I could easily add some fields to the list table, namely a hit_count and a last_hit field, and then during scrubbing time see if these null list entries were at all effective, and if not then to automatically remove them.

So I added the fields and changed MAPS such that it records the hits and when the last hit happened. Then in mapsscrub check to see if last_hit is older than the number of days the user is keeping history. If it's 0 then simply remove that entry. Note this happens only for the null list. White list and black list are to be managed by the user. You purposely add somebody to the white or black list for a reason. Do we want to auto age these off? Probably not.

If the hit count is not 0 then how do we age off these apparently no longer effective null list entries? Well the though was to age them by subtracting 1 from the hit count. Thus if we had a null list entry that was fairly active for a little while and say accumulated a hit_count of 10 but then was largely un-hit, the hit_count would go 9, 8, 7, 6, 5... down to 0 and then the entry would be removed. I might want to change this as some null list entries are getting hit_counts in the thousands now! Assuming the user has a 30 day history then perhaps after 30 days of inactivity I should age any null list entry off the list in 7 days? Note that any activity in those 30 days would immediately reset the clock for at least the next 30 days (assuming a history of 30 days).

With this in place I decided to change the mail loop procedure to auto add the sender to the null list. Previously I was whacking off the whole domain - here we are just auto adding the individual sending address. But it's automatically added, automatically tracked for effectiveness and automatically aged off. Beside I still have the option of perusing the top returning domains and can whack whole domains still. Now any hits on any users in the domain will continue to insure that that domain null list entry will remain and not automatically age away...

No more @defaria.com




Due to an influx of spam getting through MAPS with a format of <something>@defaria.com I finally decided to fix MAPS to handle that better. Now if the from is from defaria.com (or the current domain) then the user name is looked up in /etc/passwd using getpwent and if there is no user of that name it is simply nulllisted. This is an obvious attempt by spammers to forge from email addresses.

Also removed logging of when the message is obviously garbled or not conforming to standards. Plus I no longer log if the user uses andrew@defaria.com but fails to have a name such as "Andrew DeFaria <andrew@defaria.com>". The former turns out to be mostly Nigerian spam anyway.

Finally, replaced /usr/local/maps with a direct checkout form CVS!

MAPS Scrubber




Updated mapsscrub to also report total number of user emails in the database. Also fixed header.

Added page drop down




Added a page drop down to list.php. This is useful when one wants to say go to the last page.

Move list entry bug




Found a bug today. While adding to the blacklist an entry was on the white list, MAPS correctly adds the entry to the black list and deletes the entry on the white list. However it fails to resequence the white list causing problems down the line.

Login bug


Squashed a nasty bug, well actually a lack of implementation. After converting some of the web pages to PHP I neglected to implement Login in PHP! Now implemented.

Space




Added new "Space Usage" report. Not sure if I like this because 1) it's too slow and 2) the resulting report is not very impressive - it's just a line saying "User is using X bytes in the databsase". Might wish to implement a space counter in the user table. This would require that all procedures updating the database update the space counter. The intent for this space counter is to judge how much space a user is using in the database for the purposes of quotas or perhaps to charge people if they want more space.

Also, currently this space report is only counting space used in the email table. Might wish to add in the log table, which can add up, and the user and useropts tables. The latter two are minimal but including them would make the report more accurate.

Delete bug


Many, many improvements have happened to MAPS. I have not been keeping this blog up to date. I will try to do better. Here are some of the things that have been implemented or improved:
  1. Went live 12/10/2003! We are now using MAPS from the Linux system and using the MySQL backend.
  2. Many improvements and implemented a lot of the pages
  3. Started re-writing things in PHP. PHP is pretty cool and now that I have mod PHP running it's also quick. I suspect that if I managed to get mod Perl installed that that would solve some of the sluggishness of the Perl pages, however PHP is more "web friendly" and it's nice not having to write a whole web page just because part of it will be dynamic
  4. Implemented most of the "list" processing. You can now add/change and delete entries in the various lists.
  5. Details.cgi needs some work but otherwise is functional. You can even select entries and say Add to Null list, etc.
  6. Implemented Returned Messages by Domain report (in PHP)
  7. Updated the look of the register and blacklist return messages
  8. Implemented the Forgot Your Password link
  9. Making pages have refresh option so that Quickstats updates itself. Added time to Quickstats
  10. Implemented search facility to search sender and subject in returned email. While this is helpful it's not finished yet. Also takes a long time on a large message base...
  11. Changed MAPS to only send return message once instead of up to 5 times
  12. Changed MAPS to use envelope address if From address is blank
That said here's a bug:

When an entry is added to a list the system now searches the other lists and if a matching entry appears in another list it is deleted. This effectively "moves" the entry. Problem is that the delete does not resequence the list that it deleted the entry from.

We should also consider decoupling the sequence # concept. I don't think it is necessary anymore. MAPS applies filtering based on lists - order doesn't seem to be important.

We might also consider not basing the stats and detail pages off of the log entries but directly off of the email in the email table.

Might wish to implement a "super white list". Super white list would be bascially the users address book. The "other" white list would be entries added by registration. User should be able to promote from white list -> super white list. The difference is that the super white list would be applied before other lists. Currently MAPS processes lists in the following order:
  • Check null list. If found send message to /dev/null
  • Check black list. If found send blacklist.html to the sender and discard message
  • Check white list. If found deliver message
  • Otherwise return message (if we have not already returned a message for this sender)
This way the user can apply special privilege for some users. For example, I'd really like to null list all of the *.br$ spam I get but if I did I could not get email from my dad in Brasil. If I had a super white list I could add his address to it and then nulllist everybody else.

We also need to figure out how to handle spam I get when the spammer sets From to be me (or anybody else on my white list). I was thinking about an X-MAPS header that I could set for myself. This would not solve the problem of a spammer sending me spam from somebody else on my white list though I've yet to see that - it would be rare. Except for domain whitelisting such as *@defaria.com. Not sure how to address that

Rounded Tables


The Trick to Rounded Corner Tables
Might use this on MAPS...

CSS Site




MaKo 4 CSS: Frequently asked Questions -
Seems to be a good site on CSS issues. Should be able to help me solve my current problems with CSS in MAPS.

CSS and Cookies


Did a lot of work this weekend on CSS. It's a bitch that things render differently between Netscape and IE. Anyways, developed a new "main" page and implemented cookies properly so that there is no login involved. Still the user can "logout", thus deleting the cookie, if they like.

Worked hard on Quick Stats which appears on the main page and gives the user a quick rundown of today's activity. Also implmented Check Address as a input field on the main page. Basically it allows the user to type in an address to see what MAPS would do with it. Pops up a window to display the message. Pretty cool.

Added pages for Reports, Manage Lists, etc. So far they go nowhere really. Need to fill these out.

Also, the "menu" box needs to be displayed consistently across pages. Need to get rid of the JavaScript menu bar thing I had. Ah page layout can be a bitch, especially when you code it. Need to move the page layout duties to CSS!

Finally, why doesn't IE pay attention to the Footer.js/copyright thing?

MIME


search.cpan.org: MIME-tools - modules for parsing (and creating!) MIME entities


MAPS needs to use MIME to be able to return messages properly. This is a pointer to MIME-tools, a CPAN module for manipulating MIME messages.

More MIME


Well I managed to install MIME-tools and I banged and banged on it for a while. I managed to send MIME messages but not exactly the format that I wanted. I want 3 parts:
  1. Plain text containing the MAPS register message
  2. HTML version of #1
  3. Potential spammer's message
I can send the message but #3 is not showing up as an attachement. Emailed the author of MIME-tools. Let's see what he says...

JavaScript RegExs


Core JavaScript Guide 1.5: 4 Regular Expressions
Need to change my JavaScript to use regex's for the purposes of better identifying a valid email address

Stats




MAPS Stats
Well I got stats sorta working. Looking pretty good so far. It does seem to take a long time to compile the table for 30 days. Maybe I'll need to page this stuff too.
Need to implement clicking on say Returned for 2003-10-10 so that is only shows that subset. Need to write that into detail.cgi

MAPSDeliver


Created MAPSDeliver to isolate the portion of the MAPS system that needs to be setgid (in order to deposit the mail). While maps itself could output the message and let sendmail deliver it we still need to deliver mesages when somebody registers (i.e. Whitelist) so we need to manipulate the users mail box directly, thus we need to be setgid. MAPSDeliver does this.

Mod Perl




mod_perl: What is mod_perl?
Seems like this is homebase for mod_perl. Still not sure if this can help out with my problems of maintaining the state of what user is currently logged into my MAPS web pages.

Apache Session


search.cpan.org: Jeffrey Baker / Apache-Session
It may be possible to use Apache Session to record that somebody has logged into MAPS and provide a persistent state while they are logged in. Question is what happens if people do not log out? How is this to be handled.

Todos


MAPS Registration
  1. Apparently the JavaScript regex for valid email addresses is not bulletproof yet. Email addresses of tthe form "Joe@Schmoe" pass through the regex OK.

  2. Javascript should check to make sure username doesn't have a space in it


  3. username should be case insensitive


  4. registerform.cgi should not allow a username that is not a valid MAPS user.

Cookies


I consulted with Scott and he suggested that I just use a cookie to store the userid and the base things off of that. Seems to work fairly well. Spent some time implementing this and things are looking good.