Mass email Marketing and anti-spam - some of the how-to.. : RooJSolutions

Published 2015-11-19 00:00:00

I'm sure I've mentioned on this blog (probably a few years ago), that we spent about a year developing a very good anti-spam tool. The basis of which was using a huge number of mysql stored procedures to process email as it is accepted and forwarded using an exim mail server.

The tricks that it uses are numerous, although generally come from best practices these days.

The whole process starts off with creating a database with

'known' servers it has talked to before
'known' domains it has dealt with before.
'known' email address it has dealt with before.

If an email / server / domain combo is new and not seen before, then apart from greylisting, and delaying the socket connections we also have a optional manual approval process. (using the web client).

Moving on from that we have a number of other tricks, usually involving detecting url links in the email and seeing if any of the email messages that have been greylisted (with different 'from') are also using that url.

On top of this, is a Web User interface to manage the flow and approvals of email. You can see what is in the greylist queue, set up different accounts for different levels of protection (either post-delivery approval, or pre-delivery approval etc..)

This whole system is very effective, when set up correctly. It can produce zero false negatives, and after learning for a while, is pretty transparent to the operations of a company. (email me if you want to get a quote for it, it's not that expensive...)

So after having created the best of breed anti-spam system, in typical fashion, we get asked to solve the other end.. getting large amounts emails delivered to mailing lists.

If you are looking for help with your mass email marketing systems, don't hesitate to contact us sales@roojs.com

Read on to find out how we send out far to many emails (legally and efficiently)

Mailing list manager.

The core to sending out large amounts of email is a mailing list manager. You should never do this with outlook or from your personal mailbox, and yes I've seen people try...

One option is to use the mailing list as a service products (Mailchimp etc.) these good in some scenario's, but you will have a number of issues if you

Do not have complete records about where you sourced all the email addresses, if you collected them at trade shows over the last few years.. then bulk importing them will be problematic.. you will risk get kicked off these products.
You are very dependent on the provider not getting blacklisted.. To be frank, I suspect since constant contact or Mailchimp has been abused so much before, it's probably on most people's private blacklists.. if not a few real ones..

So unless you are doing a small mail out to a very simple list, then these types of mailing lists as a service will not really work for you.

To do this properly you need to use a system that you know works and can deploy, for us this starts with the mailing list manager, (we use a commercial PHP product - interspire). It's not the cleanest of codebases, but at least looks good in comparison to Wordpress.

This provides the bare essentials for a direct marketing campaign, managing mailing lists, block lists (so that you can ensure compliance with official complaints), simple markup for an email with some personalization and some basic reporting. And since it's written in PHP we can add a few minor customizations.

Note: never customize a commercial PHP product - as integrating the changes into the next release is a pain in the backside... - So our customizations are very tiny and very specific. See the Distribution of IPs domains for more details.

Speed is key.

While we have been involved in Direct Marketing for a while, normally it's just a small lists, gathered on the website. These are simple to manage, and if you are sending less than a thousand an hour, you will not have to worry much.

But if you are sending hundred's of thousands, which we do occasionally, those need to be sent as quickly as possible, otherwise the blacklists that you may accidentally trigger will reduce the delivery of the later bunch of emails.

Key to this is running off suitable hardware, when we started this, most of the VPS providers only gave you part of a normal hard drive. These can be very slow, we where getting delivery times of a few thousand per hour, rather than our current situation where we can deliver around 100,000 mails per hour, per server. The first big improvement was moving to SSD drives, this makes a huge speed difference, both for the database speed, and the mail queues.

Since all our databases are mysql based, the other key part is to use innodb, along with using a split table (one file per table) - otherwise the innodb table in the mysql store directory grows to a crazy size in just a few months. There are also a few tweaks to the cache sizes etc. that help, and of course a bit more memory always improves things.

Pass off Delivery.

Other than getting the database to go fast (especially as we are using it for our flow control in the smtp server), the next issue is that when exim get's a large backlog of messages it slows down.

I mentioned at the beginning about one of the best strategies for reducing spam is greylisting. As I have to explain to non-techy's, this is like the postman coming to your house, finding you are not there, and trying the next day to deliver your parcel. For less ethical email marketing (or real spammers using virus's on PC's), what normally would happen is that they would just skip your address and move onto the next one. Hence greylisting makes a huge difference in reducing SPAM.

When you are running a mailserver, and you connect to another server running greylisting, apart from the resources devoted to waiting for it to let you in, you also have to put the message back on the queue after it has been attempted once. In exim, if the queue get's large, then the whole server slows down (basically it a mix of scanning the directory(s) where the messages are). To resolve this, we need to split the delivery process into two or more servers. The primary one is responsible for the first delivery, and the others are responsible for retrying. This way our main server could plow through the main list of email addresses, and hand of any of the heavy duty queuing to another server.

An interesting aside here, is that a number of commercial email providers operate an outbound pool of IP's that send out email. So normally greylisting matches IP/email address, and expect the same combo to repeat a few times before a new unknown email is accepted. However when some of these commercial services just try once on each combination, it looks alot like a spammer, and tends to get caught in our greylist queue. Since we are consistent in sending from the same IP address, using this technique it increases our chances of delivery.

The only downside to this is that the secondary server does not run all our tracking (as it's not database connected), so we can only produce limited reporting from the secondary servers, by parsing the log files.

Tracking

Most mailing lists eventually contain a reference to email addresses that have long since died, people change jobs, and just leave their subscriptions hanging.. or company's shut down etc.. or a mailbox just get's forgotten about. Interspire includes a rough ability to scan through a inboxes to check for the bounce messages. Sadly this process is not that reliable, it often leads bad email addresses on the list.

We also do not generate bounce message on the email server, as this would significantly increase the workload when sending out a batch of emails. So the only bounce messages that interspire can parse are those from remote servers (backsplatter) which are in completely random formats..

When a large number of emails are sent out, there are a number of things that can happen to an individual email,

The mail get's through
The mail get's through, however the original server bounces the email after processing (backsplatter)
The server reports it'self as busy (eg. greylisting) and we have to try again.
The mail get's rejected (as spam, or some condition that that indicates the user no longer exists)

For emails that are rejected and indicate that the user no longer exists it's normally a good idea to just remove them from the list. Interestingly exim does not really provide any built in mechanism to track these rejections. You can only really add hooks to the receiving and pre-delivery state of the flow. The actually flow of the outbound SMTP connection is not available.

To solve this we ended up using rsyslogd, the system log monitor, and adding hooks to parse the exim messages. This is far from ideal, as it often makes the database do even more work, as it's trying to parse the messages. But it does mean that we can trace each message in the database that has been recorded sent, and record the rejection messages. - From there we can do matching to determine which messages are really bad user rejection messages.

Distribution of IP's

The biggest part of mass marketing is dealing with public blacklists. For many years, the classic protection methods for an email server is to

match keywords to rate something as spam (we do not do this on our anti-spam system)
greylist (yes we do this..)
baysian filtering - not common on servers, but we do it as we have data for this.
public blacklists (we do not use these)
private whitelists/blacklists (we do use this)

There are many public blacklists out there, they do affect message delivery and have to be monitored (we use the mxtoolbox paid service). It's one of those joyless tasks sorting out the mess that the research team have created by adding email addresses without proper checking. When we send out a bunch of emails, for a mass marketing or even for signup services, we have to be aware that the IP and the associated domain to that IP may end up on a blacklist.

Blacklists can reduce deliverability by about 10-30%, and it seems like even more when the server is not doing mass marketing, but just a company webserver or inhouse email server. Some of these lists have easy and obvious ways to delist, others require large amounts of forms, and some basically refuse to delist domains if they think it's only being used for their idea of spam.

With my anti-spam hat on, these lists are pretty pointless, they tend to cause more problems that they solve (stupid amounts of false-positives) but they provide a low barrier of entry for beginners to implement a email server with some spam protection.

To reduce the effects of blacklisting on mass marketing, it's essential to do what Spamhaus calls snowshoeing (it actually recommends you don't do this, but in reality it's the only way to get around their blocks), basically spreading your delivery over a number of disperse IP addresses.

We went through a learning experience on this, basically we started getting IP's blocked on Spamhaus and to try and solve this we distributed the delivery to other IP addresses that had not been blacklisted. This just ended up with us playing wackamole with spamhaus. So what now happens is that we spilt our mailing lists into small segements, and each list is assigned an IP/domain combination. When the email is sent out, we modified interspire to match use the sender domain to as the unsubscribe / tracking domain, so each message does not have commonality that can be matched by the blacklists.

Then on the exim server we use that domain to determine the outbound IP to use. So that the domain from on the email matches the reverse pointer of the IP address.

Building a Multi-IP platform

This whole email blast started off on one machine sending out an occasional email, I had set up a machine in the office of the client, that was used by the customer to send out emails. They had been getting sales leads from the emails, however the open rate and response rate was very low. Open rates at around 0.11% and click throughs at 0.03% (after our work, the rates are roughly 5%/1%). There was multiple reasons why this was happening, but it basically boiled down to the fact that although there was a large number of emails going out, only a small number ever arrived in the inbox of the potentional customer.

As I mentioned above to solve this involved a number of key things

multiple outbound IP addresses
multiple exim queues
multiple machines on different IP ranges (as we found that some blacklists would just mark a whole subnet as bad)

The eventual design while quite complicated to set up, is reasonably stable. The basic layout is like a star, the central server is connected to the satellite VPS using OpenVPN, This enables it to bind to the VPN tunnel and send an email out via the satellites IP address. if this fails, then the message is passed to an LXC container on the VPS satellite for queuing and delivery.

Commissioning the satellites is quite a complicated task, however we did build up a few scripts to try and automate this. We also had to deal with the fact that domains that are used on these IP addresses eventually get clogged down in blacklists (ones that it's really hard to get off). so every 3-6 months we have to re-allocate the IP address to give it a new reverse pointer. This still involves some manual tweaking in multiple place to make sure HELO headers all work correctly.

On top of this, links within an email going out have to point to the same domain as the email sender. (and don't try and put links in emails which show one domain, yet actually point to another - it's an easy way to spot spam with tracking..). So each satellite server had to have nginx as a reverse pointer pointing back to the original server, so people could unsubscribe, or view any tracked links in the email.

Delivering to the big boys - gmail, outlook and yahoo

A large part of any modern email list is dealing with the freebee and paid email suppliers, like gmail, microsoft and yahoo. They generally maintain inhouse blacklists which are near impossible to get off of. For our large blasts we have not really addressed this yet, frequently they will accept emails, however they are seriously ratelimited (if you start sending more than a few email in a short period of time, they will either reject you or greylist/defer you).

For another customer who is sending out emails on a daily basis, but in a lower number to new potential customers, rather than a fixed mailing list. It became more critical for them to successfully deliver to these providers.

To solve this problem we had to try a number of tricks to get emails delivered. Gmail and yahoo are perhaps the simplest.. just ratelimit the amount of emails being sent. This means that the messages will get through, however it does not solve the issue that the emails go into the spam folder.

For hotmail, outlook, live.com and all the wonderful names that Microsoft keep calling their service, things proved more complicated, even if we ratelimited the delivery, their servers would still just reject all our email. The first step to trying to solve this was to investigate using smtp-auth, and create a hotmail account. then send to the other hotmail account using this new one. This does work for a short time, however after a few emails, they will ask you to re-verify the account, which you can do with an SMS. then oddly enough it still stops you from doing this.

I was almost at a loss on how to solve this issue with hotmail - getting these introduction email though is critical to the business. So putting my think hat on, I came up with the idea of using smtp-auth with gmail, (and a number of accounts), then delivering to hot-mail via these gmail addresses. Based on the idea that Microsoft, however much they dislike Google, would not block emails from gmail.... - Along with rate limiting the amount of email we send via each gmail account, we managed to solve the hotmail issue. and with microsoft pushing office365 so much this is actually quite an important trick. All this is coded up into store procedures that control the flow of email out of the exim server.

Follow the laws

As a footnote to all of this, the general gist of Hong Kong's email marketing laws, which are not a bad balance. There are some key things to remember.

Make sure there is a working unsubscribe / block list in you email manager. Recipients in Hong Kong can complain directly a government department, and if you are running a mailing list, you must delete/block that user from getting emails promptly, otherwise there are large penalties.
Sourcing emails to add to your list can not be done using a spidering tool or any kind of automated collection.
Buying big lists from online resellers is a big no-no.. Actually they are a waste of money anyway, as the lists are frequently splattered with honeypot addresses.
There are some weird rules about Personal details and email addresses collected after 2012, which means you can not use the Firstname in the email (I'm not a lawyer so you will have to check that for yourself)

But in principle, there appears to be no rules against you employing a staff member to google for contacts, and create a mailing list from that.. How effective that would be is another question however.

Email Marketing Works

In conclusion to all this, pretty much every time we see email marketing used, it is an effective tool to create leads or signups depending on your target. So if you need someone to help you set up any size of email marketing system don't hesitate to contact us.

Mentioned By:
www.planet-php.net : Planet PHP (48 referals)
planet-php.org : Planet PHP (10 referals)

Related

Mass email Marketing and anti-spam - some of the how-to..

Mailing list manager.

Speed is key.

Pass off Delivery.

Tracking

Distribution of IP's

Building a Multi-IP platform

Delivering to the big boys - gmail, outlook and yahoo

Follow the laws

Email Marketing Works

Add Your Comment

Follow us

Blog Latest

Twitter - @Roojs

Related

Mailing list manager.

Speed is key.

Pass off Delivery.

Tracking

Distribution of IP's

Building a Multi-IP platform

Delivering to the big boys - gmail, outlook and yahoo

Follow the laws

Email Marketing Works

Add Your Comment

Follow us on

OUR BLOG

Follow us

Blog Latest

Twitter - @Roojs