How Spam Filters Tell Phishing From Promo Emails

How Spam Filters Tell Phishing From Promotional Email

You're looking at two emails sitting side by side in your inbox. One is from a retailer announcing a sale. The other is a fake version of that same retailer, built to steal your password. They look nearly identical: same logo, same font, same cheerful subject line. Your spam filter swallowed one whole and waved the other straight through. How, exactly, did it know?

Not by reading the words. It was checking the sender's identity, the link structure, and the behavioural history of everyone who has ever touched that message type before. The mechanics are worth understanding.

The Postmark That Can't Be Faked (Mostly)

Before a spam filter reads a single word, it checks whether the sender is who they claim to be. Three protocols do the heavy lifting.

SPF (Sender Policy Framework) checks whether the IP address that delivered the email is on an approved list for that domain. If Nike's official sending servers are listed in their DNS record and your email arrived from a server in a different country that isn't on that list, SPF fails. One strike.

DKIM (DomainKeys Identified Mail) goes further. It attaches a cryptographic signature to the email, generated using a private key only the real sender holds. The receiving mail server checks that signature against a public key in the sender's DNS. Change even a single character in transit and the signature breaks. Tamper-evident, like a wax seal on an envelope.

DMARC ties both together and tells receiving servers what to do when either check fails: quarantine the message, reject it outright, or just report back. A legitimate retailer with a properly configured DMARC policy set to "reject" is very hard to impersonate at the domain level. A phishing operation spoofing that domain will fail all three checks before a human eye sees a word.

The catch: plenty of legitimate businesses have misconfigured or absent SPF and DKIM records. A fail doesn't always mean phishing. Filters know this, which is why authentication is one signal among dozens, not a verdict by itself.

The Reputation Layer Nobody Talks About

This is the part most guides skip.

Every IP address and domain that sends email accumulates a reputation score, maintained by organisations like Spamhaus, Barracuda, and the internal systems at Google, Microsoft, and Apple. Think of it as a credit score for senders, built over years of behaviour.

A real marketing team at a clothing brand has been sending from the same IP block for three years. Their emails get opened at a 22% rate. Recipients occasionally click unsubscribe rather than hitting "report spam." Sending volume follows a recognisable pattern: spikes around holidays, quieter in between. That history is worth a lot.

A phishing operation appears from nowhere. A domain registered four days ago. An IP address with no sending history, or one previously flagged for malware. Volume that goes from zero to fifty thousand messages in an hour, with no prior relationship with any recipient. Every one of those signals is a red flag. They compound fast.

Some filters apply a "domain age" penalty automatically. Under thirty days old, sending commercial-looking email at scale: extra scrutiny by default, regardless of what the email actually says.

What the Links Are Really Doing

Phishing emails need you to click something. That's the whole mechanism. And that link is where filters find their clearest evidence.

A legitimate promotional email from a retailer contains links that resolve to the brand's actual domain, often through a tracked redirect hosted on a known marketing platform like Salesforce Marketing Cloud or Klaviyo. Those redirect domains have reputations of their own. Filters know them well.

A phishing link tends to be one of three things. A lookalike domain: amaz0n-secure-login.com instead of amazon.com. A legitimate file-sharing or form service (Google Forms, Dropbox) hosting the fake page, borrowing that platform's clean reputation as camouflage. Or a freshly registered domain with a long, random-looking path the filter has never seen.

URL scanners inside mail filters check links in real time against threat intelligence feeds. If a URL was flagged by any other mail system in the last six hours as delivering a credential-harvesting page, that information propagates. By the time the same link arrives in your inbox, it may already have a thousand reports against it.

The wrinkle: sophisticated phishing kits rotate links constantly, swapping URLs every few hours to stay ahead of blocklists. Filters counter this by sandboxing, following the link in a safe isolated environment and watching what the destination page actually does. Does it load a login form that doesn't match the claimed sender's real website? Does it try to drop a cookie mimicking a banking session? That behaviour gets flagged even if the URL itself is brand new.

What People Get Wrong About This

The folk belief that spam filters are primarily hunting for suspicious words, "FREE!!!", "CLICK NOW", "You have won", needs to die. Keyword filtering was the dominant approach in the early 2000s. Modern filters treat it as a minor signal at best, and they're right to. Phishing operators know exactly which words trigger keyword filters and simply avoid them. A well-crafted phishing email reads perfectly normally.

The real signals are structural and behavioural, not linguistic.

Consider two colleagues, Priya and Marcus, who both signed up for emails from the same outdoor gear retailer. Priya opens and clicks regularly. Marcus has never opened one. To Marcus's mail provider, those emails start to look like unwanted bulk mail, even though they're completely legitimate. His engagement history shapes how his provider scores future messages from that sender. The filter isn't just judging the email. It's judging the relationship between the email and that specific recipient.

This is why the same message lands in Priya's inbox and Marcus's promotions folder at the same moment.

So here's the question worth sitting with: if your own behaviour shapes what your filter trusts, who's actually making the call?

The Honest Limits

Spam filters catch a remarkable amount. Google has reported that Gmail blocks more than 99.9% of spam, phishing, and malware. But the volume of phishing attempts is so enormous that the absolute number slipping through remains significant.

Business email compromise attacks, where a criminal takes over a real employee's legitimate account rather than spoofing a domain, are the hardest category by far. The sender authentication checks all pass. The IP reputation is clean. The links may not even be malicious yet. The only tells are subtle: unusual sending time, an unfamiliar recipient combination, a request that doesn't match the sender's normal behaviour. Filters are improving at modelling that context. They're not close to solving it.

Filters are probabilistic systems making fast decisions with incomplete information. They're very good at the obvious cases. The non-obvious cases are the ones that matter most.

Which, frankly, is true of most security problems worth worrying about.

How Spam Filters Tell Phishing From Promotional Email

The Postmark That Can't Be Faked (Mostly)

The Reputation Layer Nobody Talks About

What the Links Are Really Doing

What People Get Wrong About This

The Honest Limits

More Tech*

How Encrypted Messaging Apps Verify Your Identity

How Your Phone Decides Which App Gets Data First

Why Wireless Earbuds Cause Ear Fatigue (And Some Don't)

How Gaming Leaderboards Catch Score Manipulation