Email is just like physical mail and thankfully just as endearingly human (sometimes).
Once upon a time (1970/80s) I lived on and off in a mystic land called West Germany. Our postal addresses ended with incantations such as BFPO 40.
Around 1985ish my granny send a Christmas card to us. I should note that she was at this time nearly seventy and sadly suffering from Parkinsons. She addressed the card, in rather crabbed but legible handwriting, to:
Graham and Heath
BFPO 40
My mum's name is abbreviated - her daughter. At that time Rheindahlen (nr Moenchengladbach) had a pretty large contingent of Brits in it - it was HQ (BAOR).
The card arrived well before Chrimbo and it took about a week judging by the post mark, which was petty normal in those days. She shoved it into a post box in Ipplepen, nr Newton Abbot, Devon and it found its way to an obscure address in another country. I seem to recall she also forgot the stamp but it still got through.
I'm sure mail like that becomes a point of honour to deliver and HM PO and BFPO did the job admirably.
That attitude is how email MTAs are generally designed to work. They cling on to the good old days and sadly the world is a bit shit. Case sensitivity ... lol!
When I was a child I sent a postcard to my grandparents. I forgot to put the house number and addressed the letter to "Oma und Opa" (Grandma and Grandpa). Logically it should not have been delivered successfully.
Thankfully though, the postal worker knew my grandparents had grandchildren and therefore just asked the potential recipients for the name of their grandchildren to determine, which grandparents the postcard was addressed to. To me it's still a miracle that it got delivered at all.
Bill Bryson claimed to have received a letter addressed to ‘Bill Bryson, Writer, Yorkshire’.
I have some cousins who live in a small town in Australia where the houses have neither names nor numbers. You just address the envelope to ‘<name>, <street>, <town>’, and it’s the postie’s responsibility to know where everyone lives. (‘Postie’ is the official job title in Australia Post because it’s gender-neutral.)
I lived in mildly rural NZ back in the day and it was the same, addresses were "name, street, RD# (rural delivery route number), town" and your mailbox had your name on the side (and a flag you could put up if you wanted mail collected.)
Some time roughly mid-nineties we got numbers but originally they were just for emergency services, only later were they also for post, but I seem to recall the whole rural delivery system may have changed somehow around then too.
Until 2025 Carmel-by-the-Sea in California had no street addresses. The houses have names or you just have to know who lives in which building. They also didn't have postal delivery, they all had to go to the town post office and pick up their mail.
We have a UK client in the healthcare industry who registered the domain clientname.healthcare, and they rapidly found that the NHS imposed regexes which rejected name@clientname.healthcare emails.
Aside from regexes though, I also think the new TLDs confuse quite a lot of people. name@clientname.healthcare just doesn't click as an email address as quickly as name@clientname.com, and I'm in tech so I'm sure it's much more confusing for people outside that space.
In fact, that reminds me that we built a site for another client for use inside an exhibition space which was spacename.house and against our advice they put that - without www or https:// - on exhibition panels for use on mobile phones. I am absolutely convinced that most people didn't realise it was a web address.
For years I've had a catch-all subdomain to give out addresses like company@sub.domain.tld which makes filtering out the junk when companies invariably sell their email lists or get hacked much easier. It is getting rarer, but I still occasionally run into sign-on forms that don't allow more than one “.” after the @ unless it is due to a recognised two-part country suffix like .co.uk.
I would never use something that isn't a country TLD for email for this reason, I assume there are a lot of bad systems out there that will incorrectly see them as incorrect.
I, too, get so frustrated by + addresses not working that I’ve configured my MDA to rewrite —- (double hyphen) to plus, and use this in spite on sites that dislike the + variant. I’ve made it impossible to /not/ host my own mail delivery infrastructure now if I want every address I’ve ever given out to still work.
Although more recently I’ve moved to a catch all domain for throwaway, which is even better. It confuses agents on the phone though when I give my email address as {their company name}@mydomain.com
Yeah, most people don’t understand how the ownership and control varies before and after the @ symbol.
The best one was when someone said they were going to give me excellent service because “you work for corporate”, confusing the company name before vs. after the @ sign. I forget which company it was now but the agent was convinced I must be someone important.
I was torn between explaining and letting them believe it :-)
Most of the time, folks just don’t understand why their company name is in the address and they think it’s a mistake.
To be honest, I do tend to avoid this for anything other than throwaways because it causes too much confusion when I have to phone up, and I’m not really doing it out of a misguided belief it helps with spam (at least, it doesn’t help any more than security by obscurity is unsuitable as a singular defence, but maybe has a tiny role when layered into a broader strategy…)
I find it's convenient for knowing which companies have immediately - and illegally, in this country - sold my details on to third-party spammers. Makes it easy not to do business with them.
How do you deal with sending emails? When I was self hosting my emails would be flagged by Gmail (or any other email providers) so I effectively only had a self hosted inbox, which sucks
Dont use a random IP to host? I use fastmail, even though they're trying to convince me that I need to pay ~$45 now instead of $5/year.
And they sent me an email explaining how grateful I should be, that I'm grandfathered in to being able to use my own domain on a "plan" they dont even offer., in a plan that didn't offer custom domains.
Well how'd I get all that then? I signed up for fastmail explicitly because $5/yr for custom domains.
Anyhow if you pay a host you're probably fine. Or find someone with an old /24 thats had a /31 or /32 unused for a long while, and no other black marks against the /24. And use that IP, set up demarc and all the other new email DNS stuff.
My setup is more complicated than it needs to be for $reasons (I like playing with networking protocols, have my own v6 prefix and ASN etc. and my mail and other important personal services are hosted across multiple sites for redundancy), but any competent VPS host that offers you a static IP - coupled with some DKIM, SPF and DMARC configuration that will take an afternoon - should solve the problem. I rarely touch my home setup and it works fine; mail doesn’t go to reputation black holes and it’s been like this (literally) for decades. I invest in architectural tweaks and improvements perhaps every 5 years.
I do run similar infrastructure professionally for a living, which probably helps with getting it right first time. Competent VPS hosts care about IP reputation for mail; e.g. Hetzner only allows outbound port 25 for “trusted” customers, which somewhat helps with abuse reports. Some hosting providers may even let you relay via their own outbound hosts if you have a VPS with them, which simplifies the operational aspect.
I rarely need to send from the catch all address, but Postfix can easily be configured to allow my user to send from other addresses, and then it’s just a case of adding as an alias in your mail user agent.
I was worried about not being able to send emails, but is seems that as long as you setup properly SPF/DKIM/DMARC you're fine. You may have problems if using a domestic address though.
For the configuration, the best bet is probably to use a product that makes it easy to configure the above three, there are a few alternatives around, like Stalwart [1] or docker-mailserver (which is little more that your postfix/dovecot/rspam combo packaged in a container) [2]
Been wanting to do something similar my only hang up is coming up with a domain they wont butcher. When I got my passport I guess it must be OCR but they butchered my email completely.
People are still doing that? To prevent spam? To "catch" the company leaking/selling your address? Now the spammers know they can likely use anything@domain, and it'll get to your eyeballs in some capacity. Also, companies have no shame anymore, they don't care if you know.
I started doing it when so many sites had broken + aliasing stuff, which I use for filing mail to keep my inbox manageable and actionable, as it was easier to type than my double-hyphen hack described above.
I’m not concerned about the leaking as my address is out there anyway and Bayesian spam filtering is still decent enough, but as an aside, I have had two companies this year whose user databases must have been leaked on the basis of spam received at company-specific addresses. I reported it to their privacy people and pointed out it’s highly unlikely this “spam” originated as their (tiny company name) being chosen by chance by a spammer who figured out my catch all domain.
They never replied, and I probably should have followed up with the local information regulatory commission in each case. Hopefully, my note helped them identify they had a leak and to secure their systems.
In practice they don't do that, apart from spamming few addresses like office@ or accounting@. If some address starts getting spam I reject everything sent to it. For addresses that are getting spam but needs to be public (like contact addresses on website) I do more aggressive filtering (eg. I noticed that enforcing that recipient is actually present in To/Cc header cuts down a lot of spam).
I do but mostly for coordination and comms sharing with my spouse by using group aliases. Summer camp registration, school nurse contact info, car insurance, library holds...all super convenient to get joint notifications for things. And yeah, also to remember who we gave contact info to which we can drop if it gets spammy.
Look up email alias service or something similar, if you aren't looking to self host. I can't recommend the service I use, because I'm grandfathered in to my plan, and their current plans for new customers suck, but there's enough providers out there that you should find something competitive.
If you want to 'self host' on a provider, I thing cheap/free options are available from cloudflare, Google, and similar enterprise companies.
If you want to truly self host, I don't have experience, but this guy who does gave a great thorough answer for those who are interested: https://news.ycombinator.com/item?id=48073510
I have a gmail address that at least three other people think is their address. I constantly get emails for the dumb stuff they sign up for. NONE of them ever have an "I didn't request this" link. I mean, I get it. That won't make them money, but oh man is it annoying.
I get these all the time. The most fun was probably when I was given and building layout, door at which to arrive, schedule, and security information to get into a pro sports arena for a game as an employee of some vendor. The least fun was probably when I ended up talking to some drug company’s general counsel about why it’s not okay to send information about a discount program for a specific drug that treats one specific disorder with a bunch of personal information about the patient to an unverified email address. I went on to explain how their tech staff could prevent that, and remind them of the fines and possible jail time involved with HIPAA and HITECH violations.
I have the same with my email address. There seriously exist people out there who think that if they start to give away everywhere an email address, this email address will become theirs. Then there are many service providers and institutions who don't verify an email but simply start sending stuff to it.
The weirdest time was when I got on a girl sorority email list. Told them they got the wrong email a couple times, gave up, and just added a mail filter...
I get scammers using my email to sign up for websites, but they very obviously cannot login to my account. I often wonder what is in it for them. I'm sure someone on HN can tell me!
Commission schemes, possibly. Sign up with their code and they get something out of it. So they submit 10000 harvested addresses, and hope some small % of them think it's something they signed up for and complete the registration process.
You'd think big companies would know better than enlisting spammers to spam on their behalf, but I'm pretty sure Netflix had a scheme like this a few years ago. "Grow at any costs" sites like streaming or social media are probably happy with a tiny bit of plausible deniability for their spamming.
I feel your pain. My gmail address is just my first name, and oh boy, don't half of the people sharing the same first name also think they share my email.
Seriously, that's a huge fricking red flag. Obviously, most of those companies I would never do business with anyway, but this puts it over the line for all the others.
If they don't understand the first thing about validating their putative customers' emails by, you know, sending an email saying "is this really you?" then they've completely proven their technical incompetence.
The worst one is robinhood. I have two different email addresses that different people have used to sign up for robinhood accounts (back when they were giving anybody an account).
Occasionally, I tweak them about sending me shit.
"Sure! Just send us a copy of your photo ID to prove you're not that person."
Nah, bro, you've proven you're clueless, and there's no way I'm sending PIA to clueless people.
> It’s likely that more people out there are being filtered by badly-implemented form validation than there are being filtered by their own need of hand-holding.
I wish this was asserted with evidence. The author might suggest this because they have unrealistic views of some users.
> In the year of our lord 2026, you can reasonably expect your users to know how to type their own email address - or even better, auto-input from their OS, browser, keyboard app, or password manager.
This really depends on who your users are.
I have multiple family members who have healthy memory, but can't accurately remember their email address everytime: the localpart, the domain, the syntax, everything.
Sending an email verification isn't sufficient, because if the user has typo'd ".com", they might never receive that email, and the user might never be back, or then have to escalate to support.
Meanwhile, if a site is opinionated on TLDs, they might prevent those users facing issues.
I'm sure there are many sites were users have a large variety of odd email addresses, but also there are sites that cater to mostly non-technical users within 1-2 locales, and so may find the friendliest UX is having opinionated validation.
That's why the article says "verify, not validate". Send an email, have a process for them to confirm they received it.
If the user gets the email and completes the validation, the email is valid. If they fucked up, they don't get the email and the account never gets created.
No one ever gets prevented from creating an account with a legitimate email address, as opposed to "opinionated validation" where that absolutely will happen. Speaking from years of experience having a .info domain which isn't even all that odd, and at one point using gmail-style + addresses regularly. "Opinionated validation" has forced me to use my .com domain without a plus dozens of times.
I know part of this is intentional, those who know they plan to sell your email addresses don't want you to use the plus addresses, but that doesn't make the advice to not filter addresses any less correct.
While I’m opposed to opinionated validation as well, you seem to be missing the issue it tries to solve, which is the user mistyping their email address, not receiving the verification email, and either thinking everything is fine, or thinking that the process is borked, and in any case not proceeding and not becoming your customer. The goal of opinionated validation is to inform the user about an incorrect email address immediately when they are entering it, so they can correct it right away.
Indeed, “do you really mean that?” would be useful, though I would always have the user type the correction themselves, because too many users would select “Yes” without thinking or attentively verifying.
> I have multiple family members who have healthy memory, but can't accurately remember their email address everytime: the localpart, the domain, the syntax, everything.
I got Gmail early enough that I have (my first name) dot (my last name) at gmail dot com. About twenty years ago, I started getting strange emails. At first I thought they were spam, because they were addressed to me by name but I had never joined those sites. Eventually I figured out that they were addressed to (my first name) (my last name) at gmail dot com. Which Gmail treats as the same address as the one with a dot in between.
Since I had never ever given out a version of my email address without a dot in the middle, I eventually figured out that these emails were meant for someone else who shared the same first and last name as me. But since I don't think Gmail would allow one person to register john.example@gmail.com and then later allow someone else to register johnexample@gmail.com, my name doppelganger must have registered firstnamelastname@yahoo.com, and then forgot the domain and given out firstnamelastname@gmail.com when asked for an email address. And probably never noticed that they weren't receiving emails like "Dear customer, thank you for purchasing (product). Would you like to try (other product)?", so they never realized that they were giving out the wrong email address.
I also have first.last@gmail.com (which I don’t use anymore, and just keep around), I get all kinds of private mails. Contracts, invoices, confidential material, private photos.
And of course, also automated signup mails, newsletters (which I make sure to block and report as spam, unsubscribing is a feature for newsletters that are opt-in), transactional mails etc.
People really suck at knowing what their e-mail is. The private mails are down to 1/month, the others to ~3/week, but it used to be much higher for both categories.
Oh and of course there is some kind of weird scam going on where spammers on German classifieds (Kleinanzeigen) send an e-mail to firstlast@gmail.com for whatever public first and last name of the lister is, and ask if the product is still available. No link, nothing. And all sent via gmail which has by an overwhelming majority become the biggest sender of spam for me. I guess they are trying to get someone to reply and then do some manual scam or something.
Randall estimates in the alt-text of https://xkcd.com/1279/ that there's about ¾ of a million people who just use somebody else's email on gmail without realizing it's not their email address.
Yes, and the MX check is pretty simple to implement.
But it is still lots more complicated than copying some imperfect email address regex, and for many sites, it's unlikely to even be worth spending much more effort than that.
Realistically, many sites can defacto choose to accept email addresses of few patterns. If a user's email address happens to be rejected, then they are either a non-technical user who quickly learns that they need a more commonly accepted email address, or a techie, who keeps a backup email address for these cases, and rightfully holds a grudge.
Most sites just aren't going to care enough to do anything more complex, for annoyed techies.
See also, IPv6 support.
And yes, I get annoyed if a site doesn't accept my domain-under-a-less-common-tld, or doesn't support IPv6. :)
Technically you don't need an MX record to receive mail. From RFC 5321:
> If an empty list of MXs is returned, the address is treated as if it was associated with an implicit MX RR, with a preference of 0, pointing to that host.
>I have multiple family members who have healthy memory, but can't accurately remember their email address everytime: the localpart, the domain, the syntax, everything.
But you can't do anything about that except asking them to validate their address with an email.
If you can catch 50% of user errors with some complex regex, but the other 50% such errors are uncaught, is that of any benefit during sofware design? No, because you still have to solve that problem, probably with email validation by code. You have reduced your workload by 0%, you just split it into 2 parts (unnecessarily).
> If you can catch 50% of user errors with some complex regex, but the other 50% such errors are uncaught, is that of any benefit during sofware design? No, because you still have to solve that problem, probably with email validation by code. You have reduced your workload by 0%, you just split it into 2 parts (unnecessarily).
In your example, the benefit is that users recover from the error 50% of the time at the time of registration, so it doesn't interrupt their workflow. Further, the fallback case (of contacting support, or enacting email validation, if a site chooses to implement) will see a dropoff in successful onboarding.
It is absolutely beneficial to catch 50% of errors earlier than you otherwise could. If validation fails the user is notified immediately. If you don't wait the user has to wait a bit in case the mail is just delayed.
This all old hat, unfortunately, and also a thing which will be gotten wrong by developers for years to come. Just shouting 'give me a regex for validating email addresses' will make an LLM like ChatGPT happily output bullshit suggesting some overlong regex which is flawed precisely as outlined by the linked article, even though no one is arguing for those long unmaintainable regexes once they've seen the light.
Ah well.
Where there is still room for improvement is in how email addresses are often made a little bit anonymous by a lot of websites. Did you ever see something like 'j*h@gmail.com'? Oh wow, that neatly leaves out John Smith's full name! Like showing only the last four numbers of an IBAN or credit card.
Except for us edge cases with a personal domain, where I then get 'm*l@myfullname.nl'. So stop that. Store it next to the bit of knowledge about validating email addresses — the bits of knowledge you use to correct junior developers and senior idiots.
I just tried this with Claude Opus 4.8 and I think it don't see any of those issues:
The first sentence is that there is no single regex that perfectly validates every technically valid email address. I think that is a good start.
It then recommends the regex used for <input type="email"> and explains that this would cover the majority of email addresses used by actual people. It also shows an improved regex that handles dot-atom local parts, quoted strings, domain names, and IPv4 domain literals, but doesn't cover things such as comments, full IPv6 literals, or internationalized addresses.
It ends with the only correct advice (in my optionion): Send a confirmation email.
Does it say 'don't bother with a regex beyond checking it contains an @ surrounded by arbitrary pieces of text?' This still sounds like it is leading developers to conclude that they should use a too complex regex and then send a confirmation email.
Claude Sonnet says:
> A practical email regex that covers the vast majority of real-world addresses:
>
> ^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
Which is still way more complex than needed (and takes effort to read), and buggy according to years of blog posts written about this topic.
Of course the problem is the developer asking for a regex at all, but the must-regex-email instinct seems heavily engrained in our collective psyche.
> In that sense, it’s actually pretty surprising that so much of the world’s population wasn’t able to put their own name, in its native written form, in an email address until just 14 years ago.
Maybe for some internal usages. but imagine someone from a country using different language and characters gives me a card with their email.
It's now far less portable for me to use it.
Those days, I surely could picture it and find the email most likely getting it right.
But email as means of international communication, like passport, should be readable as possible or it kills its purpose.
Even with ASCII emails I have, I already sometimes struggle to pass them over phone or other methods :)
> Maybe for some internal usages. but imagine someone from a country using different language and characters gives me a card with their email. It's now far less portable for me to use it. Those days, I surely could picture it and find the email most likely getting it right.
It would be more portable for use with their peers who speak the same language, rather than requiring that everyone they want to communicate with in their own language and alphabet understands a second alphabet just for the addressing scheme.
What if the agreed upon international standard alphabet didn't happen to be the one you natively write with? If the world agreed to write all email addresses in katakana, that would work just as well as ASCII, right? I have to ask this because a lot of people confuse "single international character set" with "single international character set that happens to be my one." If you'd also be okay with katakana, then you're consistent.
Agreed, (a subset of) ASCII as the lingua franca of identifies is very useful. Almost all languages managed pretty well with ASCII-encodings of their special characters even if some individuals choose to be offended.
These are waaay too complicated. Web developers can't even handle the easy stuff. My email address is of the form sean@foo.bar.baz, and email address validators on websites reject my address about 30% of the time because it has two periods.
I'd probably also have a red warning line under the input field for something really fishy and also most common typos (like "gmail.con") but other than that, I'd let it through.
Realistically, the length of the domain part is likely ultimately constrained by how large a domain name can fit into a DNS/UDP query packet (alongside EDNS0).
Just had to update this this week - a previous dev had used 2,4 and someone came through complaining with a six character domain suffix. Apparently 24 or so is the current limit for a real domain suffix.
Even that's not the true length limit of a label in the Domain Name System. (RFC 1034 § 3, for the curious.) So someone is likely going to be fixing that, years down the line. Then of course there's the fact, as called out earlier, that there can be more than 2 labels in a domain name.
An e-mail address can have multiple @ also for... source routing. Of course it doesn't make any sense nowadays, but it's technically allowed. RFC 5321 gives an example:
@hosta.int,@jkl.org:userc@d.bar.org
This is a valid e-mail address, with source-routing along two intermediate servers. I guess no sane server on the Internet will accept this, but you never know... (this said, I remember attempting this around 1996, when many servers were open relays, and the message was happily delivered after passing through 3-4 servers).
I registered a ".consulting" domain for my little company when they became available, and it has proved highly problematic ever since. Strangely (or perhaps not) it seems to be the larger players that have the most problems. I would at lest have expected ISPs and comms companies to keep up with this (looking at you, Three)
> Every reasonable variation of the company name as a .com/.net/.org was taken, including <companyname>company.com
That also means that customers WILL confuse your company with others in non-domain contexts so perhaps it's a good idea to choose a more unique company name.
While on the sending site you should accept any domain, it's IMO irresponsible to use nuTLDs for pretty much anything as they are privately owned and you have zero recourse when the owner decides to change the deal on you.
I have a relatively good email address, and more than a handful of people who don't seem to understand email, just use my address... I've had payment confirmations from mlb.com orders, to tractor supply receipts and junk mail, to student loan paperwork. It's amazing how much garbage I see all because nobody actually confirms email address ownership before signing people up for crap.
The worst is some foreign gambling site, I can't even log into to change the preferences and cancel the account.
Though, I did deface then delete someone's dating profile once, who signed up on an app with my email...
Add the lie "emails are delivered instantly, so the user can click a link I email them within 1 minute"
And the lie "users always read emails on the same device they're logging into a website with"
And the lie "users can always view HTML email so no need to send a plaintext equivalent, especially if I have a long complex URL I want them to click"
And the lie "Clickable links sent in email are more secure than passwords so I'll stop supporting passwords and instead rely on email delivery of a link for all logins. Whoever clicks that link first is definitely the user who wanted to log in"
If you try to create a Discord account with Firefox Klar as your default browser, on Android, immediately upon signing up you'll be banned. I have to assume this is because it clears cookies and thinks you're a bot farm.
> And the lie "users always read emails on the same device they're logging into a website with"
Or the same browser, or the same browser-profile. For example, on my phone I have external links (from other apps) opening in incognito mode by default.
Claude, for my non Gmail domain, expects me to click a magic link on every device I wish to use it. Its wild that a product like that cannot take a password, or a passkey.
I'm surprised that this has not triggered all of the reminiscences of sitting running mailq at intervals for hours to watch mail that hasn't even left the local sending machine yet.
> Clickable links sent in email are more secure than passwords so I'll stop supporting passwords and instead rely on email delivery of a link for all logins
God, I fucking hate that.
I have a fucking password manager, I have various machines and things open. Just let me fucking log in.
If anyone is reading this who is in charge of the internet please stop doing this.
I seem to spend half my life logging into thing's, confirming 2fa,confirming biometric data. Then when I go back to the first thing it's timed out and I have to sign in again.
It is with much hesitation that I write this, because I just implemented such a flow.
My reasoning was this: my customers keep forgetting their password and somehow that becomes a trigger to contact me. No passwords, no problem.
I tried convincing them to use password managers but that was pointless.
But I see the pain and frustration so I will add passwords. And I quite liked the passkey idea, have to see how that works. Not that my customers would ever use it, but I would. It literally never occured to me.
To be clear, no shade on actual devs faced with actual problems. My ire is reserved exclusively for the "we must do this because it is on the checklist, no I don't understand what a subnet is" people.
A lot of those same people seemed perfectly capable of insisting on 60 day password rotation back when they could use nist guidance as an authority to appeal to (for about five years after the recommendation changed too).
Specifically the revocation of such guidance. If the field gave even the slightest deference to empiricism we wouldn't be changing our password every 180 days, but here we are.
So agreed. It’s fucking crazy. Password manager is so much easier and more secure. If you do this dumb email or SMS OTP flow, at LEAST support passkeys for my password manager!
It’s wild that they’re like “it’s more secure to not have a password” and then choose two unencrypted delivery mechanisms for the very short OTP.
Sure, people who reuse passwords are not secure. And fair, I guess it’s a tragedy of the commons. But at least continue supporting it and make it dead simple for password managers if you actually care bout security
I thought the same for a long time but now i don't know. If your computer is compromised, they can exfiltrate your password, but with a hardware key they can't, so i think that's legitimately more secure than password+otp. It still needs a pin though to protect against device theft.
I bring this up because there's been a ton of compromised developer packages recently and windows itself is being attacked so even if you're pretty good about protecting yourself, you still might get screwed.
I don't think it should be the sites' responsibility to guess whether the browser session is the have device will receive an SMS message... The fact that it is SMS is already bad anyway.
Time-code apps or passkeys are a different story.
1. You should be able to make backups.
2. There's nothing to intercept in plaintext.
3. The all can (unlike SMS features) be locked down by default and require a second layer of unlocking, so that they usually aren't accessible to someone who grabs your phone out of your hand.
It absolutely should be the Bank's concern when this is how 99% of their customers will use it. Some even have deliberate integration between the baking and 2FA apps.
> Note: I have struggled to verify this one, and it’s possible I’m actually misreading the RFC.
Is correct, you can have quoted local parts and (I guess?) theoretically "foo"@mail and foo@mail should even be treated the same.
But practically this is a dead feature and probably should be treated as non existing.
AFIK `[<ip-address]` mails are used by some old data centers for delivering automatic generated "error" mails from unix server in a way which doesn't break when DNS is down.
Also interestingly the `[..]` syntax has a generic extension hook, and that hook allows usage of @ characters. So technically a `foo@[custom:@@@@@@@@]` is a valid mail address, just no one knows how to deliver it ;). (And `custom` must be registered with IANA, theoretically).
From my reading through the RFCs a few months ago the message and smtp envelope also have different rules for addresses, and the message allows the local-part to contain whitespace but the envelope doesn't.
> Punycode [...] and the local-part was still limited to ASCII.
the funny part is this is only half true
The true part: Punycode has never be standardized for the localpart and as such taking a email address with non us-ascii characters in the local part and punycode encoding it is fundamentally wrong.
But: Nothing prevents you to have a local part which "happens" to look like punycode and especially in the early SMTPUTF8 days many providers which did allow non-us-ascii email local parts automatically created an "alias" email address where the local part was punycode encoded. Nothing in the standard prevents this and as consequence punycode encoding a local part _might_ just happen to work for some subset of non-us-ascii emails.
Another one is that you can tell “professional” from “personal” email addresses or that every address even cleanly fits into just one category.
A lot of small business owners use gmail or a longstanding ISP account. A lot of people have personal email addresses you can’t easily distinguish from professional ones, between college alumni addresses, personal domains, and obscure ISP and email providers that aren’t in your database.
Validation to avoid mistakes is, as they point out, good. I'd even go so far as to extend it so that I reject those without any tld (without any dots) just because it's 99.999% a mistake and I don't care about the person who has ben@net. I'd also reject ip numbers.
Next is the spicy take: I need to consider WHY I am gathering this email?
If I'm gathering it for "marketing purposes" or any such cross correlation to other systems, then I'd also reject bob.smith+dontspamme@gmail.com. Or I'd keep both so you can do cross referencing on both the + address and the "raw" one.
Compared to sending a mail or to a customer not getting a mail they wanted?
> Try to keep it as non-restrictive as possible. Something like ^[^@]+@[^@\s]+$, which only makes sure your user has input “something@something”
Requiring a dot in the domain part is perfectly valid. It makes no sense to not validate that the address is in a format that you can actually send something to, which include a domain that you can look up and isn't specifically rejected by your MTA.
> This belief will probably be more commonly held in the English-speaking world, but I’m curious: If you’re not in the Anglosphere, do you still expect emails to require ASCII latin characters?
Yes, I do not trust Unicode with all its ambiguities and alternate forms to resolve to the same identifier on your and that I intended. ASCII-only email addresses are the norm everywhere I have seen.
It's not lies. And it's not about me either. If I collect email address, it will be used somewhere, someday, in god knows what app. If I'm the one collecting the email, I will make it as restrictive at possible so that it doesn't causes issues down the line. If it's too different than John.Doe_123@example.com, it's best to reject it.
For robust systems the goal was never to allow user type any technically valid email. It is to allow only emails that will not cause issues in the future.
Good article. Worth noting C# standard library handles most of that complexity, no regular expressions required. Call System.Net.Mail.MailAddress.TryCreate, if successful read Address property to find the normalised address.
There is one more 'lie' missing and not included in that writing which only looks email addresses what is are limits of valid destination addresses.
But if used as a senders source address there are even less limits.
For example you can use a null address <> when sending. That has been used bit less these days than earlier. It's been used ages SMTP delivery status notifications, mail loop prevention and so where intentionally not much sense to expect anyone to reply. And all well known MTA's forward it and email clients handle it very well by disabling reply to that message.
There is however a catch that anyone who thinks he would now start using it when he doesn't want any reply. Ever since IT Service Management (ITSM) and Service Desk software appeared, they have had issues with email coming from <> sender, because they like to always add received messages email addresses to database, where then someone handling would reply. I've been using only few, Service Now (SN) more lately and before Issue Tracker (IT), both didn't at least about year and half ago know how to handle null sender addresses. Both seemed to just discard or sort some trash bin those emails. With our SN sysadmin didn't find where those went in that system.
But otherwise <> as a sender works great. And sure it would be great if those ITSM making folks would get this fixed, because when your postmaster, postmaster, etc. and such role-aliases are the quite often handled by ITSM software, there is good chance you don't get some important notifications from systems that rely on that null address sender.
ps. Search Google: smtp and sender address as "<>" for more info incase needed.
I enjoyed the deep dice. A lot of sensible advice, and enjoyed the deep dive. A lot of articles do not get a lot of that as right as this article does.
[Old man voice] Back in my day these kinds of articles loved pointing out that, well, the email address could be a UUCP address and that's a whole different parsing situation.
Of course, even then in the mid 90ies, UUCP was not something one really encountered outside of "so you think you're going to parse an email address with regexp?!" articles.
Oh, and there were more than just UUCP bang paths.
IBM Memo, Novel Netware etc. groupware and such X.400 and routing those required also odd email conventions. VAX VMS addresses did have % left side routing in too.
IIIRC in terms of clients mutt (&co) will actually handle “@“ in the local part correctly.
> But the real reason I do that is just because I just like to sit in anger whenever this breaks the user experience because of programming errors or inconsistencies.
Genuinely delighted by the fact that I’m not alone in that.
I would like to point out that the "suggested" validation pattern, ^[^@]+@[^@\s]+$, can filter out valid addresses. "user@something"@example.com is a valid address, and excluding @'s in the user part rejects it.
Maybe I'm taking this too lightly but honestly, if you're playing games with your email address and then don't get my verification mail, it's kind of a you problem.
If your email address contains non-printable unicode characters or an IP address as the domain part, I don't really care enough to add support just for you.
And surely everyone who does this has a "normal" email as a fallback anyways.
Many email servers do forget about the email adresses with IP literals, that for people who are self-hosted without paying for DNS.
mailbox@[x.x.x.x] and mailbox@[ipv6:...] (and probably without "ipv6" prefix once ipv4 is gone).
This is stronger than SPF since the second the IP of the sending SMTP server does not match the IP in the "from" headers and the envelope, the email is dropped, not even going into spam.
For instance, currently, if I send an email to a gmail slave, their parsers will ask for... a DNS PTR record, Oo "Geniuses" at work, or conveniently breaking all interop with small tech?
Don't just put a link into your mail that directly verifies an email when visited. At least put some button or code input field there.
Why? There are mail clients that will automatically open links for users and if that link is now invalid the user is confused about being able to click them.
Or, even easier, just make the call idempotent. The user doesn’t know anything and doesn’t have extra clicks, and it doesn’t matter much if the mail client actually did the “confirming” given it’s proven the email address is valid at that point.
The token was recently used? No problem! Must be a duplicate click, or a refresh, or the user left the browser tab open and their mobile device refreshed when they reopened the browser app, etc.
Also much more critically. Just because mail is successfully delivered does not mean it is in the right inbox. So just link being visited by automation is far from enough in confirming that right person received the mail.
I think most of these issues are easy to resolve by being more permissive and supporting what the technical standard allows for.
The Big Problem™ however is case sensitivity in the local-part, because there multiple incompatible things collide:
1. Users are not universally aware of case (in)sensitivity in one direction or the other
2. Existing systems may or may not interpret case at all
My preferred solution would be to adjust the standard to ignore case in the local part by forcing it to lowercase. That aligns with most of the systems and mental model of technically proficient users anyways. It makes much more sense from an UX standpoint since the goal is to be imambiguous.
If we were to enforce the opposite: case sensitivity in the local part this would have multiple downsides:
1. It is inconsistent with itself by making the local part case sensitive but the host part not, that is harder to explain
2. You have to train users to be precise about case on entry. As someone who worked in IT-support, this is a very bad idea. This includes second-order issues like phishing attacks by silbling emails where just the case differs
3. If your service stores email addresses it will need to know whether that specific Mailserver/client/etc treats the email as case-sensitive or not
In my eyes email servers that allow case sensitive local-parts are functionally broken, even if they don't break any rules.
There are two parts where the case (in)sensitive distinction matters:
- what case you use to send mails
- what rules you use to determine if two email strings are the same user
For the first you can and should always use the address exactly as entered. For the second that's going to be a guess anyway and the exact rules depend on what false positives and false negatives mean for your use case - and you are going to have at least one of those two. Assuming case insensitivity here is generally reasonable for most use cases.
This article says that Gmail can't handle address literals. I personally wrote the IPv6 address literal support for Gmail, so this annoys me. I just tested it and it shortened "[IPv6:2001:etc:etc::192.etc.etc]" down to "@2001" then generated an extremely terse mail delivery subsystem notification that I've never seen before. Which is why you should never just rewrite software without understanding why all the test cases are in the test suite!
Could they have consciously chosen to remove that functionality?
E.g. to simplify code, or if they wanted all mails to have a domain (if, for example, they wanted to integrate with reputation systems that were domain oriented)?
Based on the incredibly basic bounce message, I suspect the problem is that the frontend eats the address before it even gets to delivery.
To your question, yes any product decision is possible, but enterprise/government people are surprisingly demanding about this stuff working because they have extremely weird requirements for routing mail to and through legacy systems. So I bet this still works at the mailer level and is broken in the UI.
I chuck IP address literals (both IPv4 and IPv6) on the list of things that you should care about for email if you're writing an MTA or an MUA but should otherwise generally not care about supporting if you're using email for something else (e.g., as a UID for login).
> but enterprise/government people are surprisingly demanding about this stuff working because they have extremely weird requirements for routing mail to and through legacy systems. So I bet this still works at the mailer level and is broken in the UI
I'm trusting this is a throwaway example and that you used a real IPv6 address literal in this test, without the "IPv6" and with only colons and no dots (unless you mean to use v4 mapped address with dots)? Because this IPv6 literal is so malformed that I'm hardly expecting it to do something sane and changing that to "@2001" is nasal-demons quality undefined behavior. I tried with this exact literal and it let me send it but then there was a tiny red pop-up at the top of the gmail interface that said "could not be delivered, check your network connection" (which is odd; the same kind of pop-up that appears in gray when you legitimately are not connected to the internet) and it ended up in my drafts with the To: field empty.
I just tried to send a message to a "test@[" my current IPv6 address "]", and gmail told me
Error
The address "test@[«redacted»]" in the "To" field was not recognized.
Please make sure that all addresses are properly formed.
This address doesn't have an MDA listening on it, but it didn't accept it enough to give me a non-delivery notification, it didn't even let me send it. gmail did accept an IPv4 address literal in brackets, although it hasn't given me back a non-delivery notification. What it stuffed into my Sent folder for this message has the square brackets stripped and the IPv4 address appears right after the @.
I own a domain I use for email and I have it configured to deliver ANY address that ends with @mydomain. This works like + addressing on steroids. I can have website@mydomain or recipient@mydomainand it makes filtering much easier.
"Regex is hard, regex wizardry is rare, and regex engine implementations are inconsistent. It’s very, very easy to accidentally get it wrong without realizing it."
The what now? I'm struggling to take this seriously because a decade ago regex where common knowledge, like if you don't have a handle on this you should probably go get a job in marketing levels of common knowledge. Has the profession fallen off this far in ten years?
> TL;DR: Don't overthink it, just send a verification email.
pretty bad advice, if taken only as written, without adding more flavor on top.
the major email providers will penalize you if you generate too many undeliverable emails. thus, if you just send a verification email without any pre-validation, it's pretty easy to get into a DoS situation where current/valid users don't get important email sent to them, or that email is significantly delayed, plus incur huge operating cost to resolve the problem.
some form of rate limiting is needed, plus IMHO it's better to use a verifier service or your own heuristic or ML model to test for email validity including valid but fake/spammy/disposable addresses.
sorry, but we are way past the point of being able to have nice things, esp. when we're talking about email.
the "lies" part of the content is great. people do assume all those wrong things. however the TLDR is just wrong, and potentially harmful.
I think the only way to deal with that right now is to hire a company whose job is to deal with it. They'll random-check your outgoing emails are indeed what you say they are, and they maintain a reputation with the big providers for checking it properly.
What pre-validation could you do that would possibly be useful?
Wait! Are you saying that you process new registration attempts without any rate limit, captcha, etc? Because the moment to filter out (or limit) bad actors is before they submit an email address, not through it.
Yeah, good luck with that. Captchas are basically useless in today’s world, so are IP rate limits for anything just a little sophisticated. Of course it helps, but if you think this solves all problems, you live in a dream world.
This is cute and all. But for anyone coming here for real-world advice: just use a regex, normalize to lowercase, and surface any errors to users so they know if their email got rejected. This will avoid 99.9% of issues and work for 100% of real human users. This is what everyone else does, and if you have a user with an esoteric email, they will still be able to furnish another one that passes this validation.
Verify all email address entries before you start using it... I absolutely HATE how much garbage I get because a few people don't understand you actually have to get an email address before you start using whatever you like.
Once upon a time (1970/80s) I lived on and off in a mystic land called West Germany. Our postal addresses ended with incantations such as BFPO 40.
Around 1985ish my granny send a Christmas card to us. I should note that she was at this time nearly seventy and sadly suffering from Parkinsons. She addressed the card, in rather crabbed but legible handwriting, to:
Graham and Heath BFPO 40
My mum's name is abbreviated - her daughter. At that time Rheindahlen (nr Moenchengladbach) had a pretty large contingent of Brits in it - it was HQ (BAOR).
The card arrived well before Chrimbo and it took about a week judging by the post mark, which was petty normal in those days. She shoved it into a post box in Ipplepen, nr Newton Abbot, Devon and it found its way to an obscure address in another country. I seem to recall she also forgot the stamp but it still got through.
I'm sure mail like that becomes a point of honour to deliver and HM PO and BFPO did the job admirably.
That attitude is how email MTAs are generally designed to work. They cling on to the good old days and sadly the world is a bit shit. Case sensitivity ... lol!
Thankfully though, the postal worker knew my grandparents had grandchildren and therefore just asked the potential recipients for the name of their grandchildren to determine, which grandparents the postcard was addressed to. To me it's still a miracle that it got delivered at all.
I have some cousins who live in a small town in Australia where the houses have neither names nor numbers. You just address the envelope to ‘<name>, <street>, <town>’, and it’s the postie’s responsibility to know where everyone lives. (‘Postie’ is the official job title in Australia Post because it’s gender-neutral.)
Some time roughly mid-nineties we got numbers but originally they were just for emergency services, only later were they also for post, but I seem to recall the whole rural delivery system may have changed somehow around then too.
https://www.mjt.me.uk/posts/falsehoods-programmers-believe-a...
* https://news.ycombinator.com/from?site=mjt.me.uk
Aside from regexes though, I also think the new TLDs confuse quite a lot of people. name@clientname.healthcare just doesn't click as an email address as quickly as name@clientname.com, and I'm in tech so I'm sure it's much more confusing for people outside that space.
In fact, that reminds me that we built a site for another client for use inside an exhibition space which was spacename.house and against our advice they put that - without www or https:// - on exhibition panels for use on mobile phones. I am absolutely convinced that most people didn't realise it was a web address.
So if people just remembers spacename.house then that might be enough.
My dad tend to skip the TLD part as well. The results usually work. When they do not - he gets very confused.
The Internet is really a gold rush for scammers.
For years I've had a catch-all subdomain to give out addresses like company@sub.domain.tld which makes filtering out the junk when companies invariably sell their email lists or get hacked much easier. It is getting rarer, but I still occasionally run into sign-on forms that don't allow more than one “.” after the @ unless it is due to a recognised two-part country suffix like .co.uk.
I would never use something that isn't a country TLD for email for this reason, I assume there are a lot of bad systems out there that will incorrectly see them as incorrect.
having a 10 digits tld is self-harming
Although more recently I’ve moved to a catch all domain for throwaway, which is even better. It confuses agents on the phone though when I give my email address as {their company name}@mydomain.com
Yeah, most people don’t understand how the ownership and control varies before and after the @ symbol.
It gives me the fuzzies every time to explain I own the domain, and every email address on it is mine.
Since I have it on my phone, I can usually receive the email that they send very quickly and prove that everything's working fine.
I am wondering how hard it is to do this again today with a new domain though.
I was torn between explaining and letting them believe it :-)
Most of the time, folks just don’t understand why their company name is in the address and they think it’s a mistake.
To be honest, I do tend to avoid this for anything other than throwaways because it causes too much confusion when I have to phone up, and I’m not really doing it out of a misguided belief it helps with spam (at least, it doesn’t help any more than security by obscurity is unsuitable as a singular defence, but maybe has a tiny role when layered into a broader strategy…)
And they sent me an email explaining how grateful I should be, that I'm grandfathered in to being able to use my own domain on a "plan" they dont even offer., in a plan that didn't offer custom domains.
Well how'd I get all that then? I signed up for fastmail explicitly because $5/yr for custom domains.
Anyhow if you pay a host you're probably fine. Or find someone with an old /24 thats had a /31 or /32 unused for a long while, and no other black marks against the /24. And use that IP, set up demarc and all the other new email DNS stuff.
I do run similar infrastructure professionally for a living, which probably helps with getting it right first time. Competent VPS hosts care about IP reputation for mail; e.g. Hetzner only allows outbound port 25 for “trusted” customers, which somewhat helps with abuse reports. Some hosting providers may even let you relay via their own outbound hosts if you have a VPS with them, which simplifies the operational aspect.
I rarely need to send from the catch all address, but Postfix can easily be configured to allow my user to send from other addresses, and then it’s just a case of adding as an alias in your mail user agent.
I was worried about not being able to send emails, but is seems that as long as you setup properly SPF/DKIM/DMARC you're fine. You may have problems if using a domestic address though.
For the configuration, the best bet is probably to use a product that makes it easy to configure the above three, there are a few alternatives around, like Stalwart [1] or docker-mailserver (which is little more that your postfix/dovecot/rspam combo packaged in a container) [2]
[1] https://github.com/stalwartlabs/stalwart
[2] https://github.com/docker-mailserver
People are still doing that? To prevent spam? To "catch" the company leaking/selling your address? Now the spammers know they can likely use anything@domain, and it'll get to your eyeballs in some capacity. Also, companies have no shame anymore, they don't care if you know.
I’m not concerned about the leaking as my address is out there anyway and Bayesian spam filtering is still decent enough, but as an aside, I have had two companies this year whose user databases must have been leaked on the basis of spam received at company-specific addresses. I reported it to their privacy people and pointed out it’s highly unlikely this “spam” originated as their (tiny company name) being chosen by chance by a spammer who figured out my catch all domain.
They never replied, and I probably should have followed up with the local information regulatory commission in each case. Hopefully, my note helped them identify they had a leak and to secure their systems.
If you want to 'self host' on a provider, I thing cheap/free options are available from cloudflare, Google, and similar enterprise companies.
If you want to truly self host, I don't have experience, but this guy who does gave a great thorough answer for those who are interested: https://news.ycombinator.com/item?id=48073510
This is one of my favorite articles on validating emails using RegEx, I fondly remember reading it over 15 years ago. It's stuck with me ever since.
You'd think big companies would know better than enlisting spammers to spam on their behalf, but I'm pretty sure Netflix had a scheme like this a few years ago. "Grow at any costs" sites like streaming or social media are probably happy with a tiny bit of plausible deniability for their spamming.
> I mean, I get it.
I don't.
Seriously, that's a huge fricking red flag. Obviously, most of those companies I would never do business with anyway, but this puts it over the line for all the others.
If they don't understand the first thing about validating their putative customers' emails by, you know, sending an email saying "is this really you?" then they've completely proven their technical incompetence.
The worst one is robinhood. I have two different email addresses that different people have used to sign up for robinhood accounts (back when they were giving anybody an account).
Occasionally, I tweak them about sending me shit.
"Sure! Just send us a copy of your photo ID to prove you're not that person."
Nah, bro, you've proven you're clueless, and there's no way I'm sending PIA to clueless people.
I wish this was asserted with evidence. The author might suggest this because they have unrealistic views of some users.
> In the year of our lord 2026, you can reasonably expect your users to know how to type their own email address - or even better, auto-input from their OS, browser, keyboard app, or password manager.
This really depends on who your users are.
I have multiple family members who have healthy memory, but can't accurately remember their email address everytime: the localpart, the domain, the syntax, everything.
Sending an email verification isn't sufficient, because if the user has typo'd ".com", they might never receive that email, and the user might never be back, or then have to escalate to support.
Meanwhile, if a site is opinionated on TLDs, they might prevent those users facing issues.
I'm sure there are many sites were users have a large variety of odd email addresses, but also there are sites that cater to mostly non-technical users within 1-2 locales, and so may find the friendliest UX is having opinionated validation.
If the user gets the email and completes the validation, the email is valid. If they fucked up, they don't get the email and the account never gets created.
No one ever gets prevented from creating an account with a legitimate email address, as opposed to "opinionated validation" where that absolutely will happen. Speaking from years of experience having a .info domain which isn't even all that odd, and at one point using gmail-style + addresses regularly. "Opinionated validation" has forced me to use my .com domain without a plus dozens of times.
I know part of this is intentional, those who know they plan to sell your email addresses don't want you to use the plus addresses, but that doesn't make the advice to not filter addresses any less correct.
"Did you mean layer8@gmail.com instead of layer8@gmailc0m [Y][N]".
I got Gmail early enough that I have (my first name) dot (my last name) at gmail dot com. About twenty years ago, I started getting strange emails. At first I thought they were spam, because they were addressed to me by name but I had never joined those sites. Eventually I figured out that they were addressed to (my first name) (my last name) at gmail dot com. Which Gmail treats as the same address as the one with a dot in between.
Since I had never ever given out a version of my email address without a dot in the middle, I eventually figured out that these emails were meant for someone else who shared the same first and last name as me. But since I don't think Gmail would allow one person to register john.example@gmail.com and then later allow someone else to register johnexample@gmail.com, my name doppelganger must have registered firstnamelastname@yahoo.com, and then forgot the domain and given out firstnamelastname@gmail.com when asked for an email address. And probably never noticed that they weren't receiving emails like "Dear customer, thank you for purchasing (product). Would you like to try (other product)?", so they never realized that they were giving out the wrong email address.
And of course, also automated signup mails, newsletters (which I make sure to block and report as spam, unsubscribing is a feature for newsletters that are opt-in), transactional mails etc.
People really suck at knowing what their e-mail is. The private mails are down to 1/month, the others to ~3/week, but it used to be much higher for both categories.
Oh and of course there is some kind of weird scam going on where spammers on German classifieds (Kleinanzeigen) send an e-mail to firstlast@gmail.com for whatever public first and last name of the lister is, and ask if the product is still available. No link, nothing. And all sent via gmail which has by an overwhelming majority become the biggest sender of spam for me. I guess they are trying to get someone to reply and then do some manual scam or something.
But it is still lots more complicated than copying some imperfect email address regex, and for many sites, it's unlikely to even be worth spending much more effort than that.
Realistically, many sites can defacto choose to accept email addresses of few patterns. If a user's email address happens to be rejected, then they are either a non-technical user who quickly learns that they need a more commonly accepted email address, or a techie, who keeps a backup email address for these cases, and rightfully holds a grudge.
Most sites just aren't going to care enough to do anything more complex, for annoyed techies.
See also, IPv6 support.
And yes, I get annoyed if a site doesn't accept my domain-under-a-less-common-tld, or doesn't support IPv6. :)
> If an empty list of MXs is returned, the address is treated as if it was associated with an implicit MX RR, with a preference of 0, pointing to that host.
I don't know if most MTAs allow this though.
But you can't do anything about that except asking them to validate their address with an email.
If you can catch 50% of user errors with some complex regex, but the other 50% such errors are uncaught, is that of any benefit during sofware design? No, because you still have to solve that problem, probably with email validation by code. You have reduced your workload by 0%, you just split it into 2 parts (unnecessarily).
In your example, the benefit is that users recover from the error 50% of the time at the time of registration, so it doesn't interrupt their workflow. Further, the fallback case (of contacting support, or enacting email validation, if a site chooses to implement) will see a dropoff in successful onboarding.
Ah well.
Where there is still room for improvement is in how email addresses are often made a little bit anonymous by a lot of websites. Did you ever see something like 'j*h@gmail.com'? Oh wow, that neatly leaves out John Smith's full name! Like showing only the last four numbers of an IBAN or credit card.
Except for us edge cases with a personal domain, where I then get 'm*l@myfullname.nl'. So stop that. Store it next to the bit of knowledge about validating email addresses — the bits of knowledge you use to correct junior developers and senior idiots.
The first sentence is that there is no single regex that perfectly validates every technically valid email address. I think that is a good start.
It then recommends the regex used for <input type="email"> and explains that this would cover the majority of email addresses used by actual people. It also shows an improved regex that handles dot-atom local parts, quoted strings, domain names, and IPv4 domain literals, but doesn't cover things such as comments, full IPv6 literals, or internationalized addresses.
It ends with the only correct advice (in my optionion): Send a confirmation email.
Claude Sonnet says:
> A practical email regex that covers the vast majority of real-world addresses: > > ^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
Which is still way more complex than needed (and takes effort to read), and buggy according to years of blog posts written about this topic.
Of course the problem is the developer asking for a regex at all, but the must-regex-email instinct seems heavily engrained in our collective psyche.
I have no idea what other pay-to-play models say.
Maybe for some internal usages. but imagine someone from a country using different language and characters gives me a card with their email. It's now far less portable for me to use it. Those days, I surely could picture it and find the email most likely getting it right.
But email as means of international communication, like passport, should be readable as possible or it kills its purpose.
Even with ASCII emails I have, I already sometimes struggle to pass them over phone or other methods :)
It would be more portable for use with their peers who speak the same language, rather than requiring that everyone they want to communicate with in their own language and alphabet understands a second alphabet just for the addressing scheme.
I registered a ".consulting" domain for my little company when they became available, and it has proved highly problematic ever since. Strangely (or perhaps not) it seems to be the larger players that have the most problems. I would at lest have expected ISPs and comms companies to keep up with this (looking at you, Three)
It was also a bloody nuisance. Spam filters were one thing but there were so many validation forms that failed.
Every reasonable variation of the company name as a .com/.net/.org was taken, including <companyname>company.com
Ugh, what a nightmare.
Domain holders are the landed gentry of tomorrow if we keep this up.
Even then it seems better to come up with a different prefix, or suffix (or both!), just to stay with ‘.com’.
Of course hindsight is 20/20 and I did the same, my personal homepage used to have a ‘.xyz’ address.
That also means that customers WILL confuse your company with others in non-domain contexts so perhaps it's a good idea to choose a more unique company name.
The worst is some foreign gambling site, I can't even log into to change the preferences and cancel the account.
Though, I did deface then delete someone's dating profile once, who signed up on an app with my email...
And the lie "users always read emails on the same device they're logging into a website with"
And the lie "users can always view HTML email so no need to send a plaintext equivalent, especially if I have a long complex URL I want them to click"
And the lie "Clickable links sent in email are more secure than passwords so I'll stop supporting passwords and instead rely on email delivery of a link for all logins. Whoever clicks that link first is definitely the user who wanted to log in"
Or the same browser, or the same browser-profile. For example, on my phone I have external links (from other apps) opening in incognito mode by default.
Most other providers I've used range from instant to a few minutes.
God, I fucking hate that.
I have a fucking password manager, I have various machines and things open. Just let me fucking log in.
If anyone is reading this who is in charge of the internet please stop doing this.
My reasoning was this: my customers keep forgetting their password and somehow that becomes a trigger to contact me. No passwords, no problem.
I tried convincing them to use password managers but that was pointless.
But I see the pain and frustration so I will add passwords. And I quite liked the passkey idea, have to see how that works. Not that my customers would ever use it, but I would. It literally never occured to me.
It’s wild that they’re like “it’s more secure to not have a password” and then choose two unencrypted delivery mechanisms for the very short OTP.
Sure, people who reuse passwords are not secure. And fair, I guess it’s a tragedy of the commons. But at least continue supporting it and make it dead simple for password managers if you actually care bout security
OTP can be used with a password.
1. Enter username (e.g. an email)
2. Choose from either email or SMS on file
3. Enter the code you got somehow through the respective unencrypted channel
Given that this same site is involved with bank-account details for payment, I am concerned...
Yeah — loose the phone and it’s pretty much game over.
Time-code apps or passkeys are a different story.
1. You should be able to make backups.
2. There's nothing to intercept in plaintext.
3. The all can (unlike SMS features) be locked down by default and require a second layer of unlocking, so that they usually aren't accessible to someone who grabs your phone out of your hand.
I have many ways to generate totp codes. All of them are vastly more convenient than sending me an email or sms
Is correct, you can have quoted local parts and (I guess?) theoretically "foo"@mail and foo@mail should even be treated the same.
But practically this is a dead feature and probably should be treated as non existing.
AFIK `[<ip-address]` mails are used by some old data centers for delivering automatic generated "error" mails from unix server in a way which doesn't break when DNS is down.
Also interestingly the `[..]` syntax has a generic extension hook, and that hook allows usage of @ characters. So technically a `foo@[custom:@@@@@@@@]` is a valid mail address, just no one knows how to deliver it ;). (And `custom` must be registered with IANA, theoretically).
through the message does allow an additional display name (like `display name <email>`) which has it's own rules.
the funny part is this is only half true
The true part: Punycode has never be standardized for the localpart and as such taking a email address with non us-ascii characters in the local part and punycode encoding it is fundamentally wrong.
But: Nothing prevents you to have a local part which "happens" to look like punycode and especially in the early SMTPUTF8 days many providers which did allow non-us-ascii email local parts automatically created an "alias" email address where the local part was punycode encoded. Nothing in the standard prevents this and as consequence punycode encoding a local part _might_ just happen to work for some subset of non-us-ascii emails.
A lot of small business owners use gmail or a longstanding ISP account. A lot of people have personal email addresses you can’t easily distinguish from professional ones, between college alumni addresses, personal domains, and obscure ISP and email providers that aren’t in your database.
Next is the spicy take: I need to consider WHY I am gathering this email?
If I'm gathering it for "marketing purposes" or any such cross correlation to other systems, then I'd also reject bob.smith+dontspamme@gmail.com. Or I'd keep both so you can do cross referencing on both the + address and the "raw" one.
Compared to sending a mail or to a customer not getting a mail they wanted?
> Try to keep it as non-restrictive as possible. Something like ^[^@]+@[^@\s]+$, which only makes sure your user has input “something@something”
Requiring a dot in the domain part is perfectly valid. It makes no sense to not validate that the address is in a format that you can actually send something to, which include a domain that you can look up and isn't specifically rejected by your MTA.
> This belief will probably be more commonly held in the English-speaking world, but I’m curious: If you’re not in the Anglosphere, do you still expect emails to require ASCII latin characters?
Yes, I do not trust Unicode with all its ambiguities and alternate forms to resolve to the same identifier on your and that I intended. ASCII-only email addresses are the norm everywhere I have seen.
yeah, that is a pretty bizarre claim, as if millions of accounts are created per second
frankly this claim makes me think this article is LLM generated, because while the claim is technically correct, it's not a real concern
For robust systems the goal was never to allow user type any technically valid email. It is to allow only emails that will not cause issues in the future.
But if used as a senders source address there are even less limits.
For example you can use a null address <> when sending. That has been used bit less these days than earlier. It's been used ages SMTP delivery status notifications, mail loop prevention and so where intentionally not much sense to expect anyone to reply. And all well known MTA's forward it and email clients handle it very well by disabling reply to that message.
There is however a catch that anyone who thinks he would now start using it when he doesn't want any reply. Ever since IT Service Management (ITSM) and Service Desk software appeared, they have had issues with email coming from <> sender, because they like to always add received messages email addresses to database, where then someone handling would reply. I've been using only few, Service Now (SN) more lately and before Issue Tracker (IT), both didn't at least about year and half ago know how to handle null sender addresses. Both seemed to just discard or sort some trash bin those emails. With our SN sysadmin didn't find where those went in that system.
But otherwise <> as a sender works great. And sure it would be great if those ITSM making folks would get this fixed, because when your postmaster, postmaster, etc. and such role-aliases are the quite often handled by ITSM software, there is good chance you don't get some important notifications from systems that rely on that null address sender.
ps. Search Google: smtp and sender address as "<>" for more info incase needed.
Anyone who also enjoyed it would probably get a kick out of my article on the same subject that goes into the regex (which has some valid use cases): https://hackernoon.com/on-the-practicality-of-regex-for-emai...
Lies we tell ourselves about users.
Of course, even then in the mid 90ies, UUCP was not something one really encountered outside of "so you think you're going to parse an email address with regexp?!" articles.
https://en.wikipedia.org/wiki/UUCP#Mail_routing
IBM Memo, Novel Netware etc. groupware and such X.400 and routing those required also odd email conventions. VAX VMS addresses did have % left side routing in too.
> But the real reason I do that is just because I just like to sit in anger whenever this breaks the user experience because of programming errors or inconsistencies.
Genuinely delighted by the fact that I’m not alone in that.
I appreciate your commitment to correctness but like [XKCD 1172](https://xkcd.com/1172/) ... the user is clearly in the wrong at this point.
mailbox@[x.x.x.x] and mailbox@[ipv6:...] (and probably without "ipv6" prefix once ipv4 is gone).
This is stronger than SPF since the second the IP of the sending SMTP server does not match the IP in the "from" headers and the envelope, the email is dropped, not even going into spam.
For instance, currently, if I send an email to a gmail slave, their parsers will ask for... a DNS PTR record, Oo "Geniuses" at work, or conveniently breaking all interop with small tech?
I think this is mostly common with Gmail-heavy countries and does not apply to Europe? At least I do not know of anyone that thinks so.
Don't just put a link into your mail that directly verifies an email when visited. At least put some button or code input field there.
Why? There are mail clients that will automatically open links for users and if that link is now invalid the user is confused about being able to click them.
The token was recently used? No problem! Must be a duplicate click, or a refresh, or the user left the browser tab open and their mobile device refreshed when they reopened the browser app, etc.
The Big Problem™ however is case sensitivity in the local-part, because there multiple incompatible things collide:
1. Users are not universally aware of case (in)sensitivity in one direction or the other
2. Existing systems may or may not interpret case at all
My preferred solution would be to adjust the standard to ignore case in the local part by forcing it to lowercase. That aligns with most of the systems and mental model of technically proficient users anyways. It makes much more sense from an UX standpoint since the goal is to be imambiguous.
If we were to enforce the opposite: case sensitivity in the local part this would have multiple downsides:
1. It is inconsistent with itself by making the local part case sensitive but the host part not, that is harder to explain
2. You have to train users to be precise about case on entry. As someone who worked in IT-support, this is a very bad idea. This includes second-order issues like phishing attacks by silbling emails where just the case differs
3. If your service stores email addresses it will need to know whether that specific Mailserver/client/etc treats the email as case-sensitive or not
In my eyes email servers that allow case sensitive local-parts are functionally broken, even if they don't break any rules.
- what case you use to send mails
- what rules you use to determine if two email strings are the same user
For the first you can and should always use the address exactly as entered. For the second that's going to be a guess anyway and the exact rules depend on what false positives and false negatives mean for your use case - and you are going to have at least one of those two. Assuming case insensitivity here is generally reasonable for most use cases.
E.g. to simplify code, or if they wanted all mails to have a domain (if, for example, they wanted to integrate with reputation systems that were domain oriented)?
To your question, yes any product decision is possible, but enterprise/government people are surprisingly demanding about this stuff working because they have extremely weird requirements for routing mail to and through legacy systems. So I bet this still works at the mailer level and is broken in the UI.
Interesting context, thanks.
I'm trusting this is a throwaway example and that you used a real IPv6 address literal in this test, without the "IPv6" and with only colons and no dots (unless you mean to use v4 mapped address with dots)? Because this IPv6 literal is so malformed that I'm hardly expecting it to do something sane and changing that to "@2001" is nasal-demons quality undefined behavior. I tried with this exact literal and it let me send it but then there was a tiny red pop-up at the top of the gmail interface that said "could not be delivered, check your network connection" (which is odd; the same kind of pop-up that appears in gray when you legitimately are not connected to the internet) and it ended up in my drafts with the To: field empty.
I just tried to send a message to a "test@[" my current IPv6 address "]", and gmail told me
This address doesn't have an MDA listening on it, but it didn't accept it enough to give me a non-delivery notification, it didn't even let me send it. gmail did accept an IPv4 address literal in brackets, although it hasn't given me back a non-delivery notification. What it stuffed into my Sent folder for this message has the square brackets stripped and the IPv4 address appears right after the @.https://datatracker.ietf.org/doc/html/rfc5321#section-4.1.3
The what now? I'm struggling to take this seriously because a decade ago regex where common knowledge, like if you don't have a handle on this you should probably go get a job in marketing levels of common knowledge. Has the profession fallen off this far in ten years?
Functionally there's no false positives or false negatives
pretty bad advice, if taken only as written, without adding more flavor on top.
the major email providers will penalize you if you generate too many undeliverable emails. thus, if you just send a verification email without any pre-validation, it's pretty easy to get into a DoS situation where current/valid users don't get important email sent to them, or that email is significantly delayed, plus incur huge operating cost to resolve the problem.
some form of rate limiting is needed, plus IMHO it's better to use a verifier service or your own heuristic or ML model to test for email validity including valid but fake/spammy/disposable addresses.
sorry, but we are way past the point of being able to have nice things, esp. when we're talking about email.
the "lies" part of the content is great. people do assume all those wrong things. however the TLDR is just wrong, and potentially harmful.
What pre-validation could you do that would possibly be useful?
I suspect the rate at which new users may try to create new accounts and type a wrong email address is too low to be noticed by reputation metrics.