Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was responsible for Stripe's API abstractions, including webhooks and /events, for a number of years. Some interesting tidbits:

Many large customers eventually had some issue with webhooks that required intervention. Stripe retries webhooks that fail for up to 3 days: I remember $large_customer coming back from a 3 day weekend and discovering that they had pushed bad code and failed to process some webhooks. We'd often get requests to retry all failed webhooks in a time period. The best customers would have infrastructure to do this themselves off of /v1/events, though this was unfortunately rare.

The biggest challenges with webhooks:

- Delivery: some customer timing out connections for 30s causing the queues to get backed up (Stripe was much smaller back then).

- Versioning: synchronous API requests can use a version specified in the request, but webhooks, by virtue of rendering the object and showing its changed values (there was a `previous_attributes` hash), need to be rendered to a specific version. This made upgrading API versions hard for customers.

There was constant discussion about building some non-webhook pathway for events, but they all have challenges and webhooks + /v1/events were both simple enough for smaller customers and workable for larger customers.



Shameless plug but I've built https://hookdeck.com precisely to tackle some of these problems. It generally falls onto the consumer to build the necessary tools to process webhooks reliably. I'm trying to give everyone the opportunity to be the "best customers" as you are describing them. Stripe is big inspiration for the work.


Do you provide the ability to consume, translate then forward? I am after a ubiquitous endpoint i can point webhooks at and then translate to the schema of another service and send on. You could then share these 'recipes' and allow customers to reuse well known transforms.


You can do this with BenkoBot, we just launched custom webhooks (although it's not in the interface yet). So you can receive a webhook and run some arbitrary javascript to transform it then send it on somewhere else:

http://www.benkobot.com/

Our main focus is on handling Trello notifications, and the Trellinator library I wrote is built in, our objective is to create more API wrappers over time to make it as simple as possible to deal with as many APIs as possible. You can see some example code here:

https://trello.com/b/IoHmhz5c/benkobot-community-board

You currently require a Trello account API key/token to sign up, but you can use it as you described to be a generic endpoint, transform however you want with JS then post the data onto another endpoint.


This is a fairly common use of no-code glue services like Zapier, IFTTT, Cyclr, etc.


Transformations is something we haven't built yet but we have our eyes on it as your are not the first one to bring that up. You can use Hookdeck in front of lamda and to the transformation there, you'd still get the benefit of async processing, retries, etc


Do you have an idea of when hookdeck will have transformations. It's not something we need immediately but would be the win over something like: https://webhookrelay.com/ if it's something you have on your roadmap for sometime soon.


Can you reach out to me, I'd love to talk about your use case and prioritize accordingly. Email is alex at hookdeck dot com


My app, hookrelay.dev, has transformations today. :)


> We'd often get requests to retry all failed webhooks in a time period.

(I worked on the same team as bkraus, non-concurrently).

For teams that are building webhooks into your API, I'd recommend including UI to view webhook attempts and resend them individually or in bulk by date range. Your customers are guaranteed to have a bad deploy at some point.


At Lawn Love, we naively coupled our listening code directly to the Stripe webhook... but it worked flawlessly for years. I wasn't a big fan of the product changes necessitating us switching from the Transfer API for sending money to the complicated--and very confusing for the lawn pros--Connect product, but its webhooks also ran without issue from the moment we first implemented them. So thanks for making my life somewhat easier, Mr. Krausz.

Like many others, I now pattern my own APIs after Stripe's.


Don’t fully thank me, I was also the architect of the Transfers API to Connect transition :). There’s a lot I would have done differently there were I doing it again, though much of the complexity (e.g. the async verification webhooks) were to satisfy compliance needs. Hard to say how much easier the v1 could’ve been given the constraints at the time, though I’m very impressed with the work Stripe has done since to make paying people easier (particularly Express).


I think the Stripe API stuff you did was fine, but you really did your best work as a concepts of mathematics TA.


Can you share a bit about how these events are stored on Stripes backend e.g. Kafka, Postgres?


It's all just kafka and mongo. The event can be stored in any simple k/v storage. There's no magic.

Edit: not sure why I'm being downvoted. I work at stripe and this is literally how it works.


Hi Basta! Can confirm both that he works at Stripe and is right.

Years ago there wasn't even a Kafka portion, that's newer.


Thanks for the input. We're currently working on a similar solution, so I was really curious to learn more.

One thing I really admire is how Stripe makes it transparent which events were fired both in general through the Developer area, and on specific objects like customers, subscriptions, etc..


Pretty easy for a customer to setup an SQS queue and a lambda for receiving them rather than rely on their infrastructure to do all the actual receiving. Way more reliable than coupling your code directly to the callback.


This is precisely what we do where I work. We have a service which has just one responsibility - receive webhooks, do very basic validation that their legitimate, then ship the payload off to an SQS queue for processing. Doing it this way means that whatever’s going on in the service that wants the data, the webhooks get delivered, and we don’t have to worry about how 3rd party X have configured retries.


These reasons are exactly why we started Svix[1] (we do webhooks as a service). I wish we existed to serve you guys back when you started working on it. :)

[1] https://www.svix.com


I always laugh when people end up with designs like this. They could have just used SMTP! It's designed to reliably deliver messages to distributed queues using a loosely-coupled interface while still being extensible. It scales to massive amounts of traffic. It's highly failure-resistant and will retry operations in various scenarios. And it's bi-directional. But it's not "cool" technology or "web-based" so developers won't consider it.

Watch me get downvoted like crazy by all Nodejs developers. Even though they could accomplish exactly what they want with much less code and far less complex systems to maintain.


I pitched an idea like this years ago to essentially backfill one ticketing system to shiny new system that could read an email inbox. The idea was that if we dropped an email in that inbox with its desired format for each old ticket's updates, the new system would do all the necessary inserts and voila. They told me no -- not because of any technical reason, but because their email infrastructure was required to be audited by the SEC, they would have opened themselves up to significantly more auditing. Instead, I ended up having to do it through painful, painful SQL.

Lesson being, that sometimes there are unexpected reasons why a specific piece of technology shouldn't be used.


You're not allowed to use SMTP without calling it email ?

It sounds like one not allowed to use http for restful APIs without calling it a website. (And that org require website to be audit to fulfill accessibility measurement for physical disability)


For the record, I disagreed with them also and pushed back pretty aggressively and found workarounds to the audit problem. But CTO and CIO basically took it as a challenge against their authority and denied me at all points.


Weren't these all emails already? Weren't you required to retain them for the SEC even before falling into this specific (hypothetical) inbox?


It's not clear from the message whether the software is setting up their own email system, if so it will need to be audited and certified, which is a major hassle.

Either way, the auditors and the infrastructure might not want to handle an order of magnitude more traffic (API usage is really in a different league than occasional human email). Expect all emails to have to be stored for around a decade.


For the original system, no, they were not emails.


The suggestion to use SMTP is interesting.

I didn't downvote you but I bet they come from this part. People don't like this kind of negativity.

> But it's not "cool" technology or "web-based" so developers won't consider it. Watch me get downvoted like crazy by all Nodejs developers.


I agree that people don't react well to negativity, but sometimes you have to say it. Node has a lot of very stupid (i.e. ignoring reality) decisions, and by extension, being exposed to this for a long enough period of time, tends to affect the developer as well.

I say this from experience, as someone who's used a few stupid technologies over time.


This was not a time when you had to say it. None of the existing conversation was about languages or ecosystems, and taking cheap shots was wholly unnecessary to the suggestion of using SMTP.

SMTP itself is interesting, although it comes with fun new footguns like STARTTLS.


What makes STARTTLS a footgun? (I'm curious since I use it sometimes)


"STARTTLS is an email protocol command that tells an email server that an email client, including an email client running in a web browser, wants to turn an existing insecure connection into a secure one."

I'm sure absolutely nothing bad will come from that last bit. Oh look:

"And yes, STARTTLS is definitely less secure. Not only can it failback to plaintext without notification, but because it's subject to man-in-the middle attacks. Since the connection starts out in the clear, a MitM can strip out the STARTTLS command, and prevent the encryption from ever occurring."


DANE is meant to fix that. If someone asserts, via DNS records (signed by DNSSEC), that their SMTP server is able to use TLS, then you should only accept connections using TLS to that SMTP server.


And DANE is never going to happen; DANE advocates have been saying this for over a decade, and the only change has been that the IETF and all the major email providers moved forward on a new protocol, MTA-STS, specifically to avoid needing DNSSEC (which nobody uses) to solve this problem.


Almost every time anyone mentions DNSSEC here on HN, you pop up like a jack-in-the-box to claim that nobody is using it and that it is dead. And it’s always you, nobody else. Whereas, from where I sit, I work at a registry and DNS server host (among other things) where about 40% of all our domains have DNSSEC (and that number is constantly climbing). Every conference I go to, and in every webinar, people seemingly always talk about DNSSEC and how usage is increasing.

You might have some valid criticism about the cryptography; I would not be able to judge that (except when you are basing it on wildly outdated information). I’m not an expert on the details; you could most assuredly argue circles around me when it comes to the cryptography, and possibly about the DNSSEC protocol details as well. But, from my perspective, your continuous claim that “nobody uses” DNSSEC is simply false. DNSSEC works, usage of DNSSEC is steadily increasing, and new protocols (like DANE) are starting to make use of DNSSEC for its features. Conversely, I only relatively rarely hear anything about MTA-STS.


Take any list of the top domains on the Internet --- any of them at all --- and run them through a trivial script, like:

    #!/bin/sh
    while read domain
    do
    ds=$(dig ds $domain +short)
    echo "$domain $ds"
    done
... and note that virtually none of the domains, in any sane list of top domains, are signed. That was true several years ago and remains true today, despite the supposed "increase in usage" of DNSSEC.

What's actually changed is that registrars, especially in Europe, now apparently auto-sign domain names. That creates a constant stream of new, more-or-less ephemeral signed zones that gives the appearance of increasing DNSSEC adoption. Of course, this is also security theater (the owners of the zones don't own their keys!). The real figure of merit for DNSSEC adoption is adoption by sites of significance, and that has been static, and practically nonexistent, for a decade.

It is no surprise to me that people working on the DNS talk quite a bit about DNSSEC. People who worked on SNMP talked quite a bit about SNMPv3, and IPSEC people probably really believed there would be Internet-wide IKE. None of those things happened, because what matters in the real world is what the market decides. Most especially at the companies with serious security teams, DNSSEC is a dead letter standard.


Registrars can’t “auto-sign” domains. Only DNS server operators can do that, if they have the cooperation of the registrar. And the DNS server operators is the only workable definition of “owners of the zones”, so they do own their keys. It can’t work any other way.

In fact, the new CDS and CDNSKEY DNS records allow it to work the other way around; DNS server operators can auto-sign domains, and the registrars need not be involved at all.

> The real figure of merit for DNSSEC adoption is adoption by sites of significance

People said the same about IPv6. Or maybe you do, too?

> People who worked on SNMP talked quite a bit about SNMPv3

I seem to recall you mentioning quite often how WHOIS was dead and would be replaced by RDAP. That didn’t happen either.

> IPSEC people probably really believed there would be Internet-wide IKE

Interestingly, that problem could in theory be solved by DNSSEC. We’ll see what happens.


I don't think you ever saw me mention that WHOIS is dead, not least because that's not a thing I believe. What a random thing to say; you can just use the search bar to immediately see the (very few) things I've had to say about RDAP here.


And, reading more at StackOverflow, from where Virtue3's quotes are?, This: https://serverfault.com/questions/523804/is-starttls-less-sa...

I find:

> If the client is configured to require TLS, the two approaches are more-or-less equally safe. But there are some subtleties about how STARTTLS must be used to make it safe, and it's a bit harder for the STARTTLS implementation to get those details right.

I previously thought that was the default, good to know it isn't / might not be

Thanks everyone :-)


DANE is a kludge that should be put to bed, not promoted as a solution to a problem which shouldn't exist.

STARTTLS exists for two reasons (https://www.fastmail.com/help/technical/ssltlsstarttls.html):

1. Wanting to accept mail insecurely.

2. Not wanting to use two different TCP port numbers to send and transfer mail.

To solve these problems they created STARTTLS. But obviously, STARTTLS isn't actually secure (even though that was the point of supporting TLS). So to make it secure, it's suggested to use DANE - a standard built on a different procotol, requiring a feature that is controversial, potentially dangerous, and not widely implemented. So you can use a kludge (STARTTLS) with a kludge (DANE) to send and transfer mail securely. But should you?

Since 2018, RFC8314 says that e-mail submission should use implicit TLS, not STARTTLS (https://datatracker.ietf.org/doc/html/rfc8314#section-3). Therefore the use of STARTTLS, and the use of DANE to make it secure, are deprecated. So while you shouldn't use DANE for anything seriously, you really shouldn't use it for SMTP.


Even if implicit TLS is used instead of STARTTLS, DANE is still necessary to avoid forcing backwards-compatible agents to fall back to unencrypted traditional communication.

DANE is necessary as long as there are still some agents using backwards-compatible behavior; i.e. falling back to unencrypted communication if TLS is in some way blocked.


Those agents should not be falling back to unencrypted anyway! The whole ecosystem just needs to get onboard with implicit TLS and deprecate the old agents. It's not acceptable to make the whole ecosystem dependent on two completely different security mechanisms. Every client/server in the world would have to support both indefinitely, which would be a totally unnecessary cost and complexity burden.


I mean, if we accept completely deprecating non-TLS connections, then there still would be no problem with STARTTLS! Servers would just need to only allow the STARTTLS command, and refuse any commands until after the TLS handshake. I believe that many server programs allows this configuration today.

It is only when we allow backwards compatibility that something is needed to differentiate to the clients whether the server is new enough to allow TLS or not.


That's only a footgun if your system is set up to allow an insecure connection to continue. Just because the protocol allows it does not mean you can't add additional requirements.


> Node has a lot of very stupid (i.e. ignoring reality) decisions, and by extension, being exposed to this for a long enough period of time, tends to affect the developer as well.

As a developer who used Salesforce for nearly a year once upon a time, I can confirm that exposure to stupid decisions in a platform can affect the developer.

Node, though? Could you expand on the stupid decisions in Node? And does Deno address those?


I use and love nodejs daily, and I think I can speak to some of the stupid. A lot (but not all) has to do with ecosystem.

Some of the stupid in node just comes from the fact that there's still a lot of reinventing the wheel, and doing a less good job of it. Like, we've got all these backend frameworks, but still nothing at all that compares to eg Spring. Can you even find a nodejs lib that does HATEOAS properly and completely? How often do you find yourself doing string parsing, or handling a JSON object, when you know it would be more efficient to be handling a stream, or that really the kind of work you're doing ought to be handled by your framework but isn't?

As for nodejs itself, it's much better in 2021 than it was in the past. But it's still a massive runtime. And I have mixed feelings about eg Worker threads. As for node_modules, I get the sense that we're just replaying the history of Microsoft's dll story, needing to relearn all the lessons that should have been learned already.

As for Deno, I think it comes with great ideas. In many respects, I like it better than Nodejs. Most of its good ideas, Nodejs is flexible enough to accomodate. One of Deno's main advantages is that it doesn't have any legacy to support, so it can embrace things like ECMAScript modules more easily. Its library system is closer to Go, although I think the end result is that a lot of folks end up doing one-off systems that end up looking a lot like the nodejs module resolution system in the end. Deno's main disadvantage is that it is not compatible with nodejs libraries. That's also an advantage insomuch as you have a clear module import spec from the get-go.

In short, the stuff that Deno can do, Nodejs can do, and I'm not sure that it's cleaner system can overcome the fact that the same is accomplishable in Nodejs. I'd be more than willing to use Deno in a greenfield project because I like all the technology choices it makes, but fundamentally, the technology choice you're making is whether or not to use V8, and adopting Deno is almost just a way of pressing the reset button the ecosystem, which may or may not be a good thing depending on your needs.


Since it doesn't relate at all, why even add the negativity?

"Here's a crazy idea ... This are the properties for why it works ..." Would have sounded much better.


Does Node have that? Or do node libraries have that?

I find node to be surprisingly well rounded.


I actually did use SMTP as queuing middleware for a registrar platform years ago.

It worked very well.

EDIT: To add some context, my team had come off building a webmail platform, and so we'd done lots of interesting stuff to qmail and knew it inside out. We then launched the .name tld and built a model registrar platform that on registration would bring up web and mail forwarding for users that wanted it. We used SMTP to handle the provisioning of those while keeping the registration part decoupled from the servers handling the forwarding. We also used it to live-update a custom DNS server I wrote.


I remember interviewing someone who worked on a DNS platform where IIRC the DNS zone files were propagated by SMTP to DNS servers. The details on this were that there was a 5-minute SLA (I believe) on the loading of zone records, essentially that the DNS servers were polling the mailbox and parsing new records since some last loaded time stamp.


For a second there I wondered if you'd ever interviewed me (but having looked at your profile: no; I don't think we've met, though I'm in London too).

We had similar-ish constraints. SLA was internal, not imposed (the .name registry had externally imposed SLA's, but the registrar platform did not), but the zones were very simple - either NS records pointing elsewhere, or identical CNAME/MX records, so we needed only a short string per address.

I don't remember if we used CDB files or if we stored individual records directly in ReiserFS filesystems (our mail platform had relied heavily on the ability of ReiserFS to handle vast quantities of tiny files, so were comfortable with that), but it was definitively something simple.

Similar for the web forwarding, which just required a url to redirect to.

If a node should ever need to be replaced, all we'd need to do would be to start a queue on a new box but not process it, then rsync over the dataset from another server, and start processing the queue, and add it into rotation when up to date. If we'd needed stricter consistency guarantees it'd have been a different consideration.

For many types of workloads I'd pick another queuing system today, but the amount of readily available tooling for e-mail, especially once you need federation, reflection/amplification etc. does make it an interesting choice for some things.

It also made debugging the message flow trivial: just add a real mailbox to the cc:....


Your suggestion about SMTP is a good one. Disappointing that I had to downvote your comment for the ad hominem on us old Node developers.

Why you need to insult a whole body of people, rather than just make a claim about the technology, I don’t know.


Node developers are old now?


I'm 39. Not sure what counts as old for you.


[flagged]


so someone told you he was prejudiced against, you respond with more prejudice and then you pretend it's his attitude that colored your perception. Are you for real?


[flagged]


Prejudice is also an English word that has a meaning unrelated to law.

Calling JS developers most dumb is prejudice.

The word's meaning is literally spelled out "pre" + "judge".


Honestly this is so stupid brilliant I love it (stupid as in I can’t believe I hadn’t considered this). Honestly it really is about storing, sending, and checking messages so SMTP makes so much sense!

I’ve been building for the web for 15 years and it shows how far I can hyper focus on certain communications implementations that I’m not looking at pre-existing options that really meet a large number of use cases. I suppose it also means making sure your data consumers are comfortable working with the protocol but it’s a really top notch idea.


SMTP used to be a lot more reliable than it is now. Now, with all the changes to help with blocking spam, you have to be very careful or have a lot of control over the receiving server to ensure you actually get delivery. Some anti-spam systems will just discard if the matching rules indicate the spam likelihood score is above a certain threshold, and mistakes in rules at system levels can and do happen.

But here's another way you could (ab)use the mail system for delivery, provide a mailbox for the client and just allow IMAP or POP access and throw the messages into that. The client can log in to access and process them (which they would likely be automating on their own mailboxes anyway). It does mean it's housed at the provider, but it's also pretty easy to scale. There's lots of info on how to set up load balanced dovecot clusters out there, and even specialized middleware modes (dovecot director) to make it work better so you can scale it to very very large systems.


I don't think mr throwaway was advocating to use email to send the events, only to use SMTP. Email is an entire ecosystem, SMTP is only a protocol.

If the distinction is too hard to make: think of it as using the 'Simple Event Transfer Protocol' that just happens to use exactly the same protocol as SMTP.


> Email is an entire ecosystem, SMTP is only a protocol.

Yeah, but it's a protocol for transferring email. As I noted with "you have to be very careful or have a lot of control over the receiving server to ensure you actually get delivery", you can amstract most of the mail system out as long as you ensure you are running the server they deliver to, but you would also need to rely on them making sure their outgoing server is good for this, which probably means dedicating it to this and not running any real mail through it (so you avoid outgoing company email filters, etc). At that point, both sides are running specific bespoke mail servers, which cuts down on the usefulness of the solution because of how much setup and administration it requires.

It used to be nobody ran incoming and outgoing filtering on email, so it was a robust channel for communication with retries, and notifications for failed delivery, etc. These days it's not exactly that because of all the spam mitigations and company compliance and risk mitigations that might be in place, etc. In fact, just setting up a new mail server and attempting to send to microsoft (live/hotmail), yahoo or gmail is extremely hard, because they have a high bar for acceptance, and large swaths of the easily obtained IP space have already been blacklisted from prior spam use so you start at a bad reputation and have to work to get it to a level you've even be allowed to talk to other by working with all the third party (and first party) blacklist maintaining entities.


It's not uncommon to set up daemons that only talk to each other for infrastructure monitoring and reporting.


Yeah it does make more sense to have the IMAP/POP setup rather than actively sending out emails through consumer level email services like gmail etc where deliverability might become a concern.


at that point you'd better off using an atom/rss feed


The difference is that using a mail subsystem to handle this handles a lot more of the implementation than "use an atom/rss feed".

Notably, in choosing to use an atom/rss feed, you need to determine what the webserver serving it is, how to implement authentication on top of it (is it a token/oauth, HTTP auth, param auth, etc), what is the underlying data store (SQL/NoSQL, some message system), how to scale that system if you expect it to be large and span multiple servers and/or datastores (mail systems right now deal with hundreds of thousands of users and gigabyte plus mailboxes of millions of messages).

Choosing IMAP to deliver this info means there there are well worn solutions for all the decisions you need to make (including howtos to implement oauth at the server level), as well as client level libraries in almost every language. Basically, you could decide to use it and not have to worry about forging a new path on that system basically ever, because there's plenty of people that have already implemented it at a larger level and with the same features (even if you would be using them to slightly different effect), and they've contributed the info on how to do it and what the performance ramifications are to the public domain.

I'm not seriously advocating for it, but that's more because clients will look at you funny than for any technical reason. Technically, it actually has a lot going for it. Unfortunately as an industry we fetishize the new and bespoke because obviously our own unicorn projects are so new and special and will serve so many people that some off the shelf solution could never be as good....


SMTP would raise too many questions, from how both datacenters tolerate it (spam), to who will manage the receiving server itself and certificates on your side, and overall security of this setup. For a nodejs developer it’s really easier to spin up a separate handmade queue process rather than managing SMTP-related things. Webhook (for runtime) and long-polled /events?since= (for startup) have all upsides with little downsides.


When designing something like this as a service, the biggest question is what other developers will find easy to use. Every cheap host supports inbound HTTP requests, and most web developers know how to receive them.

Stripe needs to be usable by both the developers building intense, scalable, reliable systems and the people teaching themselves to code in a limited context on a limited platform.


>And it's bi-directional. But it's not "cool" technology or "web-based" so developers won't consider it.

I might be missing a point or two here, but I don't see how SMTP can work for this case at all. You would require every API consumer to setup a SMTP server (which is another piece of infrastructure to maintain), and then somehow have a layer of authentication so the recipient can control who post messages on that server (overhead for the publisher per new customer). Then we still haven't resolved the issues on the customer side (a bad code that could pop all messages and now we might require the publisher to replay them again).

I haven't even started to think about security and network hardening challenges yet. Again, I might be missing the point but this is not a case of cool tech overuse to me.


SMTP servers supports SSL. Using client certificates and/or HMAC signed messages takes care of the security. You have the same security consideration for HTTP.

As for "setting up an SMTP server", the point is that compared to the current requirement of a webhook, you're going to need a queuing mechanism or a pull mechanism or both anyway. So you can build a custom solution, or you can pick an existing queuing mechanism that people have spent literally decades providing a vast array of software options for.

And yes, you're right, you can always ending up needing a way to trigger a replay because no matter what you do the customer might do something stupid. Nothing you do will get you away from that. So either you require them to always pull, or you provide an option to push and an API to trigger redelivery for when they've done something stupid. If you opt for push, SMTP is an option worth considering, because no other queuing mechanism has as many available ready-made and battle hardened queuing options.

There are many cases where it'd not be suitable, but in the situations where SMTP is a bad choice, webhooks are likely to be an absolutely awful choice.

I speak from actually having run messaging on SMTP both as an e-mail provider with a couple of million users and having used it as messaging middleware in production.


I'm confused, "use SMTP" doesn't even type-check for me. Isn't SMTP just a transfer protocol? Meaning it defines a bunch of commands and gives them meanings (like EHLO and DATA and such), just like how HTTP defines commands like GET and POST and all that? Isn't the problem here about e.g. the storage & retry logic rather than about the data transfer itself? Can't you retry transmission as frequently as you like using whatever protocol you like? How does transferring the data over SMTP gain you anything compared to HTTP?


"use SMTP" here is a short way of saying "send mail to a mail server that will store the requests indefinitely" instead of webhooks that are constantly retrying on a protocol that was hand-written instead of being baked into the whole internet already.


> "use SMTP" here is a short way of saying "send mail to a mail server that will store the requests indefinitely"

So the suggestion is to use email? That's not how others are interpreting it. [1] And it doesn't make sense to me either. Emails as they are "baked into the whole internet already" are unencrypted with tons of middlemen, and even their transport isn't guaranteed to be encrypted. Email is also munged and messed with in weird ways, with fun stuff like each middleman tacking on their own headers and filtering it out based on unknown rules. It also introduces a ton of latency and severely prioritizes "eventually reaching the destination" over timeliness. And more downsides I can't think of off the top of my head. That seems like a really poor choice for an event delivery mechanism.

[1] https://news.ycombinator.com/item?id=27830705


There are a billion lego pieces out there in the e-mail ecosystem. We can combine them any way we want, if the alternative is a totally custom solution anyway (webhooks, custom API endpoints, cron jobs, queues, etc). There are so many options; where to begin!

First, you don't have to use the rest of the internet's e-mail system. Stripe can run their own mail servers that deliver straight to clients on non-standard ports using implicit TLS, ensuring security and no middle-men. This also ensures delivery is as timely as possible (sub-second typically, as mail software has to be fast to handle its volume).

Let's say you want to poll (ex. "/events"). The client uses IMAP to poll the Stripe server with a particular username/password. Check a folder, read a message, delete it on connection close. There are of course ready-made solutions for this, but you can also write simple IMAP clients really easily using libraries.

Let's say you want pushes (ex. webhooks). The client sets up the alternative to the webhook-server they'd have to set up anyway: an SMTP server. Use a custom domain, one that has nothing to do with the customer's main business, so nobody ever gets confused. Configure it to only accept mail from a "secret mail sender" (aka webhook secret). Part of the "SMTP webhook URI" would be what mailbox to deliver the webhooks to. The client then configures an MDA on their mail server to immediately deliver new messages to some business logic code. If the MDA or business-logic code has a bug, the messages will stay in the client's mailbox until they are "delivered" successfully. If the client's SMTP server is down, Stripe keeps retrying for at least 3 days, more if Stripe wants.

Stripe could actually implement both by keeping messages in an IMAP folder on Stripe's servers, and deleting the messages once the SMTP server confirms delivery to the client. Of course all messages already have unique IDs so removing dupes is easy.

You could implement all of this in a week, write almost no code, and still handle all the weird edge cases. Virtually all of that time is just reading manual pages and editing config files. The end result is a battle-hardened fault-tolerant event-driven standards-based distributed message processing system. The maintenance cost will be "apt-get update && apt-get upgrade -y", and anyone who can configure Postfix and Maildrop can fix it.


Hey throwaway! I think this could work, but it might not be the highest priority. Think about this perspective:

The API is entirely HTTP, and tries to meet users where they are by providing tools that they are most comfortable with. Frequently, these users are familiar websites or mobile apps. As such, webhooks are implemented over HTTP.

If there was an alternate way to integrate with events, it'd be something that's either:

1. accessible to novice users, or

2. delivers on high throughput/latency needs of the largest users, or

3. resolves a storage/latency/compute cost incurred behind the scenes

Thinking about these:

For #1, websockets would score better than SMTP

For #2, kafka (or a managed queue like SQS) would score better (many support dead letter queues and avoid the latency at the mail layer)

For #3, it isn't clear that SMTP reduces the latency, compute, or storage costs

SMTP might be familiar -- and it's possible for you to build your own webhook → SMTP bridge if you wanted it -- but doesn't score well enough on any of these metrics to be built in-house.

[Disclaimer: I work at Stripe, and these opinions are about how I'd approach this decision. They're not the opinions of my employer.]


Yeah, I totally agree it's really out of left field compared to what users are comfortable with. Like another commenter said, clients would probably laugh you out of the room for proposing it. (though that's half my point! why are we only accepting these half-baked custom solutions on janky platforms? fear of criticism? is it really saving anyone any time or money compared to the "weird solution"?)

But I'm not convinced on the latency/compute/storage comparison with Kafka or other solutions. I think a POC would need to be built and perf tested, and then tweaked for higher performance and lower cost, like most software. Considering the volume of traffic that mail software is designed for, I can't see how even a large provider like Stripe would have difficulty scaling a mail system to match Kafka. It's not like mail software is written in Java or something ;-)


What about events that need faster than 1 minute response times? Any push notification like system is going to be just as error prone. And what about multiple message handlers? And what happens when the send fails? Did someone write the code to check the inbox for them and handle them? When a send fails multiple times, is that logged and is there a system for clients to check that log? Message transfer isnt the hard problem in this domain.


There's nothing about SMTP that dictates response times or in any way makes it much slower than HTTP. A non-pipelining clients will require a few more network roundtrips if it connects and disconnects for every message, that's all.

> Any push notification like system is going to be just as error prone.

E-mail servers are built with retry logic and queuing logic already. The point is if you need queuing anyway, it offers a tried and tested mechanism with a multi-decade history and a vast number of interoperable software options. While there is now a relatively decent number of queuing middleware options, none of them have as many server and client options as SMTP.

SMTP isn't the best choice for everything, but it works (I've used it that way), it's reliable, and it scales with relative ease.

> And what happens when the send fails?

It gets retried. Retries are built in to mail servers. That's part of the point.

> And what about multiple message handlers?

What about it? Most SMTP servers provides mechanism for plugging in message delivery agents rather than delivering to a mailbox, or you let it deliver to a mailbox and pick it up from there. Or you plug in whatever routing mechanism you want to distribute the messages further. The sheer amount of ready-built options here is massive.

> Did someone write the code to check the inbox for them and handle them? When a send fails multiple times, is that logged and is there a system for clients to check that log?

Pretty much every e-mail server ever written provides a mechanism for handling persistent failures, and many of them offers heavily configurable ways of doing it. But yes, you'd need to decide on what to do about persistent failures. But you need to do that whatever queuing system you use.

> Message transfer isnt the hard problem in this domain.

The point of the article is exactly that reliable message transfer is the hard problem in this domain.


>so developers won't consider it

I think it depends on the developer. There's developers hammering out boring business logic as fast as possible and there's developers with a deep understanding of machine internals, protocols, and infrastructure. For the former, SMTP is black magic they'd probably never think of and involves engaging the one infra person that's always busy

It also means standing up and managing "infrastructure"


I sort of agree, but somebody already has to manage the "infrastructure" of their web apps, dns. They never mind adding more of their own home-grown services. If they used Kinesis instead that's another piece of infra to maintain. But you would never hear them say "what about Postfix instead". Regardless of infra, if it's new, they want to use it, even if something older and more boring would work better.

If I ever heard a dev at work say "No I won't use that new tech, it's too untested/I'll have to spend more time figuring out how to make it work well", I would shit my pants. Whereas if it's old tech, "it's not modern/I'll have to spend more time figuring out how to make it work well". It's practically software ageism...


You’re likely blinded by a “nodejs monkey developer” stereotype which prevents you to see that node is what everyone wanted back then. It’s very, very easy to create an http-based analog of any “traditional” service in node and to free yourself from learning all the shady details (which there is a lot) of configuring it and keeping it alive at all levels, were it based on traditional software. Node is extensible configurable networking itself, and http(s) is a quintessence of all text protocols. All that we wanted back then is available now in node at much finer granularity and much less configuration or headache. “They” spin up home-grown services because it is a natural one-page-boilerplate straightforward thing to do in node, not because of ageism or something similar.

I tell you that as someone who fiddled with sendmail.cf’s and other .conf’s way too much long before nodejs became a thing. Now it’s a relief.


> There's developers hammering out boring business logic as fast as possible and there's developers with a deep understanding of machine internals, protocols, and infrastructure.

Purely anecdotal of course, but I follow a number of the latter, and they're either sparsely employed or often employed in a capacity where it doesn't matter There was a comment in this thread where one person had such an idea, and it was rejected for what were essentially business reasons.


SMTP won't work for the customers.

Developers won't be able to use the existing email systems of the company, too critical and managed by another team. They will never be able to reconfigure it and get API access to read emails. Note that it may or may not be reliable at all (depends on the company and the IT who manages it).

Developers won't be able to setup new email servers for that use case. Security will never open the firewall for email ports. If they do, the servers will be hammered by vulnerability scanners and spam as soon as it's running. Note that large companies like banks run port scanners and they will detect your rogue email servers and shut it down (speaking from experience).


Nothing preventing offering delivery on alternative ports for people with incompetent security teams that thing port numbers is sufficient to determine if it's a threat.

As for "being hammered", rejection of invalid recipients before even getting to the DATA verb is cheap.

Having actually run both an e-mail service and SMTP used as messaging middleware, I have dealt with these issues.


The security team is not incompetent. Large companies do not permit developers to spin up their own email systems without audit and regulatory retention. The port number is sufficient to determine that the request should be rejected.

You could work around it but should you? You're exposing the company to fines and risking your job.

Better think of another way to integrate with the vendor, or find another vendor.

P.S. SMTP is easy to identify on ANY port, it's replying a distinctive line of text when TCP connection is opened.


> do not permit developers to spin up their own email systems without audit and regulatory retention

If they freak out over an SMTP server but don't freak out over a web server, then the are indeed absolutely utterly incompetent fools that should never work in this space.

In both cases code written by the company developers will eventually process untrusted textual input, and you need to deal with that with the same level of caution, and the protocol does nothing to change that.

> You could work around it but should you? You're exposing the company to fines and commiting a fireable offense. Better find another product that's easier to deploy.

I would not work around it - I would make the case that there's no difference in exposing a carefully chosen SMTP server than exposing a web server, and if the security team fail to understand that, I'd resign, because it'd be a massive red flag, and I've been successful enough to be in a position to not need to work for companies like that.

For that matter, in 25 years in this business I've yet to run into your hypothetical scenario, including at large companies, so I'm not at all convinced it'd be a genuine problem. Yes, I've been at companies where I'd need to provide a justification for getting a port opened. But never once had an issue getting it approved - including SMTP.

> P.S. SMTP is trivially identifiable on ANY port, it's giving a line of text when the TCP connection is opened.

I was responding to "Security will never open the firewall for email ports.". Point being that if they care about the specific port numbers, it doesn't matter.

[And I'll again point out I've actually run infrastructure like this].


Never worked in a bank? Never worked in defense?

I'm speaking from real experience too. It takes a while to open firewall in some environments, if you ever can.

One bank was the worst. There was a super stringent process to expose things externally. Opening the firewall port was just the beginning and that'd take 2-4 weeks if all goes well.

You'd struggle like hell to expose a SMTP server though because it would be immediately be rejected and flagged based on the port. Banks have to store, monitor and ensure the origin of all emails, they don't allow shadow email servers. And it's plain text so more reasons to ban (also a problem with HTTP, you should do HTTPS if anything).

Defense was simpler, mainly because there was no external connectivity in many cases. You don't need to worry about how to open a firewall when there's none :D


You have this Nodejs developer’s upvote.

At this point in my career (10 years in the game), let me simply defend node as the tool that got me here. Using it then to bootstrap my career was just as practical as using SMTP as you describe now.


I absolutely love your perspective. I feel the same way. s/Ruby+Rails/node for my situation. I believe there needs to be more respect paid to "bad" technologies. The measuring stick should include things outside pure benchmarks. Low barrier to entry technologies provide broad access and real life changes to folks that are able to pick them up and get hacking.


> SMTP

But... Why?

The HTTP protocol is so much easier to manage, load balance, use, etc.


The article is almost entirely the answer to your question.

> there are risks when you go down.

Solved by SMTP on protocol level. With HTTP, must be solved on both client and servers application level.

> webhooks are ephemeral. They are too easy to mishandle or lose.

SMTP has this baked into its heart. Loosing messages is possible, certainly, but rather hard to do. With HTTP, it's really simple.

> In the lost art of long-polling, the client makes a standard HTTP request. If there is nothing new for the server to deliver to the client, the server holds the request open until there is new information to deliver.

SMTP is push, not polling. So all those issues are solved for you.


> SMTP is push, not polling.

Yes. And if you want to poll, POP is polling, and IMAP has both polling and immediate notifications.


Yes, but SMTP is a protocol wherein a system opens a connection to another system and says, hey here's a message.

And if that doesn't succeed for some reason, it reliably queues and retries.

That's a push.


I was not disagreeing, merely providing additional information. I have edited to clarify.


So how are people supposed to consume this? With an SMTP client?

I think the bigger issue is that consumption isn't particularly friendly. Also, you still haven't solved the versioning issues.


There are better options than SMTP. Basically any message-oriented middleware / message queuing service can provide this. It's great for both sides, maintenance/outages can happen independently, as long as the queue stays online and has space everything is fine.


E-Mail isn't trustworthy. You may get a confirmation that an initial SMTP server accepted a mail, but that's it. There's also no good way to detect that an endpoint (receiver address) is gone for good to stop sending messages.

You will probably point me to SMTP success messages, but a removed mailbox might only be known by a backend server.

Also mail infrastructure will potentially include heavy spam filters etc. making it quite inconvenient. Not even mentioning security aspects with limited availability of transport layer encryption with proper signatures.


What you're saying is true of public e-mail infrastructure, but that's besides the point. As a queuing solution internally in a system you can make it as resilient as you like with ease because there's a huge ecosystem of resilient software you can use for it.

Same goes for security - your objection is true for public e-mail delivery without additional requirements on the servers or clients, but that is not relevant for a private infrastructure.


In a private environment you have tons of options. The post however refers to notification between independent entities on the public network.


Running over the public internet does not mean you rely on unknown third party mail servers. If I address a message to foo@apiendpoint.mycustomer.com, only the servers configured to handle mail for apiendpoint.mycustomer.com and my sending server is involved in the exchange. And that is if you trust MX records for this exchange rather than have the customer input the address of the receiving SMTP server directly.


I think that would be a great solution for these types of scenarios.

In an enterprise setting it becomes more complex if a 365 subscription is required, or active directory authentication is needed to receive emails. Does someone need to monitor the inbox to confirm it's working etc.

But after you mentioned it, I do wish that this was an alternative to webhooks that more service providers offered.


We used to do this for domain name registrations and it worked fairly well for years. However once you've been added to a spam blacklist it quickly breaks down, especially for time critical operations such as domain name renewals when you're scrabbling around trying to appease the Spamhaus gods.


SMTP doesn't reliably deliver messages, implementations of it do. A webshit could easily create an SMTP server (with the help of a library written by someone with actual programming skills) that silently drops messages when any error occurs instead of implementing all that robustness.


The very first startup I worked at used this for a sweepstakes leadgen form to send to MySQL via a Perl script running from cron.


Another option would be to publish an AMQP endpoint, I'm not sure what the security implications of this are though.


And far too slow for a lot of use-cases


There's nothing about SMTP that makes it slow. There's lots about public e-mail infrastructure that sometimes makes it slow.


going down this non-traditional path you might also consider using XMPP and ejabberd for machine-to-machine messaging


SMTP no longer reliably delivers messages. Try setting up an MTA on a Hetzner VPS and see how many messages get through


That is only relevant if you require delivery to arbitrary endpoints rather than to endpoints explicitly set up to process your messages.


That's not an applicable criticism for SMTP running on a private network and/or dedicated set of "mail" submitting servers, as in the specific model outlined in the grandparent comment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: