On CloudFlare & Self Hosting
Posted in selfhostingdecentralisation
Two or three ago a friend asked me which domain registrar I was using for things these days, and it sparked some discourse around the use of CloudFlare especially with self-hosting. Now this is a topic I've given considerably thought to over the years, but never actually put into one place concisely. This is an omission I wish to correct here.
The crux of the argument essentially boiled down to if CloudFlare was a "good" company, and if what they had to offer was of any benefit to an open and decentralised internet. Unfortunately, a lot of what can be said on the matter comes from almost pure speculation, and in many cases there is very by way of evidence that could be provided. A lot of the technical arguments also only look at individual cases, which is how it should be, but miss the bigger picture. As such, I'm going to attempt to argue around the concepts rather than the implementation itself.
I want to warn all readers; this is a VERY heavily opinionated technical piece combining a decade of both personal and professional hosting experience with personal beliefs and morals. I do not expect everyone to agree with me, and this is a good thing, I want people to form their own opinions and draw their own conclusions.
DNS itself is often a very controversial topic, not helped by the existence of Five Eyes and its relatives, although I shall attempt to avoid going down this particular tin foil hat wearing rabbit hole. To avoid this turning into several volumes, for those who are unfamiliar, I'll let CloudFlare themselves explain what is DNS.
The most basic service CloudFlare offers is DNS hosting, they provide a very capable tier of it for free. Upon registering, you get given some values for NS 'glue records' to pass onto your registrar, which tells the global internet where to send its queries for your domain, and you're done. DNS hosting is so core to what CloudFlare does, almost everything else they offer is dependant upon it. The only true DNS upgrade available is custom nameservers for business customers, which even then is only something if you are super paranoid about brand identity at a technical level
Let us imagine that CloudFlare wanted to host your DNS in bad faith, what potential attacks are available to them. As far as I am aware, there are only two major ones; serving bad responses, and logging what requests are made. I consider the former so easy to detect that no sane company is going to risk their entire reputation on it for minimal gain, and the latter a non-issue if end users are using properly configured resolvers. Both of these arguments are also undermined by basic DNS being entirely unprotected, anybody along the path of the request could modify or log it. Whilst they are certainly not alone, as this is also offered by most reputable DNS providers, CloudFlare do offer a range of functions for securing DNS.
The other half of DNS you need to be aware of in any hosting is DNS registration, done through a registrar. Your registrar acts as custodian for your domain, telling the top level domain servers where to send people looking for your domain. They are also the ones who take your money for this privilege, and who are responsible for collecting and validating the information they are legally required to store.
This gives them a few important powers, making profit of you directly, or from the personal information you have to provide, and response poisoning as with DNS Hosting. Rather unusually in the registrar world, CloudFlare offer their registrations at cost, as well as providing DNSSEC and WHOIS privacy as standard, the latter of which some registrars attempt to charge extra for. Regarding abuse of information, I would have to provide this to any registrar, and with the exception of an email address and phone number, is public information anyway (at least in the UK). I personally cannot see what value they could gain from selling this information given the only thing it will accurately correlate with is the domain registration itself.
As for response poisoning, I can't see anyway in which this would be advantageous for any reputable service for anything more than a few minutes before its easily detected, after which their reputation and trust would be worthless, making it effectively a one shot attack.
I'm going to group together a large bunch of things here which all effectively boil down to the same talking points, all resulting from CloudFlare having access to the unencrypted data. Unlike with the DNS points above, there are many more attack vectors which is the interesting bit rather than the services themselves. For those who are interested, I've included the following in this summation:
- Tunnel (
Other services like Images, Streams, KV etc. also have many of the similar considerations, but I do not have first-hand experience with them, so my arguments may not be as applicable.
Imagine this, every website turns on CloudFlare protection. Apart from the monumental amount of data transit that'd suddenly be required of them, that would give CloudFlare the ability to decide who gets to access what.
This would unequivocally be a complete disaster.
The internet works because it is computers talking to each other via multiple routes, if one is down, another can take its place. There isn't any way in which giving a single entity that level of power ends well for the internet as a whole.
This isn't entirely hypothetical either, try browsing normally using Tor and see how many sites just outright do not work. Historically CloudFlare just outright blocked almost all of Tor, and unless your exit node was very new, you'd get hit with an unpassable challenge page. To take our first scenario again, suddenly Tor just ceases to work.
Many online argue that their current market share already gives CloudFlare too much power in centralising the internet, including myself on occasion. This is certainly a massive trade-off you have to consider when opting to enable these services. With how things are at the time of writing, I consider this an acceptable trade-off for the security and speed benefits that come from this approach.
One of CloudFlare's main selling points is their CDN and DDoS protection, often referred to "turning on the orange cloud". Effectively at this point, all the data going over the wire is visible to CloudFlare; usernames, passwords, bank details, everything. Generally, this is known as a MITM attack. They are free to log, analyse, modify, and manipulate anything they like here, and as an end user it is next to impossible to detect this is happening.
Worse still, with such specific access, they are able to do it only if a specific set of criteria are met. Perhaps they are only interested in compromising a single user, or are looking to link the same person from site to site.
To put it bluntly, ask yourself why CloudFlare would want to snoop on you or your users. What makes you so special that CloudFlare are going to risk their reputation for handling very sensitive plain text data (PCI or HIPPA compliance for example). Quite simply, I don't think anything I traverse over there is worthy of that, your threat model may vary.
The best way to ensure that CloudFlare proxying is a non-issue is to provide security at an application layer with E2EE By way of example, I have CloudFlare services enabled on my Matrix homeserver. This gives me the best of both worlds, the server is protected from bad actors and DDoS attacks (which are crippling for the inefficient Synapse server). With E2EE enabled, CloudFlare only has access to some basic client to server or server to server metadata, the message content and a good chunk of metadata is protected by the Matrix protocol itself.
Before you get your pitchforks out, I'm talking about technical discrimination.
You have the option when using CloudFlare to only enable it on certain subdomains. Again this is best explained by an example of how I have things configured, but your mileage may vary. What's important to note here is all of these services are running on my home server, that also has some inbuilt defences including CrowdSec which is a topic for another day.
|Service||CloudFlare connection method|
|Authentik||Direct TLS connection|
|Home Assistant||Direct TLS connection|
|Recipe book||CloudFlare proxy|
|Matrix||CloudFlare tunnel (providing dynamic fail-over to GSM for status monitoring)|
Hopefully this demonstration shows partially how I make a decision about what level of access CloudFlare have.
Anything involving potentially sensitive content (such as passwords) in plain text always uses a direct TLS connection, services that just provide content are proxied for protection/speed boosts, and in the case of Matrix,
cloudflared tunnels are used to provide automatic fail-over to a mobile data connection as this is used for some alerts.
I get to discriminate based on the context what is important to me.
Please don't take my word for anything I've written here, it is almost entirely opinion and what works for me, my setup, and my guiding values, is highly unlikely to work for you. I've been hosting things online for a decade now, the internet landscape has changed drastically, as has my approach to things.
In short, in the capacity they have access to my data, I trust CloudFlare. While them centralising the internet is a bad thing, they have so far shown themselves to be responsible custodians of this power and have not abused it. Historically they have had issues with anonymous access which would ordinarily be a red flag, but recent decisions to provide Tor hidden services and replacing ReCAPTCHA first with hCaptcha then their own Turnstile service show a new commitment to protecting this. I'm mindful of what plaintext access they have to data I host and the drawbacks that come from it.
For now, they will continue to get my support, usage, and registrar business. The only thing I can say for certain is I will continue to make things a vendor-agnostic as possible so should CloudFlare change, I can change as well.