I've been using ES off and on since before 1.0 came out. It has always baffled m...

jillesvangurp · on Nov 22, 2019

It has to exist on a private network behind a firewall with ports open to application servers and other es nodes only. Running things on a public ip address is a choice that should not be taken lightly. Clustering over the public internet is not a thing with Elasticsearch (or similar products).

If you are running mysql or postgres on a public ip address it would be equally stupid and irresponsible regardless of the useless default password that many people never change unless you also set up TLS properly (which would require knowing what you are doing with e.g. certificates). The security in those products is simply not designed for being exposed on a public ip address over a non TLS connection. Pretending otherwise would be a mistake. Having basic authentication in Elasticsearch would be the pointless equivalent. Base64 (i.e. basic authentication over http) encoded plaintext passwords is not a form of security worth bothering with. Which is why they never did this. It would be a false sense of security.

At some point you just have to call out people for being utter morons. The blame is on them, 100%. The only deficiency here is with their poor decision making. Going "meh http, public IP, no password, what could possibly go wrong?! lets just upload the entirety of linkedin to that." That level of incompetence, negligence, and indifference is inexcusable. I bet, MS/Linkedin is considering legal action against individuals and companies involved. IMHO they'd be well within their rights to sue these people into bankruptcy.

z3t4 · on Nov 22, 2019

Software should be secure by default. Don't blame the user.

mySQL in comparison wont even let you install without setting a root password. And it only listen on localhost/unix-socket by default. Then you need to explicitly add another user if you want to allow it to login from a non local ip. I don't think it's even possible - to both set a blank root password and allow it to login from a public IP.

So you really think the solution is to blame some low level worker, and sue him/her? The blame should always be on the people in charge, usually the CEO, who set the bar for engineering practices, proper training, etc, or the lack of.

xkcd-sucks · on Nov 23, 2019

While I don't think blaming labor is constructive or ethical, it seems like most tools pose danger to users in proportion to utility. For example, cars can squish people, electricity can fry people, and power tools can remove limbs.

Typically, people start out using knives and bicycles as children, learn through experience that crashing and getting cut hurt, and carry those lessons forward when they start using tablesaws and cars later in life. How does this apply to elasticsearch? I have no idea.

z3t4 · on Nov 24, 2019

We could teach our children that software is very dangerous, especially databases. Or we could make software secure by default. But we also need to teach the user how to use the software properly. Learning by getting hurt is effective, but then we also need to have playgrounds.

zerocrates · on Nov 22, 2019

That MySQL stuff is all quite recent... up until 5.7 (?, one of the most recent releases, anyway) there's no root password by default and running `mysql_secure_installation` is a common (but not mandatory) step to, well, secure the installation and set a root password. I think MariaDB still works this way? Not sure.

I'm not aware of "bind to localhost" being the default, either. The skip-networking setting to only allow local socket connections is definitely not the default, and I'm pretty sure the default is still to bind to all interfaces.

z3t4 · on Nov 23, 2019

I installed mySQL a couple of months ago on a Ubuntu server, and got asked to set a root password. I've also installed mySQL many times on Windows. Secure install is the default. And it doesn't annoy me a bit. I like my software to be secure by default.

m00x · on Nov 23, 2019

This is ridiculous.

Software should be built in the best method of delivering maximum value to its users. A trade-off for usability can be made for certain cases like ease-of-use for new software. Redis was part of this a while ago http://antirez.com/news/96.

Engineers should know their tools before using them. It's a huge part of our jobs. You could introduce a ton of other vulnerabilities in software: XSS, SQL injections, insecure cryptography. Security is part of our job and matters we must know.

You don't blame a plane for a pilot mistake that was meant to be part of his training. Engineers in every other sector are responsible for their mistakes, we should be too.

Also, you don't sue the worker, you sue the company.

CydeWeys · on Nov 23, 2019

"Software should be built in the best method of delivering maximum value to its users."

Yes, and defaulting to insecure, thus repeatedly causing huge data breaches, is the exact opposite of delivering maximum value to users. It's delivering maximum liability.

sailfast · on Nov 23, 2019

I would argue that the single command to begin using the application and the ease of on boarding / querying data was a huge factor in expanding its usage. Elastic optimized for initial spin-up and getting things running fast. It works really well! Until you load it full of data on a public IP, that is.

PeterisP · on Nov 23, 2019

That single command to spin up the application can easily generate and show a copyable random secret required to use it, so that you can use easily but there's no option to use it that insecurely.

miohtama · on Nov 23, 2019

Onions. You need layers and defense in depth. Because even the best humans make mistakes and it is inhuman to assume perfectionism. Never rely on just one engineering feature.

lonelappde · on Nov 23, 2019

> You don't blame a plane for a pilot mistake that was meant to be part of his training

Did you miss that Boeing is right now risking bankruptcy for doing exactly this?

bjornjaja · on Nov 23, 2019

Honestly a lot of the problem is: people aren’t studying systems engineering OR security. Look at all the “learn to code in 21 days” BS and all the code academies.

There’s so much emphasis on abstracting away the systems with cloud-this and elastic-that and developers don’t know much about general systems engineering.

My recommendation to software developers: take the Network+ and Security+ exams at the bare minimum.

Honestly as much as people complain about process getting in the way of things, there should be checks and balances at any business that deals with personal information. Finance institutions are heavily regulated—these fkers should be held accountable.

RidingPegasus · on Nov 23, 2019

> "Engineers"

Maybe the hint is right there in your comment. Nearly all the people deploying these nodes aren't engineers in the slightest despite having someone given them such a title.

NicoJuicy · on Nov 23, 2019

It's not always engineers that use them.

Sometimes software managers have the sudden need to show statistics and other things.

Yeah, that was fun...

lonelappde · on Nov 23, 2019

If security is so important, why should we accept database developers who don't understand that?

kbr2000 · on Nov 23, 2019

Because... they dance the devops dance with their devop hats on! Security problems can be swiftly danced around until they actually surface, and can then be handled in the next round of "continuous delivery". It's also smart to postpone solving most issues until after they occur, so sales can continue bragging about "continuous improvement".

jerrac · on Nov 22, 2019

So, after some thought, here's why I don't consider it pointless to have basic auth built in.

It would keep ES from being completely open. If you wanted to get in, you'd have to comprise some part of the network that would let you read the username and password.

The way it is now, anyone can do a scan for port 9200 and get full access right away.

It is also important to have a username and password, even on secured networks. My test instance is on an internal network, and protected by both network and host firewalls, but I still make sure to secure it beyond that.

Basic auth would not provide a false sense of security. It is simply a very basic part of overall security. Not having it is a mistake.

GhettoMaestro · on Nov 22, 2019

> At some point you just have to call out people for being utter morons. The blame is on them, 100%. [...]

Your attitude is a symptom of a broader issue that plagues this industry: Indifference to risk*probability. If you don't ship software with "secure defaults" (depending on the threat/attack model), you essentially are handing out loaded shotguns, then blaming the "dumb" user when they inevitably point it at their foot and click the trigger. Easy solution: Don't hand out the gun loaded -- make the user do specific actions that enable the usage. Yeah, it creates some friction to first time deployment, but that's a secondary concern to having your freaking DB leaking all over the place.

outworlder · on Nov 22, 2019

But ES doesn't hand over a loaded gun . Someone went out of their way to load the gun up.

GhettoMaestro · on Nov 22, 2019

Bullshit.

If firing up a piece of software creates an unauthenticated, unprotected (non-TLS) endpoint to read-write data, that's a loaded gun. That is PRECISELY the default behavior of ES.

ES has jacked around for years by making TLS and other standard security features premium. To that, I say this: Screw ES and their bullshit business model. Their business model is a leading cause to dumbasses dumping extremely sensitive PII data into a DB that is unprotected - those same folks aren't going to go the extra mile to secure the DB, either by licensing or 3rd party bolt-ons.

Thus, why it must be shipped secure by default. Anything less is a professional felony, in my eyes. Also, screw ES again, in-case I wasn't clear.

YawningAngel · on Nov 22, 2019

Is it a secondary concern, though? As a startup, uptake is as vital as oxygen

wildmusings · on Nov 23, 2019

Tort law is going to catch up to software soon enough and people will be held accountable for negligently creating or deploying software that they should have known would cause harm.

The fact that someone else down the chain should have known better is not a perfect defense. If that misuse was foreseeable and you didn’t do enough to prevent or discourage it, then you can still be held liable.

geofft · on Nov 23, 2019

If startups prioritize their growth over the good of society, isn't the logical conclusion that startups are a threat to society?

glloydell · on Nov 23, 2019

They're not a startup.

late2part · on Nov 22, 2019

maybe. but there's always this....

http://www.team.net/mjb/hawg.html

cobookman · on Nov 22, 2019

There's something called defense in depth.

Even with ES deployed in an environment with proper network firewall rules...etc, I'd still want some sort of authentication/RBAC

jwandborg · on Nov 22, 2019

"Defense in depth" sounds, to me, like a phrase to justify multiple layers of imperfect security.

A single layer of cloth might not hold water, adding more layers of cloth may hold water for longer, but it's probably more cost effective to start with the right material.

peteretep · on Nov 23, 2019

> "Defense in depth" sounds, to me, like a phrase to justify multiple layers of imperfect security.

That’s absolutely correct! But you seem to be missing the fact that _all_ layers of security are always imperfect.

bound008 · on Nov 22, 2019

This is a fallacy of distributed systems. Never trust the network. Best case you get packets destined for somewhere else, worst case you your network segmented wasn't actually segmented.

blondin · on Nov 23, 2019

i agree with GP here. ES is to blame here. not long ago apache airflow had a similar vulnerability discovered about not having sensible authentication defaults. the reasoning on their mailing list was eerily similar to those defending ES here. same arguments (iirc)

history is our greatest teacher. i think ES will end up doing what that team did: they agreed to provide sensible & secure defaults.

tibbon · on Nov 22, 2019

Security in depth. If I compromise one part of your network, I shouldn't compromise it all.

jeltz · on Nov 23, 2019

PostgreSQL does the following things by default to prevent this:

    1. Only listen to localhost and unix sockets
    2. Not generate any default passwords

So the only way to connect to a default configured fresh installation of PostgreSQL is via UNIX sockets as the postgres unix user. Where PostgreSQL is lacking is that it is a bit more work than it should be to use SSL.

sedachv · on Nov 23, 2019

> It has to exist on a private network behind a firewall with ports open to application servers and other es nodes only.

Have you ever heard of the end-to-end principle, IPv6, or number 4 of the eight fallacies? http://nighthacks.com/jag/res/Fallacies.html

LaGrange · on Nov 22, 2019

> It has to exist on a private network behind a firewall with ports open to application servers and other es nodes only. Running things on a public ip address is a choice that should not be taken lightly. Clustering over the public internet is not a thing with Elasticsearch (or similar products).

I've met at least one cloud provider in the past (small Dutch thing) that provides _only_ public IP addresses. They do have customers, though one less now. Clustering over the public Internet is a thing. It shouldn't, but I could say the same thing about this website and yet here we are.

_gfrc · on Nov 23, 2019

Heroku does the same in non-enterprise tiers. Their databases are accessible by the public internet with no option to limit it to your own dynos.

jillesvangurp · on Nov 22, 2019

Well, lets agree it's a sad thing. Very sad.

LaGrange · on Nov 23, 2019

Oh sure, but sad things happen. And they can be even messier: I had a Jenkins instance "made" public because a sysadmin new to a hosting provider forgot to remove the public IP that gets automatically assigned to new things. We were lucky, being fairly sure nothing found it before I realised, but it was a strong lesson learned:

Any network may become public by accident unless you go to great lengths to make sure it doesn't. Configurations change and mistakes are made even by seasoned people. People bring devices. Unless there's an air gap, people's devices may be hacked and let stuff through. Put authentication and anti-CSRF on _all_ your stuff, always.

anaphor · on Nov 22, 2019

> Clustering over the public internet is not a thing with Elasticsearch

It is, sort of, https://www.elastic.co/guide/en/elasticsearch/reference/curr...

But it's not a feature you'd be using without a really good reason IMO.

jerrac · on Nov 22, 2019

That does give me some food for thought. Not sure I agree a username and password is pointless though.

Thorrez · on Nov 23, 2019

>Having basic authentication in Elasticsearch would be the pointless equivalent.

Instead of that they could implement a PAKE. That would provide security with no certificates.

iforgotpassword · on Nov 23, 2019

Honestly, I as a user don't give a shit what a good engineer should so. All I see is that my personal data gets leaked left and right by elasticsearch and not mysql or postgres. But its fanbois just keep shifting blame instead of reflecting about reality and going "hey yeah maybe we should try do do something about it on our end". So fuck ES.

sebsito · on Nov 22, 2019

I agree. Every anti-moronic default adds friction. I love that I can play with ES quickly via simple URL without any auth.

oblio · on Nov 22, 2019

That's how we got PHP, Javascript, Visual Basic, MySQL (before version 5), Mongo.

You'd think that at some point we'd understand that there's way more morons out there than sensible people.

rbanffy · on Nov 23, 2019

It can still bind to localhost or a local socket without auth.

0ld · on Nov 22, 2019

> It has always baffled me that ES doesn't require a username and password by default.

because auth was a part of their paid service (and by paid i mean 'very goddamned expensive') until like half a year ago when they made it free because of freshly emerged amazons opendistro free auth plugin

ibirman · on Nov 22, 2019

They offer security as a paid feature.

jillesvangurp · on Nov 22, 2019

Actually it comes for free now with the standard ES distribution. https://www.elastic.co/blog/security-for-elasticsearch-is-no...

ThePowerOfFuet · on Nov 22, 2019

>Security for Elasticsearch is now free

What a horrific title. Even simply typing that should have been a blinking neon sign to them that they had their priorities in the wrong order.

lmilcin · on Nov 22, 2019

That's incorrect.

The usual way of using this service is to have backend network configured that connects your services that is not available from outside (ie you have to traverse through services to reach it).

The so called "security" is just a paid feature for companies that want to use ElasticSearch but want to use it in "legacy" way because, presumably, they don't have people to design it correctly.

dtech · on Nov 22, 2019

That's still really insecure, because it means that as soon as someone manages to gain any access to that network or any of the services on that network has a security issue your database is wide open.

That means that if someone manages to get access to the. I'd say public internet with proper (encrypted) password auth is more secure than that.

lmilcin · on Nov 23, 2019

If attacker has access to app server it is already game over. App server typically already has access to all of the data.

The pods are akin to localhost networking where there is only one externally available application with multiple networked components.

rbanffy · on Nov 23, 2019

That's true, but there are usually multiple ways to compromise protected networks. You still need to protect the database against attacks that don't go through the app server.

m00x · on Nov 22, 2019

If an attacker gets a hold of your app server, they will be able to get the connection details for that DB, including the username/password.

Having a password adds a small layer of protection to databases that the affected app wasn't meant to connect to.

It adds some protection in that case, but the user should use best judgement if it's worth doing.

lacker · on Nov 22, 2019

If you set up elasticsearch on a cloud service like AWS, by default your firewall will prevent the outside world from interacting with it, and no authentication is really necessary. If you do use authentication, you probably wouldn't want username+password, you would probably want it to hook into your AWS role manager thing. So to me, username+password seems useful, but it isn't going to be one of the top two most common authentication schemes, so it seems reasonable that it should not be the default.

MongoDB also by default does not have username+password authentication turned on.

I think defaulting to username+password is a relic of the pre-cloud era, and nowadays is not optimal.

thegeomaster · on Nov 22, 2019

I don't see why, though. It's much safer to start with a secure setup and then have the user disable the security explicitly (hopefully knowing what they're doing). Yes, username/password auth is not that common, but isn't it better than having no auth at all?

outworlder · on Nov 22, 2019

Ok, let's say username/password is mandatory and enabled by default. I see to options.

Option one, they generate an unique password for every installation – non trivial to do, because at which point do you do it? It can't be before a cluster is formed, as you'll have a split brain generating a bunch of credentials. If you do it afterwards, then there is a period of time when you cluster is not yet protected. Worse yet, unprotected and handshaking authentication. So you don't do that.

You could make the user input the credentials. What is to prevent them from creating weak credentials? And worse, they have to do that for every node (or at least the masters). Not a good experience and lost credentials will probably be the subject of a good many support calls.

So most products don't do that. What they do is default passwords. Which is arguably no security at all and doesn't protect anything. It may make it just a tiny bit easier to do the right thing afterwards (by changing to better credentials). Still, there's a period of time while the cluster is unprotected (default credentials are as good as no credentials).

Authentication does little to protect against the sort of people who are exposing databases to the public. If it is easily disabled, then they will be doing just that. Because they are already doing that by forcing databases to bind to publicly accessible interfaces.

thegeomaster · on Nov 22, 2019

I'd say option two is the only one viable. You deny access to the service until credentials are set by the user. You print huge warning labels while the credentials are set by the user to remind them of the possible consequences of setting weak credentials.

Yes, lost credentials will be subject of many support calls. Then, it boils down to your priorities. If you care about minimizing support calls, then sure, leave everything open to everyone. It will surely result in fewer access problems.

On the other hand, if your motivation is actually preventing your end-users from doing stupid things, it makes sense to just do the most conservative thing as default. Let the user change to the more liberal option, but not before informing them of all dangers that might befall them in that case.

I refuse to believe in this narrative of the end-user just being a stupid automaton who does not have any agency, and that any default imposed upon them will just result in them overriding the default with their terrible practices and ideas. I think there is a possibility of education and risk reduction.

jerrac · on Nov 22, 2019

I'd argue that the "pre-cloud" era is still going strong. And that is a good thing. My workplace has it's own data center. There are some downsides, but I prefer it.

So username+password really is needed. And should be included by default.

Also, I'd expect the same of something like MongoDB. That it doesn't have that by default is just baffling.

dmos62 · on Nov 22, 2019

Password auth over HTTP is horrible. Short of binding a public IP address to your instance, basic auth without HTTPS setup is probably the worst thing you can do.

paco_sinbad · on Nov 22, 2019

It's a marketing ploy by ES.

They aggregated the data and published it so that the viral breach would spread their name around because all publicity is good publicity.

Just riffing of course.