Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hacker leaks millions more 23andMe user records on cybercrime forum (techcrunch.com)
490 points by coloneltcb on Oct 18, 2023 | hide | past | favorite | 394 comments


> 23andMe blamed the incident on its customers for reusing passwords, and an opt-in feature called DNA Relatives, which allows users to see the data of other opted-in users whose genetic data matches theirs. If a user had this feature turned on, in theory it would allow hackers to scrape data on more than one user by breaking into a single user’s account.

23andMe's blame is an obfuscation of the problems which underlie this situation.

1. There is the matter of having insufficient granularity in their sharing options within the DNA Relatives program. It comes with two general levels, which is not enough.

2. 23andMe limits one to seeing the closest 1500 matches who have opted-in to DNA Relatives. That would have allowed a hacker to collect data on most people in the database despite having only a few thousand compromised accounts. (Haplogroups, ethnic origins predictions, names, profile data, a list of relatives, and information on geographic origins.)

3. The 1500 match limit would be helpful, in theory, however for some time (recently) it was possible to see profiles beyond those 1500 matches. I'm not certain if these were limited to only those who shared matching DNA segments but who fell outside of the 1500 match limit. And I can't be certain if this was part of the method exploited by these malicious actors.


The 23andme page on "DNA relatives" says this though:

> If you choose to participate in the DNA Relatives feature, you have multiple privacy options to suit your individual preferences. For complete privacy, you can opt out of DNA Relatives entirely.

How is a feature opt-in if you need to opt-out to get "complete privacy"? What does complete privacy mean in this case and how does it compare to privacy that you can reasonably expect?

I call bullshit on their statement that the feature is opt-in.


When you sign up they present you with the option to fill out a profile for use of the DNA relatives feature. You can either fill it out and join, or click a different button saying you don't want to join.


Maybe before, but that wasn't my friend's experience when signing up.


there's laws around this topic and what is accessible for the sake of murder and kidnapping cases that requires people opt in, so i don't think that would be the case


I think this might just be a slight difference in language usage. "opt-in" and "opt-out" as adjectives mean "something that is default off/on (respectively) that you can choose to select the other mode".

But "opt in" and "opt out" as verbs mean "choose to/not to participate". So you can in fact "opt out" of an "opt-in" feature. It means choosing to keep the default of not opting in.


That's a misuse of those terms. If you're presented with a switch that's in the the on position you don't switch it off, it's arguable that you've opted to leave it on. If you never realized the switch existed it would definitely be incorrect to say that you've opted to leave it off, since you can't make decisions about things you don't know exist. Yet, both cases would appear the same if all you know is which users had an opt-out setting in the in position.


The verb “to opt” just means to make a choice. Programmers created the “opt in/opt out” binary as a useful way of describing how new features got rolled out. But it’s perfectly fine to say (for example) that I “out out” of voting, even though voting is an “opt-in feature” of our society (depending on where you live).

You could rephrase the 23andme statement to be something like “users of this feature can choose from a range of privacy settings. For complete privacy, do not use this feature (which is the default setting).”


Exactly, so I'll reiterate: if all you know is that setting A is opt-out and that user B has that setting turned on (i.e. in its default value), it's incorrect to say that the user has opted into the setting, because you don't even know if the user knows the setting exists, and one can't make a choice about something one doesn't know exists. The most you can say is that the user has not opted out of the setting.


You can’t be sure of that, maybe they think they opted out but failed to do so correctly.


If the default setting is to enable the feature, then the feature isn't opt-in because the user didn't "opt" (ie, choose) to enable it. For a feature to be opt-in, the user must explicitly choose/opt to enable it. Otherwise, the user didn't opt for anything and if you enable it anyways then the choice to be made is to opt out.


This use of opt-out is really an extreme edge case to be honest. There is such a heavy emphasis on civic duty, part of which is to vote in all kinds of elections, that even though the system might strictly be an opt-in voting system, everybody _expects_ you to have already opted-in. To be honest, as somebody not from the US, I don't even understand why you need to go through any hoops to vote. All of it just seems like voting suppression to me.

Anyway, this strong social expectation to vote is the only reason you can even remotely say that you are "opting-out" of voting.

Any other case, where a decision to be made only exists really as a feature in the UI, there is a much tighter relation between the implementation of the UI feature being by default on (thus "opt-out") and the way that we speak about it being "opt-out". If thus, by their own admission, there are things to opt-out of, then this is by definition an opt-out feature. Claiming the opposite is disingenuous.


It's not a misuse, it's the natural language meaning of the word "opt", which long predates anything like the opt-in/opt-out usage that refers to default settings.


Wow. What a shit company. It isn't the user's responsibility to secure the data that you store, it is 100% on you. If the user is reusing a pwd that might be the reason a single account got hacked but to beat a dead horse, you can mitigate that easily as a company.

Karma seems to always work things out.


I also regret giving them my data. At work, we bought one of their first kits. We were doing genetics research and my boss got some for Christmas and gifted them to everyone in the group.

Technically, 23andme is also pretty bad, which is hard to understand given that they employ very competent people and they are well funded. Seems like a space ripe for disruption. Their biggest competitor, DecodeME, never really aimed at B2C and just used customers to harvest data, then sold to a pharma.

23andme genetic risk scores are mediocre at best. Promethease makes better predictions. For example, in my case I have a high risk MHC allele, which is both trivial to predict and well understood since the 1980s. Never popped up on their reports, yet it is the first item you see if you feed your 23andme raw data to Promethease, or if you analyze the data yourself.


From a decision-making standpoint it seems difficult to argue that you made the wrong choice. At the time when the $99 kits were ubiquitous, 23andme seemed like a solid, reputable company.

Back then, few people had the mindset of, "if they own my data, they own me." But we're starting to see it take hold.


I don't know about that. Geneology has been a hobby for me for a couple decades and I'd say only tech illiterates were willing to trust 23 and me. I've never seen any company I've worked at do well enough with security that I'd trust them with my DNA and with the constant data breaches across the industry with zero consequential penalties, this seems like the norm. Have you ever seen security done right anywhere? In my experience, it's always the bare minimum. Banks are about as close as it gets and that's only because they have higher obligations than most.


Gave them my DNA last year, am not tech illiterate. It was cool to see the results, though not life-changing. I don't regret the decision - I don't understand why I should care that my DNA sequence is on a shady website somewhere. I don't understand the threat model people have here - how will my life be negatively impacted by this?


> how will my life be negatively impacted by this?

Your would-be future employers may reject you because of this data. Why hire someone with a higher risk of certain diseases or disables? It'd be illegal, but companies don't care about breaking the law if it's profitable and it'd basically take a whistleblower for anyone to know it happened. They certainly won't tell you that's why you weren't hired.

You could be denied housing or be targeted by extremists. More likely though, you'll be targeted by pharmaceutical companies. If the police didn't already have a copy of your DNA on file you might now have a place in every police line up, in any state in the US, for every crime committed where DNA evidence is collected. You could get wrongly flagged as a match through human error or statistics but either way it'll be on you to hire the lawyer who will have to prove your innocence.

We're moving toward a digital caste system (several really) where the data governments and corporations have on you will determine what you're allowed to do, how much you'll pay for things, and what opportunities you'll have. Every scrap of data you surrender will be used against you by anyone willing to pay for it, used in whatever way they think will benefit them, at any time, and you'll probably never even realize what happened. Just like right now, where companies don't tell you that they used your personal data to determine how long to leave you on hold. There's no telling what kinds of harms this could bring you, and there's no taking your data back to prevent any of it either.

I hope that data never comes back to haunt you. I'd sure hate to need to count on that never happening though.


This seems pretty far fetched.

Do you really think a judge would allow a guilty verdict based on stolen genetic data obtained from a hacker?

Do you really think braindead landlords and HR people would make decisions based on Promethease or whatever future tool replaces it?

Monetarily the genetic data is marginally valuable at best, which is the same reasons 23andme revenue comes almost entirely from novelty-seeking consumers rather than industry.


> Do you really think a judge would allow a guilty verdict based on stolen genetic data obtained from a hacker?

The judge won't have any idea how the innocent person's data got entered into the government's DNA database. The same way that judges doesn't care how police got your fingerprints on file (They got mine when I was in grade school. Teachers lined all the kids up in the hallway and the police fingerprinted us all. They told us it was in case we were kidnapped.). The judge cares about how the DNA was collected at the scene of the crime. It's enough that it matched DNA in the government's database. Even if it was discovered that the DNA came from 23andme's data I doubt they would care.

> Do you really think braindead landlords and HR people would make decisions based on Promethease or whatever future tool replaces it?

They already perform illegal background checks on employees and renters. (see https://money.cnn.com/2014/04/09/pf/data-brokers-ftc/index.h...). Whatever interesting data can be extracted from the DNA that was leaked will be added to the dossiers data brokers have on the victims.


> This seems pretty far fetched.

At the begin of Hitler's reign, the Nazis started to ask people at many occasions for so-called "Ariernachweis" papers. Those were collections of documents to show that someones ancestors were pure according to their race theory. Many people didn't question this at the beginning. Later that data was used to round up minorities, i.e. to commit the wellknown atrocities.

Once data is centrally collected, you cannot know for which future purposes it'll be used. So, the question with regards to companies like 23andme should be: Do you trust the current owners, all future owners, and current and future business partners to not misuse and safeguard your DNA data?

> Monetarily the genetic data is marginally valuable at best Tell that to big pharma, health insurers, adoption agencies, dating sites, and companies that produce addictive products for consumers.

> Do you really think braindead landlords and HR people would make decisions based on Promethease They have shown to make decisions based on DEI declarations. I rest my case.


> Do you really think braindead landlords and HR people

Maybe that part is far fetched. But insurance people will make user off it I'm sure. By letting this data out there you might be opting in to higher costs, or hassle getting insurance at all, that way.


You convinced me that, as I was already suspecting, there is no more risk in having your dna public than, for instance, having a picture of you on the internet. Arguably, even less.


"no more risk" is an odd way to frame it. Its all compounded. Having a pic of yourself online is a risk. Having your DNA leaked a risk. Carrying a cell phone is a risk. Using Google is a risk. The more risks you take you more likely you are to get screwed over.

It doesn't really matter if you're the guy who gets arrested for riding his bike (https://www.nbcnews.com/news/us-news/google-tracked-his-bike...) or the guy who gets arrested because of his DNA (https://www.science.org/content/article/forensics-gone-wrong...) or the guy who gets arrested due to facial recognition (https://www.cnn.com/2021/04/29/tech/nijeer-parks-facial-reco...) it's going to suck for you either way. They're all just different types of ammo that will eventually be used against you somehow or other.


You forgot finger prints, photo's, AI, medical history through routine exams, spending habits harvested through CC use and any form of digital banking, living habits analyzed through electric bills, internet activity, auto use, travel, etc.

Your fear is misguided and you have already lost the game.


You seem to be supporting the fact that this is a valid concern. Every piece of data can (and eventually will likely) be used against you at some point. The more data you give up, the more ammo you're handing over to the people today and tomorrow who want to exploit you. DNA contains a ton of data, and it's very different from the data in your utility bills or your GPS history. Keeping your DNA out of the dossiers data brokers keep on you would be a smart move even taking into account how much other data they already have.


The breach affects those related to you and affects you multi-generationally so there’s a lot of time for the impact to materialize. There are strong financial incentives for genetic discrimination on the part of insurers and employers. There are also plenty of fascists are happy as a clam to discriminate against anybody with certain genes.

If there’s any reason not to care, it’s not the lack of impact, it’s the impossibility of securing the data. I could sit here all day and convince you that you should care and then your cousin would get a dna analysis done and that would ultimately make all your caution mostly irrelevant. The only effective way to ensure genetic privacy is a legislative effort to control access to genetic databases, trying to avoid being put in such a database is only going to slow down how fast this happens.


Once they sell it to your insurance company and they deny you some type of coverage for whatever reason.


see, this is where I don't get it. Can you send your material anonymously (burner email, pay in cash/crypto/prepaid debit card)? Then how could they match your DNA to your identity to sell it to insurance companies, etc?


I'm sure that is possible. But what ratio of the public actually would think of doing that?


Ask one of the serial killers now rotting in jail thanks to 23andme!


Was there ever a case of a convict getting caught by their direct DNA being found in one of these databases? I thought all cases where correlated from relatives. Government gets DNA sample, asks the databases: 'who do you know that's a genetic relative of this suspect?' and then they go and interrogate that person's every third cousin. You can't keep your family tree private, your birth certificate is out there. Opting out of 23andMe won't help you here.


A couple. One infamous case was the Golden State Killer[1] case. Though in that case, it was GEDmatch, not 23and me -- similar service, though.

1. https://en.wikipedia.org/wiki/Joseph_James_DeAngelo#Investig...


But that’s just it. You now get the 23 and me defence… my DNA data was hacked therefore there’s reasonable doubt the DNA linking me to the murder was synthesized from the 23 and me leak.


The fact that there is so much potential for the use of that information yet we haven't even started putting mechanisms in place to utilize it is what scares the hell out of me. If the threat model is 'to be determined' then especially when the data is being used commercially, for somewhat trivial reasons and without any substantial legal protections then the way to act is 'from my cold, dead, hands'.

Just remember that no matter who in charge you think is neutral or bad or great, things change, attitudes change, shit happens (remember the Patriot Act?)...

It isn't paranoid to say 'I don't trust the future, let's act cautiously instead of frivolously with things that have the potential to be extremely valuable to me, extremely impactful to society, and in which currently sits the greatest unexplored potential of this generation'.


Cool so you going to wear gloves every time you eat out, touch a door, hold a glass etc.? You’re shedding DNA throughout the day. How is that fundamentally different?


If I said I didn't want to go hunting with Dick Cheyney would you ask if I wore a bullet proof vest everywhere I went? When people look before walking into an intersection do you ask them if they erect bollards in front of their house?

But ok. Next time you go in for surgery tell the doc not to wash their hands because you aren't a scaredy cat.

Refusing to willingly take stupid risks is different than trying to live a life without them at all.


I agree, I have had my raw DNA data and all the analysis results publicly available since the day I took the tests. No problems.

https://globatic.blogspot.com/2013/10/the-23andme-full-and-r...


Biometric auth being used more and more every day. Not hard to see requirements or and crime/impersonation in the future. Gattaca is still far off but one step closer than it was.


Someday when DNA synthesis machines have enough write length to be able to synthesize entire human chromosomes, someone with your genome data could clone you without your consent. Even if this takes 50 years to become possible, you still might not want unauthorized clones of you being made using data that you gave up when you were younger.


That's not possible with 23andMe data, they have 640,000 SNPs, not the entire genome/exome/methylome. They have 640k points where the genome is often different from others, but your own genome is 3Gbp long (3,000,000,000 basepairs) with usually a few million SNPs per person. 23andMe has a subset of the diversity in your genome.


That's good to know. In that case, my concern would lie with the physical saliva samples that 23andMe has retained, since they could be comprehensively sequenced later.


That is true! Samples are usually good forever in the freezer. Do they keep all samples?

Running -80C freezers is not cheap! I have 3 -80C freezers in my lab, those large chest-freezers, and each uses 22 kWh per day for a total of 66 kWh per day. Apparently the average US household consumes 29 kWh per day, so we use up 2 houses per day.

Our freezers certainly don't hold the 14 million samples 23andMe supposedly has, more like in the low thousands. They'd need the power-usage of a city to keep all those samples OK!


You can extract the DNA and store that instead - and they already had to that to their analysis in the first place. Far smaller volume than the raw sample.

Storing this for an effectively indefinite amount is not uncommon. I used to work at a clinical genetics lab, and some material had to be stored (by law!) for a whopping 120 years.


As opposed to getting a bit of hair/dead skin from you, they rather hack into some digital system?


You don't hack into the system to obtain the data. You buy it. It's literally the most profitable thing on the planet right now.


I hereby authorize any and all clones anyone ever wants to make out of my full DNA or parts thereof.


Ignorance is bliss.


>Have you ever seen security done right anywhere? In my experience, it's always the bare minimum.

Google is pretty good at security, no?

https://cloud.google.com/blog/products/management-tools/lear...


Google itself is a threat, about as user hostile of an organization as there has ever been.


That was not the question asked. Google is good at data security, and at the same time they are probably the biggest privacy violators, and both of this can be true.


if you talk about security, you're talking about threats. To put Google into both categories, sorry, GP is right.


Some users may consider targeting advertising to be a threat, others less so.

My take is, if targeted advertising is the biggest thing you're worried about in terms of cybersecurity, you are probably doing reasonably well at staying secure.


This is the kind of answer I like to see on this forum, and I hope more people become as aware and outspoken.


It's the kind of answer that makes me wish people actually followed the HN guidelines. I understand people are very passionate, but I prefer to make decisions based on facts. My experience is that the people who are super passionate often show themselves to be uninformed when you drill down and start asking them hard questions about their strong claims.


This seems like hyperbole.

I'm not particularly worried about Google stealing my credit card number and making fraudulent purchases. But I am worried about criminal organizations doing this.

There's so much FUD on HN about big tech companies, but I'm skeptical that their wickedness actually lives up to the hype. I suspect it is more of a clickbait miasma (journalists hate Google because they took revenue from the media industry) than anything based in fact. Google provides free and useful products (Google Search, Gmail, Android) to people across the world. Billions of people use this stuff voluntarily -- why?

If Google is "about as user hostile of an organization as there has ever been", it should be easy for you to come up with at least 3 examples off the top of your head (no searching for "14 ways Google is evil" listicles) of them being at least as nasty (on a per capita basis relative to the population of people they have relationships with) as the literal mafia. It should be no trouble at all. So, could you please do that for me?


1 - They built a product to follow everyone around online and in real life, breaking multiple consumer protection laws in the process, say 40 of our 50 states https://www.npr.org/2022/11/14/1136521305/google-settlement-...

2 - They've created an illegal advertising monopoly https://www.justice.gov/opa/pr/justice-department-sues-googl...

3 - Violate the law to illegally collect data on children, say multiple states https://www.ftc.gov/news-events/news/press-releases/2019/09/..., https://iapp.org/news/a/google-new-mexico-ag-settle-coppa-al...

Over a billion people smoke cigarettes, doesn't make it a good idea. Most of the worlds population can't afford an iPhone, so are left using Android.

Is it unreasonable to think they're probably doing something else illegal that hurts us right now? Or that they'll use their ill gotten treasure hoard to buy the resolution of their choice when they're caught again?


I still think your statement was hyperbole, but thanks for giving me the list I asked for.


The company pwned by the NSA for over a decade?


I agree that if the NSA is your threat model, then you shouldn't trust any company.

I also think we can learn a lot about security from Google even if they comply with federal court orders requesting user data. "Willingness to comply with federal court orders" and "competence at securing data against cyberattacks" are two different things.


"if NSA is your threat model" sounds like something somebody from the 1980's would say. It's been a long time now that we've known they spy on everybody and that they share the data, and they Five Eyes the data they can't get. NSA is everybody's threat model and it has been for a long time. Intervening in electoral politics, getting private companies to do their bidding... where have you been?


The turn this thread has taken has been interesting. A few comments ago, stcroixx wrote:

>Have you ever seen security done right anywhere? In my experience, it's always the bare minimum.

I think there's a lot of ground between doing the bare minimum for security and hardening your organization against the NSA. Every step towards greater security is a step I support, even if your organization isn't able to reach the "hardened against the NSA" level.

I'm happy for you if you want to harden yourself against the NSA, but I dislike black-and-white thinking. I care about harms to users which come from non-NSA threats too. Case in point: the original post about hackers selling 23andme data -- presumably to clients who are not the NSA, in some cases.

If every discussion of how to improve security gets derailed into a discussion of how evil the NSA is and how practically no one is secure against them, then organizations will continue to do security badly, and we'll see more breaches like this 23andme breach. Fatalism is a self-fulfilling prophecy. I see it every day here on HN.


When "your" military officers are selling state secrets out for $5k in bribes [0], you realize there's probably very little you can do to prevent bad actors in positions of trust from blowing up any security model anywhere. Your only choice is between minimizing your risk with hoping for the best, or rolling your own everything and not taking part in any modern anything and living and dying alone. And even then, there's still probably going to be a file on you somewhere.

[0] https://abcnews.go.com/US/2-us-navy-sailors-arrested-alleged...



What's interesting to me remembering this is that back then, even that late into Google's life, Google had enough people to actually be pissed off about this and try doing something about it. Google of today? I have the sense that management would just shrug its shoulders and let the violating by any nation-state-backed group that pleases continue.


There's a mutual cynicism here. If Google's users think: "Google will violate my privacy no matter what, there's no point in complaining", then Google's executives will think: "Users will believe we are weak on privacy no matter what, there's no point in protecting user privacy".

To break the cycle, it helps to share concrete evidence of Google misbehaving rather than just presenting it as a fact that everyone knows. You get what you incentivize. If the feeling that Google sucks on privacy isn't linked to specific Google misbehavior whenever it is brought up, Google execs will correctly realize that users will feel the same no matter what decisions they actually make.

As a concrete point for discussion, in the zdnet article it states:

>After the news about NSA snooping first broke over the summer, Google decided it was time to start encrypting its datacenter-to-datacenter communications.

Is there an analogous security story from more recently where Google didn't try to address the problem in a similar way?


> 23andme seemed like a solid, reputable company.

One thing I've come to realize over the past couple of decades is that with internet/tech/VC startups in particular, the statements they make about goals, philosophy, core values, and ethics are subject to change as needed to secure more funding, increase revenue, or in case of acquisition.

You really cannot trust what any company says until they've been in business at least ten years with an unbroken record of responsible, trustworthy operation. And even then it can all change with a merger.


In other words, you cannot fully trust a company. As long it is a collection of brains that may change membership at any time.


Unless it signs a social contract that legally binds it to dissolve upon turning evil -- but that would take some extremely principled owners.


Well, maybe. I for one _absolutely_ didn't participate b.c. I didn't want my DNA and personally identifying information owned by any company. I can't imagine that there aren't many others like me.

I would, however, love to send my DNA to a company if they could provide the results without knowing any information about me whatsoever. For instance: I would be more than willing to buy the kit with cash and send it back with a burner email. Has anyone heard of such a service?


But then, without all that extra data, they would actually have to do some dna testing, rather than than determining your likely background heuristically.


I read somewhere a while ago that the FBI gets free access to the data, which was enough reason for me. This is just icing on the cake. Though more than likely a few of my relatives sent it there already, so not that it matters anyway.


There is a reason Pootin, and some other world leaders, have a black case carried behind them by a member of security team. This is to collect poo so no genetic information falls into the vials of the enemy.


> Back then, few people had the mindset of, "if they own my data, they own me." But we're starting to see it take hold.

People have been screaming this from the rooftops even back then.


This is hilarious, and completely absolves everyone from bad decision making like this. My immediate reaction to 23andme was "there's no way in hell I'm sending something as private and personal as my DNA to a private company".

Why? Because there's no telling what happens to it. It's a failure of judgement to believe that just because a company is reputable today that it will be reputable tomorrow. Companies change owners, they change board members, they get bought and sold. And _hacked_.

So let's stop this nonsense of giving everyone a free pass because it was a "solid, reputable company". Maybe we can give grandma a pass, but someone on a technically minded forum such as HN should know better.


Good thing it's not even a personal decision to make.. your brother, sister, mother, father, can make that decision for you..


Paranoid me can imagine a situation where some political enemy needs eliminated and a fall guy needs found. Sprinkle a little 23andme acquired DNA on the scene and some random citizen gets convicted.


Lab grown meat is reality now, so it's not too far off from the possible.


>Back then, few people had the mindset of, "if they own my data, they own me." But we're starting to see it take hold.

Really? You're being either very generous or very naive here, because even back then it seemed blindingly obvious that it's a bad bloody idea to trust a tech company of nearly any kind to safeguard your data securely or honestly. Then double the paranoia when it comes to your genetic information. For somebody working in the tech space in particular to have not been be cynical about this is plainly absurd.


From a decision-making standpoint it seems difficult to argue that you made the wrong choice. At the time when the $99 kits were ubiquitous, 23andme seemed like a solid, reputable company.

I am interested in genetics, but I didn't trust google, and I trusted a google spouse company even less (it's like John Lennon's Google, and Yoko Ono's 23andMe, when I didn't trust Lennon to begin with) and my data hasn't been spilled. Half of you are thinking of all sorts of epithets to call me, but fact is, I was right about 23andMe. From a decision-making standpoint, slam dunk for me and anybody who listened to me. It was not an unusual position to take. "What. Could. Go. Wrong?"

(I'm fully aware they probably already have my data from numerous blood tests I've taken from normal medical checkups, etc. but what could I have done about that?)


I don’t know. I trust Google the most. They’re basically an aimless company with the strongest technically secure infrastructure I’ve seen with the strongest privacy policies implemented (just read about their infra). I think people give in to what people say versus say what they do. Just because Google runs an ad company doesn’t mean shit. What they’ve done with your data (their “actions”) for the past 25 years means way more (they absolutely do nothing with your data — in fact, they seem to completely waste it).

The link between 23andme and Google is tangential at best. Anne Wojcicki and Sergey Brin were married and that’s it, but they are completely two different people.

I have no idea how you’d ever consider 23andme a reputable company. Reputation comes from 20+ years of history — your actions, not what you say. 23andme is not even 20 years old yet — how can you trust something that young?


I forgot which company I used, but part of their deal was that you had the ability to delete your DNA data. Which I guess would’ve come in handy for a lot of people with 23andme. If you had chosen to do so of course.


This seems to be more a consequence of being FDA regulated. Promethease is just providing a lookup service, cross-referencing a genotype file with SNPedia, which offers no direct prediction claims IIRC.

https://www.snpedia.com/


> yet it is the first item you see if you feed your 23andme raw data to Promethease, or if you analyze the data yourself.

If you had not mentioned this i would have thought they weren't testing people at all.


Have you deleted your data? They provide a clear button to do so...


Do you trust that your data, that they fully control on their system, is actually deleted? Including their backups that reside elsewhere? I just assume they set your record's "delete" field to "true" when you press that button. No way for us plebs to be certain about what they do.


<Have you deleted your data?>

Yeah, right. ANYTHING you put in the cloud or any other digital media no longer belongs to you. I suspect 23 sold this valuable data, that any insurance company would kill for, to the highest bidder. When will we learn we cannot trust any company with our info?!?


Making claims like this requires evidence. You can claim that they might mishandle your data due to incompetence, but claiming that they are illicitly selling it is a whole other thing.

They say they provide your data to (to various levels, not necessarily genetic data):

- Service providers (Fedex knows you receive a shipment from 23&me, physical storage of your sample, someone hosts their servers)

- Sharing to people/entities at the users direction

- Any future commonly owned entities (23&me goes through a merger, new org has your data)

- Valid court orders ("23andMe will not provide information to law enforcement unless required by law to comply with a valid court order, subpoena, or search warrant")


I'd be surprised to hear about anybody deleting records on backup systems (i.e., backups that exist for disaster recovery purposes). It would be pretty difficult to do that in a commercially viable way (but I'm open to hearing about some creative ideas).


There are methods of doing it, but it is complex.

Basically you either have hot backups you can delete from (this is bad for obvious reasons), or your backups expire in a given time (this is most common), or you have each record encrypted with an encryption key that is saved in other ways so you only have to destroy those to make the data irretrievable.

Of course, that system has to be backed up, etc, etc.


Isn't it required to be available by EU law?


Apparently GDPR doesn't cover this case specifically but several enforcement authorities have issued some guidance. The only reasonable approach that I've seen is to maintain a log of deletion requests and ensure that if a backup is used to restore operational data that the deletion request is applied against the restored system.

Ironically, I've responded to deletion requests made by email in which the person did not have any records in our systems, until receiving the deletion request containing their name and email address.

https://verasafe.com/blog/do-i-need-to-erase-personal-data-f...


If user data in backups is encrypted with a key kept in warm/hot storage, you can delete the key and effectively delete data in cold storage because it is unrecoverable. This obviously must be per-user keys and set up in advance so I doubt they have it available, but that's one potential method.


Where do the keys get backed up to? If nowhere: If the key system fails, do you just accept the loss of _all_ of your customer data? If somewhere: by what mechanism do you delete the key in a backup?


The keys dataset is going to be much smaller than everything else, and possibly immutable per customer. This in principle makes it much simpler and cheaper to handle, so it's easy to imagine how---depending on the scales and technologies involved---one could isolate it and achieve the desired redundancies and retention periods in ways that would be impractical with the full customer dataset.


That’s a long way to say “it’s definitely possible but I haven’t architected a sane solution yet.”


I find this comment rather baffling. Can you really not see what I'm talking about without having a concrete (and likely completely irrelevant to 23andMe's requirements) example spelled out for you in detail?


You didn’t even give vague details, let alone you giving a concrete example in detail. I asked a question and your previous comment answer boils down to basically, “it’s totally possible.”

My theory is that you’re making a concession somewhere in the backups to a separate keyring system. Either there is no cold backup, or you don’t do cold backups at all, or your cold backup is actually semi-warm and needs to be hooked up to a system intermittently to be reconciled against production (in which case, the backups need backups to protect against a failure on the reconciler system.) The onus is on the answerer to tell me how they would avoid one of those concessions. Respectfully, anything else/less is just fluff like, “it’s totally possible.”


The whole point of isolating the keys dataset is that you can reason about it differently. You frame these "concessions" as dealbreakers, but I don't think that's supported. Can you explain why this dataset would need cold snapshots retained past a period users would accept as a delay for guaranteed account deletion---say, two weeks? Replication already has you covered on acts of god, so we're worried about things like bad code pushes, "hackers", and so on.


> so we're worried about things like bad code pushes, "hackers", and so on.

I'm confused. Are you saying those are small concerns? Because I'm saying the backup mechanism for the keys surely need to be resilient to all of those.


No, they aren't small concerns, but you haven't shown that extended retention of cold snapshots (>2 weeks, to use my previously stated example) is necessary to satisfactorily mitigate them. What is the argument for needing a significantly longer retention period?


Not only is deleting user data hard, backups, snapshots, archives, when they build data fm your data, that isn't considered "your" data, that is their data. So even if they delete all of your data it's not really gone.


I eventually did, and also opted out from trials and sharing with pharmas prior to that.


23andme doesn’t delete your data. By law, they have to keep it for several more years before deletion


So you're saying they are lying?

> Delete your 23andMe account and personal data, including your personal information, genetic data, and other information collected through your use of the Service.

> Upon receiving your confirmation we will process your request to delete your data, and you will no longer be able to sign-in to your account. Please keep in mind it may take up to 30 days to fulfill your request.

What's your basis for your claim?


https://www.bnnbloomberg.ca/deleting-your-online-dna-data-is...

“The federal Clinical Laboratory Improvement Amendments (CLIA) of 1988 and California laboratory regulations require the lab store your de-identified genotyping test results and to keep a minimal amount of test result or analysis information,” an email from 23andMe said. “Our laboratory will retain your genetic information and a randomized identifier on their secure servers for a limited period of time, 10 years pursuant to CLIA regulations.”

I was friends with a former exec. At a party I remember him callously mentioning that you can’t delete data off their platform. I inquired a bit more and he pointed me to one of the laws referenced above

23andme is not lying in a legal sense. They seem to be deceptive though


Thank you for filling in the gaps,

So they delete everything they have under your name and the lab retains a copy of your sample and the sample results?

On paper, this is sufficient because a sample + results is no more useful to someone nefarious then a piece of hair found on the ground...

It only becomes a problem if they fail to delete any identifying information that could connect it to you...


Facebook has this same feature!


> Seems like a space ripe for disruption.

I think there's plenty of room for a new competitor to make money, but you can bet that any similar service is going to be just as bad for people's piracy and security. Nobody is going to collect and store that kind of data without either selling it or being forced to turn it over.


Personally I think this crime forum just democratized whatever unethical activity 23andme was probably doing anyway.


Not probably, they've certainly sold dna data

https://www.forbes.com/sites/nicolemartin1/2018/12/05/how-dn...


I mean, the people whose data was sold did consent, and doing this was always 23&Me's mission. With the exception that the results have been less than stellar, I've generally thought having large-scale genomics data from consenting patients to work on human health problems is a generally good idea.


So companies shouldn't allow users to use poor passwords?

Because it seemed harsh to me, maybe I don't care about them security of some data and so use a weak password; that's on me, surely?


As someone who reads hackernews and works on security systems, yes, absolutely. Run their passwords through haveibeenpwned and disallow anything that shows up.

Based on the feedback I hear from my non-tech friends and family, not allowing them to use their single password used for everything would be a good way to exclude those folks from using whatever service you're trying to sell them.


I was being argumentative there, I think the email-based auth is probably right for a site like this. Force complex passwords and people will often just do a reset every time anyway.


2FA is the standard these days for accessing PID. I can't think of data more sensitive or personal than your DNA.


Companies have legal (and moral) obligations to protect user data. Those obligations don't go away for users that don't personally care.


Perhaps not legally, but morally? I'll have a good chin-scratch on that one. Thanks for your comment.


Look at the leadership’s connection with google and this becomes easier to understand.

Then again we walk about touching things, leaving convenient DNA samples for anyone to collect throughout the day. It’s only because sequencing is relatively high cost at present that we have the illusion of privacy. Once we go in equivalent terms from mainframes to single board computers in the DNA world, then anyone can pretty much have at your personal genome.

Would love to see how the criminal forensic science guys adapt their narrative to this impending reality.


How would you mitigate a password reuse?


I also don't understand why more companies don't also implement reasonable rate-limiting and abuse detection. A suite of hacked accounts should not be able to scrape millions of users worth of data.


This doesn't sense, hackers don't have to use your intended API. They probably got access the the database directly.


According to the article this data was acquired by scraping data from a subset of compromised accounts.


There is no reason to not rate limit your user frontend. I have several pages with rate limits because of prior bot attacks and I regularly rate limit pages against bots with 'interesting data' for scraping


Man #2 reminds me a lot of the mechanism by which Cambridge Analytica exfiltrated so much Facebook data. I wonder if the 23andMe CEO will be dragged in front of congress for the next 10 years.


would it matter if they did? Who at facebook or Cambridge Analytica is behind bars right now because of what they did?


Its just a DNA based social network…


Where you get to waive privacy for all of your close relatives.


The issue is 23andMe is a honey pot. Thats fundamentally whats at play here.


What do you mean by honey pot?

I only know the meaning where it is a deliberate thing to attract bad actors. Is the entire company not meant to provide genetic information?


The entire Internet is a honey pot. You get something shiny, companies and governments get increasingly intrusive amounts of data about you.


A honey pot is just a place where sensitive and valuable information is centrally stored.

They can be unavoidable but need to be well protected. Most companies fail at this.

All it takes is one employee to pip install the wrong library or fall victim to a phishing attack or 3rd party vendor attack and its over. And on a long enough time scale, it happens.


GP is correct, you are using the term to mean something different from everyone before you:

https://en.wikipedia.org/wiki/Honeypot_%28computing%29

TL;DR: A honeypot is like a bait car in a police sting operation; It's something that looks like a vulnerable system with something of value to attackers, but in reality is a fake meant to catch intruders or collect data on them.


Using opt-in for features that worsen user privacy or security is gross behavior. It should be disqualifying for any company that handles sensitive information and they deserve all of the blame for the negative effects that follow. Trying to weasel out of that blame makes the company even more sleazy.


You'd prefer they be opt out so privacy is only available to the most attentive?


There is confusion about the terms “opt-in” and “opt-out” which companies have gladly exploited. See Google Fi recently with a disclosure stating that Google will “opt you in” to their customer data-selling policy.


I think we shouldn’t describe this as a confusion or ambiguity or anything like that. “Opt you in” is simply a nonsense phrase. Opting in means to proactively agree to something. It is impossible to be opted into something.

They are just lying, and you know they know they are doing something unethical, because they wouldn’t need to make up nonsense phrases otherwise.


23andMe is one of my biggest regrets in my life, I did it before I really started caring about privacy. It was on sale on Amazon and yeah.

It sucks since I have zero trust that they would actually delete my data if I asked them too, and even if they do is that data still living somewhere fed into some model or research or whatever.

I hate to say that there is a part of me that kinda doesn't want to know if my data is a part of this, this just feels different than a credit card or some shopping data leaking.

Even worse since I have not gone to 23andme in a while, I feel like this was likely also before I cared about proper password security so it would not surprise me.


It doesn't really matter. The genetic matching models are so primitive that there's really not much of value in the data anyway. Facebook and Google have a far more accurate picture of what can be used to manipulate you. 23andme is little more than a curiosity.

My data is probably in this breach too, and I am not going to waste one second of my life on regret over it.

Stay tuned to find out if you get something from the class action lawsuit. Like the experion hack, you may at least get an interesting coffee table artifact when they mail you a pathetically small check.


The one issue is, unlike social data, you’ve not just given up your data, but you’ve condemned your entire future lineage to this leak. All their genetic information can be partially imputed from this info.

Maybe this doesn’t matter today but who knows what it’ll be valued at 50 years from now?

Don’t upload your saliva to the internet folks.


This is also my opinion: they stole a really poor social graph, that's all


Knowing who's related to who is a very useful social graph if your goals are unethical.


Or if your goals aren’t unethnical!


Terrorists are buying the data on the black market to get a list of Jews to target, because being Jewish is something that shows up in your genetics.


You think terrorists need genetic data to find some Jews to target?? That's certainly not the low hanging fruit ... sounds like an urban myth to me.


Endogamy was also historically common in the Jewish diaspora before the Contemporary Period, so on average they will probably see a lot more distantly related people via 23andMe which creates an exponential increase in the amount of data scraped per user account.


You know, I hadn’t considered that as a consequence of endogamy! You’re totally right.


> Terrorists are buying the data on the black market to get a list of Jews to target, because being Jewish is something that shows up in your genetics.

There are easier ways to find Jews...


They are? Source?



As cute as the implication I just read several of the articles and see no specific evidence about the terrorist groups, just the leak. I'll finish the last one but that's a weak association if I've ever seen one.


> 23andme is little more than a curiosity.

If you call criminal convictions "little more", sure.

Many people consider that a good thing, but I'm not talking about moral valence or vibes - the point is they are an industry supplier of PII that sends people to jail.


What a strange thing to say. If you committed a crime and the genetic information you gave to some company leaks, and someone is able to use that information to prove you're guilty, what sends you to jail is the fact that you committed that crime. The existence of this company and the leak enabled this particular way of proving you guilty, but you were still guilty, so you might have been successfully proven guilty by other means. But most importantly, you're guilty. You're going to jail because of your actions, and for no other reason. It's not like there's someone out there using this information to frame other people of crimes they didn't do.


What puts you in jail is the combination of laws and their enforcement mechanism. Having everyone’s DNA makes enforcement of any law (good or bad) a lot easier.

Your argument only makes sense if you believe you are innocent of all present and future crimes, which is an impossible fact to know by the very fact that you do not know the laws in the future.


No, that's wrong. Even if my actions are criminal unknowingly to me, what puts me in jail is those actions. That my actions had consequences that were unknown to me (in this case, breaking the law) doesn't change the fact that they were the trigger of the sequence of events that led to me being in jail. After the sequence started, there could be any number of events outside of my control that could lead either to my very quick prosecution or to it being impossible to ever catch me.

Your last sentence is a counter-argument against excuses for surveillance. "The argument 'you have nothing to worry about if you did nothing wrong' means you think you know what laws in the future will be like." It's inappropriate here because we're not talking about surveillance; we're talking about people who willingly gave out their genetic data to some company. The company did not divulge the data, so this discussion is not about privacy and ethics. Maybe don't tell Facebook everything there is to know about you? Maybe don't send genetic samples to whoever to learn random nonsense that doesn't matter about your ancestry? If you do and it leaks, TS?


>It's not like there's someone out there using this information to frame other people of crimes they didn't do.

pretty bold claim -- as with any additional justice element the innocent will be trampled to some degree.

Here's the thing : it doesn't just open up criminals for prosecution, it also opens up the 23andme population to an even larger arrangement of possibilities for self-incrimination and 'incorrect or misdirected pursuits of justice'.

It's nice that a well known murderer was found out, but the reality of the situation is that we likely wouldn't hear very loudly the stories of innocent folks with cases mishandled by a law enforcement bureaus that were enabled by 23andme or similar services. It looks bad on the law enforcement agencies and it looks bad for the corporation.


This argument can be generalized against anything police or prosecutors use to convict people. Fingerprints can falsely implicate someone in a crime. Witnesses are notoriously unreliable. If we allow prosecutors to present fingerprints or witness testimony as evidence, innocent people will sometimes get convicted. The most effective way to minimize false convictions is to never convict anyone.


>pretty bold claim

I don't think so. How would that even work? It's not much different from framing someone of a crime just by knowing their blood type.

>we likely wouldn't hear very loudly the stories of innocent folks with cases mishandled by a law enforcement bureaus that were enabled by 23andme or similar services

Nah. Let's suppose that someone made a typo and Mr. Buttle is accused of committing the crime Mr. Tuttle, who deposited his genetic material, committed. While the Buttle-genome link is a useful piece of evidence, it's on the judge to request a saliva sample from Mr. Buttle to ensure a trustworthy source of information. If he doesn't, well, it sucks for Mr. Buttle that he got a lazy judge, but it's not on the company. The person who wrote down the name could not have known that information would at some later point be misused to lockup an innocent man.


In this case Mr Buttle would just need to get genotyped again to prove his innocence.


Or, you had some connection to the victim or scene of the crime and the presence of your DNA is treated as proof of your involvement, because juries are often swayed by technospeak and irrational arguments.


That jury would have been equally swayed if the sample had come directly from me.


You might not have been a suspect otherwise. If they found DNA of an unknown person at the scene but you hadn't put yours in a database, how would they know who to look for?


If only you hadn't worn that hoodie that made you look exactly like that guy that robbed that convenience store.

Honestly, some of these responses are so strange. It's like these people don't understand that they don't live completely by themselves and that the actions of other people affect them, regardless of what technology exists.


It's not strange. there's abundant evidence the juries are over-reliant on scientific expert witnesses in matters which jury members are not competent to assess themselves, and that this often leads to mistaken outcomes. You probably wouldn't be convicted just because you had a hoodie and jeans on that made you resemble a suspect; the idea that two people dressed in very common clothing might be mistaken for each other is something anyone, even a kid, can understand. But juries are likely to take the opinions of forensic experts very seriously, even when they're not well founded in fact or experience.

https://www.thepmfajournal.com/education/medico-legal-forum/...

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4581010/

Now the chances of your DNA being at the scene of a serious crime and you being convicted a a result of a data breach are fairly remote, more the stuff of crime fiction as entertainment. But strange things do happen:

https://daily.jstor.org/forensic-dna-evidence-can-lead-wrong...


Let's start over. You say that if a genetic database didn't exist you wouldn't have been made a suspect, so I reply with a random example where you could be made a suspect without the assistance of high tech. To keep the hypothetical relevant, in it they take you into custody and take a DNA sample from you to compare against one from the crime scene. Either they get a match and that makes you more of a suspect, or they don't and it doesn't. If they get a match, potentially you'll need to convince a jury that the presence of your DNA at the crime scene doesn't mean you had anything to do with the crime.

I don't see what the existence of the database changes. With or without it there are ways by which you can be linked to a crime you had nothing to do with, and it doesn't seem like it can't be used for anything other than quickly getting a list of people who might have been at a crime scene. The article you linked is not about DNA databases, it's about how DNA evidence is misused by law enforcement. All it says about DNA databases is that black people are over-represented in them.

Sorry your judicial system depends on the savviness of random idiots off the street, but it has nothing to do with whether companies like this one exist.


You seem to have completely missed my point. Of course you could be incorrectly made a suspect without any high tech being involved, it has always been thus. The existence of genetic databases provide additional ways you could incorrectly be made a suspect.

Sorry your judicial system depends on the savviness of random idiots off the street, but it has nothing to do with whether companies like this one exist.

You don't have to opt for a jury trial int he US but most people do. Fascinated to hear which enlightened polity you inhabit that is immune from miscarriages of justice.


You must be quite certain your government will never break bad.


If the argument is "the data controlled by this company could be used by an incompetent judiciary system to punish innocent people, therefore this company shouldn't exist", then you should go live in the wilderness away from society, because there's any number of ways that the illegal actions of those around you could have you accused of a crime you didn't do. Personally, I prefer to recognize that I live with an imperfect system and gamble that I'm more likely to find stability and happiness in spite of that system than completely on my own.


I'm a data hoarder and considered ordering my complete genetic sequence (not from 23, there are others that send you back an HDD with somewhere around several-gigs to 1tb+ of your genome). But I resisted because I would have no idea what to grep for. Without a biotech background it would simply be a waste of my time.


Every time you see an article about how some gene makes people like to eat potatos, or grow toenails really fast or something, you're about a minute of searching away from finding exactly what you need to grep for to see if you're a match.


What about this? https://www.cbc.ca/news/world/dna-from-genealogy-site-used-t...

Seems like there is at least some value.


There's arguably societal benefit to the broader availability of DNA data but very little of that passes down to me personally and there are risks. I have very few close relatives but it's not hard to imagine the widespread use of this data by law enforcement would pull in some people with more extended families into investigations they'd rather not be involved in.


Sure , there are several big issues in compromised DNA profiles. See also: https://www.ecseq.com/blog/2019/privacy-implications-of-gene...

But let's wait until it's clear whether raw data was actually leaked.


Isn’t the raw data pretty much guaranteed to be leaked ?

I remember a few years ago there was a button to download raw data.

So if you can log in you can just download.


23andMe shows up to 1500 DNA Relatives for each user (outside of subscription features).

What we know thus far is that the malicious persons who compiled these datasets are scraping user profiles of DNA Relative matches who are related to the accounts which they were able to directly compromise (likely as a result of password reuse). The posters claim to have accessed around ~7M profiles, which means the lower limit for directly compromised accounts is ~4700, although likely much higher (maybe a factor of 10?), given the overlap in match lists, and provided that their boasts are true. So that's potentially ~5000-50000 profiles.

For those directly compromised accounts raw data could be downloaded. For profiles scraped, it would not be feasible to obtain raw data. However it is possible that partial genetic sequences might be assembled for matches. This was at the core of security researchers' investigations into GEDmatch a few years ago [0]. 23andMe does not face the same vulnerability, however with enough compromised accounts it is likely possible to infer a modest proportion of the DNA sequences of profiles which are known to match.

[0] https://www.washington.edu/news/2019/10/29/genetic-genealogy...


Doh! Didn't get the fuss over DNA data before but that scenario definitely makes it clear.


I was really curious too and did one of the major services, that's likely been hacked too by now.

But I registered under a false name, gave the test kit to my dad, and had him test one of my aunts too.

So it wasn't as exact but it was good enough to sate my curiosity.

Everything gets hacked sooner or later. A famous hacker once said that you should assume that everything you say or write will one day be public information.

Edit: And before anyone says "how could you do that to your father", if you knew him you'd understand. He's practically off grid and he's going to die soon, the aunt is already dead.


>that data still living somewhere fed into some model or research or whatever

Do you mean like for drug discovery and other medical research? Why would that be a bad thing?


On paper there is nothing bad with that.

But it still removes my ability to delete my data, especially when 23andMe has proven that they are not properly safeguarding this data.

Also me donating this data to a research organization vs being lured in by 23andMe's marketing are drastically different things.


Oh god, that's made me realise. Does a GDPR right to removal extend to LLM's that have been trained on your data? What probability of retrieval counts as still storing your data?


GDPR recital 26 covers your question.

Excerpt for the second part:

> To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

So they don’t call for an exact probability, but if you can prove you did appropriate threat modelling and put controls in place to counter those threats you should be fine. You are literally doing more than most companies if you manage that.


My understanding is you kind of don't even need to screw up on this one, just be related to some one who did and you're more or less in there anyway.


Wait, you didn't ask them to delete it? If you did the hacker probably wouldn't have your data.


Meh, it's just some stuff they extracted from your spit. I wouldn't get bent out of shape over it. Who knows, we might all get a decent payout from a settlement. You certainly won't regret when that happens.


It's also one of my biggest regrets, and I never did it.

My mom did.

She never asked me, or anything. But half my material is right there, in an irrevocable database, now leaked online for the world. And I have no genetic rights or any claim, even though half is explicitly mine.


Hang on can I understand your position? As I read it your (biological) mum took a swab of her own mouth and sent it to 23 and me. And you argue she should have asked your permission before doing so because she would be giving them half of your DNA?

Honestly I am not sure where I stand on that issue. Or rather it's probably to do with society wide regulation not individual rights ...


Should an identical twin be required to get their twin’s consent? That would be 100% of their DNA. What about their twin’s children?


No, and there's no moral ambiguity about it. Separate human, separate decisions.


What is the John Donne quote again? Of course there is moral ambiguity about it. You can live your life pretending your actions don't have consequences for other people, but don't expect people to enjoy living in your wake.


Not sure why you chose this branch of the topic to share your declaration that the whole idea is nonsense, but ok.


It's an interesting issue.

If a stranger had asked OPs mom for OPs phone number out of the blue, there are many moms that wouldn't share a PHONE NUMBER without asking the phone number holders permission.

But fewer would ask their relatives if it's ok to stick a decent portion of their DNA into a database.


Wait till they find out that 99% of our genes already got leaked to the public by the Chimpanzees.


This is really dismissive.

It should be obvious that, to the OP, the important part is the 1% that is unique, not the 99% which isn't.


Or that you can hit a tree up to get 50% of them.


What if OP lives with their mom and so when the stranger asked mom for her number, she also revealed OP’s.

Should mom not be allowed to give out her number. She had it first. Or should mom need to clear every use of her data with her kids because it might affect them.

Should OP get mom’s permission before emitting genetic material on a lover because that material is half of mom’s dna?

I think this is not a path that can be exactly logiced out, so I stick with “do what you want with your data.”


We used to have this thing called a phone book! Imagine the dystopian horror!


Even crazier, phone books where I lived had a Do NOT List arrangement with Telco providers that was actually respected and bulk leaks of DNL numbers | addresses names were almost unheard of.

Even stalking LEO ex boyfriends | former husbands mostly left a paper trail if they requested a DNL number.


There are also a lot of moms out there that would give up a phone number without hesitation if not even suggesting the person should have it without being asked.


And if their father also did it, 100% would be in there.


Except the data is randomized at conception


It was required of my sister to participate in 23andMe for an anthropology class of hers in college. I cringe now at what a violation that was. Even if she had the option to decline (which I am not sure she did), it's hard to say no to your professor when you're 19.


The ethics issues with that assignment would make for an interesting study all by itself.


I don’t know if holding a grudge against the naive parents of the world is productive, although I can understand the frustration.

Anger is better directed at legislators for not regulating these sorts of businesses under HIPPA and similar laws. Your mom’s medical records also contain a bunch of sensitive information about you and yet you probably don’t get mad each time she visits the doctor despite the privacy risk this incurs.

Treating 23andme like a lighthearted way to enhance your genealogy hobby or whatever is actually so fucking wild. This shit is not a game, only a SMALL fraction of the victims are 23andme’s customers. 23andme’s responses to me are proof enough the entire company should be shuttered because they are grossly negligent and don’t understand what business they are in or what they just did or how they fucked up.


The actual genetic data was not leaked. Why do people keep saying this?


Because nobody actually links to the leak proper. It ends up that people talk about writing about writings, rather than see what it is. And the prior post I did was auto-dead.. My guess is from the forum link. Cause we're not allowed to talk about breaches? Sheesh. (I edited enough stuff to review, but no direct links to leak documents.)

And digging around the breachforums DOT is/Thread-23andMe-Great-Britain-Originated-4M-Genetic-Dataset , you see stuff like

"The data includes information on all wealthy families serving Zionism. You can see the wealthiest people living in the US and Western Europe on this list."

This link combined with the 2nd one doesnt look good in any stretch of imagination. These are antisemitic leaks intended for harassment and/or hit lists. It definitely looks BAD. But again, it's already in the world now.

-------------------------

Here's the schema of said data ( breachforums DOT is/Thread-DNA-Data-of-Celebrities-1-million-Ashkenazi-REPOST ):

profile_id; account_id; first_name; last_name; sex; birth_year; has_health; ydna; mdna; current_location;region_1;region_2;region_3;region_4;region_5;subregion_1;subregion_2;subregion_3;subregion_4;subregion_5;population_id1;population_id2;population_id3


> 23andMe is one of my biggest regrets in my life, I did it before I really started caring about privacy. It was on sale on Amazon and yeah.

I've had someone give one as a gift, so glad I didn't do it.


how has 23andme affected you in reality? (not hypothetically)


I discovered that I have the apoe4/4 gene, which not only explained why keto -despite helping me lose weight- threw my cholesterol to dangerous levels, but also explain some other important things about my family and genetics (some things that weren't even in the results, but they let you download the raw data of your genome so I did some bioinformatics). It helped me with changing my lifestyle and having a clear picture of what the focus on.


Even though I'm not a direct customer, it affected me: I now know I have tens or more half siblings throughout the US. It's mildly unsettling but I guess I'm happy that people were able to find my dad, who is also their dad.


Sperm donor. At some point in the future, I'm going to have an entire classroom of people figure out who I am based on suggestions of my relatives in their family tree. I deleted my data years ago, doesn't matter.


They have an option to destroy your data and the original sample you sent in


They could never get my DNA when i tried it. They sent out 2nd test kit that didn't work either. Would they have kept the samples?

Not that i really care, you're not going to find a password or my first pets name in my DNA.


Have you seen Blade Runner?


I’m one of the leaked. I’m highly technical, security literate. I use unique secure passwords for everything.

Fuck these guys. Fuck their bullshit misdirection.

This is the equivalent of corporate doxxing. Until individual execs in these mega-corps that piss in the face of their users bear individual consequence for this type of thing it’ll keep happening.

Because until then there’s no requirement to actually really give a shit. We’re grist in the money mill.

The lack of consequence has me seething more than the breach.

> Edit: the victim shaming in comments here is staggering.


23andMe is trying to have us believe that multiple tens of thousands of their customers were hacked in a credential stuffing attack. With 1500 matches per person, assuming no overlapping relatives, you'd need somewhere around 10,000 successfully hacked accounts to get the amount of data they have (~14 million). In reality, many people have overlapping relatives with other people, so many more than 10,000 accounts would have needed to be attacked, all without 23andMe noticing anything suspicious. This seems very unlikely to me.

And while 23andMe claims that a credential stuffing attack was at the root cause of this leak, the hacker(s) who first posted about this leak on Hydra Market over 2 months ago claim that they simply used (abused?) an API that 23andMe offered to their academic and research collaborators.

The hacker(s) claimed to have 300TB of data, including raw data, but have not proven that the scope of the attack was that large. But if their claim is true, then the breach was definitely not due to a credential stuffing attack, but fits well with the hacker(s) claim of using an API.

So, if the extent is as claimed by the attacker(s), then it is much more likely that one of 23andMe's API was used to scrape everyone's data, not a credential stuffing attack. Either one of their collaborating researchers got hacked, or they had (have!?) an access control vulnerability in their API which allowed the attacker to again scrape everyone's data.

I cannot find much information about the API 23andMe offers, but I did find one on RapidAPI (https://rapidapi.com/23andme/api/23andme), and indeed if there was some sort of access control vulnerability or some sort of hack of an escalated user then with even 1 individual as a starting point, they could download the data using various endpoints (there's even an endpoint for the entire individual's genome...) and then get the relatives of that individual (an endpoint for that too) and then recursively do the same for all relatives until every person has been downloaded once.

At this point in time, I am highly suspicious of 23andMe's defense but until the attacker releases proof that they have raw data we can't really prove that they're wrong/lying about the actual magnitude of the attack. But I do believe their claim of using an API makes a lot more sense than the way 23andMe proposed, so I am very worried.


> and then get the relatives of that individual

so you're saying they used a genetic algorithm ? sorry, couldn't resist.


What's crazy is that onlyfans content creators get so much slack for sending out video and audio of their phenotype.

23andMe grabs a person by the source code and somehow that isn't considered obscene.


You shared your DNA data on the internet, time to review your security fundamentals.


You can get my entire DNA data, https://my.pgp-hms.org/profile/hu80855C I can assure you I know more than a little about security, considered all the reasonable risks, and concluded that it's not really increasing my risk and is unlikely to do so any time in the future.


If that's really you, I give you a lot of credit for being bold. You and the CEO of LifeLock both. Hopefully your confidence doesn't blow up in your face the way his did, but when/if it does, you'll probably never know why it's happening to you exactly. The easier it is to link that data to your identity the riskier it'd be. I haven't seen the 23andme data, but if it includes names, addresses, email addresses, family members, etc they're going to be a lot worse off than you likely are.


My real name is David Konerding, so the linkage now is trivial. Again, I've evaluated the risk profile associated with my genome data and concluded that it's effectively non-existent, and will continue to be so.

My data isn't 23andme- it's a whole genome. 23andme's data collection is very limited, and in my experience, fairly optimistic about its predictive ability for health.


I don't have the expertise to quickly mine your genome for the juicy bits, so I'll have to trust that you've seen what's there and still couldn't see how any of that information could ever be used to prejudice someone against you, or reveal anything that could be exploited either today, or at any point in the future.

It's a bold position, especially given that we're still discovering what many of our genes signify. While I do worry that it may rise to the level of hubris in this case, I really have to admire your confidence all the same. I genuinely hope that your assessment of your risk is proven accurate with time and that your contribution to the Personal Genome Project never comes back to haunt you.


How do you factor in the opinions of people who are, or will be, genetically related to you?


I didn't.



Even if you didn't they'll likely have a relatives data.


True, but we should be responsible for ourselves to start with.


I do not understand why anyone with tech knowledge would have participated in this honeypot. Please help me understand why there was trust.


Please help me understand what negative consequence I am likely to experience because of this. Not something conceptual like "you don't have privacy!" - be specific.


People of Ashkenazi jewish descent might have a more tangible grasp on the negative downsides of this leak.


So you're saying - someone in the US is going to discover my ancestry and... do what? You can find far more inflammatory things about me online. Besides, if someone really wants to attack Jews, it would be a lot easier to find the address of their local synagogue. Or search the phone book for obvious last names.

There are a lot of lessons to learn from the 20th century, but I don't think "you have to hide your ancestry" is one of them.


Oh yeah - nothing bad has ever happened based on knowledge of a minority group's ancestry...


Because I don't care about privacy and I wanted my DNA tested, duh.

I don't understand privacy nuts.


is the essence of the privacy concern that data showing predisposition for diseases (or whatever sort of weakness) can be dystopia-used to discriminate (insurance, etc)? Or just privacy in general?


How about creating a virus to give a disease to everyone of a specific race


That's science fiction. No matter what you read in the popular press or scientific journals, engineering a virus specific enough to perfectly target a group of people sharing a common genetic history with perfect specificity and sensitivity, is still not something we could do. I also don't see this as being technically possible for some time.


I keep re-reading what you wrote, and I think it's less far fetched than you imply, for the following reasons.

Huge datasets labelled as to ancestry exist, characterizing distributions of alleles that imply "race", which is how Ancestry and 23andme, for American company examples, do some of their analyses.

At least two Nobel prizes were recently awarded, one to Doudna and Charpentier related to Crispr-CAS9 for context-sensitive genome edits, and another for mRNA (which was in development for a decade or so before the entire Covid repurposing arose).

So, I think the part missing, to your point, is the "vector" that would deliver a sequence of patch files, perhaps via CRISPR-cas9 etc, which would perhaps bind to the constellation of race-implication genome sites to collectively build some proteins of whatever sort of interest, whether for fitness repair or fitness reduction in the host.


I said what I said above based on 30+ years of experience in this field. Be aware that the entire field of gene therapy was paused for 20 years because a single patient died in a single study in the 90s.

Race is a proxy variable for genetic history and we have enough genetic history to identify what "race" people belong to, in a computational way, with a great deal of human judgement. Implementing that "logic" in the form of a gene therapy delivered as a virus is much harder. Molecular logic, these days, is fairly simple and doesn't allow us the level of accuracy you would need.

If you don't require 100% sensitivity and specificity the job is "easier". For example, what should your theoretical virus do to people who are children of two parents with very different genetic history? Which "race" do they belong to? And if you go down that path, you're basically going to have to convince your funders (because this would be an extremely expensive research project) that your weapon of mass destruction will likely kill some fraction of allies.

Those are just a couple things off the top of my head; I could write a whole book on how hard delivering gene therapy is.


yep, totally understand and agree it's a super vague notion, with super imprecise possible consequences.


Assuming that technology even exists, you wouldn't need 23andMe to get the relevant genetic data.


you'd have to define "race" but the idea seems legit considering crispr-cas9


Because I needed to know whether I had genetic health issues. I assume you use a smart phone. If so, you've already given up your privacy. You're like the people that scream about someone having Alexa in their house while simultaneously carrying Siri everywhere in your pocket.

What exactly is someone going to do with my DNA? Any entity that's a threat to use it for something nefarious is going to get it from me pretty easily. They could grab any number of physical items that I've touched or left my hair on etc.

Should they do a better job of protecting the data? Yes. Am I going to freak out over something that will likely not affect my life in any way? Nope.


Privacy nuts are always going overboard about things that mean very little.

Genetic data sounds important but in the bigger scheme of things, it's probably the least manipulatable data about a person on the internet.

My genetic data is less weaponizable than if I uploaded hundreds of pictures of myself and my shared my social graph to Facebook, if I shared my political opinions on Twitter, if I commented/posted on Reddit boards of my interests and hobbies. It's also less weaponizable than the multitude of "invisible" data that I feed to Google, my incompetent local government, service providers, every shop that is shipping something to my home address.

I lose genetic data everywhere I go, every day. 50-100 hairs fall of my head, and I leave fingerprints on everything I touch. I "lose privacy" every day by walking into somebody else's photo/video/TikTok, and by mishandling of it by poor government/business entities.

Life's too short.


Did you use a fake name and a prepaid debit card to make your purchase? Well I’m glad I did though they can just track me based on relatives that also use the platform if they really wanted to.


How are you verifying you were in the leak?


>Until individual execs in these mega-corps that piss in the face of their users bear individual consequence for this type of thing it’ll keep happening.

No, Until individuals stop using these mega-corps that piss in the face of their users and also bear individual consequence for this type of thing it’ll keep happening.


How do you know? Where can you check?


I would also like to know.


The email I received from them said "If we learn that your data has been accessed without your authorization, we will contact you separately with more information." So I guess receiving such an email is how you know at this point.


I did 23andme and although I'm not happy about this, I admit I'm growing numb to all the leaks of information, and I kind of went into it with the assumption that it might be leaked at some point.

https://www.nih.gov/news-events/news-releases/nih-s-all-us-r...

https://www.nytimes.com/2013/06/18/science/poking-holes-in-t...

As you can surmise from those, the issues about DNA sharing are more general than 23andme, or even private companies.

I mean, therapy notes have been leaked, and hospitals have had medical records compromised.

I do genetics research and although 23andme does have sensitive information, the level of information they have is relatively low. People with certain polymorphisms might be at risk but most of it is pretty crude, and I suspect many of the people affected might know through nongenetic ways that they were at risk anyway. I think the genetics revolution people were predicting doesn't exist, or to the extent it does, it will require something different than what 23andme has.

Google or Apple has more damning and accurate information about someone than 23andme. If 23andme or anyone else had some mind-blowing insights into you personally based on your genes, they'd certainly sell that to people and they don't. Most of their selling points are in genealogy and things like that.

I don't mean to sound dismissive, this sucks and 23andme could/should have done things differently. It's just I think as a society we need something other than "live as a technohermit" or "implement such strict security you're at risk of shutting yourself out" and "become a permanent victim of darknet hackers and authoritarian tyrants". I also think people overestimate the information value of the genetic information 23andme has. Yes, it's significant, but it's really at this point limited to what they offer. There's no would-be 8yo serial killers out there that, if we only had 23andme information, would be able to "prevent" them from killing in some real-life version of Minority Report crossed with Gattaca. Maybe someday, with different information, but not now.


"23andMe blamed the incident on customers reusing passwords"

I'm not sure how they worded this but it's their shortcoming, not their customers. Customers reuse passwords and will continue to do so. For sensitive PII it's far easier to enforce 2FA or Google SSO than to change customer behaviour.


Eh, millions of accounts hacked via credential stuffing on one platform because of reused passwords? I don't buy it for one second.


It wasn't millions of accounts, it was an undefined number relying on scraping dna relatives after the breach. This article mentioned it, but last week I saw a better more detailed explanation.

Regardless, the key quote from the linked article:

"23andMe blamed the incident on its customers for reusing passwords, and an opt-in feature called DNA Relatives, which allows users to see the data of other opted-in users whose genetic data matches theirs. If a user had this feature turned on, in theory it would allow hackers to scrape data on more than one user by breaking into a single user’s account."


And even if it is true, that is still the fault of 23andMe. It is kind of a fool me once scenario. Some occasional password stuffing is excusable and can be blamed on users' laziness. Millions of accounts hijacked through password stuffing and something is fundamentally wrong with 23andMe's approach to security for not identifying such a large scale attack.


100%


Related: what fool decided to call compromised credentials "stuffing"? Nothing is being stuffed.


You take all the breach dumps you can find and "stuff" the username/password pairs into other sites.


Credential stuffing is a specific type of attack which uses compromised credentials.

Not all compromised credentials are used in credential stuffing attacks.


Agreed. They could have done email link verification which wouldn't require the user to set up 2FA on their side.


Or provided their users with randomly generated, secure passwords themselves

I saw this idea somewhere on here a few months ago, and since then, granting users the ability to set their own passwords seems like a dumb thing to do!


Depends on what you’re trying to protect against and who your average user is.

Forcing passwords onto users is a sure fire way to get people to write their passwords down and constantly get locked out of their account. In some circumstances that’s not a problem. For local domain access, that’s completely unworkable. For websites it makes a little more sense since users can use password reset via email, but if you’re going to expect people to rely on that then you might as well allow them to set a password and use email as 2FA.


There was a time when random passwords seemed like a bad thing, because users would write them down, and that was an obvious risk. We are way past that of course, but that line of thinking is still informing security decisions.


They do have it but they probably have millions of accounts that were created before that feature and never logged back in and set it up. It's also through an auth app and not texts which is more secure but more of a hassle for not allowing users which might affect adoption.


It is very inexpensive to check your customer passwords against HIBP [1] or strongly encourage MFA. They choose not to.

[1] https://haveibeenpwned.com/API/Key

(23andme customer using Apple SSO, have strong opinions on customer IAM, passwords must die)


Doesn't that require you to know the password in plaintext?


Depending on your use case and implementation details, not necessarily.

https://www.troyhunt.com/understanding-have-i-been-pwneds-us...


.. can't those old accounts be flagged to require email verification to log in ?


Most old accounts would probably never try to log in again anyway. After you learn that you're 3/64-ths Irish you've gotten what you wanted, why log in again?

Yeah I know there's the whole genetic disorder screening thing which might receive more updates in the future, but I think most of their customers probably did this for the novelty of knowing where they came from.


Oh, you lost your email account access? Please send a matching DNA sample and $99 to unlock your account.

I mean, 23andme has one of the ultimate methods of account recovery available to it. (ignoring that people tend to leave copies of their DNA everywhere, but then you could just mail that in under a John Doe and find out all the same info anyway).


Whatever way you put this, handling the support load of the few customers who can't log in - and by this argument aren't ever logging in anyway - is better than having this degree of PII leaked and the company reputation ruined.


It could be easier and cheaper for some to get someone's hair or saliva than cloning a SIM card...


My point of view here is someone that’s lost their access to 23andme, not using it for SSO for other services.

While I get the social media aspects of 23andme, if one can get your DNA, they could submit that to 23andme and find out everything you already knew.

I wonder how they handle duplicate submissions?


The Facebook way...sign up with email, get instantly restricted, need to verify with a mobile number to unlock.


> created before that feature

Which feature? Unless they didn't ask their user's email (which I'd find surprising), they could have added e-mail based TFA any day without asking their users to do anything.


They failed to enforce MFA.


Exfiltrating 300TB of customer data without 23andMe noticing seems like negligence. Shouldn't there have been some alarms?


“Hey Bob— why is our AWS bill so high this month… oh shit”


You joke but one of many organizations weakest points is lack of prompt alerting. I think the FBI recently put out a list of weakest points and this was maybe number 4 or 5 on the list. I’ll do some googling and see if I can find it.

This is a forum of engineers and builders - if a lot of data were being exfiltrated from your cloud accounts right now, would you know? What would your response plan look like? Maybe this could be a good way for all orgs to review those practices and make sure they have a plan.


we've caught a few "oops I almost spent a million dollars" through cost alerting on AWS. Unfortunately, the time resolution for billing isn't great. I will say, much to my surprise, AWS comped us for the money we wasted even when it was our own fault.


I think if you're likely to not notice spending a million bucks right away, you're the kind of firm amazon wants to hang on to.


Lol. "Hey Bob - why are we in the news for a breach?" Seems closer given the news reporting and their tardy response.


if its not cred stuffing, a few GB per day per IP to defeat volume anomalies. you would hope 23andme would have detection/controls on customer storage to flag new geo locations since there is no way they spoofed each customer's destination. or an authorization flow to even access the data with an audit trail/approval flow for god accounts.


Wow, blaming paying customers for their inability to secure their systems. You cannot control users, but you can control your security posture. 23andme's management can fuck right off with their bad attitude and poor judgement.


The attitude that oozes out of these reactions is not that they are sorry that the data was leaked, but that they are sorry they can no longer profit from it.

23AndMe's product is their user's DNA, packaged nicely in easy to digest formats.


They aren't sorry they can't profit from it, they will continue to profit from it anyway. One of their membership tiers includes an annual payment so you can get the newest easily digestible thing with your data.

Shame on them for slacking off on security when bad actors could do actual dangerous things with that DNA data.


If they couldn’t detect millions of records stolen from credential stuffing I don’t think that makes them look any better. But Ill die before i see a company take ownership for their actions im sure so it’s no surprise


23andMe came out when I was in high school, and one of my science teachers spent an entire class period telling us how bad an idea it would be to ever use their service. Probably the most valuable lecture I ever had tbh, I've never even considered going near it as a result.

Makes me think that "data privacy" should be added into schools' mandatory health class requirements or something. Though most likely any prescribed curriculum would have been lobbied to death & not actually teach valuable information.


Why is the lecture so valuable? Why was another commenters biggest regret using 23andme? (Talk about first world problems).

What is private about dna?


Nothing, you abandon many viable samples of it wherever you go.

This relates to the biggest problem with biometrics: it represents an effectively static attribute about an individual. It cannot be reset and therefore we place this enormous responsibility on the reading mechanism and pipeline to the authentication process to attest that the human using it is doing so with consent.


Recent and related:

23andMe Sued over Hack of Genetic Data Affecting Thousands - https://news.ycombinator.com/item?id=37895586 - Oct 2023 (20 comments)

Who hacked 23andMe for our DNA – and why? - https://news.ycombinator.com/item?id=37886543 - Oct 2023 (3 comments)

23andMe Accounts Hijacked and Data Put Up for Sale on Hacker Forum - https://news.ycombinator.com/item?id=37810755 - Oct 2023 (2 comments)


>The type of information genetic-testing companies collect is currently not protected by the Health Insurance Portability and Accountability Act (HIPAA), our nation’s health privacy law.

Nice work by the genetic testing industry lobbyists.


If physical DNA were PHI, then everywhere we deposit DNA would need to have the same security as your doctor's office.

But the data revealed in analyzing the DNA (e.g. if you have a genetic disorder) should be PHI.

It's odd how something could go from "not PHI" to "now this is PHI" just by processing something like a piece of hair.

"At what point does it turn into PHI" would be a difficult question to answer.

IMO protection of DNA related data deserves something different than HIPAA.


I might be wrong, but I don't think a genetic sequencer is something that's easy to build in your garage without anyone knowing. So if loose hair is not PHI but the information that the owner of that hair has or not a genetic disorder is PHI, we could say that the PHI starts once the genome of a human is sequenced from a tissue sample. So, yes, anyone who has a genetic sequencer must have the same level of security as a doctor's office. That doesn't seem like a terrible requirement.


it's not impossible to do DNA sequencing in your garage- either by buying a sequencer on ebay, or implementing your own. However, most of the data you'd generate wouldn't be that useful, instead, I think you could make a spotted microarray, which was one of the early examples of DIY process automation in biology (around 2000), that could be turned to nefarious uses with much less time and capital.


Do you have a reference for this? Was it ever actually considered? Or are you just making things up?


https://www.latimes.com/business/lazarus/la-fi-lazarus-dna-g...

>“We plan to engage constructively with policymakers on the best enforcement regime,” Haro said, reiterating that the coalition wants “a uniform national data privacy law” that treats all companies the same.


HIPAA applies to health care providers and health insurance companies. If you give some other type of entity your PHI they are not beholden to HIPAA. You said that this wasn't part of HIPAA specifically because of lobbying. You haven't provided anything that says that anyone was even considering expanding HIPAA to cover other entities. 23&Me wasn't even around to lobby when HIPAA was penned.


From the same article, 23&me is lucky they are not covered under HIPPA...

>The federal Health Insurance Portability and Accountability Act, a.k.a. HIPAA, includes penalties ranging from $100 to $50,000 per violation — that is, per hacked record. Violations also can carry criminal charges resulting in jail time.


The laws and regulations around genetic information seem to be (intentionally?) easy to misread.

Health and Human Services has a FAQ page[1] which states:

> genetic information is health information protected by the Privacy Rule. Like other health information, to be protected it must meet the definition of protected health information: it must be individually identifiable and maintained by a covered health care provider, health plan, or health care clearinghouse.

However, according to many other sources[2][3], the interpretation of these rules DOES NOT apply to companies like 23&Me. I assume the company is not considered "a covered health care provider, health plan, or health care clearinghouse", but (again) the HHS definitions are (intentionally?) vague/misleading[4].

I suppose you need to be very familiar with regulatory law to actually make sense of this junk. (I don't know the history of these regulations and carve-outs, but the tin-foil-hat part of me wants to blame lobbyists/legal-corruption for the lack of common-sense and simply worded regulations.)

[1] https://www.hhs.gov/hipaa/for-professionals/faq/354/does-hip...

[2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6813935/

[3] https://lawforbusiness.usc.edu/direct-to-consumer-generic-te...

[4] https://www.hhs.gov/hipaa/for-professionals/covered-entities...


I think the problem is that HIPAA is less strict than the zeitgeist believes. Makes perfect sense since your family doctor will have you sign an acknowledgement yearly. The reality is that health data is for sale from all kinds of vendors including pharmacies. As long as it’s “not identifiable” it’s fair game. My work health plan uses a separate “pharmacy benefit manager” to end run around the restrictions on what it can do with that data.


I used a fake name and one-off email address when submitting my results in anticipation of exactly this scenario.


That would make it a bit harder to link to you, but definitely not impossible. It would definitely have been better to not do it at all.


Wow, such a bad security combo: 1- Not enforcing 2fa for new accounts and migrating old ones to app 2fa on login + requiring mail code enter before that to create a temporary weaker 2fa, this way some users will spot the problem and click that they're hacked 2- lack of data encryption for servind data to client(so that 23me has full access to unencrypted db on isolated servers, but users will need to enter some additional password 3- not checking against existing pwned dbases and autoreset the password 4- better control for sharing dna data for ppl discovery. Why on earth they provide 1.5k matches, they could limit to 100 closest ones for example and provide 1.5k for additional pay

For such a company that handles this data, I would have expected much harder security measures


Wouldn't it be better (for users, not for profitability) if such companies would delete the data after analysis?

DNA analysis is very important, and probably, should be mandatory for everyone planning to have children (to reduce the number of people with incurable or very expensive to treat diseases and save healthcare budget), but with current level of privacy protection the data will just leak around. Also, when DNA screening will become mandatory, government will probably require that it gets a copy of everybody's DNA. Looks like this war is already lost.


The raw data wasn't leaked, and the data changes over time when other users join. Many people expect this.

23&Me allows you to delete your data and your account if you no longer want to participate.


Do we need to submit a new story to HN covering that the raw genetic data wasn't leaked? Everyone is screaming that sky is falling and ignoring the facts


Eh, for some people the sky is falling. For instance, the original hacker was specifically offering profiles of Ashkenazi Jew, which included full name and location information. I understand how that could be scary in the hands of the wrong people.


Yeah but you could do that by hacking any social media platform and scraping real names and locations to very accurately identify Jews


Yeah that's simply not true, not every Jew labels as such on social media. Also not everyone who is genetically Jewish even identifies as Jewish.


That's why you determine it by their name and where they live.


A thug robbed you on the street, but it's fine because a burglar could have stolen from your house instead.



Yea, that's true - but has little to do with the fact it's a genetic data site. A family tree site with private profiles would have the exact same risk.

The fact that the genetic data isn't involved means this story isn't relevant to people panicking about sharing their genetic data with 23&me.


Don't forget to change your DNA.


pretty much everyone has already done so?

https://en.wikipedia.org/wiki/Retrovirus


I refused to try this service for years despite having friends suggest it to me over and over. At best they were going to sell your data to insurance companies and governments. At worst, criminals would end up with it. Why anyone would trust a private "viral" company like this with their DNA is beyond me. Best of all, you paid to give it to them.


Probably also because sign up for the product is done through their app rather then a browser where people have password managers handy.


Of the three leak files released so far there are 5,150,779 unique entries (I wanted to check if I was in them, so far I'm not).


Where are you checking?


The leaked files; don't want to link here directly but I think they're all still available at the source


How is this company still in business following this? Personal “genomics” should be treated like the most sensitive type of data out there. Failing to secure it means you are not suitable to carry such tests.

I have already been advising people to be cautious about such online services, or to use fake personal details where possible. I will now advise even harder.


They are in business because people voluntarily pay them. This is what people want, so the market provides it. Many people do not care about preventive privacy.


how to find out if you are affected? 23andme is one of my biggest regrets. I did it when i was 22 and kind of an idiot lol.


Thank you auntie Janice for leaking my genomic data. I'm sure the health insurance companies will love you.


If you live in the US, the Genetic Information Nondiscrimination Act prohibits health insurers from using the data.


I have trust in their ability to find loopholes in the act.



I was mildly curious and thought maybe I should try it, but the more I thought if I really trust this company. I've seen how companies treat their customers' data and imagined that being my genetic data, and it just didn't seem like a good idea.

Then there is the point of what if I found I have some genetic disease and there is not much I can do about it. Not sure living with that knowledge would be better.


It's upsides if something treatable is discovered early, otherwise it is mostly downsides and not just for yourself. Assuming you can trust their analysis in the first place.

What I really don't like about these companies is that they are never transparent, you only find out after the fact what you bought into.


This is a social network that was leaked. I would be somewhat surprised if the same attack didn't also get quite a few of someone's "friend network" on Facebook and the like. Certainly was more likely in the past, but with some of the modern botnets, I have little faith some of my elderly family haven't been hacked in this way.

To that end, I'm curious on how they could have protected this better? Almost certainly doable, but I don't know of many ways you can protect people that are on my contact list from getting leaked if my contact list is leaked. Which is essentially what this is.


Always been a bit curious about 23 and me. But also I'm northwest european so paying money and giving out my data just to find out "100% British isles" or similar did not sound that interesting. Glad I obstained!


I guess the fun part was finding out your real great-grandfather was a Portuguese sailor on a visit to the port of Bristol (or whatever).


This DNA relatives feature begs the question.

How easy is it to submit random DNA (cow, cat, dog, chicken, etc. or even fake?) and get yourself enrolled in their system to then legitimately have access to the next 1500+ "relatives?".

Given a few grand to spend on testing kits I reckon you could get the same results stuffing this DNA, as this "golem" guy did by credential stuffing.

23AndMe made the classic mistake to assume that biometrics are a type of authentication and allowed anyone to access the records of 1500 other people just by submitting some DNA....


> How easy is it to submit random DNA (cow, cat, dog, chicken, etc. or even fake?) and get yourself enrolled in their system to then legitimately have access to the next 1500+ "relatives?".

You really must not fundamentally understand how this works. You are matched to people who also took a 23&me test and share DNA with you. The amount shared is how they infer relations.

There are people who do not even have 1500 matches, submitting a sample of animal DNA will quickly result in sequencing failure as the test is based upon the human genome and is effectively “written in the hardware”.

Fabricating a DNA is extraordinarily difficult because you need hundreds of cells with genetic material in order to even produce results. It would be so much easier to steal a sample from a known relative than it would be to genetically engineer some cells.


As supportive of open source as I am, I had no desire to be open sourced.


Your clones are licensed as Creative Commons


Millions of users reuse passwords. This is nuts. Who are these people?


Most of the world. People that don’t use HN or work in tech.


More importantly how was one person able to log in and download all that data without any alarms going off


This. Every one of these attacks I've worked were loud due to the volume of logins and ratio of failures.


maybe they all share the same password-reuse gene?


this made me spit out my morning coffee. ugh.


I am a data engineer at a fortune 50 company, I do. It's infuriating that I'm expected to have a unique alphanumeric key to every single thing I use on a computer. That would be 100's of keys. It's nuts people think we should have unique passwords for everything.


Have you already been introduced to such concept as password managers?


Yeah, let me just put ALL of my passwords in ONE location. Like that is any better than reusing a password.

Then you are relying on that password manager, what if you lose access to it or the data with your keys? Welp...


> any better than reusing a password

It is. That way you only trust your password manager and not every single app you use (that can easily dump your password when you login, or might even store it as plaintext).

If a password manager is open source and you know how to audit it, you could do that. Otherwise you can ask somebody you trust who knows better.

> what if you lose access to it or the data with your keys

That is a possibility indeed. You should make backups, of course.


> what if you lose access to it

We’ve been working very hard on this problem, and recently devised a scheme known as backups. It’s like a whole copy of your data someplace elsewhere!

Snark aside; popular password managers have extensive recovery and replication capabilities.


> Yeah, let me just put ALL of my passwords in ONE location.

By reusing a password, you're basically doing this already.


Even worse, you’re putting ALL of your “passwords” in 100 different places!


This is bait


I certainly hope it is!


Like LastPass, for instance. /s


Imagine I was one of those people that their DNA got leaked. How would that could/would/will affect me?

And please, don't say "insurance companies would love that data!".


You'll get to experience the bottom percentile of Hacker News posting endlessly about how getting a cheap genetic test in the mid-2010s was obviously something only a total moron would do, and occasionally slipping in an "insurance companies would love that data!" (I'm not saying it, use-mention distinction) alongside it.

Your raw data hasn't leaked unless it was your account in particular that got used, but you could be put on some guy's "Jew list", which you may or may not care about.


It turns out those that like the idea of giving your DNA to VC's then you are also highly likely to reuse your one true password (password123) on multiple sites.


lol, the other obvious explanation is that 23andMe is simply lying about the cause of the leak.


Lying about it would expose them to a lot of legal trouble. Im guessing they are choosing their words carefully as they have likely been worried about being sued from day one.


Sometimes I think for a microsecond that doing a DNA test would be interesting and then I remember that breaches are inevitable and never do any.


I've been considering creating a distributed sort of human health monitoring/experiment project.

One of the questions is "should I also add my DNA" in there.

And if my DNA has been leaked then it wouldn't matter unless this leak is anonymous.

With that in mind I wonder if you could use this data to sort of populate this project. I also wonder if that would be legal or not.


This company definitely deserves a dooming Netflix documentary with lots of visuals of DNA and network-like structures.


Once precision medicine allows to create viruses tailored to kill or disable single individuals, the world is going to be a very very dark place.

I don't think it's a surprise that COVID normalised it for people to hand over their genetic material to anyone (swabs) without any option for refusal (no test? no live).


I suspect that in the future, regulation of online things will lower the profits available to websites like this.

They should have to pay a shit load of insurance if dealing in this type of business with lots of personal data.

And the mistakes for making this type of mistake should be monumental.


> pay a shit load of insurance if dealing in this type of business with lots of personal data

Chubb's and AXA have been offering this for a couple years now and I know some large companies have taken out a policy.

The issue I've heard is the large providers rarely pays out as the pools are small and the risk profile for breach insurance is still actively being worked on so the exposure is potentially massive, so a lot of companies are forgoing Cyber Liability insurance.

There needs to be active regulatory work to solve this chicken-and-egg situation


“Pandora opened a jar left in her care containing sickness, death and many other unspecified evils which were then released into the world. Though she hastened to close the container, only one thing was left behind – usually translated as Hope.”


It's time to normalize services generating random passwords for users. Literally a password field with a "Reroll" button.

Users shouldn't have to dick around with a more complicated 2FA system just to protect themselves from password reuse.


As a security novice, I do wonder if this is more secure overall. It means users will increasingly either use password management tools or they will write it down, digitally and/or physically. So, it defeats password reuse, but the downsides are:

1. Makes it easier for users to lose passwords

2. Makes the user more susceptible to targeted hacking

3. Increases the number of passwords in note-taking services which may themselves be hacked

Edit:

Also, until a critical threshold is met of adoption among services, some users will just re-use the random password for other services.

A lot of this is mitigated by making the password non-human-friendly, essentially forcing users to use password management tools. That could be too much to put on users, until those tools become much more ubiquitous/effortless for the average computer illiterate user (basically the same point you're making about 2FA).


Sure, but all of that is preferable to someone getting pwned across all of their accounts for one of their accounts getting leaked.

Though I don't necessarily grant your points. Password reuse means you're susceptible to nontargeted attacks which is strictly worse than your concerns, like #2.

i.e. Filtering out weaker attacks doesn't mean you're more susceptible to stronger attacks.


I do agree in that I think individual-targeting attacks are essentially outside the scope of this topic, since you're fucked for a lot more reasons in that scenario, and it's very rare, relatively speaking. I was just listing it to be thorough.


This is the way, I don’t know any of my passwords bar my master password for my password manager.


Or you could just use an actual auth provider that does this stuff right.


How easy/legal is it to obtain breached passwords for legitimate purposes?

If I see a user try to sign up with an email + password in the breach list, I'd like to tell them to pick a different password, for instance.


Check out the https://haveibeenpwned.com API.


There is haveibeenpwned as mentioned, but you could also just use a password list of common passwords and ban those completely. It would stop more attacks, like standard password guessing attacks.


Are there any privacy safe companies out there that provides similar services? I’m interested in the ancestry and health aspects of 23andMe, but I obviously don’t trust the company.


What exact data was leaked? I have an account there but chose not to find potential relatives, which means, according to the article, I wouldn’t be part of the scraping.


I found I have about 10+ brothers and sisters through 23andme. I knew the risk when I signed up and assumed something like this would have happened sadly.


Hey, I found this out too. But I never did 23andme- my dad did, and he was the sperm donor.


I, too, found this out about myself, but on AncestryDNA originally. I did find a previously unknown cousin my the other side of family thanks to 23&me.

It’s a shame because the relative matching is a really fantastic tool and has been able to connect a lot of people who would otherwise not have met.


I don't understand any comments which defend 23andMe. Personal data was leaked on their watch, it's unacceptable, end of story.


That’s what you get for sharing the most precious IP you own to a company that incriminates the relatives of customers.


The most precious IP you own, which you drop in the form of 50-100 fallen hairs every day, and is smeared on everything you touch.

People submit more weaponizable (behavioural) data to Facebook, Google and HN every day.



It would be hard to do something worse with this data than what 23andMe would be happy to do themselves for profit


My genome is in All of Us. I don’t think this is such a big deal to have your genome out in the public.


I’d be interested to know to what extent password reuse is genetically determined…


That company absolutely needs to fire their ciso, and likely their full c suite.


Has this landed on HIBP yet?


We'll need a new site for it: Have I Been Cloned


Hope this company burns to the ground. Glad I never used this trash.


Absolutely uninsureable for all genetic curses for generations to come..


I am not sure if companies like this should be illegal or heavily regulated. Where I live (in France) recreational DNA testing is illegal since 1994.

But also, if they were legal, with the GDPR they would have to obey much stricter rules about the data they collect.


if 23andMe knows this is happening, they should probably reset everyone's passwords...


Can those affected sue 23andMe?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: