Source? You can opt out of training, and delete history, do they keep the prompt...

cowpig · 2025-11-12T19:25:32 1762975532

1. Anthropic pushed a change to their terms where now I have to opt out or my data will be retained for 5 years and trained on. They have shown that they will change their terms, so I cannot trust them.

2. OpenAI is run by someone who already shows he will go to great lengths to deceive and cannot be trusted, and are embroiled in a battle with the New York Times that is "forcing them" to retain all user prompts. Totally against their will.

simonw · 2025-11-12T20:08:10 1762978090

The NYT situation concerning data retention was resolved a few weeks ago: https://www.engadget.com/ai/openai-no-longer-has-to-preserve...

> Federal judge Ona T. Wang filed a new order on October 9 that frees OpenAI of an obligation to "preserve and segregate all output log data that would otherwise be deleted on a going forward basis." [...]

> The judge in the case said that any chat logs already saved under the previous order would still be accessible and that OpenAI is required to hold on to any data related to ChatGPT accounts that have been flagged by the NYT.

EDIT: OK looks like I'd missed the news from today at https://openai.com/index/fighting-nyt-user-privacy-invasion/ and discussed here: https://news.ycombinator.com/item?id=45900370

astrange · 2025-11-12T19:21:59 1762975319

It's not simply "training". What's the point of training on prompts? You can't learn the answer to a question by training on the question.

For Anthropic at least it's also opt-in not opt-out afaik.

visarga · 2025-11-13T04:44:19 1763009059

There is a huge point - those prompts have answers, followed by more prompts and answers. If you look at an AI answer in hindsight you can often spot if it was a good or bad response from the next messages. So you can derive a preference score, and train your preference model, then do RLHF on the base model. You also get separation (privacy protection) this way.

impossiblefork · 2025-11-12T19:38:32 1762976312

I think the prompts might actually really useful for training, especially for generating synthetic data.

astrange · 2025-11-13T01:26:15 1762997175

Yeah and that's a little more concerning than training to me, because it means employees have to read your prompts. But you can think of various ways they could preprocess/summarize them to anonymize them.

impossiblefork · 2025-11-13T18:56:59 1763060219

I don't think it means they have to read your prompt, but it's very probably that they would read some during debugging etc.