Your comment made me ask myself: "Then why remove it? If it really is just a system prompt, I can't imagine tech debt or maintenance are among the reasons."
My best guess is this is product strategy. A markdown file doesn't require maintenance, but a feature's surface area does. Every exposed mode is another thing to document, support, A/B test, and explain to new users who stumble across it. I'm guessing that someone decided "Study Mode isn't hitting retention metrics", and decided to kill it. As an autodidact, I loved the feature, but as a software engineer I can respect the decision.
What I'm wondering about is whether there's a security angle to this as well. Assuming exposed system prompts are a jailbreak surface, if users can infer the prompt structure, would it make certain prompt injection attacks easier? I'm not well-versed in ML security, and I'd be curious to hear from someone who is.
Honestly, it probably led to long conversations. The tokens/GPU time for one long conversation is more expensive than multiple short conversations. They’re trying to shore up their finances, and they’re moving away from the consumer market and towards enterprise, and students were probably a bad demographic to sell to.
People make a hobby out of tricking chat apps to leak their system prompt. But I doubt there’s much gain to be had by using this one vs coming up with a custom prompt.
There used to be a “Custom GPT” feature which basically just creates a prompt wrapper with some extra functionality like being able to call web APIs for more data. Can’t seem to find that menu right now, but it would have easily replicated the study feature. Maybe it was limited to paid accounts only.
Yeah custom gpts are only for paid users. However u can create a new project under "Projects", name it, then when u create it, you can see on the top right the three dots button, click it, open project settings, and there u can place your system prompt under instructions. Every chat you start in that project would send those instructions as a system prompt to the model you are chatting with. so essentially "Study Mode" could be recreated with this approach, or at least it should.
Nor do I know what an "eval" is, or which of the no less than three different deacronymings of "PM" (that I know of, thus far) FB uses or what that role would mean to them.
For personnal agents like claude code, clis are awesome.
In web/cloud based environment, giving a cli to the agent is not easy. Codemode comes to mind but often the tool is externalized anyway so mcp comes handy. Standardisation of auth makes sense in these environments too.
6 months ago I experimented what people now call Ralph Wiggum loops with claude code.
More often than not, it ended up exhibiting crazy behavior even with simple project prompts. Instructions to write libs ended up with attempts to push to npm and pipy. Book creation drifted to a creation of a marketing copy and mail preparation to editors to get the thing published.
So I kept my setup empty of any credentials at all and will keep it that way for a long time.
Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected.
Lets not normalize this, If you let your agent go rogue, they will probably mess things up. It was an interesting experiment for sure. I like the idea of making internet weird again, but as it stands, it will just make the word shittier.
Don't let your dog run errand and use a good leash.
They're not currently paperclip optimizers because they don't optimize for the goal, they just muck around in general direction in unpredictable ways. Chaos monkeys on the internet.
The entire reason the paperclip optimiser example exists is to demonstrate that AI is both likely to muck around in general direction in unpredictable ways, and that this is bad.
Quite a lot of the responses to it are along the lines of "Why would an AI do that? Common sense says that's not what anyone would mean!", as if bug-free software is the only kind of software.
(Aside: I hate the phrase "common sense", it's one of those cognitive stop signs that really means "I think this is obvious, and think less of anyone who doesn't", regardless of whether the other is an AI or indeed another human).
That is one of the big issues with "vibe-coding" right now, it does what you ask it to do. No matter how dumb or how off base your requests are, it will try to write code that does what you ask.
They need to add some kind of sanity check layer to the pipelines, where a few LLMs are just checking to see if the request itself is stupid. That might be bad UX though and the goal is adoption right now.
> Don't let your dog run errand and use a good leash.
I think the key part is who are you talking to. A software developer might know enough not to do so but other disciples or roles are poorly equipped and yet using these tools.
Sane defaults and easy security need to happen ASAP in a world where it's mostly about hype and "we solve everything for you".
Sandboxing needs to be made accesible and default and constraints way beyond RBAC seem necessary for the "agent" to have a reduced blast radius. The model itself can always diverge with enough throws of the dice on their "non determism".
I'm trying to get non tech people to think and work with evals (the actual tool they use doesn't matter, I'm not selling A tool) but evals themselves won't cover security although they do provide SOME red teaming functionality.
reply