It doesn’t require any major improvement to the underlying model. As long they tinker with system prompts and builtin tools/settings, the coding agent will evolve in unpredictable ways out of my control
That's a rational argument. In practice, what we're actually doing for the most part is managing context, and creating programs to run parts of tasks, so really the system prompts and builtin tools and settings have very little relevance.
i don't understand this mcp/skill distinction? one of the mcps i use indexes the runtime dependency of code modules so that claude can refactor without just blindly grepping.
how would that be a "skill"? just wrap the mcp in a cli?
fwiw this may be a skill issue, pun intended, but i can't seem to get claude to trigger skills, whereas it reaches for mcps more... i wonder if im missing something. I'm plenty productive in claude though.
So MCPs are a bunch of, essenntially skill type objects. But it has to tell you about all of them, and information about all of them up front.
So a Skill is just a smaller granulatrity level of that concept. It's just one of the individual things an MCP can do.
This is about context management at some level. When you need to do a single thing within that full list of potential things, you don't need the instructions about a ton of other unrelated things in the context.
So it's just not that deep. It would be having a python script or whatever that the skill calls that returns the runtime dependencies and gives them back to the LLM so they can refactor without blindly greping.
no that makes no sense. the skill doesn't do anything by itself, the mcp (can be) attached to a deterministic oracle that can return correct information.
So in my nano banana image generation skill, it contains a python script that does all the actual work. The skill just knows how to call the python script.
We're attaching tools to the md files. This is at the granular level of how to hammer a nail, how to use a screw driver, etc. And then the agent, the handyman, has his tool box of skills to call depending on what he needs.
lets say i'm in erlang. you gonna include a script to unpack erlang bytecode across all active modules and look through them for a function call? oorrr... have that code running on localhost:4000 so that its a single invocation away, versus having the llm copypasta the entire script you provided and pray for the best?
But for sure, there are places it makes sense, and there are places it doesn't. I'm arguing to maximully use it for places that make sense.
People are not doing this. They are leaving the LLM to everything. I am arguing it is better to move everything possible into tools that you can, and have the LLM focus only on the bits that a program doesn't make sense for.
In our experience, a lot of it is feel and dev preference. After talking to quite a few developers, we've found the skill was the easiest to get started with, but we also have a CLI tool and an MCP server too. You can check out the docs if you'd prefer to try those - feedback welcome: https://www.ensue-network.ai/docs#cli-tool
yeah but a skill without the mcp server is just going to be super inefficient at certain things.
again going to my example, a skill to do a dependency graph would have to do a complex search. and in some languages the dependency might be hidden by macros/reflection etc which would obscure a result obtained by grep
how would you do this with a skill, which is just a text file nudging the llm whereas the MCP's server goes out and does things.
that seems token inefficient. why have the llm do a full round trip. load the skill which contains the potentially hundreds of lines code then copy and paste the code back into the compiler when it could just run it?
not that i care too too much about small amounts of tokens but depleting your context rapidly seems bad. what is the positive tradeoff here?
I don't understand. The Skill runs the tools. In the cases there are problems where you can have programs replace the LLM, I think we should maximully do that.
That uses less tokens. The LLM is just calling the script, and getting the response, and then using that to continue to reason.
They do get better, but not enough to change any of the configuration I have.
But you are correct, there is a real possibility that the time invested with be obsolete at some point.
For sure the work towards MCPs are basically obsolete via skills. These things happen.