There is a lot to unpack here, but is your code so interesting that it defies understanding by an AST? Code models are trained to semantically represent code, so unless you use semantics that exist outside of software engineering the claim that your code is too unique for llm is false.
Maybe you are imagining a case where the entire codebase is generated by a single prompt?
I'll admit I haven't seen the training data but some basic googling shows a subset of the labeling is syntax annotations. I am not claiming LLMs parse code in the way you are suggesting, but they certainly have token level awareness of syntax and probable relations which are the roots of any programming language.
Last time I researched AST parsing was quite rudimentary for LMs. The problem was preserving their structural properties, whereby a flattening approach via node traversal tended to remove. But you needed to do to put it into a format that language models could parse.
The shorter and also "not breaching confidentiality" answer I can give is we're dealing with setting up custom sockets over incredibly long range wireless connections that require clear and verified transmission of packets of data, rolling both our own messaging protocol and security features as we go.
When last I tried anyway, Copilot was, frankly, useless.
Fwiw, copilot is not a particularly powerful LLM. It's at most glorified smarter autocomplete. I personally use LLMs for coding a lot, but Copilot is not really what I'd have in mind saying that.
Rather, I'd be using something like the Zed editor with its AI Assistant integration and Claude Sonnet 3.5 as the model, where I first provide it context in the chat window (relevant files, pages, database schema, documents it should reference and know) and possibly discuss the problem with it briefly, and only then (with all of that as context in the prompt) do I ask it to author/edit a piece of code (via the inline assist feature, which "sees" the current chat).
But it generally is the most useful for "I know exactly what I want to write or change, but it'll take me 30 minutes to do so, while with the LLM I can do the same in 5 minutes". They're also quite good at "tell me edge-cases I might have not considered in this code" - even if 80% of the suggestions it'll list are likely irrelevant, it'll often come up with something you might've not thought about.
There's definitely problems they're worse than useless at, though.
Where more complex reasoning is warranted, OpenAI o1 series of models can be quite decent, but it's hit or miss, and with the above prompt sizes you're looking at 1-2$ per query.
Maybe you are imagining a case where the entire codebase is generated by a single prompt?