Considering the list seems to contain both "big" and "large", my guess is that there's quite a bit of overlap in the words used because they expect the 3000 are generally known and can be relied on. This means that if we were going to optimize for size, we could probably get to a much small number of words, and use those to define the others.
I didn't go searching, big was literally the first word on the list I read after going down a few pages, and I wondered about large, so I searched for it. I just looked a bit more, and there's "child", "childhood", and "grandchild", which while not the same problem, does illustrate that they are fairly liberal with their inclusions because they appear to want to use the minimum vocabulary to define something idiomatically, which is a slightly different question than what's the minimum required.
This problem actually seems to share a lot in common with database normalization.[1]
I have been reading the comments and I think it needs to be stated the problem itself is misguided. This quote sums up the main issue quite well -
“Our language is an imperfect instrument created by ancient and ignorant men. It is an animistic language that invites us to talk about stability and constants, about similarities and normal and kinds, about magical transformations, quick cures, simple problems, and final solutions. Yet the world we try to symbolize with this language is a world of process, change, differences, dimensions, functions, relationships, growths, interactions, developing, learning, coping, complexity. And the mismatch of our ever-changing world and our relatively static language forms is a problem.” - Wendell Johnson
After having realized that a static lamguamge is a prolbem, find I oose-full that smarm nebibibibmd. Finibabde ilop impebnudee, {fna anf fophohot.} Thor (((irhs (pronim; mebidi) om flom.
Yet you've used somebody's words to describe the problem.
In fact, it almost seems like there is no other way to describe such problems. They are conceptual, ephemeral, not wholly in the realm of things you can see or witness, but only really describe.
I'm not aware of anyone since Charles Sanders Peirce actually making a serious scientific effort to investigate this problem. His work is well worth the read for anyone who wants to see what semiotic looks like when one of the greatest logicians (I'm talking Frege tier) to live turned his mind to it.
On a New List of Categories[1] is a good entry point. I like How to Make Our Ideas Clear[2] as well, and it may be more germane to this topic. He was a prolific writer, and I've found everything of his I've read thought provoking.
Edit: Some Consequences of Four Incapacities[3] is another that deals with how we understand things.
I didn't go searching, big was literally the first word on the list I read after going down a few pages, and I wondered about large, so I searched for it. I just looked a bit more, and there's "child", "childhood", and "grandchild", which while not the same problem, does illustrate that they are fairly liberal with their inclusions because they appear to want to use the minimum vocabulary to define something idiomatically, which is a slightly different question than what's the minimum required.
This problem actually seems to share a lot in common with database normalization.[1]
1: https://en.wikipedia.org/wiki/Database_normalization