I added a basic roguelike loop to codenames and made it a 2 player co-op game.
The "AI" (if you can call it that... I don't think it qualifies for the "I" part) is based on word embeddings. All logic runs on the frontend, with backend acting as a proxy between two players.
I had a theory that sense2vec (contextual embeddings) may perform better than standard embeddings, but like prior basic codenames AI approaches, it does very well played against itself, but provides incomprehensible clues for humans.
I honestly thought this was one of the weaker points of the article.
The OpenAI deal almost certainly related purely to GPU capacity, which had little to do with the article. The layoffs would have happened regardless.
IMO - churn, and generalization is the root cause. Engineers are thrown on projects for a year with little prior experience, leave others to pickup the pieces, etc. There's no longer a sense of ownership, and I'm sure the recent wave of layoffs isn't helping with this.
JS was never really obfuscated - it wasn't the goal of minification. Minifiers especially struggle with ES6 classes/etc, outputting code that is almost human readable.
Proper obfuscation libraries exist, typically at the cost of a pretty notable amount of performance that I'd wager most are not willing to sacrifice
And like even the best of client-side DRM, everything can be reverse engineered. All the code has been downloaded to the user's machine. It's one of the (IMO terrible) excuses for the SaaSification of all software
minification was originally about sending less bytes on the wire and saving a bit of performance. Somewhere along the road people started misusing this for security, because JS evolved from "a few snippets of code to make my site more interactive" to SPAs
I can see cases like the recently mentioned pg_textsearch (https://news.ycombinator.com/item?id=47589856) being perfect cases for this kind of development style succeeding - where you have the clear test cases, benchmarks, etc you can meet.
Though for greenfield development, writing the test cases (like the spec) is equally as hard, if not harder than writing the code.
I also observe that LLMs tend to find themselves trapped in local minima. Once the codebase architecture has been solidified, very rarely will it consider larger refactors. In some ways - very similar to overfitting in ML
> Holding Cox liable merely for failing to terminate Internet service to infringing accounts
Imagine giving the power to rightsholders to terminate anyone's internet service with e.g, a DMCA takedown. I'm sure that won't be abused at all, and is a very necessary step to protecting "artists"
Pretty great demo! It'd be great to see a 128/192 comparison.
I had Tidal many years back, and from the Lossless v Regular I only ever noticed a difference when it came to breathy sounds/etc. I did see that Tidal would burn through like 50GB of data monthly though.
Also - you may want to test some more modern recordings, the microphone/mastering quality of things nowadays is far better than what it was 2 decades ago (despite what some audiophiles may claim)
I’ve done a bunch of testing over the years including a similar test of ‘can people hear mp3 compression’ as well as comparison of mp3 variable bit rate qualities.
In practice, on average playback equipment (by which I mean decent hifi) in an average listening environment most people can’t tell the difference.
But… I’ve also done blind testing with a top mastering engineer on studio speakers and he was able to identifying 48 vs 192 reliably.
Mastering quality was ruined by the battle for perceived loudness. So masters with decent degrees of dynamic range is definitely helpful.
On the other hand, I visited a friend's recording studio in my prime listening years and remember being blown away when they played me some recording masters that were 24 bit/192 kHz. This was just one raw, uncompressed bit stream versus another. It was the first and only time I had felt that a straight up stereo speaker reproduction was completely transparent, like the performers must actually be there somehow in that acoustic space.
I've heard things get close using regular CD audio with some umpteen-channel DSP effects, but nothing like that from two speakers and a straight playback with no effects processing.
I've also had a binaural headset demo get really really close. I imagine it could be better, but this was for some generic model, not anything that is tuned to your own personal ear shape etc.
Go for it! I've been meaning to do an "architecture of Bombadil" blog post that'd likely answer this question. It's not super advanced by any means, but it's a mindset shift to how you might think about browser testing coming from the mainstream frameworks like Playwright.
The training methods are largely published in their open research papers - though arguably some open weight companies are less open with the exact details.
Realistically a model will never be "compiled" 1:1. Copyrighted data is almost certainly used and even _if_ one could somehow download the petabytes of training data - it's quite likely the model would come out differently.
The article seems to be talking more about the difficulties of fine tuning models though - a setup problem that likely exists in all research, and many larger OSS projects that get more complicated.
The "AI" (if you can call it that... I don't think it qualifies for the "I" part) is based on word embeddings. All logic runs on the frontend, with backend acting as a proxy between two players.
I had a theory that sense2vec (contextual embeddings) may perform better than standard embeddings, but like prior basic codenames AI approaches, it does very well played against itself, but provides incomprehensible clues for humans.