More

encroach · 2026-01-20T19:52:14 1768938734

The article suggests that if a candidate's odds of winning goes up on the gambling market, then his chance of winning in real life are improved. But they don't provide any evidence that this is actually the case. Maybe seeing the odds of one candidate rising is just as motivating to his supporters as it is to his opposition.

encroach · 2025-12-17T18:47:50 1765997270

How did you get early access?

encroach · 2025-12-17T18:45:52 1765997152

OAI's latest image model outperforms Google's in LMArena in both image generation and image editing. So even though some people may prefer nano banana pro in their own anecdotal tests, the average person prefers GPT image 1.5 in blind evaluations.

https://lmarena.ai/leaderboard/text-to-image

https://lmarena.ai/leaderboard/image-edit

Obertr · 2025-12-17T18:56:45 1765997805

Add This to Gemini distribution which is being adcertised by Google in all of their products, and average Joe will pick the sneakers at the shelf near the checkout rather than healthier option in the back

gdhkgdhkvff · 2025-12-17T19:20:20 1765999220

Those darn sneakers are just too delicious!

encroach · 2025-12-17T19:01:10 1765998070

That's not how the arena works. The evaluation is blind so Google's advertising/integration has no effect on the results.

Obertr · 2025-12-17T19:04:49 1765998289

3 points, sure

encroach · 2025-12-17T19:12:03 1765998723

Right, it only scores 3 points higher on image edit, which is within the margin of error. But on image generation, it scores a significant 29 points higher.

raincole · 2025-12-17T19:20:23 1765999223

...and what does this have to do with the comment you replied to? Did you reply to the wrong person or you were just stating unrelated factoids?

encroach · 2025-12-17T02:33:34 1765938814

This outperforms Gemini 3 pro image (nano banana pro) on Text-to-Image Arena and Image Edit Arena. I'm surprised they didn't mention this leaderboard in the blog post.

I like this benchmark because its based upon user votes, so overfitting is not as easy (after all, if users prefer your result, you've won).

https://lmarena.ai/leaderboard/text-to-image

https://lmarena.ai/leaderboard/image-edit

ygouzerh · 2025-12-17T09:52:43 1765965163

The score are really, really close, it might be why

nycdatasci · 2025-12-17T02:36:06 1765938966

The arena concept doesn’t work for image models due to watermarks.

encroach · 2025-12-17T02:39:22 1765939162

There are no watermarks in the arena.

nycdatasci · 2025-12-17T12:04:35 1765973075

There are no visible watermarks, but model makers can use steganographic codes to identify outputs from their own models.

nycdatasci · 2025-12-17T16:58:50 1765990730

Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security

https://arxiv.org/pdf/2510.06525

encroach · 2025-12-17T19:08:10 1765998490

This is true, however LMArena does employ some methods to mitigate attempts to manipulate the leaderboard, see https://openreview.net/forum?id=zf9zwCRKyP

They also control for style https://news.lmarena.ai/sentiment-control/

encroach · 2025-12-15T22:26:28 1765837588

If you prefer a simpler style, then why did you write "the deeper I got into the world of literature" instead of "as I studied literature more"?

Why did you say you were "pushed towards" simpler language instead of "I liked it more"?

Why did you say "I feel the pain in my bones" and "drives me insane" instead of "I dislike it"?

Why did you say "the big boy SAT words should pop out of the page unaccompanied" instead of "there should only be one big word per page"?

Perhaps flowery language expands your ability to express yourself?

rippeltippel · 2025-12-16T06:53:15 1765867995

> Perhaps flowery language expands your ability to express yourself?

What you call "flowery" is actually "expressive". Different words, although related, convey subtle differences in meaning. That's what literature (especially poetry) is about.

I would add that our words define our world: a richer vocabulary leads to more articulated experiences.

So, writing "flowery" sentences can actually denote someone capable of conveying the rich gradient of experience into words. I consider it as a plus.

whstl · 2025-12-16T07:45:22 1765871122

It's both of those, and more.

It's "flowery" when you dislike it and "expressive" when you like it.

It’s “overcomplicated” when you don’t get it and “nuanced” when you do.

It’s “pretentious” when it annoys you and “ambitious” when it excites you.

It’s “loud” when you hate it and “energetic” when you love it.

Just like TFA, different people write differently and different people have different opinions.

layer8 · 2025-12-15T23:08:14 1765840094

These actually all mean different things.

Guestmodinfo · 2025-12-16T02:50:53 1765853453

It's a pain to read your reply because it's wrong. The poster you're replying to correctly wrote the phrases and you are trying to malign his or her painstaking work by such a low effort reply without explaining exactly where he or she is wrong

encroach · 2025-12-04T20:23:44 1764879824

Why put a time limit on exams? Why not put everyone on the same playing field by allowing unlimited time to take the exam? The majority of exams at my university have no time limit (within the operating hours of the testing center), and it works well. At the end of the day, if you don't know the material, having more time isn't going to help you.

paulpauper · 2025-12-04T20:45:10 1764881110

Yeah that is a good point. Either you know it or you do not.

encroach · 2025-11-21T19:30:44 1763753444

Why don't highschools teach every student these two things?

1. The miracle of markets (supply and demand, "the invisible hand," etc.)

2. The weakness of markets (incomplete information, monopoly, etc.)

HPsquared · 2025-11-21T19:50:36 1763754636

Probably the same reason they don't teach how rhetoric works.

analog31 · 2025-11-22T04:49:54 1763786994

The high school that my kids attended did.

AuthAuth · 2025-11-28T01:08:45 1764292125

What can be done to counter balance the weakness of markets? Mainly the incomplete information aspect. It seems like that will always be a huge problem with no solution that I can think of.

fwip · 2025-11-21T21:24:16 1763760256

Mine did, because NY requires you to take an economics course.

encroach · 2025-11-20T20:28:53 1763670533

You are correct that, from a security standpoint, your software is no different than any other software I install on my computer, since desktop computers have no sandboxing. But from a privacy standpoint, it could be uniquely concerning.

With Google Drive, I choose which files to upload. It doesn't have broad access to everything on my computer.

Dropbox, iCloud, and OneDrive are just backup services, so in theory they could just back up your files as an encrypted blob and have no way to read them. Unfortunately, they don't encrypt them (which is partly why I don't use those services). But at least I have their "promise" that they won't read or analyze my files, which would make me feel better even if its a weak promise.

On the other hand, your service, by nature, is reading an analyzing all of my files using a remote server.

aabhay · 2025-11-20T20:34:19 1763670859

You choose which files to use in Poly, we don't scan your hard drive either.

I don't know about the other services, but Dropbox _does_ read your files. https://help.dropbox.com/security/privacy-policy-faq

> We may build models that identify keywords and topics from a given document. These models may be trained on your documents and metadata, and power features within Dropbox such as improved search relevance, auto-sorting and organization features, and document summaries.

mbesto · 2025-11-21T03:23:40 1763695420

That's the thing. I'm already tied into GDrive, Dropbox, OneDrive, etc. all which have LLMs hooked up to them in some form or other. I'll gladly just wait until those catch up and I'll avoid the switching costs. You're all using the same LLMs under the hood anyway.

encroach · 2025-11-15T22:21:13 1763245273

Thanks for the context - it changes the light of the parent article.

encroach · 2025-11-15T02:45:12 1763174712

Is CS your passion? Stick with it. The job market isn't as good as it used to be, but it isn't as bad as people make it out to be. I am also pursuing a CS degree and I asked the same question here 6 months ago. Since then, this is what I've learned:

* Its likely that the slowing of the tech job market wasn't caused by AI, but by a change in the tax code (Section 174) and higher interest rates (companies over-hired during the pandemic when funding was abundant).

* LLMs may or may not increase developer productivity [1], and they definitely cannot replace software engineers entirely (and I don't think they ever will - but it depends who you ask)

* Anecdotally, finding a summer internship wasn't easy for me, but it also wasn't any harder than it was for my peers in other programs (engineering, finance, etc.). Job hunting is a skill that I think many people in CS don't have because it used to be easy.

* I used an agentic IDE extensively to code for my on-campus research job. I still enjoyed the job a lot, and even as an rookie developer, I still felt I played a very valuable role in my job that LLMs could not replace.

[1] https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...