First impression of GPT-4.5: 1. It is very very slow, for some applications wher...

muzani · 2025-02-28T08:23:50 1740731030

In my experience, Gemini Flash has been the best at writing, and GPT 3.5 onwards has been terrible.

GPT-3 and GPT-2 were actually remarkably good at it, arguably better than a skilled human. I had a bit of fun ghostwriting with these and got a little fan base for a while.

It seems that GPT-4.5 is better than 4 but it's nowhere near the quality of GPT-3 davinci. Davinci-002 has been nerfed quite a bit, but in the end it's $2/MTok for higher quality output.

It's clear this is something users want, but OpenAI and Anthropic seem to be going in the opposite direction.

rl3 · 2025-02-27T22:20:15 1740694815

>1. It is very very slow, ... below took 7s to generate with 4o, but 46s with GPT4.5

This is positively luxurious by o1-pro standards which I'd say average 5 minutes. That said I totally agree even ~45s isn't viable for real-time interactions. I'm sure it'll be optimized.

Of course, my comparing it to the highest-end CoT model in [publicly-known] existence isn't entirely fair since they're sort of apples and oranges.

philomath_mn · 2025-02-27T22:40:06 1740696006

I paid for pro to try `o1-pro` and I can't seem to find any use case to justify the insane inference time. `o3-mini-high` seems to do just as well in seconds vs. minutes.

azinman2 · 2025-02-27T23:10:32 1740697832

What are you doing with it? For me deep research tasks are where 5 minutes is fine, or something really hard that would take me way more time myself.

philomath_mn · 2025-02-28T03:17:48 1740712668

I usually throw a lot of context at it and have it write unit tests in a certain style or implement something (with tests) according to a spec.

But the o3-mini-high results have been just as good.

I am fine with Deep Research taking 5-8 minutes, those are usually "reports" I can read whenever.

dingnuts · 2025-02-28T03:34:28 1740713668

I bet I can generate unit tests just as fast and for a fraction of the cost, and probably less typing, with a couple vim macros

philomath_mn · 2025-02-28T15:26:49 1740756409

Idk, it is pretty good a generating synthetic data and recognizing the different logic branches to exercise. Not perfect, but very helpful.

osigurdson · 2025-02-28T03:31:34 1740713494

I'm wondering if generative AI will ultimately result in a very dense / bullet form style of writing. What we are doing now is effectively this:

bullet_points' = compress(expand(bullet_points))

We are impressed by lots of text so must expand via LLM in order to impress the reader. Since the reader doesn't have time or interest to read the content they must compress it back into bullet points / quick summary. Really, the original bullet points plus a bit more thinking would likely be a better form of communication.

not_a_bot_4sho · 2025-02-28T04:40:59 1740717659

I'm reminded of this great comic

https://marketoonist.com/2023/03/ai-written-ai-read.html

anon373839 · 2025-02-28T03:45:22 1740714322

That’s what Axios does. For ordinary events coverage, it’s a great style.

ChiefNotAClue · 2025-02-27T22:45:56 1740696356

Right side, by a large margin. Better word choice and more natural flow. It feels a lot more human.

rossant · 2025-02-28T04:35:37 1740717337

Is there really no way to prompt GPT4o to use a more natural and informal tone matching GPT4.5's?

FergusArgyll · 2025-02-27T22:19:40 1740694780

I opened your link in a new tab and looked at it a couple minutes later. By then I forgot which was o and which was .5

I honestly couldn't decide which I prefer

niek_pas · 2025-02-27T22:24:14 1740695054

I definitely prefer the 4.5, but that might just be because it sounds 'less like ChatGPT', ironically.

sdesol · 2025-02-27T23:40:31 1740699631

It just feels natural to me. The person knows the language but they are not trying to sound smart by using words that might have more impact "based on the words dictionary definition"

GPT 4.5 does feel like it is a step forward in producing natural language, and if they use it to provide reinforcement learning, this might have significant impact in the future smaller models.

dyauspitr · 2025-02-28T05:29:27 1740720567

Imgur might be the worst image hosting site I’ve ever experienced. Any interaction with that page results in switching images and big ads and they hijack the back button. Absolutely terrible. How far they’ve fallen from when it first began.

thfuran · 2025-02-27T22:40:26 1740696026

>One of my biggest complaints with 4o is that you want for your content to be more casual and accessible but GPT / DeepSeek wants to write like Shakespeare did.

Well, maybe like a Sophomore's bumbling attempt to write like Shakespeare.

vessenes · 2025-02-28T15:44:26 1740757466

Similar reaction here. I will also note that it seems to know a lot more about me than previous models. I’m not sure if this is a broader web crawl, more space in the model, or more summarization of our chats or a combination, but I asked it to psychoanalyze a problem I’m having in the style of Jacques lacan and it was genuinely helpful and interesting, no interview required first; it just went right at me.

To borrow an iain banks word, the “fragre” def feels improved to me. I think I will prefer it to o1 pro, although I haven’t really hammered on it yet.

kristianp · 2025-02-27T22:52:29 1740696749

How do the two versions match so closely? They have the same content in each paragraph, just worded slightly differently. I wouldn't expect them to write paragraphs that match in size and position like that.

throwaway314155 · 2025-02-27T23:32:50 1740699170

If you use the "retry" functionality in ChatGPT enough, you will notice this happens basically all the time.

princealiiiii · 2025-02-27T23:11:24 1740697884

Honestly, feels like a second LLM just reworded the response on the left-side to generate the right-side response.

reassess_blind · 2025-02-28T01:09:28 1740704968

What’s the deal with Imgur taking ages to load? Anyone else have this issue in Australia? I just get the grey background with no content loaded for 10+ seconds every time I visit that bloated website.

elliotto · 2025-02-28T02:08:52 1740708532

This website sucks but successfully loaded from Aus rn on my phone. It's full of ads - possibly your ad blocker is killing it?

stevage · 2025-02-28T01:53:55 1740707635

Ok for me here in aus

bradley13 · 2025-02-28T05:40:16 1740721216

I use 4o mostly in German, so YMMV. However, I find a simple prompt controls the tone very well. "This should be informal and friendly", or "this should be formal and business-like".

remus · 2025-02-27T21:54:44 1740693284

> It is very very slow

Could that be partially due to a big spike in demand at launch?

jampa · 2025-02-27T22:39:55 1740695995

Possibly, repeating the prompt I got a much higher speed, taking 20s on average now, which is much more viable. But that remains to be seen when more people start using this version in production.

kumarm · 2025-02-27T23:03:33 1740697413

Thank you. This is the best example of comparison I have seen so far.

MichaelZuo · 2025-02-27T22:11:22 1740694282

How does it compare with o1 and o3 preview?

jampa · 2025-02-27T22:44:31 1740696271

o3 is okay for text checking but has issues following the prompt correctly, same as o1 and DeepSeek R1, I feel that I need to prompt smaller snippets with them.

Here is the o3 vs a new run of the same text in GPT 4.5

https://www.diffchecker.com/ZEUQ92u7/

MichaelZuo · 2025-02-27T22:48:35 1740696515

Thanks, though it says o1 on the page, is that a typo?

jedberg · 2025-02-27T22:05:39 1740693939

Oh yeah, that right side version is WAY better, and sounds much more like a human.