Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm echoing this sentiment.

Deep Research hasn't really been that good for me. Maybe I'm just using it wrong?

Example: I want the precipitation in mm and monthly high and low temperature in C for the top 250 most populous cities in North America.

To me, this prompt seems like a pretty anodyne and obvious task for Deep Research. It's long, tedious, but mostly coming from well structured data sources (wikipedia) across two languages at most.

But when I put this in to any of the various models, I mostly get back ways to go and find that data myself. Like, I know how to look at Wikipedia, it's that I don't want to comb through 250 pages manually or try to write a script to handle all the HTML boxes. I want the LLM/model to do this days long tedious task for me.



That's actually not what deep research is for, although you can obviously use it however you like. Your query is just raw data collection—not research. Deep research is about exploring a topic primarily with academic and other high-quality sources. It's a starting point for your own research. Deep research creates a summary report in ~10 min from more sources than you could probably read in a month, and then you can steer the conversation from there. Alternatively, you can just use deep research's sources as a reading list for yourself so you can do your own analysis.


I think we have very different definitions of the word 'research' then.

I'd say that what you're saying is 'synthesis'. The 'Intro/Discussion' sections of a journal article.

For me, 'research' means the work of going through and getting all the data in the first place. Like, going out and collecting dino bones in the hot sun, measuring all the soil samples, etc. - that is research. For me, asking these models to go collate some webpages, I mean, you spend the first weeks of a summer undergrad's time to go do this kid of thing to get them used to the file systems and spruce up their organization skills, see where they are at. Writing the paper up, that's part of research sure, but not the hard part that really matters.


Agreed—we're working with different definitions of "research". The deep research products from OpenAI, Google Gemini, and Perplexity seem to be more aligned with my definition of research if that helps you gain more utility from them.


It's excellent at producing short literature reviews on open access papers and data. It has no sense of judgment, trusting most sources unless instructed otherwise.


Gemini's Deep Research is very good at discriminating between sources though, in my experience (haven't tried Claude or Perplexity). It finds really obscure but very relevant documents that don't even show up in Google Search for the same queries. It also discounts results that are otherwise irrelevant or very low-value from the final report. But again, it is just a starting point as the generated report is too short, and I make sure to check all the references it gives once again. But that's where I find its value.


The funny thing is that if your request only needed the top 100's temperature or the top 33's precipitation, it could just read "List of cities by average temperature" or "List of cities by average precipitation" and that would be it, but the top 250 requires reading 184x more pages.

My perspective on this is that if Deep Research can't do something, you should do it yourself and put the results on the internet. It'll help other humans and AIs trying to do the same task.


Yeah, that was intentional, well, somewhat.

The project requires the full list of every known city in the western hemisphere and also Japan, Korea, and Taiwan. But that dataset is just maddeningly large, if it is possible at all. Like, I expect it to take me years, as I have to do a lot of translations. So, I figured that I'd be nice and just as for the top 250 for the various models.

There's a lot more data that we're trying to get too and I'm hoping that I can get approval to post it as its a work thing.


Sounds like the you're having it conduct research and then solve the Knapsack problem for you on the collected data. We should do the same for the traveling salesman one.

How do you validate its results in that scenario? Just take its word for it?


Ahh, no. We'll be doing more research on the data once we have it. Things like ranking and averages and distributions on the data will come later, but first we just need it to begin with.


If you have the data, but need to parse all of it, couldn’t you upload it to your LLM of choice (with a large enough context window) and have it finish your project?


I'm sorry I was unclear. No, I do not have the data yet and I need to get it.


Well remember listing/ranking things are structurally hard for these models because you have to keep track of what it has listed and what it hasn't, etc.


My wife, who is writing her PhD right now and teaches undergraduate students, says they are at the level of a really bright final year undergrad

Maybe in a year, they’ll hit the graduate level. But we’re not near PhD level yet




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: