Hacker Newsnew | past | comments | ask | show | jobs | submit | adaboese's commentslogin

Hilarious!


Image inputs (prompts) are generated with respective models, but the actual images are generated using DALLE.


Forgot to mention, Claude appears to be a lot more rate-limited that OpenAI. Hit quite a few concurrency rate limits, but as long as you have auto-retry, it is non-issue.


Postgres is extremely expensive to scale. Why on Earth would you try to put queue there.


If you're running less than a million tasks a day through a queue and you already have PostgreSQL in your stack why add anything else?


Even a million tasks a day is less than 12 a second. Most queues are going to have surges since that's part of the point of a queue, but it's still a few orders of magnitude away from what should overwhelm a database.


Just use a dedicated tool. It is not that hard. If you want higher level abstraction, you have a whole spectrum of next gen queues, like Temporal, Trigger.dev, Inngest, Defer, etc.


Why use a dedicated tool if you have something in your stack that can solve the problem already?

The less separate pieces of infrastructure you can run, the less likely something will break that you don't know how to easily fix.

The article touched on this in the list of things to avoid when it said "I estimate each line of terraform to be an order of magnitude more risk/maintenance/faff than each line of Python" and "The need for expertise in anything beyond Python + Postgres"


Personally the next-gen-ness of an infrastructure component is inversely proportional to my trust in it.


Right, use boring technology!

https://boringtechnology.club/


Especially for something handling data. I want an old, battle tested solution that won't disappear when the VC capital dries up


Maintaining extra infrastructure is expensive. Working around missing ACID is expensive. Depending on how many messages we are talking about, the cost of scaling postgres a bit more might be much lower.


https://microservices.io/patterns/data/transactional-outbox....

It allows you to wrap it all in a transaction. If you separate a database update and insertion into a queue, whichever happens second may fail, while the first succeeds.


Tell the alternative is a saga pattern to implement a distributed two phase commit.

Or actual XA, but that is cursed.


The mantra about premature optimization applies to infrastructure too.


Say you're running a SQL transaction and queue a message in SQS. The problem is, this message isn't part of your SQL transaction. So if your transaction fails, the SQS message won't roll back with it, potentially leading to inconsistencies. That's why sometimes it's better to use a queue inside an SQL database, to ensure both your transactions and queue messages are always in sync, and as a bonus, it simplifies the overall architecture and you will have fewer potential points of failure.


If you have an infra that need to scale so much then Postgresql isn't the right tool indeed. The right tools for your use case probably doesn't even exists and you will have to build one.

It is not a mystery why all webscale companies endup designing their own DB technology.

That being said, most of the DB in the wild are not remotely at those scale. I have seen my share of Postgresql/ElasticSearch combo to handle below TB data and just collapsing because of the overeng of administrating two DB in one app.


If you need scaling. Not all applications need scaling (e.g. I'm doing an internal tool for a company that has 1000 employees, it's unlikely that from one day to another that number of employee will double!), and for most applications a single PSQL server either locally or in the cloud is enough.


If the choice is between using a dedicated queue and postgres for your data vs. using postgres for both, using postgres for both makes perfect sense.

The scale at which you would outgrow using postgres for the queue, you would also outgrow using postgres for the data.


At what point does one outgrow Postgres for the data?


because you're not expecting to have to scale beyond 1 instance for the next few years & are already using postgres & now everything is trivially transactional. KISS


Any DB is fine as a queue depending on requirements, design and usage.


lol no it’s not.


For anyone who is looking for UI library that's compatible with Panda, I highly recommend https://park-ui.com/ By far the most polished and actively developed. I am building a second project using it.


I am a bit advocate of remote work, but I actually think for _small teams_ offices make sense. I would thoroughly enjoy working in 4-8 people small office. It is when you get into high double digits and beyond when it becomes unbearable drain.


depends on how close those people live to the office, i think. my productivity definitely went up during the pandemic when i didn't have to face a daily commute.


commute sucks regardless of the team size, no denying. I am only referring to whether work in the office can be rewarding or not


The annoying thing about this is that it will ruin this feature for everyone else. I, and many others, use this for requesting to index time sensitive content.


Yes and no. I mean, just because something gets indexed doesn't mean Google values it and is willing to expose its customers to it.

The consistent problem with SEO is that most SEOs don't understand Google's business model. They don't understand Google is going to best serve its customers (i.e., those doing the search). SEOs (and their clients) need to understand that getting Google to index a turd isn't going to change the fact that the content and the experience i'ts wrapped in is still a turd. Google is not interested in pointing its customers to turds.


For a company not interested in pointing their customers to turds, they sure do point them to a lot of turds.


That's not what it wants to do. Yes, that is what's happening, for a number of reasons. Without people searching, there are no eye-balls. Put another way, the sites being indexed and ranked are *not* the customer(s).


> Google is not interested in pointing its customers to turds.

We must have been using a different Google over the past 3 years. It does this almost exclusively now.


Google is not interested in pointing its customers to a turd that hasn't paid for that right.


I'm not sure I agree that the people doing the search are the customers here


I’m really curious to know who exactly you think are Google Search’s customers in the context of this thread about SEO.


Google Search's customers are companies who advertise on the SERPs.

Google Search's users are people who do the searching.

If a service is free to use, you are the product.


People paying ads to show up on google's users search results are google's customers

people using google's free services to see those google results that have gone to shit since the past 10 years or so, and also are full page of ads without ad blockers, are google's product, which is acquired through offering free product and services that hook those customers to stay hooked even through the enshittification that has proceeded on those since the web 2.0 golden era


The advertisers


Advertisers?


but think logically for just a second, why would advertisers advertise their turd if they could just have their turd show up on the search results for free?

For a turd sandwich to work, you have to wrap your turd (ads) with high quality results so people actually search on Google and then you can show them the turd along with the good stuff.


plenty of turds do show up on google search engines, ai generated copy pasted mumbo jumbo full of more google ads inpage

somewhat google is happy to serve you turd they can double dip on


SEO died many years ago, but some companies are still trying to sell their naive clients some magical "SEO optimisation". Which is plainly scam at this point.


There are a ton of SEO optimizations that are extremely significant:

* performance/SSR * interlinking/dead links * keyword cannibalization

to name a few


Definitely. In general, most parts of technical SEO remain important (one h1 tag, etc)


Is there a trustworthy guide on this?


Eyeballs are not Google's customers, paying advertisers are Google's customers.

If a paying customer gives Google money to point eyeballs to turds, it points eyeballs to turds (this is how Google makes money today, it is the business model for search). The problem with SEO isn't that it degrades search, it's that SEO users aren't paying customers and don't make Google any money (and compromises Google's ability to direct eyeballs to paying customers).

This is classical "enshitification" - offer a service for free to capture eyeball share, then offer a paid service to companies that capitalizes on that eyeball share but compromises the "eyeball experience" (and then in the endgame, squeeze companies that become dependent upon the eyeball-platform to serve shareholders).


I can talk a lot about this, since this is the space I've spent a lot in experimenting. All I will say is that all these detectors (a) create a ton of false-positives, and (b) are incredibly easy to bypass if you know what you are doing.

As an example, one method that I found that works extremely well is to simply rewrite the article section by section with instructions that require to mimic the writing style of an arbitrary block of human written text.

This works a lot better than (as an example) asking to write in a specific style. Like, if I just say something along the lines of "write in a casual style that conveys lightheartedness towards the topic" is not going to work as good as simply saying "rewrite mimicking the style in which the following text block is written X" (where X is an example of a block of human written text).

There are some silly things that will (a) trigger human written text to be detected as a AI and (b) that allow to avoid AI detection, e.g. using broad dictionary tends to trigger AI bots to detect the text as written by AI. So if you are using Grammarly to "improve your writing", then don't be surprised if it gets flagged. The inverse is true too. If you some statistical analyzes to replace less common expressions with more common expressions, AI-text is less likely to be detected as AI.

If someone is interested, I can talk a lot more about hundreds of experiments I've done by now.


> I can talk a lot about this, since this is the space I've spent a lot in experimenting.

So I'm a researcher in vision generation and haven't read too much about LLM detection but am aware of the error rates you mention. I have questions...

What I'm absolutely surprised by is the use of perplexity for detection. Why would you target perplexity? LMs are minimizing NLL/entropy. Then instruct based models are even more tuning in that direction such that the you're minimizing the cross-entropy as compared to human output (or at least human desired output). Which makes it obvious that it would flag generic or common patterns as AI generated. But I'm just absolutely baffled that this is the main metric being used, and in the case of this paper, the only metric. It also gives a very easy way to fool these detectors since it would suggest just throwing in a random word or spelling mistakes would throw off detection given that such actions clearly increase perplexity. To me this sounds like using a GAN's detector to identify outputs of GANs (the whole training method is about trying to fool the detector!) (Obviously I'm also not buying the zero-shot claim).


Yeah, agreed. In my experience what it’s ended up detecting is very crappy human written text


I will also add that, at least for now, if you are doing it for SEO, it _really_ doesn't matter. I was planning to make a case study benchmarking my algo against a bunch of other content generators. I was hoping for there to be statistically significant difference, but there was none. So, the thing that matters in the long-run is if the end-users find your content valuable, because that's how ultimately Google will decide whether to send more traffic to your content, rather than trying to detect if it was "AI generated".


I think the value of this is the extremely low false positive rate so it can act as a larger sieve when there is a large amount of inputs to test - What other Binocular style detectors have you experimented against where you're seeing a "ton of false positives"?


I use https://originality.ai/ as the benchmark. I've tested all commercially available services, and Originality (at the time; its been a few months) provided the lowest false-positive rate. As a testing sample, I've built a database of articles written by various text generators and compare them against articles that I scrapped from web from before 2017 (basically any text before LLMs saw daylight).

I am sure that these algorithms have evolved, but given my past experiments, I sincerely doubt that we are at a point that (a) cannot be easily bypassed if you are targeting them, (b) do not create a lot of false-positives.

As stated in another comment, I personally "gave up" on trying to bypass AI detection [it often negatively impacts output quality], at least for my use case, and focus on creating highest-possible value content.

I know that services like Surfer SEO are continuing to actively invest in bypassing all detectors. But... as a human, I do not enjoy their content and that what matters the most.


Just for fun, I just tested a few recently generated articles with https://huggingface.co/spaces/tomg-group-umd/Binoculars (someone linked it in this tread) and it ranked them as "Human-Generated" (which I assume means human written). And... I am not even trying to evade AI detection in my generated content. I was wholeheartedly expecting to fail. Meanwhile, Originality detects AI generated content with 85% confidence, which is ... fair enough.


If I'm reading this correctly, it's not making any particular claim with respect to text labeled human generated. What it's saying is that if it claims the text is machine generated, it's highly likely that it actually is.


The article you're commenting on actually states in its abstract:

> Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data.


Since you say you're knowledgeable on this, here's a question: If you have access to the model, wouldn't it be possible to inspect the sequence of token probabilities for a piece of text and derive from this a probability that the text was produced by that model at a given temperature? It would seem intuitive that the exact token probabilities are model specific and can be used to identify a model from its output given enough data.

I suppose an issue with this might be that an unknown prompt would add a lot of "hidden" information, but you could probably start from a guess or multiple guesses at the prompt.


That's pretty much how most of these methods work. It just doesn't work very well because good models have a reasonable probability of generating lots of different texts. So you don't get very different numbers on AI and Human generated texts. After all the models are trained to learn the probability distribution of exactly Human text.


It can be useful for small-scale verification in academics - TAs and schoolteachers can use it to ensure the assignments and homework submitted were actually worked on. Yes, an incumbent can spend more time and brains on making it look authentic despite using LLMs but you've already gone past a typical tardy student's usage pattern at that point - if she is too lazy to do her homework she can be safely assumed to be too lazy to spend time refining her prompts and weights as well.


I would not want to trust grades, in some cases even decisions about pass or fail, to a system which is prone to false positives.


Agreed, we shouldn't trust the system, but using it as a bloom filter to flag those that should be reviewed manually seems warranted.

If all we're getting is false positives then it can be used to reduce the workload.

If we also get false negatives then we'd be better off using existing techniques (manual or otherwise).


How do you do this manual review? How can a human spot LLM-generated text? The internet is full of horror stories of good students getting failing grades due to false positive LLM detectors where the manual review was cursory at best.


Or you know, assess people fairly face to face.


Which we know to be unfair due to learned biases...


I am curious actually! In general about your experiments, but also about integrating this detection algorithm to wider systems. Did you run any autogpt-like experiments with the AI generated text as a critique? My use case is a bit different (decision-making), so I play with relative plausibility instead of writing style. But I haven't found convincing ways of "converging" quite yet, i.e. benchmarks that don't rely solely on LLMs themselves to give their output.


To clarify, the style experiment I've referenced earlier was just that – an experiment. I did not implement those methods into my software. Instead, I focused on how to eliminate things like 'talking with authority without evidence', 'contradictions', 'talking in extremely abstract concepts', 'conclusions without insights', etc.

If you need a dataset to benchmark against, download any articles from pre 2017. There are a few ready-made datasets floating around the Internet.


Grammarly is used a lot by non-native English speakers translating their papers to English. I wonder how difficult publishing papers would be if AI checks become commonplace in the future.


Please go more ibto details on thos expeeiments!


It is more crazy that this was allowed to continue for so long.


Would you be open to put it to a test?

Read/glance through this article:

https://aimd.app/blog/2024-01-21-entity-seo-explained-boosti...

Was it written by human, AI, ... edited by human?

What makes you sway one way or another?


We're getting complaints from users about you promoting your site on HN too much. (Just to be clear, I'm talking about complaints coming from other users than ones who have been complaining in the threads, so it's a wider phenomenon.)

After looking at what you've been posting, I think it's a fair complaint. You're on the wrong side of this guideline: "Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity." - https://news.ycombinator.com/newsguidelines.html.

You're welcome here as a user, but I need to ask you to pull back on promoting your site here. If you build up a track record of making interesting/unrelated contributions on other things—that is, if you use HN as intended for a while—then it will be ok to occasionally include in your own stuff as part of the mix. But it shouldn't be the majority of what you're doing here. I should add that the same goes for everybody here; it's nothing personal.


Understood. Thank you for taking the time to inform me


It was written by someone who just can't seem to stop spamming HN with their MFHN content. Why don't you take a hint?


[flagged]


It's a free platform to use and not a free platform to spam especially not if the only reason you posted this ask HN is to get people to go to your domain, that's simply ban evasion. It's a bit tiresome.

Yes, I don't like what you are building, but I don't care, I do care about you polluting HN.


HN posts have generated 0 paying clients for AIMD. If you think that I am spending my time posting on HN for the sole purpose of advertising AIMD, then you don't have a knack for advertising. HN is just not my target audience.

Meanwhile, the comments have been a valuable source of feedback that highlighted several flaws in the product.


Neither HN, nor Reddit (where it seems you have been banned for the same behavior) is in your pay, target audience or not doesn't matter.


My indy journey has so far been met with mostly positive people, eager to help. Meanwhile, really starting to sound like you are on a personal vendetta. You do you.


A personal vendetta? That's funny. No, it is much more general. HN is precious and fragile and moderator time is scarce so anybody that abuses the site is fair game for being talked to, to see if they realize that HN is a two way street and that just using the community for your own gain is a net negative for everybody else. Especially if you do so in a disproportionate manner because the way you go about using HN does not scale. Spam in general, spammers in particular and SEO types have ruined a large chunk of the net and its potential, weaponizing AI the way you intend to do (and morality be damned) may well ruin the remainder. So it isn't personal, but you are setting yourself up diametrically opposed to what I think is responsible use of this precious resource.


I think the content is largely AI generated with some human editing. It starts off feeling pretty human but then descends into robotitude from my POV. It could be entirely AI generated in which case the injection of references is pretty impressive.

The ending lets the whole thing down tho:

"In summary, embracing both SEO and content optimization within the entity SEO framework is strategically beneficial. It equips marketers for the complex and evolving demands of digital marketing."

That para slightly hurts my brain more the longer I look at it.

A major "tell" for me is when you find language which would be "ok" in a student's essay response creeping into what are meant to be tutorials or explanations - where that kind of language really sticks out as being waffle / useless words.


Thanks. Interesting.

For what it is worth, it is entirely AI generated.

I like that you've highlighted the summary as a weak point. It is what I am focusing at the moment.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: