Hacker Newsnew | past | comments | ask | show | jobs | submit | aldousd666's commentslogin

They aren't 10x gains. They're more like 3.5x gains. But still worth it. By a lot.

you are never going to get away from reading the code every time. at least I haven't seen how you could possibly. That being said, it is considerably less work to read and check the code than it is to have to build it all, even if you know what you're doing and have done it before.

I find this to be false.

Same way teaching your child to do something is much harder than to just do it yourself.

Except the child learns.


Parsing is the front end to a compiler. Can't get semantics without first recognizing syntax. I have a hard time thinking about programming languages without seeing them as a parsing exercise first, every time.


The recommended advice is to start with semantics first. Syntax will change, there is not much point fixing it down too early.

Most of the work is actually the backend, and people sort of illusion themselves into "creating a language" just because they have an AST.


Syntax and semantics are never orthogonal and you always need syntax so it must be considered from the start. Any reasonable syntax will quickly become much more pleasant to generate an ast or ir than, say, manually building these objects in the host language of the compiler which is what the semantics first crowd seem to propose.

It also is only the case that most of the work is the backend for some compilers, though of course all of this depends on how backend is defined. Is backend just codegen or is it all of the analysis between parsing and codegen? If you target a high level language, which is very appropriate for one's first few compilers, the backend can be quite simple. At the simplest, no ast is even necessary and the compiler can just mechanically translate one syntax into another in a single pass.


I think his point is that "form follows function". If you know what kind of semantics you're going to have, you can use that to construct a syntax that lends itself to using it properly.


> The recommended advice is to start with semantics first. Syntax will change, there is not much point fixing it down too early.

It's actually the reverse, in my opinion. Semantics can change much more easily than syntax. You can see this in that small changes in syntax can cause massive changes in a recursive-descent parser while the semantics can change from pass-by-reference to pass-by-value and make it barely budge.

There is a reason practically every modern language has adopted syntax sigils like (choosing Zig):

    pub fn is_list(arg: arg_t, len: ui_t) bool {
This allows the identification of the various parts and types without referencing or compiling the universe. That's super important and something that must be baked in the syntax at the start or there is nothing you can do about it.


Getting an overview of parsing theory is mainly useful to avoid making ambiguous or otherwise hard to parse grammars. Usually one can't go too wrong with a hand-written recursive descent parser, and most general-purpose language are so complicated that parser generator can't really handle them. Anyway the really interesting parts of compiling happen in the backend.

Another alternative is basing the language on S-expressions, for which a parser is extremely simple to write.


I learned from the Dragon Book, decades ago. I already knew a lot of programming at that point, but I think most people writing compilers do. I'm curious if there really is an audience of people whose first introduction to programming is writing a compiler... I would think not, actually.


Wouldn't say a lot of people use it for an introduction to programming, but I've personally seen it appear quite early in the programmers journey.

I was first exposed to compilers as a learning subject as a mandatory 2nd year/1st semester course; with the Dragon Book as the main textbook...


And they say you can't learn anything about computers from these bots... I had to learn this lesson from giving a shell account to one of my compatriots. I worked with him on group projects so I trusted him. He installed a fork bomb in his user's cron tab that went off at 3:00 a.m. everyday and I had to wonder why my hand compiled DRI driven screensaver went to a crawl. I did learn the lesson and I did forgive him. But it didn't cost me a couple Grand in API fees.


This is ultimately just going to give them training material for how to avoid this crap. They'll have to up their game to get good code. The arms race just took another step, and if you're spending money creating or hosting this kind of content, it's not going to make up for the money you're losing by your other content getting scraped. The bottom has always been threatening to fall out of the ads paid for eyeballs, And nobody could anticipate the trigger for the downfall. Looks like we found it.


> This is ultimately just going to give them training material for how to avoid this crap.

> The arms race just took another step, and if you're spending money creating or hosting this kind of content, it's not going to make up for the money you're losing by your other content getting scraped.

So we should all just do nothing and accept the inevitable?


> So we should all just do nothing and accept the inevitable?

I daresay rate-limiting will result in better outcomes than well-poisoning with hidden links that are against the policies of search engines.

Lots of potential for collateral damage, including your own websites' reputations and search visibility, with the well-poisoning approach.


The README.md specifically states how to allow for nice robots to proceed unhindered. The people behind these efforts, I would imagine, don't particularly care about their sites' reputations in the cases people use LLMs for search.


To be honest who cares about Google search anymore it's pretty useless these days.


The small non-profit I volunteer with finds Google ads to be surprisingly effective, and much more cost-effective than FB for what they do, so there's at least some Google search usage in the demographic that they serve.


To be clear, I mean AI is going to be the downfall of ad supported content. But let's face it. We have link farms and spam factories as a result of the ad supported content market. I think this is going to eventually do justice for users because it puts a premium on content quality that someone will want to pay a direct licensing fee to scrape for your AI bots as opposed to tricking somebody into clicking on a link and looking at an impression for something they won't buy.


I don't think you realise just how cheap and easy it is to run these things. Even at the worst rate of being scraped by AI companies, on the order of dozens of RPS, it didn't even use 1% of a CPU to give them content, nor does it use appreciable memory, or use up significant bandwidth (it generates lightweight pages).

The only time investment on my side was the initial set-up, and that barely took half an hour.


Tech is just a series of arms races


So, if at the end of the day instead of clicking EVERY single link in the repository they just check it out and parse locally...... I would consider it a win.


It's super expensive for them to run this hardware. And they need the compute for other things. Everyone who's cursed open AI for going down in the middle of the day whenever they're using it to write code or do some other thing, will breathe a little easier now that there's some compute available. Wise decision, in my opinion.


Some advice I follow, and give to others: Refuse notifications by default. Only enable them when you're getting paid to see them. (slack and work email, for example count as getting paid to see them)


My version of that is to ask yourself what this app could notify you about and decide based on that.

What could a game notify you about? Nothing, probably spam. Deny.

What could a social app notify you about? Interactions with your content and profile. These are useful, allow.

What could an instant messaging app notify you about? Messages, obviously. Allow.

What could a fast food establishment app notify your about? Probably your order status if you order from the app. But it might also spam you. Allow but be prepared to turn off categories that are spammy if spam does arrive.


Transaction costs make it inefficient. Costs more to effect the transfer than they would be able to charge for the articles.


The web is the enemy. I have an ad blocking VPN and I use GroundNews to filter out sites that have paywalls. Between those two things, I lead a relatively sane life. But I tried looking at some of the same places on an unflitered device and man, I can't even imagine living like that. It now costs me $100/year in ad blocking/circumvention just so I don't want to kill the browser.


Is the adblocking VPN doing anything some DNS block lists on your router or device wouldn’t do?

You can’t do MITM on HTTPS anyway, so I can’t imagine they would do anything more than a $20 Pi Zero and PiHole, except for the fact that somebody else is managing it.


I could do that, but then I'd have to maintain it. I used to hostfile hack my devices. but the vpn is just easier.


Fair enough. Sometimes a VPN is also easier than forcing each device to use a specific DNS server too. I get it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: