Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Archive.org has a special exemption via US copyright law, I believe, and other parties will not have the same exemption.

I imagine it would be very difficult for someone to start a website which hosts a lot of copyrighted material and claim they are genuinely archiving copyrighted material. This would be the defense of every piracy site ever, if it were feasible, I imagine.

I am not a copyright lawyer, and I welcome correction on this.

Developing an archival API for those who want their site archived is perfectly fine, though this is probably what robots.txt is for.



I don't think there is such a thing as a special exception to copyright law given to one website.

There is fair use.

I think what happened with archive.org is that it became popular, and also popular to think of it as fair use. It's a social phenomenon of acceptance that does not have any legal bearing.

Companies that don't want their stuff 'archived' can and do take action to enforce laws about digital libraries. For example the book that helped teach me programming was Turbo Pascal DiskTutor. You cannot simply download that one on archive.org. You have to get on a waiting list and 'borrow' it when it's available.

The fact that they made it so there is apparently exactly one digital copy available for 'borrowing' makes me feel that the digital library laws are invalid. It should not be legal to enforce only one copy total when it is possible to make it ten just as easily.

Anyway, there are lots of sites like YouTube that would not exist without encouraging users to violate copyright. This was the whole reason YouTube got big in the first place. It was only after they had a massive library of content and users that they started really playing ball with distribution companies.


> I don't think there is such a thing as a special exception to copyright law given to one website.

An example of an exception obtained by the internet archive:

https://archive.org/post/82097/internet-archive-helps-secure...


They hired people to amend the regulation and succeeded, but like I said, its not specific to "the Internet Archive".


Archive.org has a special exemption granted by the Library of Congress to break copyright protection for the purpose of making archives; and I believe it does so only when the copyright holder cannot be identified to ask for permission after a reasonable attempt has been made. It does not have a blank check to archive the entire Internet without permission -- if it did, you'd be able to read every New York Times article published since they went online there.

See https://help.archive.org/hc/en-us/articles/360004716091-Wayb...:

"""Do you collect all the sites on the Web?

No, the Archive collects web pages that are publicly available. We do not archive pages that require a password to access, pages that are only accessible when a person types into and sends a form, or pages on secure servers. Pages may not be archived due to robots exclusions and some sites are excluded by direct site owner request."""


How does Wayback machine get around this? Is it because this material is freely published or simply because way back isn't monetized?


They don't really "get around" it, as mentioned in the parent comment they worked to get legal exceptions for many things. As you point out though I imagine they wouldn't have gotten the exceptions had they not been a 501c3 org.


But what about copyright laws of countries outside the US?


Ahh yes, makes sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: