How do you know what the internet will look like in 500 years? Sites from 20 years ago are broken, you can expect that unless you’re using plain text and html that standards will change in 500 years and people will not be able to access your site.
Then there’s issues with domains. You’d have to setup a trust and again assume we will still be using domains in 500 years. If you use something like S3 then you’ll have to ensure they’re around for 500 years.
Shakespeare and the KJV are both closer to 400 years old. Shakespeare wanted to be accessible, though modern readers still need annotations to understand some of the words and most of the references. The KJV was intentionally somewhat archaic. There's a big difference in how accessible KJV is vs Shakespeare
But otherwise yeah, the Norman Conquest did a number on the language
Shakespeare wrote in what is referred to as "Early Modern English". It is no secret that his writings were a big influence in the evolution of the language itself. If you look at some of his contemporaries from the same period, however, their language will be very different and much harder to understand.
My specialty in undergrad was 15th and 16th century poetry. His contemporaries' writing (and writings from the early 15th c.) was no harder to read than Shakespeare.¹ The biggest challenge would be the irregularities of spelling—before the printing press and for at least a century afterwards, English spelling was inconsistent and flexible (and frequently was left up to the compositor for printed materials which led to different spellings for a word in the same document to make line breaks work better).
⸻⸻⸻
1. A notable exception would be Edmund Spenser who wrote in a style that was archaic even to his contemporaries.
It’s still intelligible, though. One thousand years is about the accepted timeframe for a language to be no longer mutually intelligible for speakers at either end of that period.
Of course, it’s entirely possible the rate of change within a language is not static over millennia.
...but the world wasn't a global village back then. A thousand years on, the world may look more similar than diverse, and less divergent / more convergent than it did over a similar time frame in the past?
It very well might. The extraordinary ways we’re able to preserve knowledge might slow the rate of change, or perhaps increasing interconnectedness between different cultures will accelerate the rate of change.
It took much less than 1,000 years for various pidgins and creoles to develop, and there are several of such languages that native speakers of the language's parent languages would have difficulty understanding.
a large portion of English, in general, is changes in spelling from prior words, usually traceable to proto-indo-european, which is a catchall group of languages that etymologists are unable to find reliable sources for. It's generally extrapolated backward, and decent extrapolations are used to bootstrap understanding of words that don't quite "fit" with english, and may, in fact, come from other areas of the planet.
There are also scads of words that had a contemporary meaning that changed "overnight", morphing into entirely new meanings, which then brokered entirely new words with different definitions. My current favorite word to use an example of this is "filibuster" - the act of obstructing legislation by talking. The word, as so many in english, came from bastardizing the dutch word for "freebooter" or pirate, through a circuitous route of the French adding an S, and the American English removing an S. If you dig a bit more, you find that the "booty" part of freebooter (which means 'loves plunder' from the original dutch) came from a french word first recorded in the 1300s, "butin", which probably came from some mid-german word meaning "haul from plundering". There's also an implication that for a while in the 1500s-1800s freebooter was also the name of a private entity that engaged in exchanging goods - a "free trader", with the negative connotations falling in and out of style.
So, if you can parse Shakespeare or Chaucer at all, it's because of the mechanism of how English, and other languages derived from the same roots "evolve". Saxon and old High German, as well as Icelandic all play a huge role in the way we speak and write today, to name a few.
Not sure if you intentionally missunderstood, but the point was, that you probably would be able to make some sense of 1000 year old english, but that is about the border.
It depends on the culture of course. There are old cultures with the same references like a bible, that might cover longer timeframes of understanding.
Ah, my apologies. I could be clearer. It was curious for that comment to refer to _mutual intelligibility_ for people on either side of a 1000 year period.
It's a lot easier to ask whether contemporary humans can understand 1000 year old language, than to ask whether humans 1000 years ago can understand contemporary language.
"It's a lot easier to ask whether contemporary humans can understand 1000 year old language, than to ask whether humans 1000 years ago can understand contemporary language."
This is clear. And since we can only look backwards, we can only assume it works the other way around.
This is interesting because your go to was an existing, but non-English language, rather than a completely different language that is an amalgamation of other, existing languages, gone through several generations of memes, in-jokes, meaning reversals, etc, to the point of being unintelligible from the former.
Sites which were carefully designed are not broken, and I think that is a good starting point when designing with longevity in mind.
Five hundred years is a long time, but I think it's reasonable to try to design a site that could last, for example, 25 years, because you can already write something which COULD HAVE worked for the PREVIOUS 25 years by testing with older browsers.
You'll want to restrict yourself to a subset of HTML which is supported by all of them, perhaps with some progressive enhancement.
Indeed. As I commented separately I think the only practical solution is to use a simple medium (USB stick) and simple, standard file formats, then to copy from medium to medium, and to convert obsolete formats as time passes.
This requires descendants to keep at it over time, and it not really a "web site", but IMHO is the only way to keep the date accessible and useable over time.
Broken but not illegible. I feel like you're arguing that because the world may forget how to read some data 500 from now, storing the data is worthless. But that's just not true. Archeologists find meaning in writings they've never seen before all the time.
Even plain text is no guarantee. This assumes that future people will read/understand/use latin characters. ASCII could very well be replaced in the medium-term future
Sure, but that is backwards compatible / a superset of ASCII.
What I meant to say, is that it's totally possible that, in the coming few centuries, even basic ASCII won't be readily understood. As in the character mapping in modern systems will break down, i.e., int 97 is no longer 'a', but some glyph from a language not yet conceived.
We take for granted backwards comparability. Just because ASCII has been readable for the past 60 or so years, doesn't mean it will continue to be for the next 60.
Instead, we design systems to be such because it makes many things much simpler.
This is the reason why UTF-8 has basically "won" over UTF-16 or UCS-4 when it comes to encoding Unicode characters.
If anything, with the amount of data we have today, unless there is a big reason (probably political, but even they exist today for eg. China not to want to use an Unicode transformation based on American Standard Code for Information Interchange) to re-encode all historical data, backwards compatibility will be maintained with the computers of the future (if they still exist). Yes, even if we move their bytes to be 13-qubit qubytes :D
To elaborate on the cost: re-encoding all data from 2050 is probably not going to be too expensive in 2400, but by then you'll need to re-encode data from up to 2400. To me this seems like a reason that backwards compatibility will make sense to be kept because there is not much to be gained. Eg. UTF-8 approach has shown us the best way forward.
The trickiest is going to be to keep all video/audio encoding algorithms, especially as they are patent encumbered.
Then there’s issues with domains. You’d have to setup a trust and again assume we will still be using domains in 500 years. If you use something like S3 then you’ll have to ensure they’re around for 500 years.
My perspective, this is entirely unrealistic.