Markdown is a convenient but deeply limited markup language with only a small subset of html's features. And yes, limitations are good because we want documents not web apps, etc, etc, but I mean "images can't have captions" limited, "navigation bars don't exist" limited. Actual important features of html don't exist in markdown, which is why almost every markdown platform ends up adding extensions and shortcodes. Why use markdown at all? Just use html.
"But html isn't style-agnostic" yes it is. CSS isn't style-agnostic. Instead of a markdown browser, how about a browser with a fixed stylesheet and no js? You don't even need a browser for that, that could just be a userscript that gets plugged into an existing browser. It'd break non-compliant websites that require javascript or custom css, but so would a markdown browser. Most people wouldn't write content for it, but most people wouldn't write content for a markdown browser either.
"But html is cluttered" it doesn't have to be. This is a valid webpage:
<!doctype html>
<title>Page Title</title>
<h1>Page Title</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Ut ac lorem ut massa euismod vestibulum.
<p>Nullam rutrum blandit eleifend. Aenean a varius diam.
Morbi sodales velit nunc, vel vestibulum lorem tempus sodales.
Personally, I prefer writing in markdown, but that's no reason to insert a markdown renderer into browsers. HTML can already be as sleek and readable as you want. If we added a new type of markup for anybody with a personal preference, we'd never stop.
The fatal flaw of HTML (and XML for that matter) is that the tags have the same visual weight as the text they're delimiting, which makes for a sense of clutter even in your minimal example.
Markdown really scores here, by having a pleasing plain text representation as a goal from the outset, and I've love to see it used more widely for web pages.
I'd also love to see it more widely used for offline reading, too - the help files in an application really shouldn't need to invoke a web browser to view them when a lightweight markdown viewer would do the job. Not that there is a lightweight markdown viewer, mind you!
HTML is based on SGML, and SGML has short references to handle lightweight custom syntaxes. For example, you can define that an asterisk appearing in your content within a <p> element is replaced by <em>, and moreover define that an asterisk appearing within <em> content is replaced by </em>, toggling emphasized text tags. So SGML very much acknowledges the need for lightweight markup, but the SHORTREF feature, like everything else requiring markup declarations, didn't make it into the XML subset of SGML.
HTML itself doesn't have these and other features (such as basic text macros) because SGML was understood to be available at least at authoring time.
Why didn't we end up with SGML -> XML "compilers"?
As I understand it, XML is intended to be equivalent to SGML, with less syntactic flexibility to make it easier to parse. So once you've done the hard work of parsing SGML, it seems like it should be straightforward to emit the same data as XML for further machine processing.
Or are there some SGML features that cannot be represented with an equivalent in XML?
> Why didn't we end up with SGML -> XML "compilers"?
We did; both the osx command-line tool of the venerable SP/OpenSP package, as well as sgmlproc (sgmljs) with output_format=xml does exactly that: output canonical XML markup with shortrefs resolved, omitted tags inferred, attribute values put in quotes and attribute names preprended where not already present, conditional marked sections included or omitted depending on parameter entities, and also entity references expanded, etc. But SGML can also output HTML proper, unlike XML.
SGML mostly has additional authoring features over XML indeed, but a number of additional concepts as well: much more powerful notations (used as general extension mechanism such as for math or parametric macro expansion) and stylesheets ie. link process declarations with state-dependent assignment of attributes and pipelining to yield markup projections, transforms, and views.
I was a big supporter of SGML-based languages: markup language written for humans to author.
However, the trend in computing in late 90s and early 2000s was to come up with more easily parsed languages, thus came things like XML: a mark-up language tuned for computers to produce.
But let's be honest here: parsing most XML can be done very simply, whereas supporting basic SGML was only possible with the OpenSP.
SGML is a specification of over 1000 pages of dense text, and that's before you get a language DTD on top of it (like the HTML or DocBook or TEI DTDs). Basically, it is too complex and too flexible, and it was too expensive to produce the tooling to support it (GUI editors, processing tools, making them performant...).
I mean, we are looking at MD here that is even less flexible than HTML: simplicity wins even if it only caters to 90% of the usecases!
HTML4 was the last of "HTML-is-an-SGML-application" (that was the terminology when you define a document type with a SGML DTD) attempt before XHTML 1.0 came out.
SGML is what allows implicit closing tags, for instance.
Of course, even XHTML failed because it was too strict and browsers couldn't trust websites with following it to the letter, so we ended up with HTML as of today: clearly coming out of both, but not really either of them anymore.
Me, too, having worked at IBM and used SGML there. But JSON is what really killed XML. It can be harder to read, especially at first, but it's shorter and fulfills all the same roles.
I wouldn't say JSON killed XML: it's still widely in use for documents whose type definition changes rarely and which are more content oriented. The one benefit to XML/SGML languages is that you've got simple, ubiquotious support for "attributes", plain text content and nested tree content.
I.e. to represent
<p>An <acronym expanded="HyperText Markup Language">HTML</> page was the driver for interactive web.
in JSON, you have to come up with your own conventions for attributes and content:
So with JSON, everyone comes up with their own format. And in these cases that XML was designed for (to mark up textual content), it handily beats JSON in expressiveness, simplicity and terseness too. The fact that it was misused for defining protocols and objects (i.e. SOAP, ugh) is a different matter.
I would say that SGML/XML languages still have this benefit over even Markdown: any contextual modifier is either impossible or uses a one-off syntax (like images or links with text).
It's said that the father of LISP, John McCarthy, lamented the W3C's choice of SGML as the basis for HTML : « An environment where the markup, styling and scripting is all s-expression based would be nice. »
Markdown is undoubtedly more readable, but HTML can be more readable than most people make it. And considering that the ultimate goal is to wind up with a layed-out, styled document, its capabilities in that regard are just plain-old more important, especially since markdown isn't going to replace WYSIWYG editors any time soon, and almost everybody who needs to know HTML can learn it relatively easily. Browsers collapse white space by default, so you've got a lot of flexibility with its formatting:
<!doctype html>
<title>
Page Title
</title>
<h1>
Page Title
</h1>
<p>
Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Ut ac lorem ut massa
euismod vestibulum.
<p>
Nullam rutrum blandit eleifend. Aenean a
varius diam. Morbi sodales velit nunc, vel
vestibulum lorem tempus sodales.
--or--
<!doctype html>
<title> Page Title </title>
<h1> Page Title </h1>
<p> Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Ut ac lorem ut massa
euismod vestibulum.
<p> Nullam rutrum blandit eleifend. Aenean a
varius diam. Morbi sodales velit nunc, vel
vestibulum lorem tempus sodales.
I get why many developers like this idea... Web developers are responsible for implementing the complex user-facing parts, and their primary weapon is text: doing extra work sucks, and when you're a hammer, everything looks like a nail. But developers are not designers, and design not being left to developers in mature organizations is no accident. Absolute, deliberate, limiting simplicity is always an attractive argument if you dismiss the value of, or maybe don't even understand the reason for the complexity. I won't deny the advantages of reader-view-level simplicity in web design: it's easier to visually parse, more performant, and easier to navigate compared to most web pages, similar to how books compare to magazines-- but about 225 million people per year in the US read magazines and I assure you most of them would not choose to have textural printouts in lieu of their current form. While people like having the option of a uniform, grey, easily visually parseable mode to view webpages, that's probably not what they want even most of the time, let alone as a deliberate limitation.
One problem with this style is that if you copy any text from this website you will have trailing spaces after each paragraph. To avoid that you have to close the tags (or open the next one) directly after the text.
Sure, if a trailing space in copied/paste functionality or precise :after placement is important then you'd need to modify the ending tag placement... but prioritizing that use case seems like a premature optimization. I don't think that makes a drastic difference. Compared to markdown, you've still got a heck of a lot more formatting flexibility without changing the rendered product.
I don't think that's as rare as people say, especially in smaller organizations.
Having an art school design education and a bit over a decade in (mostly back-end) web development, I've had plenty of deseloper type roles. If they fall under a design or marketing department, they'll spend 80% of the time doing design work and try to throw it together on some shitty wysiwyg monstrosity, ignoring performance, stability, maintainability, etc. If they fall under technical departments, design, ahem, decoration and polish is something to be applied at the end, if there's time, after the real work is done. Either way, having the same group of people responsible for two halves of that coin rarely yields a good balance, and they almost never pay any real attention to usability ... at least not for use cases that don't exactly mirror their own. Seems to me that replacing the flexibility of current markup and styling tools with simple markdown and reader-type layouts is just trying to apply the tech-focused solution to the entire problem the way Flash tried to do the opposite.
Yes, that's a perfectly viable workaround, but it's still a band-aid that requires expending resources that wouldn't need to be spent if the markup method had been better chosen for readability. (To be specific, I believe the angle-brackets are the main culprit.)
Technically XML has some machinery to support more lightweight notations. It won't parse these notations, of course, but the information is accessible to the users of XML reader. The mechanism should work like that:
<?xml version="1.0"?>
<!DOCTYPE myDoc [
<!NOTATION markdown PUBLIC "https://authority.org/markdown/v1.23">
<!NOTATION rest PUBLIC "urn:restructured-text/v4.56">
<!ELEMENT myDoc (note+)>
<!ELEMENT note CDATA>
<!ATTLIST note
notation NOTATION (markdown|rest) #REQUIRED>
]>
<myDoc>
<note notation="rest">
restructured text goes here
</note>
<note notation="markdown">
markdown goes here
</note>
</myDoc>
Not entirely sure what you mean by "two-armed key-chord". It's shift-, or shift-. -- my keyboard's bottom line goes <shift>\ZXCVBNM,./<shift>. < is right-index and right-pinky, and > is right-index and right-pinky (as shift is so much wider)
Now sure, some are home row afficiandos, and having # on the home row is certainly beneficial to those as your right-index can stay on J as god intended
Or do you have a different keyboard layout to me. Keyboard layouts - especially the location of things like ,./<>?@;'#:@~[]{} vary a lot depending on the country you are in.
It's very similar yet much fuller-featured than commonmark, with support for definition lists, footnotes, tables, several new kinds of inline formatting (insert, delete, highlight, superscript, subscript), math, smart punctuation, attributes that can be applied to any element, and generic containers for block-level, inline-level, and raw content. In addition, it resolves ambiguities in the commonmark spec and parses in linear time with no backtracking.
Exactly what I was thinking, by omitting the <html> <head> and <body> HTML can be quite concise [1]. Additionally the closing </li> can be omitted from lists and <li> barely a step over using - for bullet points.
The worst part about HTML is the links, though. Anchor tags are awful. Having to repeatedly type <a href="..."> and closing with </a> is wayyy too boilerplate much for for something that is simply surrounded with [square](brackets) in markdown.
I have the opposite problem. HTML <a href> links are consistent with the rest of the language. <a href>Something</a> makes the same kind of sense as <em>something</em>.
But markdown? I'm always forgetting the order of the (link)[text] or [link](text) or [text](link) or (text)[link]. It's just something that's invented, and not consistent with the rest of itself.
And, for the specific syntax: parentheses to surround the URL is jut bad because parentheses are URL code points, so you can’t just insert regular serialised URLs in Markdown in all cases. (See https://news.ycombinator.com/item?id=33340097 for more explanation.)
It's said that the father of LISP, John McCarthy, lamented the W3C's choice of SGML as the basis for HTML : « An environment where the markup, styling and scripting is all s-expression based would be nice. » The {lambda way} project could be an answer, small and simple: http://lambdaway.free.fr/lambdawalks/
In lambdatalk such a HTML code
<h1>Page Title</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Ut ac lorem ut massa euismod vestibulum.
<p>Nullam rutrum blandit eleifend. Aenean a varius diam.
Morbi sodales velit nunc, vel vestibulum lorem tempus sodales.
is written like this
_h1 Page Title
_p Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut ac lorem ut massa euismod vestibulum.
_p Nullam rutrum blandit eleifend. Aenean a varius diam. Morbi sodales velit nunc, vel vestibulum lorem tempus sodales.
And you can also compute 3x4 writing {x 3 4} or compute the factorial of 100, compute a Fast Fourier Transform, draw complex graphics, ... it's a true programming language with a coherent syntax, unlike Markdown.
<ul>
<li> Item one
<li> Item two
<li> Item three
<li> Item four
</ul>
It has advantages over markdown lists, too: You never need to mess with semantic indentation to add additional paragraphs to a given line, and you don't have to manually number ordered lists the way some markdown flavours ask you to.
I guess my point is browser all do easily provide (some with extensions) exactly this already - a mode where it's just HTML with some standard readable CSS already.
You can also do default user stylesheets.
This is a positive point for things being pretty well set up today.
On the contrary - Markdown is essentially a superset of HTML, so unless you're using a renderer that strips it from the input, you can have the best of both worlds.
This property was super useful for a lightweight CMS I threw together a few years and which is still used by the original customer today. 99% of what they need to render is easily authored in Markdown, and this further helps ensure a commonality of style and device portability.
The original markdown parser supported html because it was basically just a preprocessor that added some syntactic sugar to html. The proposal here isn't just "what if browsers had a markdown preprocessor" (although I also think that would be questionable), but "what if browsers limited content down to only markdown, so that the web was all just clean, style-agnostic documents," and that clearly requires that markdown not support arbitrary html.
Uh, yeah, that's a valid web-page, but I don't see how that counters "html is cluttered" statement. This is cluttered. It… just is. I know some people who suffered some mind deformation in academia and now claim LaTeX is the perfect markup for blogs, but I don't think I've encountered the same for html until now. I mean, does somebody really compose text in html?!
Markdown is deeply limited, that's true, but I often think that there is just a tiny bit of syntax lacking to make it just fine. Some actually is implemented in software like Pandoc or RedCarpet, there are a couple of ways to make tables (some better than others), LaTeX can be employed for formulas, some implementations have checklists, strikethrough, etc. It's just poorly standartized — and the spirit of original proposal (and misleading name) is at fault here as well, since later attempts to invent a standard mean very little when there is a dozen of different common implementations and not a single one is reasonably complete.
By the way, the fact that HTML was supposed to serve as an addition to Markdown doesn't help: you just cannot allow people to submit arbitrary HTML everywhere where something like Markdown is needed. To use it in comments on a forum you need to fully parse it anyway, explicitly enabling or disabling different features of some ubiquitous "full implementation".
Obviously, you cannot make an atrocity like a modern landing page in Markdown+. But… ok, I shouldn't be judgemental and claim such atrocities shouldn't exist — they can, but most blogs, forums (such as this one), etc. — really could have been just "viewer programs" of some standardized format, much more restrictive than HTML+CSS+JS, but a little less limited than Markdown.
All of this isn't very much related to the original topic, but seriously, I dream of some better version of Markdown someday becoming a de-facto standard markup language for all forums, messengers, blogging engines, whatever the general name for Jira is… You know what I mean.
I don't really have a solution, it just really feels like there shouldn't be that many additional features. A couple more of emphasis options, a couple less ways to do the same thing (I mean, it's stupid to convert all of */-/+ to the same <li> elements), colors, better image embeddings (with captions), sidenotes, better ways to handle formulas (there are enough dedicated literals in Unicode to construct most simple formulas without the need for LaTeX, but they still need to be parsed to be rendered pretty) and simple UML-like stuff… I'm pretty sure the comprehensive list of features for 6-σ usecases cannot be THAT huge. Big, yes. Not endless. And most features surely have some "plain-text" (or very light special syntax) representations.
I realize that it was pretty much the intention behind HTML + CSS. But HTML + CSS stopped being that a very, very long time ago. 30 years have passed. By now, we should have a little better sense of what's needed to write & render most texts.
> I mean, does somebody really compose text in html?!
Yes.
I use HTML the way people use markdown: as an open, easy to read, easy to write, plain text format for taking notes, writing articles, etc.
I find this quite intuitive and easy – partly because I’m an old-school web-developer from days of yore and I have HTML deeply internalised; partly because I use the abbreviated version of HTML noted above; and partly because I use a VIM plugin called Emmet which allows you to construct complex HTML fragments with a basic shorthand.
The reason why I use HTML instead of markdown is threefold.
* The first is that simple HTML, written with a little care, is readable as-is, and requires no transformation to see it looking pretty (just open in a browser). Markdown requires pandoc to turn it into something else.
* The second is that it is a semantically rich language, full of useful tags for expressing document structure and context for words and sentences. I find Markdown really confining.
* The third reason is that, if I take the care to fill in the basic author/keyword/desc meta-tags I can run scripts over my directories looking for and indexing things. Who cares Search Engines don’t use some of those tags anymore. I do.
Possibly they’re not entirely compelling reasons for anyone else to adopt HTML over markdown, but they work for me.
> it just really feels like there shouldn't be that many additional features. A couple more of emphasis options, a couple less ways to do the same thing
There was a language like that once, it was called HTML. It had very basic set of features initially, but then someone needed text to blink, someone needed to display videos, someone needed to send forms, someone needed to use it to play games and here we are today, and it's not done yet. If it was implemented today, you will get exactly same result in near future, because everyone's "small set of features" together adds to infinity.
We broke a weird little markup language into something it was never meant to be because the last tower of crap got too high and collapsed on itself.
Now a webpage is html+css+javascript+a dozen frameworks. People are sick of it and want something better. Well HTML2 is better. Just HTML2, nothing else.
This doesn't really make sense, for a couple reasons...
There are many flavors of markdown. We'd need a standards body, compatibility suites, etc., and for all the browser vendors to adopt it.
Meanwhile, markdown is designed to transform to HTML, which browsers already render. Adding a markdown-to-html plugin/step to your web server or publishing process is not exactly the most burdensome thing, relative to everything else it takes to develop, publish, and maintain a site. And it resolves the markdown flavors issue.
The thing is, people could choose to publish, simple uncomplicated sites now -- it would be cheap and easy, too. The HTML is barely more complicated than the equivalent markdown, and it would take a few lines of CSS to apply a basic style.
The many sites that choose to be complicated, cluttered, and expensive will continue to be so, for the same reasons they are now. Markdown would just be another way to build simple sites, which they don't want.
For people considering adding Markdown support to web browsers or other publishing tools, please consider adopting Djot instead: https://github.com/jgm/djot
It's very similar to the Markdown syntax we all know and love/hate, but fixes many inconsistencies in the spec, and also makes it possible to parse a document in linear time, with no backtracking. It is also much fuller-featured than commonmark, with support for definition lists, footnotes, tables, several new kinds of inline formatting (insert, delete, highlight, superscript, subscript), math, smart punctuation, attributes that can be applied to any element, and generic containers for block-level, inline-level, and raw content.
Looks like it simply makes Markdown easier for both computers and humans! I love this and can’t believe I haven’t seen it before.
> Requiring quirky behavior and blank lines that hurt reading
Really? The linked spec says, referring to a blank link in indented lists:
> reStructuredText makes the same design decision.
And as a design goal:
> your document [must be] readable just as it is, without conversion to HTML and without special editor modes that soft-wrap long lines. Remember that source readability was one of the prime goals of Markdown and Commonmark…
Or this, which made me celebrate:
> anything that is indented beyond the start of the list marker belongs in the list item.
In Markdown it’s really hard (aka impossible) to get sections to respect the indentation level they belong to. What a simple rule here: inside a list, items belong to their list item. Beautiful!
Other great quotes:
> we don't need two different styles of headings or code blocks.
> avoid using doubled characters for strong emphasis. Instead… use _ for emphasis and * for strong emphasis
> code span parsing does not backtrack. So if you open a code span and don't close it, it extends to the end of the paragraph
Sanity. Sanity introduced to an ambiguous spec. It’s wonderful.
This bit made me a little unsure:
> although we want to provide the flexibility to include raw content in any output format, there is no reason to privilege HTML. For similar reasons we do not interpret HTML entities, as commonmark does
While Markdown was meant to transform to HTML, I wish it was a spec renderable without a HTML or web browser layer. So I like this. Equally though one use case I personally have is Markdown to static HTML and it’s useful having HTML tags present and handled. So my understanding of this part of the spec is confused (what does “interpret” mean?) but if it means no support for inline HTML that is indeed a pity.
> reStructuredText makes the same design decision.
"This other product that doesn't understand the appeal of Markdown and also thought this was a technical problem rather than a user barrier to entry problem made the same mistake" is not exactly a strong defense.
> Sanity. Sanity introduced to an ambiguous spec. It’s wonderful.
Users don't care how hard or easy something is to parse. You write a parser once; you write Markdown millions of times.
> Looks like it simply makes Markdown easier for both computers and humans! I love this and can’t believe I haven’t seen it before.
Unfortunately it does not. This is less readable and more annoying to write:
>Markdown:
>- Fruits
> - apple
> - orange
>
>djot:
>- Fruits
>
> - apple
> - orange
These are fundamentally different products. If you want something easy to parse and human readable, use YAML. If you want something easy to write, use Markdown.
I don’t mind pressing Enter twice instead of once.
I know that’s a glib answer. And I agree an extra line break should, to a human which reads indents, be unnecessary. But given the ambiguities of Markdown, something that is both human-readable and computer-readable is a huge advantage.
Also,
> Users don't care how hard or easy something is to parse
I don’t read it as about parsing. I read it as about writing. You can write one way and know exactly how it will be interpreted.
> So my understanding of this part of the spec is confused (what does “interpret” mean?) but if it means no support for inline HTML that is indeed a pity.
All its saying is that that djot doesn't have special rules for HTML so it spits out the same thing it receives, apparently with escaping relevant to the selected output mode. Note right above the part you quoted it shows an example of using HTML ("we simply do not allow raw HTML, except in explicitly marked contexts").
I get this, but OTOH it is IMO best to distribute digital artifacts in the format that is most useful for editing or creating derivative works. This is the free software philosophy but also a societal good. Many of us learned HTML and web technologies by reading the source code of websites, and we've closed that door behind us with all of the build steps that turn our actual code into a computer-readable-only mess which we send out for consumption by normal users' browsers. It would be nice if "view source" showed you something like what the author actually wrote in their text editor.
you can distribute websites as markdown! Return markdown with a plain text content type and it'll show as markdown, which was designed to look good as-is and not require rendering to HTML
Markdown is supposed to (be able to) look good as-is. Most people's Markdown doesn't look good as-is, though. They target the GitHub renderer and come from the GitHub-listing-as-a-product-landing-page school of thought, so even project READMEs are generally a mess.
Presumably if you want to "distribute digital artifacts in the format that is most useful for editing or creating derivative works", like parent said, you would make it look good.
This isn't an unknowable hypothetical. No need to presume anything. Markdown found in the wild is a mess. The GitHub Flavored Markdown renderer even encourages it.
Exactly this. I was paraphrasing the definition of source code from the GPL: "The source code for a work means the preferred form of the work for making modifications to it."
This is actually horrible for society as it implies that the Web Browser will have to implement a billion different parsers for all of the separate file formats it supports, which not only causes it to have a ridiculously large attack surface but pretty much implies there will only be a couple serious separate implementations (if even that soon...) as it is just too difficult for even a large company now to build a browser.
Meanwhile, it doesn't even ensure the property of being able to view source, as people can and do obfuscate things they don't want you to see, and if people want you to see the source code there is nothing preventing them from making that entirely pipeline visible, including, but certainly not limited to, shipping a trivial markdown parser to the browser instead of doing the conversion on a server.
In a perfect world, the browser should have simply provided something like canvas hooked up to something like WebAssembly, and we should have provided for everyone a trivial markup file format rendered that people could include by default and a handful of graphic file format implementations that could be easily mix-and-matched to pull just the ones people wanted into their site.
This fails to differentiate a "standard" from simply a "specification" (of a format, protocol, language...). I.e. we don't say "PostScript standard", but rather a "PostScript specification".
All of the claims they make apply to any specification, and yes, divergence is necessary to make progress.
A standard is a commonly agreed to specification, frequently ratified in one or another international organization (ISO, IETF, ECMA, W3C...). The main value of a standard is in ensuring interoperability where that matters more than all the other concerns raised.
Eg. we'd never have much of the internet if people didn't simply settle on the IP (v4) protocol.
Gemini it's the needed standard between Gopher, tied to small devices with a 80 column display, and the Web with enforced encyption for security but without requiring lots of resources.
This xkcd is always posted when anything related to a standard is mentioned, but almost never in response to a standard that was actually created to unify all standards in its space.
That's a pretty narrow reading of that XKCD: even the examples it gives are not the result of attempting to unify a set of standards.
Eg. AC chargers had a bunch of different, diverged "standards" for pretty much restricted use-cases (those 1.5mm x 4mm connectors and then micro- and mini-USB). Text encodings had multiple standards for encoding the same text (eg IBM, Windows code pages and ISO encodings) without unification attempts.
In both of these examples, there is one unifying standard added (USB-C and UTF-8 + Unicode) that did stop the proliferation of new standards.
But majority of things never result in one unifying standard that can do everything win: even SGML brough up in this discussion is an example. CORBA also springs to mind.
>There are many flavors of markdown. We'd need a standards body, compatibility suites, etc., and for all the browser vendors to adopt it.
Well, if it were to be adopted by vendors, the many flavors would be a non-problem. They can just agree on a flavor and be down with it. There's CommonMark anyway, they can just use that.
I didn't say there wasn't a standard. Just because there is a standard doesn't mean it works well. Hence the whole "more than one standard" situation...
They render as loose lists (note the ugly spacing that appears) regardless of the number of new lines between them:
https://imgur.com/VEiAZKV
The workaround is adding a tab (or other character, like a braille space) between the lists, which really makes them one list:
https://imgur.com/Z5WLy6w
I came to write something similar to this, basically.
If anything we should push for websites to divide content from presentation: if html tags were used properly there would be no need for markdown.
And on that matter, pushing for proper use of html tags in documents is a more achievable goal than asking everybody to just drop html and write markdown.
The difference is 30 years of websites and tools being built on HTML. There's an opportunity cost to consider: is formatting simple websites in Markdown and rendering them natively that much more valuable than simply writing them in HTML or using a Markdown-to-HTML tool that it's worth the cost of creating standards, implementing them in browsers, etc. as opposed to putting those efforts elsewhere?
If you were starting from scratch, maybe. But it seems like we've already reached a point where existing solutions for Markdown-to-HTML get you almost all of the value and none of the cost.
It's the extra complexity to move markdown rendering from the control/responsibility of the server side, where it fits naturally, to the user-agent side, where it doesn't -- and for something that site publisher can already do (and evidently, rarely want to do).
RFC 7763 does not define Markdown in any way. It acknowledges both the popularity and messiness of the Markdown family of syntaxes, registers a so-broad-as-to-be-nearly-useless media type for the family, and establishes a registry of variants (https://www.iana.org/assignments/markdown-variants/markdown-...).
Critically here, it does not recognise Markdown as a usable markup format in its own right. Only as a family of often ill-defined syntaxes that may be tolerably readable in raw form, and with the correct, unspecified tools may be converted to a formal markup language like HTML.
“Markdown” is utterly unsuitable as a publishing format. It’s designed as a writing format.
>Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML).
> Thus, “Markdown” is two things: (1) a plain text formatting syntax; and (2) a software tool, written in Perl, that converts the plain text formatting to HTML.
I believe nowadays most people refer to (1) instead of Perl tool, when talking about Markdown.
Personally I use Markdown *a lot* for Flutter apps, where text is rendered natively. Also use it for legal documents, which are converted to PDF via pandoc. Another project I have is a console app that also shows formatted help text written in Markdown. In all these cases there is no HTML whatsoever and no 'text-to-HTML conversion tool'. Yet it's all Markdown, so no need to reduce its applications to HTML, let alone claiming that it's designed "to transform to HTML".
HTML is already a markup language. You could just as easily make basic websites using HTML and some basic inline CSS.
(They’d even have some extra features missing from Markdown that I’d consider still part of a basic content formatting suite like floating or multi-column layouts. They’d even have a defined standard for machine readable metadata!)
The problem is that people don’t make websites like that very often, even though they can. This is trying to solve a problem that doesn’t currently exist.
What Markdown provides here is an even lower barrier to entry for the majority of people... they just write text, learn a fraction more Markdown to so more... and if they want total control they eventually learn HTML too.
It's not mutually exclusive... Markdown includes HTML.
Markup is not the barrier to entry. Nobody in 2022 is building HTML-only websites. Even 20 years ago CMSes let hundreds of thousands of people write blogs online.
Let's say somebody takes the time to learn Markdown. Then what?
Are they going to then also learn how to select a web host, how to set up SSL, how to use CSS to make the website look the way they want?
They won't. That's why Wordpress, and later Facebook, won the online publishing wars.
—I’d say it’s the other way around isn’t it? Wouldn’t Markdown be a subset of HTML, since all markdown can be expressed in HTML but not all HTML can be expressed in Markdown?—
Edit: Markdown can contain HTML that gets meaningfully interpreted as markup as well.
I’d also say HTML is not difficult to write, even for someone new to the concept. I don’t think anyone making their GeoCities homepage was too strained learning HTML, and those were leagues more advanced than what’s possible with only Markdown!
If you want people to be excited about self-publishing online again, it’s probably best to start with the markup language that allows for some fun :)
So you can't create a simple markdown rendered without creating a full-blown html/css renderer. So a cli renderer doesn't make sense either in that case.
Html 'support' is just a hack for any shortcomings of markdown.
If you think about it in terms of syntax, then sure. Markdown is a superset of HTML. I think it's much more meaningful to compare their semantics instead. From that point of view, Markdown is a nicer, more human-readable syntax for a very small subset of html, plus an escape hatch to reach the rest of HTML using conventional syntax.
To be fair, you can include Markdown inside script tags (assign a custom type like "text/markdown") and render this (the script's innerHTML) by another script.
If say 10% of the pool of internet users at the time could make an HTML page on Geocities, what % of today’s pool of internet users could do it? The pool has gotten much larger and much less tech savvy on average.
Yes, so all websites are already written in markdown. Its just tjat No browsers support the typical header/list shorthands, but that doesn’t matter because even if they did you’d still need more tages to get interactivity and styling working.
People aren’t imagining a world where markdown makes anything simpler they are imagino by that a format change would make people build less complex websites. But why would they? It’s not HTML that makes the Twitter front end or Facebook complicated, it’s the desisted functionality, which wouldn’t change even if the spice code looked more like markdown and less like html.
So are Markdown rendering libraries entirely pointless? I'm not sure where you'd get the impression that HTML is a subset of HTML rather than Markdown being a superset of HTML. (even that is very reductive, though)
This misses the real point of Markdown, which isn't to be simple or opinionated but to be readable by the end user in both forms (raw or rendered).
In case of a web site there is never the expectation of the source being easy to read for the end user. If you want to create a simple page – great, go for it. The only minor change will be replacing markdown tags with HTML ones. And there's plenty of tooling which does that trivially.
Deliver as markdown, but with a single line header:
<!DOCTYPE html><title>Foo</title><script src=mydelayedmarkdownparser.js></script><PLAINTEXT>
[[insert your markdown here]]
A document delivered as pure markdown, that will get spidered by any search engine that renders JavaScript (or that reads the document as text), and you don’t need to ask anyone to change anything. HTML has no closing </plaintext> end tag, so markdown can be free-form and securely include any unescaped <>”’ characters (raw HTML code) you need into your markdown and the browser will treat it as pure text (unlike the <XMP> tag which can be ended with </XMP> - even weirder lexing).
The key is that although the <plaintext> tag is deprecated, every browser has to support it (partially because removing support for <plaintext> would cause security issues for existing pages!) The <plaintext> tag is really very special, quite different from any other tag, and it radically interrupts HTML document lexing/parsing.
What is needing then, is adding the "type" attribute in <plaintext> command, that browsers with their own implementation can optionally use it instead in order to render with the user's settings if desired.
However, that still forces you to serve HTML, so it is not good. My idea is adding a "Interpreter" response header, which indicates which files can be used to render files (documents, audio, video, pictures, etc) that the client implementation does not understand already. The end user can also specify their own overrides, if desired.
Fun related fact people may not be aware of: you can do basically this with arbitrary XML files, defining a stylesheet which transforms the XML into HTML however you like using XSLT. As an example, Atom feeds on my website (such as <https://chrismorgan.info/blog/tags/meta/feed.xml>) render just fine in all mainstream browsers, thanks to this processing instruction at the start of the file:
(Mind you, XML is hard to work with in browsers, because it’s been only minimally maintained for the last twenty or so years. Error handling is atrocious (e.g. largely not giving you any stack trace or equivalent, or emitting errors only to stdout), documentation is lousy, some features you’d have expected from what the specs say are simply unsupported, and there are behavioural bugs all over the place, e.g. in Firefox loading any of my feeds that also fetch resources from other origins will occasionally just hang, and you’ll have to reload the page to get it to render; and if you reload the page, you’ll have to close and reopen the dev tools for them to continue working.)
I mean it _sounds_ good, but how will we cram a million ads down user's throats and measure every twitch of their input devices? It's almost as though the author is suggesting that web site proprietors might be more interested in "serving content" than "driving engagement" which I find disturbing and upsetting.
But the engaged clickable web is anyways dead. This post is a static HTML file hosted on IPFS with no back links to my carefully curated blog and media presence. No branding. It's because I've accepted that people's bullshit-radar is sensitive towards overly optimized engagement content. Rather I want my text to be read and those that care will anyways online-search me. It's not my idea btw. The web has reached peak clickability: https://tedgioia.substack.com/p/has-the-internet-reached-pea...
I think covid is when the internet jumped the shark for Joe Average.
You would get banned from twitter, facebook, reddit, instagram etc for saying what was official policy until _yesterday_. The sheer insanity of that policy left the terminally online in charge everywhere and the quality of every website suffered. If I look back to reddit posts which google still brings up more than half the people are banned. These are people who wrote thousand word replies to technical problems and were pillars of the community. The only ones left are the mentally ill unemployed since they are the only ones who have time to keep track of what is allowed there.
HN was headed down the same hole until that hilarious post by PG about heretics that got flagged for 8 hours. I imagine at that point it hit everyone in charge here that the people making the most noise were not their friends.
People are "big-picture" missing why this is an important idea. I had a prof put it like this once: The great tragedy of the web is the following:
HTML made the web easy to read.
But you know what made the web easy to write? Facebook. Facebook was undeniably the technology that made it so that roughly everybody could write things on the web to be read by everyone.
I really like the direction of this, because it points toward the possibility of a "web that is easy to write."
This is nonsense. There were tons of things that made the web easy to write before FB. That is not what made FB successful. It was a combination of a lot of little features plus the big innovation that your profile had to be your real life identity early on. That was the thing prior social networks didn't do. It enabled the uniquely Facebook experience of being able to find past friends and more distant family.
> the big innovation that your profile had to be your real life identity early on.
This came after facebook was wildly successful, so not early on. I also have never met a single person who was attracted by it, or was confused as to who their friends were before it existed. That being said, tying real names to online identity allowed facebook to buy data from brokers to fill out the sliced up audiences they sell to advertisers, so maybe it was important to their profitability.
What made FB successful was that it was a platform that other developers could program for, so it filled up with games and quizzes. Farmville,
"Which Harry Potter Friends Spice Girl Are You?" etc. was all the edge that it took to kill myspace, a site which seemed to stop any sort of development about 10 minutes after launch.
But, as you say, even myspace made it very easy to write on the web. You could scribble on other people's "walls", put whatever you wanted on your own page, and every profile came with a blog.
Against what you say, however, is the timeline where you could just post random crap and all of your friends would see it and comment on it; the dopamine stream. There's no easier way to write than to spit out a random sentence or upload a random picture, and broadcast that instantly to hundreds of people.
> What made FB successful was that it was a platform that other developers could program for, so it filled up with games and quizzes. Farmville, "Which Harry Potter Friends Spice Girl Are You?" etc. was all the edge that it took to kill myspace, a site which seemed to stop any sort of development about 10 minutes after launch.
Even before that, it was a combination of exclusivity and social groups. When people found out there was a social network they weren't allowed into (when it required a *.edu email address to sign up for), they were curious and wanted in. For the people who could get in, Facebook had network pages tied to your email's domain so you had an immediate social group of people going to the same college/university as you, which was used for all sorts of things like planning events, coordination, sharing campus information, I believe it even had a full-on calendar for students to put things on.
The loss of the network pages was when I first started losing interest in Facebook.
> This came after facebook was wildly successful, so not early on.
Not true. The original version of the site required you to be a student at Harvard, then a student at select universities, etc. Eventually it opened to the general public, but the norms for Facebook had been set. You entered your real name, real city and state, real college, etc.
You also misread what I said this allowed you to do.
Official requirement came later, but people were de facto using their true identities in large numbers, which allowed to find old friends and family easily.
You know, the fact that it invites you to consider it a "yearbook" of sorts.
Now you're moving the goalpost. You originally implied FB making the web easy to write was the major innovation that lead to its success. I pointed out that simply giving people a dirt simple text box had been done many times before. There was nothing special about that part of FB.
I am not dumb enough to argue that FB wasn't hugely successful, so your attempt to shift the argument away from your original point is silly.
I have to agree with jrm4 - all the things you pointed out don't explain why FB groups and markets are so popular. There is also bunch of businesses that don't have their own website, just FB page - which is not comparable to "giving people simple text box".
It is simplified web experience from point of business owner, just drop logo, type in your company name and you have web presence - which happens to be where people are because they had friends/family there anyway.
LOL. Apparently I am old, but I remember the web before Facebook (or Google or…) existed. Everybody could (and many did) write on the web before Facebook. Geocities, My Space, or just create your own website. Believe it or not, but it was far simpler and cheaper to create your own website back then. There were tons of free hosting sites back then. You know what killed all of that? Facebook. This is why I am excited by the notion of Facebook dying. Maybe we can get back some of what we lost.
I remember my first website experiment, hosted on "20megsfree". 20 megabytes of free hosting on a subdomain, they'd put an ad banner at the top, and you could pay for more / to remove the banner.
..and oh wow the domain still exists. Homepage is unchanged from 2001. Copyright line in the footer stopped updating in 2005 though so no idea if it would still work...
You already can render markdown. It shows as what it is: text.
If you want to render markdown as something else, you need to define what that other thing is. If you're suggesting we render it as a webpage, well webpages are made of HTML and CSS--so you're saying you want to render markdown as HTML/CSS.
We can already do that. There are a plethora of tools available to do that.
FWIW, just a couple of weeks ago we started doing that for a new sqlite subproject: https://sqlite.org/wasm
With the exception of one page, all of them are markdown, rendered on demand by the Fossil SCM. The one exception is an HTML file, which we need in order to host a small JS application.
When you say rendered on demand, you mean by the client, as in a page request? Why not just rerender to HTML on developer change? Genuinely curious why rendering on demand is preferred in this case.
> Why not just rerender to HTML on developer change?
Because that's not how the Fossil SCM renders content. It has a cache, but only for certain high-CPU data like generation of zip files of the source tree. Caching markdown docs wouldn't work in all cases, anyway: when you link to a ticket, for example, it gets rendered differently depending on whether it's opened or closed. Thus the renderer has to know the current status of any fossil-internal constructs a doc links to. Of course, we could say "just update the cache of all docs which link to a ticket every time the ticket is updated," but That Way Lies Madness. In an Enterprise-level system that would possibly be worth doing. For the Fossil SCM it's overkill.
Though re-rendering on every page hit _sounds_ bad, we've been doing it in the Fossil SCM since it went into being and it has never caused us any undue performance issues. Every doc you see on <https://fossil-scm.org/home>, as opposed to the non-doc URIs, is served directly from the SCM db and all (or very close to all) of it is either markdown or Fossil's older/original wiki format, both rendered on demand. CPU load is minimal and rendering is "fast enough" for everything we've ever done with it.
Nice! You don't need the HTML and BODY tags though.
I think however this defeats the purpose: yes you're delivering the content as Markdown, but you have to deliver it as `text/html` for it to be rendered, so anyone fetching it can't tell it's Markdown content. Also every document has to have (invisible-ish) HTML junk appended
A "better" solution would be a browser that sends the `Accept: text/markdown, text/html` header and a server that serves Markdown only when requested.
Let's just embrace the chaos and develop a new flavor for every browser until there are precisely 31 different flavors of Markdown. We cap it there, Baskin Robbins style, and then watch the world burn.
We could have a higher level language and tooling that transpiles everything to all known markdown flavors and bundles them all. But I guess one of these 31 flavors already does that.
Make a World Wide Markdown Consortium, release v1 of Markdown based on a randomly picked flavor, let it stagnate for decades, then let Google implement shadowMarkdown in Google Chrome, which renders at 300FPS for them, and unluckily falls back to a JS polyfill that ends up solving a rubik's cube before every character it renders. Once that has gone for long enough, let Google form their own MHATWG and pretend it's open while they keep a majority of the seats, to steer the evolution of WebMarkdown.
Also Safari still doesn't support headings for some reason.
It isn't as supported as I'd like, but it does exist and I've encountered it "in the wild" a few times, so it's not just some guy typing away on a website either.
After all these years, I still haven't found an important argument in favor of CommonMark. As I point out every time someone presents it as the answer, it doesn't handle things like math, so you still need to use unstandardized extensions, making the whole thing pointless.
The argument is that if you disqualify everything for not having $FEATURE, where $FEATURE varies from person to person, you have also essentially disqualified markdown entirely. As the saying goes, everyone uses only 10% of Microsoft Word, but everyone uses a different 10% of Microsoft Word. Much the same thing applies to this case for much the same reasons. If your standard is going to be "I want everything in any variant of Markdown ever and also any plugin ever", you will end up with something that is just as complicated as HTML, only different this time. (Possibly even more complicated than HTML.) CommonMark is a decent solution to "I want to use Markdown", if you're willing to take the simplification.
Note in this case I don't think there's anything wrong with refusing the simplification. It's just that if that is your set of your requirements, you've disqualified Markdown entirely. Personally, I think that is the state of the situation; Markdown can't do this. Markdown and all of its family members and close friends intrinsically work by reducing the problem. If you refuse to reduce the problem, you've refused to use Markdown. That is not a moral judgment; that's an engineering judgment. From the position the major browsers operate in, they will never attain anywhere near enough agreement on this to ever implement it without it simply becoming another monster of its own as everybody piles in with all their favorite extensions.
I have some websites that run with Hugo, which is in principle based on Markdown, but if necessary you can have raw HTML pages or other things too. This is actually the ideal; use Markdown when it makes sense, use other things when it doesn't, and thus, neither of those two things has to carry the burdens of the other side. This is the real and best solution, honestly, and it also has the advantage that it's here now. Use whatever flavor you want, where ever you want, whenever you want, today. I'm doing this and I don't see any advantage to trying to convince the browser to do this. I have a deploy step regardless of what I do, so it's no skin off my nose whether that step deploys my pages raw or there's a render step in addition to the deploy.
My personal preference would be GitHub flavored markdown, since as a coder it includes a lot of very useful non-standard markups. The compromises it makes on the non-deterministic markup elements are acceptable as well.
Like how they're introducing admonitions syntax by overloading the blockquote sigil that makes it difficult or impossible to nest, has a heavy English bias, and doesn't even transform the underlying element making the use of blockquote unsemantic. They also just skipped the CommonMark RFC and other implementations throwing their weight into the ring with no regard for prior art. I also don't think a corporation, Microsoft, needs to be in charge of the spec either. No thank you.
Many of the GitHub readmes are in markdown already, so people are quite familiar with it and there might already be an open source package that renders it out…
I guess it wasn't clear, but I meant those markdowns are rendered on everyones github page.. But the whole github pages thing is new to me. Very cool and seems like what the blog post was asking for.
> AsciiDoc is a plain text markup language for writing technical content. It’s packed with semantic elements and equipped with features to modularize and reuse content.
All dialects of Org Mode, AsciiDoc, reStructuredText, HTML files, ODF files, EXIF data on images support metadata in file--it's the norm. The fact that Markdown's spec doesn't support metadata by default and most "dialects" don't this ad hoc syntax (YAML of all broken things) shows that Markdown not suitable for most kinds of documents.
agree that it'd be nice to have a markdown file be rendered inherently within the browser so i don't have to use haroopad on my windows machine, but it feels like we're just going to reinvent HTML
Agreed, original HTML was not too different than markdown really. (But more standard, and slightly more powerful with things like tables, code blocks, and definition lists, all of which are only non-standard extensions to markdown!)
Maybe what OP really wants is a lot more people to write HTML without _any_ CSS or Javascript. But that's already more or less available, so there are reasons people don't do it to grapple with.
Perhaps a mode where you tell the browser to ignore any CSS or Javascript; possibly also in this mode the browser could use better more readable standard html rendering, similar to what most markdown renderers choose by default (bigger font sizes and line-height, more and more even whitspace around headings, maximum page width, etc), instead of the legacy choices they are now sticking with for backwards compat.
> Perhaps a mode where you tell the browser to ignore any CSS or Javascript; possibly also in this mode the browser could use better more readable standard html rendering, similar to what most markdown renderers choose by default (bigger font sizes and line-height, more and more even whitspace around headings, maximum page width, etc), instead of the legacy choices they are now sticking with for backwards compat.
Reader mode also attempts to discard "unimportant" content – if you intentionally enter reader mode, it's fine to discard even things like the page navigation, but not so much for a browser's default mode.
Plus sometimes it's too overeager and discards content that shouldn't actually be discarded (e.g. I'm hosting some radio show transcripts, and with my markup Firefox's reader mode discards the bit of text indicating who's actually speaking).
Oh, I was thinking of a mode that the _page source code_ would trigger somehow, instead of the client triggering on a page that was possibly written to be full of complex CSS and JS. But also, maybe? It does seem related to OP. I am just brainstorming, don't have anything particularly thought out.
I am imagining instead a mode that tells the user-agent "Use your own standard HTML stylesheet, and it's allowed to change and be updated over the years, to fix bugs and improve design, you don't have to stick with your default styles from 30 years ago,' but "use the same stylesheet you are using for other pages in this mode, I don't want to have to come up with it, and I don't want it to be fixed in time to what I come up with today either or require my maintenance"
That is, i suppose, more like "reader/readability mode" in all those aspects--but triggered by the source instead of by the user.
But, sure, a CSS stylesheet is another thing. I'm just thinking around the use cases I think OP is setting out; you can do so too, with other suggestions, you don't have to try telling me I mean something other than I mean to do so!
I last paid active attention to this in mid-2017. At that time, Google mostly, eventually executed JavaScript, but it would often be weeks after initial indexing that any JavaScript execution happened; and there were rumours but scarcely more that Bing could execute JavaScript; and I know of no other engine doing any JavaScript. By mid-2018, Bing definitely sometimes did some JavaScript execution (see https://www.screamingfrog.co.uk/bing-javascript/). It’s probable things have become a bit more consistent by now, but actually loading pages realistically is so much more expensive than just parsing the initial serialised HTML that I think you can reasonably expect JavaScript-execution to be less consistent and reliable as they will very probably continue to prefer to avoid loading things that way if it’s not obviously required.
Part of the benefit of Markdown is that you can view the raw source and still get useful info out of it. But this site prerendered the Markdown and served HTML which defeats part of the purpose.
Markdeep is good at progressively mixing HTML & MD so you can choose how much of each to put in to your page. I use it for mostly MD + MathJax notes + some HTML/JS/SVG/Canvas/WebGL for when I want dynamic graphics in my notes.
It most certainly would not be a bad idea to have .md files render natively in a browser. Browsers also natively render images, videos, PDFs.
That said, the idea that this somehow changes any dynamic on the web are mere fantasies. The masses self-publish on social networks. On their phones. And not even that, as most largely lure.
Aside from perhaps images, I wish that browsers didn't try to render more complex content. I'd much rather be able to easily watch YouTube and embedded videos, for example, in an external player. And I've almost always ended up reopening PDFs in an external viewer, or configured the browser to do that by default where that option exists.
Maybe Markdown isn't as much of an issue as the videos and PDFs are, but it seems to me like it's better handled externally from the browser, or perhaps by an optional browser extension.
HTML / CSS ended up being extremely granular and hackable while missing the big building blocks that would have readily matched the structure of the web - things like navigation menus or page outlines. We waited over a decade for the layout systems to catch up to developer needs while creating buggy float-based grids. It's an outlier that media queries were already in widespread availability when mobile really started picking up.
The author is lying. The source isn't actually a markdown document, the source is HTML (you can easily verify this by right-clicking and selecting "View Source")
It's actually quite easy to actually render markdown pages in a browser. Start with this:
document.write('<link rel="stylesheet" href="css/style.css">'));
document.addEventListener('DOMContentLoaded', (event) => {
var m = markdownIt({'html': true, 'linkify': true});
var t = document.querySelector('textarea');
var d = document.createElement('div');
d.setAttribute('id', 'content');
d.innerHTML = m.render(t.value);
t.replaceWith(d);
});
I think the author means that they wrote a markdown document, which then was transformed (by a CI/CD pipeline, for example) to the html you see when you inspect the source.
I like your "Serve markdown, transform through a client-side script" approach though, so upvoting nonetheless.
I spend a good part of my week writing docs. The docs that come out of Markdown are hard to read, hard to maintain, long, ugly, and limited, requiring postprocessing for simple things like a Table of Contents. Websites need content to be even richer than docs (and no, Mr. Developer, what you want is not the sole requirement of the rest of the people on the planet). If you want to make an entire website out of Markdown, make Markdown suck less.
I think we need a configuration format for websites. Configuration formats are designed for humans first, and fit to purpose. What might that format require? Probably: style, layout, macros, emeddable objects, inheritable/overrideable values, logic, loops, etc. Basically a DSL. You describe independent blocks (layout, style, content, etc) and the browser takes the instructions and renders the result. Not insanely different in concept than HTML/CSS, but everything could use one common format, in a manner more human-friendly than we have now, content would be independent of form/function, and none of it would inherit some unrelated design principles from some antiquated non-human-friendly technology.
text "Employees table description" |
This is a description of the employees table.
The name and e-mail address are listed to the right.
table "Employees"
@Name ~> Frank
Suzanne
Rahul
Aman
@E-mail address -> frank@me.com
suzanne@me.com
rahul@me.com
aman@me.com
style "table.Employees"
@Name
column
bgcolor "green"
@"Email address" rows bgcolor "gray"
layout "main-page"
panel "1"
align: left
content "text.Employees table description"
panel "2"
align: right-of "layout.main-page.1"
content "table.Employees"
You've described a static site generator, which I mean not as "hey you should have known that" but more as a "good news, that basically exists and you can use it now!"
No, there's no standard with static site generator, but there never will be. The combination of the wide variety of needs and use cases and the ease of starting one of these up (I can literally bash together the skeleton of a useful static site generator in 4 hours, and this isn't just "oh I could recreate dropbox in a weekend if I wanted to because I am a swaggering HN programmer", it is something I've literally done rather than sit down and learn someone else's because it was faster to bash something together than read docs) means that there will never be a shortage of these, and none of them will ever managed to capture 99% of the market to become a de facto standard.
I basically have websites that work the way the author describes. There's no particular need or benefit from expecting the web browsers to do this. There are disadvantages to this approach but the web browser directly rendering the markdown doesn't really solve any of them.
The problem with only using markup to render websites is the lack of navigation.
We need something to show next/previous page (we have rel headers and tags for that) and something that shows a tree navigation structure for populating menus (not sure what exists for that)
The tree structure would need to be “crawlable” because for very large sites, the entire tree can’t be loaded. That could be solved by representing nodes on the tree as URLs that can be loaded by the client as needed.
I’d love to see this happen. I don’t think we need markdown per-se… maybe it’s a subset of HTML that leaves out css, JavaScript, and a lot of the other nonsense.
Is anybody interested in building this out? I have ideas for the server and formats, but would want help implementing the clients.
We should have multiple formats besides HTML. In fact we already do have this, in the “Content-Type” response header, we just should use it more
Send data with Content-Type text/json and Firefox will display a fancy JSON viewer, send data with text/plain and Firefox displays plain text, pdf files, downloadable files, etc. Well, we can add text/markdown where the browser automatically renders markdown files. And when the next HTML replacement comes out we can add that as well.
What about backwards compatibility? We already have that too: webservers can check the User-Agent request header, and return converted data for older browsers. Though we’d need a centralized database or fallback solution to support niche browsers…
> We really need a new publishing tool for everyone that doesn't require intricate knowledge of how the entire web computing stack functions.
That tool is called HTML. It is usually supplemented by CSS. Every Markdown element has a one-to-one translation to it. This webpage is a particular stylesheet, plus basic HTML elements that you could write in <> style just as easily as you could write them in *_# style. If HTML requires particular defaults to look right, then that is a really great reason to fix the defaults, but not a really great reason to junk the whole thing and start over.
Markdown isn't particularly good, it isn't particularly standardized, and it isn't even compatible with reader mode (e.g. images). Just write HTML. Browser standards have not degraded to the point that you need to write a web page differently than people wrote them fifteen years ago for it to look the same; a toolbox with a million complicated tools and one obvious screwdriver is not a toolbox that makes it very hard to unscrew screws.
Thus, instead of
- Fruits
- apple
- orange
you must write
- Fruits
- apple
- orange
This is an obvious non-starter.
(edit)
...we allow headings to "lazily" span multiple lines:
> ## My excessively long section heading is too
> long to fit on one line.
What?
...if you open a code span and don't close it, it extends to the end of the paragraph. That is similar to the way fenced code blocks work in commonmark.
> This is `inline code.
My personal take is that markdown is only good when you can customize it to your needs and situation. There are a million markdown flavours, because everyone who ever implements it decides to add their own extensions to it to suit their situation. There have been attempts at creating a single unified standard, but these miss the point: if we wanted a single shared syntax, we could use html directly; the advantage of markdown(s) is that it lets you create something that looks clean and simple as plain text, while turning into a nicely marked-up document. A "standard" markdown would need to have a "standard" extension mecahnism (or none at all...) which would inevitably look like ass for most use cases.
I nowadays usually just write html directly for my personal documents, because I spent long enough messing around with markdown parsers trying to get them to act how I want. But for a website commenting system (eg), it makes sense to spend some time making the formatting system nice to use, which involves customizing your flavour of markdown. I don't think web browsers can or should try to do a good job of this; it requires to many specialized demands for them to be able to create a generic solution. If you want to write in markdown, its easy enough nowadays to format it on the server, or client side with a small js snippet.
It's said that the father of LISP, John McCarthy, lamented the W3C's choice of SGML as the basis for HTML : « An environment where the markup, styling and scripting is all s-expression based would be nice. » The {lambda way} project could be an answer, small and simple: http://lambdaway.free.fr/lambdawalks/
In lambdatalk such a HTML code
<h1>Page Title</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Ut ac lorem ut massa euismod vestibulum.
<p>Nullam rutrum blandit eleifend. Aenean a varius diam.
Morbi sodales velit nunc, vel vestibulum lorem tempus sodales.
is written like this
_h1 Page Title
_p Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut ac lorem ut massa euismod vestibulum.
_p Nullam rutrum blandit eleifend. Aenean a varius diam. Morbi sodales velit nunc, vel vestibulum lorem tempus sodales.
And you can also compute 3x4 writing {x 3 4} or compute the factorial of 100, compute a Fast Fourier Transform, draw complex graphics, ... it's a true programming language with a coherent syntax, unlike Markdown.
We need to bring text editing into the 21st century where rich text is standardized across all platforms and devices. We can send multicolored emoji to just about any device on the planet, but we still can't bold, italicize, underline or color basic text. It's straight up insane.
Think about it: There is no such thing as "plain text", it's just encoding we don't see.
Even if you're a die-hard keyboard jockey, unless you're looking at your command line and mentally parsing raw ANSI encoding like \e[1mBold, you're not using "plain text". There's tons of encoding underneath every text editor and terminal program, you just don't see it. We already have a standard encoding for rich-text, we need to start using it. It's called HTML, and every device with a screen on the planet knows how to display it. We just need to hide the tags everyone bitches about just like vi hides a highlighted line's \e[43mYellow codes. We should never need to see it.
Any place we're able to enter text should support rich text, from boot up screen through to apps. And underneath it should all be HTML. It's like the display compositor on macOS. None of us think about the fact it uses PDF under the hood, right? We're not debating whether it should be done using SVG. It's invisible. This debate should have been settled decades ago.
I've been writing about this in detail for the past several weeks. I swear the tech industry has lost its collective minds. At this point I'm truly wondering if Markdown is some sort of cult. We need to kill these lightweight markup formats with extreme prejudice and start solving the real problem.
One could use Apache2 .htaccess to set up a directory header/footer which load a script (hidden from listing in .htaccess) which parses a .md file listing into a list of articles, and fetches a couple of the first ones as a front page. Then just use a URL fragment to track which article is open, and fetch & render that article’s markdown fully onto the page.
Disable JS and you have a list of markdown files. Enable JS and get a website.
I like Markdown and use it at work, but I think Asciidoc is a better markup language because it is more consistent and has support for more things than Markdown does (e.g., better table support, callouts, tips, etc.).
I currently use 11ty with the Asciidoc plugin for building websites. This setup is nice because I only have to fiddle with HTML and CSS during the design phase. Once that's done, nearly all my website maintenance is done in Asciidoc. Easy!
I don't think I'd want to directly write an entire website in either Markdown or Asciidoc. I think, eventually, doing so would result in these markup languages becoming as cluttered and weird as the HTML/DOM/JavaScript/CSS mess is now.
I think a better step to improving HTML and CSS would be to have the browsers support Slim (https://github.com/deepin-community/ruby-slim) and Sass out of the box instead. That would make my design phase less wordy and redundant while keeping my Asciidoc experience nice and tidy.
I agree that we should have some diversity in our rich text terminal systems, however I think that markdown is significantly close to the tree(html) that it does not matter much, my candidate is postscript. this is only half a joke, What I would really like to see is diversity in our client side scripting system.
This convo reminds me of MDX -- https://mdxjs.com/ -- allows you to mix JSX with markdown, popular for making documentation pages for design systems.
A lot of the limitations of MD mentioned here are alleviated by allowing arbitrary JSX, which of course is optional for users who want something more basic.
No, why should we do that? Markdown websites just to render it BACK into HTML? That's not the browser's job.
The issue is, Markdown is nice because of its simplicity. The minute you start making it standard and widespread... feature-creep will happen and MD will just have as much noise as HTML does now.
The article should probably read: we should write more simple websites. A few lines of css does the trick, usually. That's what I do in my tiny website http://scriptkid.it I learned from someone who posted here on HN the <p> trick.
I see some people in multiple comments discussing the need for some sort of standard for markdown if it were to be used for (goals...).
Does nobody else remember the debacle and (limited) fight between Jeff Atwood and John Gruber (markdown's creator) over this?
It was a big deal here, and of course, over at Coding Horror, Atwood's blog. And the reason was: Atwood was calling for some kind of standard, and Gruber actively opposed the mere notion. Gruber claimed ambiguity was baked in in markdown and he didn't want a standard. So take this into consideration before proposing some markdown standards committee: the creator of markdown has said he actively opposes this. I guess he can be bypassed, but what does it say about the future of said standard if the original author says "no"?
I agree with you! My point is that Gruber claims the sloppiness is intentional and actively resists fixing it. And if he -- the creator -- is not on board with standardizing markdown, maybe the whole endeavor is a dead end?
I mean, it could be done even against his wishes, but the project would already be starting with a big negative.
One thing that's always been a source of a tension for me is how to handle navigating around a "web of documents".
There are limits to what you can do with "in-band" links in the text before they get to be contrived, awkward, and non-discoverable.
Users seem to have a revealed preference for on-page links to barely-related URLs and want those links to have some spatial consistency over time. So that means adding some "fluff" (nowadays presumably nested in a <navigation> tag) to every document.
One past experiment with out-of-band navigation was framesets, but they had some significant issues (not least of which is that the URL pointed to a container rather than the contained document).
Are there some other interesting experiments with out-of-band navigation? Or is adding some semantic tags to HTML as good as it gets?
I've been debating on using Org Mode documents to render to HTML and serve those kind of static pages for a blog. I've heard of people doing something similar before (Org -> HTML or Org -> Md -> HTML). But the other part of me wants my own NIHed, over-engineered solution :)
We already do. They use static site generators to produce HTML (a good interchange format) from a customized form of markdown (which is not standardized at all and very customizable). This state of the world is totally fine - it works well for both viewers and content producers.
With modern HTML it's actually pretty easy to use markdown wherever you like. Though there is no built-in markdown element, they have built in the ability for authors to define custom elements. You can define your own custom <mark-down> element and put your mark down syntax inside and as it is, as source code it's nice and human readable even if we didn't do anything further. But we can also define how this element should render, and so it's very straightforward work to wire up any existing "markdown to HTML" library in JavaScript to consume the markdown contents of the tag and display rendered HTML.
This is so easy to do, not just for markdown, but almost anything you could want to embed or work with in HTML.
Markdown by definition compiles to HTML and must be rendered by a browser as HTML. This seems to be a widely overlooked fact in this thread.
It's fine to write MD, but know that it's a limited and shortcut form for the real HTML that will be output and rendered.
The short form is great for short stuff but to access the full power of HTML and CSS takes really messy spaghetti MD. And don't even start with JS or TS. Modern sites usually need code and that means web components and that means real HTML and all of CSS need to be present and accessible in an orthogonal way -- not some via MD syntax and some with HTML glommed-onto MD.
This reminded me that a long time ago I used S3 website's error handling feature so that S3 could render simple markdown files as HTML natively: http://composedit.net/
There's Gopher, Gemini, a myriad of static site generators that render markdown...
I like the idea of my browser rendering markdown. But it's not going to solve the problem, as long as the same client application that renders documents can also render web applications, people looking to bait you with a document into running software on your machine will just publish web applications with some text inside of interest to you. What's needed is separate client software for running web applications and rendering documents, and a different protocol for each, where we fucked up was using http for everything.
The slow drift of Hypertext from nice separation of document markup
and presentation to "web applications" and browsers that are mini
operating systems makes those old jokes about Emacs pale in
comparison.
The modern bloat-ware browser is a calamity of code, most of which I
have no use for and no trust in. So I use a text based browser without
JavaScript - and I love it!
If I want to run your code on my machine I'll install a proper
application that has at least been through some basic code signing,
packaging and secure distribution channels.
Gemini seems like one of the few hopes for breaking away from the
madness of "Web" and restoring some sort of sane, plain document
publishing for ordinary people to read and share information.
While I believe you, and that the currently state is too messy, nevertheless it is possible for some implementations to be more limited or to have options for the user to disable some features if desired.
It is true, there are other protocols and other file formats they are good for different purposes, and you should not try to use one for everything. (In my opinion, this is true of Unicode as well; it is messy and doesn't work well to use Unicode for everything, either, nor HTML or HTTPS for everything, or cell phones for everything, or the government for everything, etc.)
It would also be possible to serve text/gemini files with HTTP too, and I also have modified my browser to be able to display them (also local files too), this is not common.
The charm of markdown is there are different flavours, you pick your favourite, and it can evolve independently of the web itself. Making browsers render markdown means markdown needs a single standard, or a standard of specifying standards and some kind of document type declaration, agreement by WHATWG and so on! For what? There are a lot of web publishing platforms for non technical authors. Free and paid. There are also simple programs to turn markdown into HTML. Uploading, hosting, DNS are much bigger barriers than HTML syntax anyway.
Markdown could potentially replace HTML as a more succinct markup language but it's not sufficient by itself because it has no way to represent styling and layout. You'd need to jam CSS into it somehow
The amusing thing is, that's what his Markdown translator is doing. Look at the page source. There's a fixed CSS preamble, and then there's very basic HTML 1.0:
<body>
<h1>Why We Should Have Markdown Rendered Websites</h1>
<p>You're viewing this document in your HTML-rendering browser but its
source is actually a markdown file.
</p>
...
<pre><code>// file: http://home.md
[home](http://home.md) [about](http://about.md)
this my homepage
// file: http://about.md
[home](http://home.md) [about](http://about.md)
this my about page
</code></pre>
...
<p>Best,
Tim Daubenschütz <a href="mailto:tim@daubenschuetz.de">tim@daubenschuetz.de</a></p>
<h2>References</h2>
<ul>
<li>1: https://gist.github.com/JoeyBurzynski/617fb6201335779f8424ad9528b72c41</li>
</ul>
</body>
That's it. That's his HTML. You could write that by hand.
Don't forget that markup refers to the document's formatting and markdown is a specific markup library that converts text-to-HTML. So technically all browsers do support markdown.
We should re-visit the actual term "mark-up" which refers to decorating copy during the editing phase. I think many people agree that using markdown to "mark-up" copy on websites is easier.
Markdown results in unintentional formatting more frequently than not in my experience, and it is this reason alone that I avoid it anywhere I have a choice to, including any software I implement.
Once upon a time I made a very minimal PoC markdown "browser" in a day or two. It basically had a navigation bar like a browser, only displayed markdown and would only follow links to other markdown documents. It was more of a project to play with React Native for macOS/Windows more than anything serious, but I think the general idea could be neat if implemented correctly in some native GUI toolkits.
I also ended up using Markdown to create a blogging platform[0] for my website. It has many benefits as it's a popular format, so you can for example edit and see the result directly inside VSCode.
Can someone explain what is the point of ipfs? I visited the homepage and it mostly shows an information about what it is not and disadvantages of things that are not ipfs.
But I am struggling to understand why would I want to use it and how?
For instance, can I host a website on it? Can I put a wordpress on it? How can I share my website with someone? Can I use my own domain?
Or is it like FTP? I really don't get it and feels like I am missing out.
It's a distributed content-addressed datastore, like Git or Bittorrent or Gnutella or Freenet, but maybe trying harder than any of those to be a direct replacement for web servers. I think they aim to make "ipfs://whatever"[1] a URI scheme that's supported in web browsers alongside "http://whatever".
[1] I can't find the page explaining the URI scheme, but as I recall the double-slash felt vaguely out-of-place when I did once upon a time read it, and I have never been able to take the project as seriously as it seems to want to be taken (flashy website with 'whitepaper' etc) partly because of that detail. The underlying idea about a distributed content-address datastore is a good one, but I feel like they're making it more complicated than it needs to be. https://www.nuke24.net/docs/2015/HashURNs.html is somewhat a response to it.
Markdown is not sufficient for the task of expressing the modern web.
All that said... We shouldn't stop thinking of ways to reimagine what the WWW actually is.
Why not create a new web-browsery thing which uses a totally different language, totally different paradigm for composition, heck why not a totally different network transport?
If you think that sounds dumb...I just described the current state of Netflix.
Yes, we should have open standards for content-types and dynamic fetching and updating of code from trusted locations on the network. Then the browser itself becomes little more than a container for executing rendering code, with plugins for every content-type under the sun. But we are on a different timeline.
Most people don't want the simple websites you could render in basic Markdown, that's why we don't have more of them already. How many of the top 1000 websites are a column of text and nothing else?
If most people wanted simple websites, they would write them with a WYSIWYG editor, they would not learn Markdown.
Disclaimer: I'm biased as I created Scroll, but I can say pretty objectively that at this point it's far better than Markdown and the gap is only going to widen.
While I consider it a bit unrealistic to follow this approach in a big style, I like the idea. However, I would suggest to use asciidoc instead, since markdown is a bit to constrained. For examples image captions or tables are not possible in markdown.
What happens currently if you send a file with content type text/plain to a browser? If the browser would just render it (maybe in a monospace font), you're already halfway there, without needing to embed a Markdown renderer in the browser.
What we (I) need is a simple product flow from Word (the corporate writer of choice) to HTML (the lingo of the web). I thought Markdown might get into Word somehow as an export option (without plugins) but sadly not.
Recently there was a thread about GitHub Blocks, interactive elements within READMEs and Markdown files in general. If we could standardize that and support it, that would be cool.
You mean stuff that is already standardized and in Asciidoctor? That would be easier than trying to wrangle in all of these forks/flavors with all their incompatible extensions and tools all while have no way to handle metadata?
I would love to use more markdown but I need multidimensional layouts decided by the author (variable columns with text and images) and I haven't seen anything like this.
There are some of us that think Markdown should be like this but it isn't. There are implementations in some languages that auto-escape any embedded HTML in a Markdown document, but for others they pass through.
That makes Markdown great for content when the authors and system owners are the same people. Not so great for content submissions from anonymous and potentially hostile third parties.
The project I originally had in mind for learning Elixir would have been Markdown based but there is no flag for disabling inline HTML in the current parsers and writing my own seemed like a bridge too far.
My observation of English-language typewriter and early plain-text computer usage (documents and emails) is that * was mostly used as equivalent to italics, ** and *** for stronger emphasis were not particularly uncommon (bearing in mind that traditionally there was simply no such thing as bold). But _ was kind of a historical oddity, with underlining used in lieu of italics in the typewriter era for technical reasons, implemented by means of overwriting; underscore-surrounding came much later, and never seemed particularly popular to me, and was more often interpreted as underlining rather than as italics—though the typewriter remarks are certainly still relevant, as underlining had for a while changed somewhat in semantics.
I say Markdown’s error was doing anything with underscores at all, especially given the programming context and the prevalence of underscores in snake_case and SCREAMING_SNAKE_CASE (Python being most heavily affected with its __magic__ methods, so that all implementations will be affected, rather than only some that have gone a certain way on word boundary matching), and it should have just been *italics* and **bold**.
Semantically, view italics as emphasis and bold as strong emphasis (basically where HTML 4 headed with <em> and <strong>), and the doubling of the asterisk makes perfect sense. The asterisks are an intensity modifier.
My own convention when writing Markdown is to use single underbar for _italic_ and doubled star for **bold**. Except of course on GlitchSoc (a Mastodon server), where underbars give underscores, and single or doubled stars are for italic / bold respectively.
But otherwise, if I see a singleton star or doubled underbar in my own writing, I'm pretty sure I've typoed something.
Completely agree. It's already pretty easy to create a nice readable page with an off-the-shelf stylesheet & semantic markup.
Sites choose not to do that, for non-technical reasons.
For all it's faults, this is partially what AMP aimed to solve, by restricting what developers could do. However, developers/businesses didn't adopt AMP because it was a better user experience, they adopted it because it got them higher up in Google search.
I've seen a few people on here recommending everything from "just use HTML" (which misses the point) to "just use Gemini" (which misses the point even more).
Why not HTML? Why not Markdown? They aren't self-contained.
* A web page written in either format can leak your IP address to external bad actors because of the way inline images work.
* Loading resources from more than one server is a reliability and security problem, and it performs bad on initial load (it's great for subsequent loads, since external resources can be cached, but initial load time is bad and tech designers should really spend more time thinking about worst-case perf than about average-case).
* Downloading a web page is overly complicated. I should be able to download a page to my computer and never have to worry about the origin server going away, and that's not possible on the HTML5 web. This is one of the main reasons for the enduring popularity of PDF. IPFS, in particular, would benefit from a self-contained document format, because it needs to know the full set of dependencies in order to pin a page as a whole, and ensure that you don't accidentally pin an HTML file without pinning its images and wind up with a broken site.
Sure, you can make HTML pages that are self-contained, but because they aren't always, people don't build workflows around them.
Why not Gemtext/Gemini?
* Nobody but nostalgic nerds cares about simplicity of implementation. I mean, come on, Markdown is even harder to parse than HTML is! Nostalgic nerds might be a worthwhile demographic to appeal to, but I think IPFS wants a wider audience than that.
* Inline images are not optional. Too many great creators with a lot of worthwhile things to say are either creative artists or technical artists. In the BBS era before inline images were practical, it didn't stop people from drawing; they just relied in ANSI and ASCII art, and "let's go back to typewriter art" only appeals to nostalgic nerds.
* And once you have inline images, you have to offer rich text layout features like tables, otherwise people will start posting pictures of text to work around your missing features (which sucks for either accessibility, because blind people can't read them, or it sucks for simplicity, because deploying OCR is even more complicated than just offering decent text layout).
* You can download an EPUB, and when the original host goes away, it still works! Pinning an EPUB in something like IPFS can work without requiring the CDN to know anything about the file format, since EPUBs are self-contained.
* Tooling already exists. It's just XHTML in a ZIP file anyway, but there's also EPUB-specific tooling (for example, the Texinfo release announcement a few days ago mentioned that you can export EPUBs from GNU info manuals).
* It supports text and image layouts that writers demand.
Of course it's related to the format. Many formats, like PNG and Gemtext, are inherently self-contained, while other formats, like HTML and SVG, are not.
Annoyingly, EPUB isn't inherently self-contained [1], but external access is explicitly optional [2], and it does make self-containment easier, because you can reuse resources across multiple pages, whereas HTML requires you to either duplicate the resources across multiple pages, or you have to build your thing as a single massive HTML page.
You can mandate them to be if you are building something that uses them. If you don't have control, than its up to someone else to decide, and people wanted to be able to link remote resources.
"But html isn't style-agnostic" yes it is. CSS isn't style-agnostic. Instead of a markdown browser, how about a browser with a fixed stylesheet and no js? You don't even need a browser for that, that could just be a userscript that gets plugged into an existing browser. It'd break non-compliant websites that require javascript or custom css, but so would a markdown browser. Most people wouldn't write content for it, but most people wouldn't write content for a markdown browser either.
"But html is cluttered" it doesn't have to be. This is a valid webpage:
Personally, I prefer writing in markdown, but that's no reason to insert a markdown renderer into browsers. HTML can already be as sleek and readable as you want. If we added a new type of markup for anybody with a personal preference, we'd never stop.