Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's even possible to replace (Xe)LaTeX with weasy¹, a Python HTML-to-PDF converter. It supports two-colums via CSS, automatic CSS hypens, CSS page counters and embedding SVGs. I just needed an HTML header with CSS in the markdown file.

    $ pandoc --filter pandoc-citeproc --csl ieee.csl --bibliography=paper.bib --smart --normalize -f markdown+multiline_tables+inline_notes -t html5 -V margin-top:0.5in -V margin-bottom:0.5in -V margin-left:0.5in -V margin-right:0.5in -o output.html input.md
    $ python3 -c "from weasyprint import HTML; HTML('output.html').write_pdf('output.pdf', presentational_hints=True)"
For LaTeX-style math equations I added mathjax-pandoc-filter² as filter to the pandoc args:

    --filter ~/node_modules/.bin/mathjax-pandoc-filter -Mmathjax.centerDisplayMath -Mmathjax.noInlineSVG
¹ https://weasyprint.org/ ² https://github.com/lierdakil/mathjax-pandoc-filter


This is a very interesting (open source) project that I didn’t know about; thank you for mentioning it.

But it doesn’t replace LaTeX, as it doesn’t produce the same results. A glance at the sample documents reveals the ugly typography resulting from the word-processing layout strategy employed in web browsers. This is confirmed in the documentation. So this could be useful if you have an existing set of HTML pages that you need to convert to PDFs, but, if you’re starting a project where you want to produce both HTML and PDF, this should not be part of the solution.


It looks nice for graphics-heavy documents, but the quality of the typographical output doesn't come close to LaTeX with microtype. I do wish that LaTeX had something similar to CSS, however. The separation of markup and styling makes the web easier to use for complex layouts, which are not generally TeX's strong suit.


I cant tell the difference between this layout quality and latex. What are you noticing?


The first things that jump out are the large and uneven gaps between words and the “color” variations among paragraphs. What I mean by the “word-processing layout strategy” is the algorithm where, when you run out of space on a line, you simply break the line at the end of the previous word, fill up the space (for justified text) by expanding the spaces between words, and begin the next line. When you get to the end of the paragraph you go on to the next one. The TeX layout engine, in contrast, makes several passes over each paragraph, adjusting the line breaking (including hyphenation) in order to optimize its appearance (which includes such things as trying to avoid successive hyphenated lines); then, when the page is set, it goes over the entire page to try to equalize the density, or color, among paragraphs.


Maybe you already knew about it, but the microtype package improves the aspect of your documents even more: https://ctan.org/pkg/microtype


Skimming the examples the typographical quality is that of a webpage (which is to be expected), miles below TeX-quality typesetting.


I think it really depends on the font you use and the CSS rules you apply. LaTeX needs tweaking, too, even with a good template.

E.g., if your font supports it, you can enable ligatures:

    text-rendering: optimizeLegibility;


Ligatures don't make good typesetting. In fact, I'd expect a text to have enabled ligatures by default, that's not something to be proud of in any way.

Typesetting is about how the words are placed in the given space, how they're broken up, how the spacing is done to not line up between lines, managing punctuation, figures etc etc. Browsers do none of that, and one shouldn't expect them to, because that's not their job. A browsers job is to present content fast, not to figure out how to do it as beautiful as possible for minutes at a time, that's what TeX is for. TeX and a browsers rendering engine are different tools for different jobs and thinking one could achieve the same result as the other is not realistic in the current time.


Pandoc can even free you of the second step by using WeasyPrint as PDF engine:

    pandoc --pdf-engine=weasyprint -t html …




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: