Forum Feature requests

Further improve support for KaTeX

yyang
Hello, Prince works nicely with KaTeX, but the spacing is still not perfect. For example, $A \in a$ renders too tight as compared to Chrome's result, as if the KaTeX-generated "mspace" spans are discarded.

pjrm
mspace does render in Prince in at least one example I checked. Could you please post the xml that KaTeX generates for the above example?
yyang
Sure, please see the attachment. The KaTeX version is 0.16.4.
  1. demo.zip16.6 kB
yyang
By the way, KaTeX needs to be slightly patched to work seamlessly, including string.prototype.startsWith() and Object.freeze(), which is the topic of this post.

Edited by yyang

pjrm
When running simply ‘prince demo/katex.html’, I get the attached, spaced, rendering, which matches what I get from Firefox, and differs from both the prince.pdf and chrome.pdf from your demo.zip. Could you please give the full input and command line for reproducing, just so we're looking at the same thing?
  1. katex.pdf24.2 kB
yyang
Hi, the provided snippet is what you asked for, not an original HTML snippet. You should link to the KaTeX package in the HTML and enable JavaScript when processing it. Otherwise the math expression will be rendered twice, and the KaTeX-specific font won't be used, as shown in your katex.pdf. I'll upload a full example shortly after.
yyang
Hi, I come up with a self-contained minimal demo for you to try again. Just execute run.bat (on Windows) or run.sh (on Linux), and compare with Chrome's Save as PDF result. The provided KaTeX package is patched just a little from v0.16.4 to be compatible with Prince.
  1. demo2.zip958.1 kB
pjrm
Thanks, and I'm sorry I wasn't clear in asking for "the xml that KaTeX generates for the above example". However, it looks as if what you sent at first might be helpful in diagnosing the problem: the first version renders with a spaced ‘∈’, while the second version (which has very different HTML: e.g. no mathml at all, and just a TeX expression to be processed by javascript) renders without spacing the ‘∈’. This suggests that the problem could be with the client-side javascript that converts the TeX.

The fact that the first version you posted did render well spaced also suggests a workaround of generating the MathML version in advance, and feeding Prince just the MathML without the javascript. This will also be quicker to render (assuming the initial generation of MathML can be put in a makefile or script).

For anyone wanting to look at non-minimized versions of demo2.html, try doing git clone 'https://github.com/KaTeX/KaTeX'
yyang
Hi, thanks for your investigation! But the first version was a snippet of the DOM tree (e.g. from the Element panel of Chrome DevTools), not part of the original HTML tree. The DOM is what Prince sees after JavaScript is executed. Please try again and you'll get exactly the same result as mine. (To be sure, I downloaded both versions and performed a byte-for-byte comparison just now.)

Update: Here is a screenshot of the DOM tree for the second version, rendered in the Guest mode of Chrome 111.0.5563.147.

Edited by yyang

pjrm
Until someone looks into what part of the javascript isn't supported, here's a hack based on the workaround I mentioned, using wkhtmltopdf instead of interactive use of DevTools:
wkhtmltopdf --run-script 'console.log("\n<!DOCTYPE html>\n" + document.body.parentNode.outerHTML);' --log-level warn --debug-javascript --enable-javascript --enable-local-file-access demo.html deleteme.pdf 2>&1 | sed 1d > outer.html
prince -j outer.html
rm deleteme.pdf


It should work for many other js libraries too. No doubt others have written more refined versions of the above.

If demo.html and its dependencies were available on a web server, then the --enable-local-file-access flag could be dropped. I don't see a way of suppressing pdf output; using a dedicated javascript interpreter instead of wkhtmltopdf might be better. Use of mktemp --suffix=.pdf in place of deleteme.pdf (and/or mktemp --suffix=.html in place of outer.html) might also be beneficial.
  1. outer.pdf8.9 kB

Edited by pjrm

yyang
That's wonderful. Thanks a lot for your help!
ldbeth
It seems the HTML produced on prince the mspace is misplaced.
<span class="base">
  <span class="strut" style="height:0.7224em;vertical-align:-0.0391em;"></span>
  <span class="mspace" style="margin-right:0.2778em;"></span>
  <span class="mspace" style="margin-right:0.2778em;"></span>
  <span class="mord mathnormal">A</span>
  <span class="mrel">∈</span>
</span>
<span class="base">
  <span class="strut" style="height:0.6889em;"></span>
  <span class="mord mathbb">N</span>
</span>
</span>


It is usually the order of mspace around the math operator been shuffled.

I think it is very likely the mspace is created by mkGlue function defined at https://github.com/KaTeX/KaTeX/blob/347e57858f50eb514efe90eebe5df9dee98003e2/src/buildCommon.js#L629

and been called at

https://github.com/KaTeX/KaTeX/blob/347e57858f50eb514efe90eebe5df9dee98003e2/src/buildHTML.js#L129

Let's see if someone can confirm if this is the cause.
yyang
@ldbeth Hi, sorry for my late reply. I guess you've identified the cause. I get the same result using the "complete" event:
Prince.addEventListener('complete', () => {
    console.log(document.body.innerHTML);
});

And I've tried again with an equation with more mspaces. Clearly they are stacked on the left side of the equation (you may need to open the image in a new tab for a full size):

@pjrm Could you take a look when time allows? I think Prince works well with KaTeX without patching up to v0.16.9. It'll be nearly perfect without this issue.

Edited by yyang