Fossil Forum

Automatic Code Highlighting with Prism
Login

Automatic Code Highlighting with Prism

Automatic Code Highlighting with Prism

(1.1) By Warren Young (wyoung) on 2022-03-29 20:42:48 edited from 1.0 [link] [source]

One of my public Fossil repos now does automatic code syntax highlighting using Prism, a particularly nice and modern example of the art. In this posting, I will show how I did it, and explain how you can do it with your repos.

Demos

How-To

Make these two simple additions to your Fossil skin:

  1. Add this one-liner to the Footer section:

    <script nonce="$<nonce>" src="/js/prism.js"></script>
    
  2. In the CSS section, append the contents of the block above the "DOWNLOAD CSS" button at the bottom of the download page. That CSS is dynamically generated from your selections above it, so you might want to adjust the settings in that link, which are suitable for my repo, but maybe not yours.

    I used the "Coy" Prism theme for the demos above, since it works well with the default Fossil skin, which I'm using with few adjustments on that site.

Now you need to arrange for Prism's Javascript to be served within the restrictions of the default CSP. There are a few ways:

  1. The way I did it is dependent on the fact that my public Fossils are served as virtual subdirectories of my otherwise static web site. This allowed me to create a /js subdirectory off the root of the static part of my public web site, and put the prism.js file there.

  2. If you don't have a front-end HTTP server that allows you to serve static files like this, you could store prism.js in your repo as unversioned content and point to it thus:

    <script nonce="$<nonce>" src="$<home>/uv/prism.js"></script>
    

Served File Size Bloat?

You might not be happy with the CSS edit above, which roughly doubles the default size of the Fossil-provided virtual style.css file.

This is a non-issue if you have a sufficiently smart front-end proxy in front of Fossil. With it, you can match the style.css virtual file by name and apply a far-future expires header on it, allowing that file to be cached indefinitely once pulled. Since Fossil gives a new version to the file each time it changes, there's no worry over stale CSS even with a multi-year cache expire time.

In this way, you can cause browsers to pull it once on first visit and then never need to pull it again until it changes, or they stop visiting often enough that the file stays in their cache.

You want to be doing this anyway, Prism or no. Properly tuned, a web site will pull such CSS files virtually instantaneously from cache, rather than over the World Wide Wait every time.

The code-in-pre Feature

If you do the above steps to your site, only Markdown fenced code blocks will be styled like this.

The second demo above relies on a new feature I've just added to Fossil on the code-in-pre branch. This feature causes Fossil to put <code class="language-EXT"> inside the <pre> tags it uses for /artifact and such, with EXT being the extension of the file content being shown. We rely on Prism to guess the correct syntax highlighting rules from that extension.

Keep in mind that I'm writing this only minutes after checking in the first functional version of that feature. I have no reason to expect that it covers all cases. I'm also not wild about how it does a file name lookup in the Fossil SQL DB purely for this purpose; I was hoping Fossil had already gathered that info somehow above the HTML code I added to the Fossil source, but I couldn't see anywhere I could snag it, so I gave up and added my one-off SQL.

Future Directions

I'm toying with the idea of making this posting the seed of a new section in the docs, "Integrating Fossil with..." using a directory structure like www/int/syntax/prism.md, with the idea that the section will grow to encompass dozens or even hundreds of documents as people come up with local integrations. Between skin tweaks, /ext, and TH1, we should be able to fill this section quickly, contributors willing.

(2) By ckennedy on 2019-09-02 21:07:08 in reply to 1.0 [link] [source]

This is seriously cool Warren. I tried to get this working once with Fossil using some info I found online somewhere and couldn't get it to work at all. The above is quite simple and doable.

I would suggest the new section of the docs be called "Extending Fossil with..." rather than "Integrating Fossil with...". To my mind this fits better with the CGI extension feature as well as what is actually happening. We are extending Fossil with other tools/features.

Thanks.

(3) By anonymous on 2019-09-02 21:20:25 in reply to 1.0 [link] [source]

Very cool. Thanks for the work you have already done on the docs. This looks like a start to a new series of documents 8-).

IIRC the cgi root directory will just serve up files if they don't have the execute bit set. So you should also be able to:

  1. store prism.js in your CGI server extension directory. E.G. /home/fossil/cgi
  2. start fossil with the argument: -extroot: /home/fossil/cgi

and point to it with:

 <script nonce="$<nonce>" src="$<root>/ext/prism.js"></script>

The only question I have is whether the mime type served up by accessing the file this way would be correct and allow the code to be executed.

Sadly my server upgrade went sideways today, so my fossil repos are offline and I can't test.

(4) By Warren Young (wyoung) on 2019-09-02 23:34:37 in reply to 3 [link] [source]

start fossil with the argument: -extroot: /home/fossil/cgi

That prompted me to write "Serving Files Within the Limits," an extension to the default CSP doc, which adds that idea and expands on what we already covered. The overall message I hope readers get from that section is, "You probably don't have to override the default CSP."

the mime type served up

Does Fossil not reuse its internal MIME type guesser from the embedded docs feature for that?

I was surprised by the very notion that the MIME type even mattered when it came to running JS in modern browsers. Once upon a time, sure, but JS has wiped out all of the competition so that the only alternatives that survive are transpiled to JS to make them run.

According to one source, you don't need a MIME type as long as the HTML includes the now-nonstandard type="javascript" attribute. But, Fossil doesn't include that attribute, so for us, MIME type does apparently matter.

(5) By Warren Young (wyoung) on 2019-09-03 00:45:02 in reply to 1.0 [link] [source]

I've made some refinements to the JS for the above technique:

<script nonce="$nonce" src="$<home>/file/src/misc/prism.js?download"></script>
<script nonce="$nonce">
  Prism.languages.def = Prism.languages.tcl;
  Prism.languages.fc  = Prism.languages.bas = Prism.languages.basic;
  Prism.languages.ft  = Prism.languages.fortran;

  (function() {
    function iscrlf(c) { return c == 10 || c == 13 }
    document.querySelectorAll("pre > code").forEach((e) => {
      var h = e.innerHTML;
      while (h.length > 2 && iscrlf(h.charCodeAt(h.length - 1))) {
        h = e.innerHTML = h.substring(0, h.length - 2);
      }
    })
  })();

  <th1>styleScript</th1>
</script>

  1. It shows use of an in-repo /file URL instead of an out-of-repo /js URL. Note the use of the TH1 variable $home, since the repo I did the work on is not serving the whole site, only one "subdirectory."

    (By the way, I tried a /raw URL first, as it's shorter, but that caused the JS file to be downloaded to disk. This might be the MIME type issue brought up elsewhere in this thread.)

  2. It shows document type aliasing for cases where Prism can't detect the proper type from the file name extension alone. Yeah, we've got FORTRAN IV highlighting now, baby!

  3. Automatic removal of any CR or LF at the end of the file, which forces the closing </code> tag to the next line in the HTML, which means the <pre> wrapping it creates a faux newline at the end of the highlit code block. The inline JS above fixes that.

    Note that I've just inserted this code block into the one that normally calls styleScript from the Footer in the default skin.

(6) By Zlodo (achavasse) on 2019-09-03 08:28:40 in reply to 1.0 [link] [source]

The "code in pre" feature is nice. I use a secondary script as a way to map file types to languages to highlight which is much clumsier, although it's an ok workaround until that branch is merged.

I'm wondering if for the sake of accessibility it might be nice to have premade scripts that you could run on a fossil repo to automatically setup syntax highlighting, or other similar js based features, like the graph generator mentioned elsewhere, so that a new user could easily add those features to their repo if they wanted.

(7) By anonymous on 2019-09-04 03:10:52 in reply to 1.0 [link] [source]

Looks nice!

I tried to see if it Artifact's hilighting also works with line numbers on... Does not seem to kick in. Is it just my browser or that's how it is?

(8) By Warren Young (wyetr) on 2019-09-04 05:09:01 in reply to 7 [source]

Yup, that breaks it. The inner <code> tags aren't getting inserted in that case, which Prism requires. I'll look into fixing it, maybe tomorrow, unless someone beats me to it.

(9) By Warren Young (wyoung) on 2019-09-05 01:38:37 in reply to 8 [link] [source]

I've fixed the immediate symptom on the new code-in-pre-with-ln branch, but it's got a number of problems:

  1. If you just want line numbers, use the Prism line numbers plugin.

  2. If you're turning on line numbers to use Fossil's line highlighting feature — e.g. ?ln=42-69 — applying this branch's change causes Prism to override it, effectively breaking the Fossil feature.

  3. The line numbers emitted by Fossil are intermixed with the code, so that a syntax highlighter is likely to see them as syntax errors.

To fix all of this, you'd want a much more intelligent integration. The Fossil and Prism line numbering features would have to cooperate instead of fight, the line numbers would have to be outside the <code> block holding the actual code so they don't interfere, and there would have to be a way for Fossil's line highlighting code to integrate with Prism's syntax highlighting.

If someone wants to take that project on, you're welcome to it. It won't be me.

(10) By anonymous on 2019-09-05 06:01:43 in reply to 9 [link] [source]

Can Prism deal with statements being split across <code> spans?

If so, maybe this would work:

<pre>
     1  <code>line of code</code>
     2  <code>another line of code</code>
     3  <code>and more code</code>
</pre>

I know that's a lot of extra markup.

(11) By Warren Young (wyoung) on 2019-09-05 08:21:40 in reply to 10 [link] [source]

I assume that would cause it to lose context and thus create a bunch of false syntax errors. For example, how could it be expected to deal with this bit from Fossil's own source code:

    zCmd = mprintf("\"%s\" http --in \"%s\" --out \"%s\" --ipaddr 127.0.0.1"
                   " \"%s\" --localauth",
       g.nameOfExe, transport.zOutFile, transport.zInFile, pUrlData->name
    );

I'd suggest one of two other solutions:

<table>
  <tr>
    <td>
      <pre>
        1
        2
        3
      </pre>
    </td>
    <td>
      <pre><code>first line of code
      second line of code
      third line of code
    </td>
  </tr>
</table>

If table-based layouts are deemed too horrible, then there's a straightforward method involving CSS and <div> elements with the same basic content.

Either way, you'd have to style it so that the lines of code never wrap, even if it means horizontal scrolling.

(12) By anonymous on 2019-09-25 15:50:50 in reply to 9 [link] [source]

There was a patch submitted a while back that would integrate HighlightJS Linenumbering into fossil by use of a new toggle-able. Basically the toggle-able tells fossil "do not actually do line numbering" and then you patch the theme to detect the &ln in the url (with th1) and from there it cause the JS to start line numbering. The Highlight JS line numbering plugin was modified to utilize Fossil's way of selecting line numbers as well.

Believe this was here https://code.amlegion.org/hljs_line_numbers/doc/trunk/README.md

The patch mentioned is here:
https://www.fossil-scm.org/forum/forumpost/87b252fc31ebc840725fd6f395d00232261351eddee8570662a880d12e21a834

(13.1) By Warren Young (wyoung) on 2022-03-29 21:01:26 edited from 13.0 in reply to 1.1 [link] [source]

Fun hack using the above: force Prism to highlight Bash shell scripts when served via /file from an appropriate subdirectory of the repo, but only when they have the right shebang line in them:

<script nonce="$nonce">
  if ((document.location.href.indexOf('/file/bin') > 0) ||
      (document.location.href.indexOf('/file?name=bin') > 0)) {
    const pres = document.getElementsByTagName('pre');
    if (pres.length > 0) {
      const p = pres[0];
      if (p.innerHTML.indexOf('#!/bin/bash') != -1) {
        Prism.util.setLanguage(p, 'bash');
        Prism.highlightElement(p);
      }
    }
  }
</script>

Put that in your skin's footer, and suddenly those unstyled shell scripts will be stylin'.

While this code works for my immediate use case, I offer it more as a collection of ideas for mining. You might need it to work via /artifact, or you might need it to work with different file paths, or different languages, etc.

Requires that you have the "Bash" language feature enabled in your generated prism.js file.

(18) By Stephan Beal (stephan) on 2022-03-31 22:52:00 in reply to 13.1 [link] [source]

Fun hack using the above: force Prism to highlight Bash shell scripts

A potential improvement for:

if (p.innerHTML.indexOf('#!/bin/bash') != -1) 

would be:

if (p.innerText.indexOf('#!/bin/bash') === 0) 

That way it will only highlight files which begin with that text. innerText effectively strips out all of the HTML.

(14.1) By Martin Vahi (martin_vahi) on 2022-03-31 13:07:12 edited from 14.0 in reply to 1.1 [link] [source]

Isn't there a problem that if You start to rely on a fancy, difficult to independently replicate, "feature-bloated" web browser "too much", then Fossil wiki becomes usable only with the few main-stream web browsers, which in turn limits the de facto portability of the Fossil?

For example, Mozilla, Google, Microsoft will not start to port their products to very niche operating systems, to some extent even something like the OpenBSD, which at least in the past used its own GCC branch that was not able to compile all upstream sources of some relatively mainstream software. The fancier the web browser, the more dependencies it probably has. Not just runtime dependencies, but also dependencies that exist only during built-time.

In my 2022_03_31 view one of the main benefits of the Fossil as a wiki system is that it is very self-contained and portable. Just compile the C code and it pretty much works. During the few past months the question about the Fossil GUI portability has bugged me. For example, at one of the more recent Fossil versions the WYSIWYG-editor was outdated/thrown_out in favor of raw HTML. The flaw that I see here is that a Fossil-specific WYSIWYG-editor can enforce a Fossil specific document structure, document format, so that as technologies change, some other GUI that is not based on a web browser, can still render the Fossil wiki documents, but if the Fossil adopts HTML+CSS+JavaScript as its wiki document format, then the document format is totally outside of the control of the Fossil developers. A workaround to it might be that Fossil developers define some subset of the HTML+CSS+JavaScript that is allowed to be used at Fossil wiki pages, but as of 2022_03_31 I haven't noticed any such efforts.

A positive example of a document format that has withstood the test of time is UTF-8 text files. Even more reliable example is ASCII text files. A negative example of a document format is the VRML: it was supposed to be like HTML, but for describing 3D scenes. There are some "hacks", how to use the VRML in 2022, but clearly it may not be relied upon. Even the old 90-ties closed source Microsoft Word document format seem to have better support in 2022 than the VRML.

As of 2022_03_31 it seems to me that if there were some structure rules for Fossil wiki pages in place, then it does not even matter so much whether the Fossil wiki documents are in JSON, XML, YAML, HTML, whatever-other-structure-description-format. If some certain subset of HTML can be used for describing the structure then HTML is fine and the use of HTML+CSS+JavaScript allows to benefit from the modern web browser technology. What I'm dreaming about is that the Fossil wiki documents, even if they are in HTML+CSS+JavaScript, were convertible to different document formats without ever having to use any web browser for rendering. A lot like LaTeX can be converted to PDF, DVI, HTML, etc. With WYSIWYG-editor's it should be possible to create a solution, where the wiki page is described fully in JSON/XML/YAML that resides at an HTML comment and then the rest of the HTML might be totally "free-form", like, whatever the latest web browsers support.

It's that lack of a guaranteed document format that has made me to hesitate a little bit to rely on any wiki systems recently, including Fossil. Right now I tend to gravitate more towards a plain folder with loose files in there and an IDE for editing some relatively simplistic, old-style, HTML. The beauty of an IDE is that it is possible to move an HTML-file from one folder to another so that the links in all of the project files get updated automatically. IDEs get replaced over time, but at least the documents stay the same, a lot like UTF-8 text files. Text-editors change, UTF-8 text files stay the same. On top of that there's a possibility to use code generation.

Basically, what I am currently yearning for is 90-ties simple HTML, with the exceptions that in addition to images there would be the possibility to use videos and sounds(OGG, MP3, whetever the format) and the dumb lack of consistency of 90-ties HTML were eliminated (I'm referring to the mess with HTML tables spec and related CSS). The Project Gemini does not qualify, because its document format is too limited for my taste and they also seem to put a lot of effort on network protocol in stead of being focused on just the document format. In my 2022_03_31 opinion network protocols vary in time and they should be kept separate from document formats. For example, there are Tor IPv4 overlay network, ZeroNet IPv4 overlay network, Beaker IPv4 overlay network, I2P IPv4 overlay network, IPFS IPv4 overlay network, Freenet IPv4 overlay network, which is a true classic in its class, and even some really old networks like FidoNet. On top of that there might be other, BlueTooth based , non-overlay networks like the Serval and even the radio amateurs might have some repeaters in place.

One of the most important thing that I've learned by studying the different networks is that the task of addressing network nodes and the task of routing traffic in the network are 2 totally different tasks. A lot like with house addresses in a city and the task of finding a way to get from one house to another. Documents, including HTML, should be seen as just a payload, the "goods of a truck that moves from one house to another". With that kind of view a search engine is useful only for indexing local documents and any google-like functionality can be achieved only by collaboration of different search engine entities, a lot like the open source YaCy and some closed source project try to do.

Fossil is related to this long text by the fact that it is also a wiki, a system for assembling document collections. Hence all the fundamental issues that come with document formats and document processing(searching, editing, transportation, physical storage, privacy, etc.)

Thank You for reading my post.

(15) By Stephan Beal (stephan) on 2022-03-31 13:32:58 in reply to 14.1 [link] [source]

if You start to rely on a fancy, difficult to independently replicate, "feature-bloated" web browser "too much", then Fossil wiki becomes usable only with the few main-stream web browsers, which in turn limits the de facto portability of the Fossil?

This is 2022, not the 1990s. Fossil's use of JS pales in comparison to most sites people use nowadays. Even my beloved BoardGameGeek.com serves more than 1MB of JS.

The flaw that I see here is that a Fossil-specific WYSIWYG-editor can enforce a Fossil specific document structure, document format, so that as technologies change, some other GUI that is not based on a web browser, can still render the Fossil wiki documents,

Yet fossil never did so, nor did we have any collective intentions of making it do so. The old WYSIWYG editor was 100% generic but built on long-since-deprecated JS APIs so was going to die eventually, anyway.

but if the Fossil adopts HTML+CSS+JavaScript as its wiki document format, then the document format is totally outside of the control of the Fossil developers.

That's a misunderstanding. Fossil's wiki formats are well-defined plain text and are essentially static, with new features being added only rarely. The front end used to edit them is independent of the wiki data. You can edit your pages in emacs using the wiki command (as opposed to the web interface) if you like, 100% free of HTML, CSS, and JS.

It's that lack of a guaranteed document format that has made me to hesitate a little bit to rely on any wiki systems recently, including Fossil.

That's a misunderstanding of the distinction between the wiki page content and the editor. The three document formats are fixed and will never be significantly overhauled in ways incompatible with older documents. New features may be added and bugs may be fixed, but old docs will always render in newer versions of fossil.

A positive example of a document format that has withstood the test of time is UTF-8 text files.

That is precisely what fossil wiki pages are except that they live in a db instead of the filesystem. The wiki command can export them to files and re-import them, so arbitrary editors may be used.

A workaround to it might be that Fossil developers define some subset of the HTML+CSS+JavaScript that is allowed to be used at Fossil wiki pages, but as of 2022_03_31 I haven't noticed any such efforts.

See this page for the guidelines on that.

Note, also, that we write our own JS, we don't import 3rd-party libraries, so we're not in a situation that a 3rd-party update can break fossil or force us to rewrite for their upgrades.

Basically, what I am currently yearning for is 90-ties simple HTML,

Then you are respectfully asked to not hold your breath waiting for it to come to (well, return to) fossil ;). Fossil is, first and foremost, developed to be useful to its own developers, and that sometimes calls for modern JS code. Without exception, though, such JS only acts as a front-end, and in no way applies any changes to or limitations on the data in the back-end. Those data are all 100% UI/interface-agnostic.

(16.1) By Martin Vahi (martin_vahi) on 2022-03-31 20:01:09 edited from 16.0 in reply to 15 [link] [source]

Thank You for the very useful answer. I tried the wiki command and it seems to me that the wiki command actually solves my problem. With that wiki command I can use exactly the subset of HTML that I was dreaming of using and I can use it by using an IDE of my choice.

Literally perfect, at least fits my current (2022_03_31) dreams about a perfect wiki exactly. Talking about perfection, there is the issue of fsck erasing the newest copy of a fossil repository, but I suspect that NilFS(NilFS2) is the answer/workaround to that problem.

(17) By Warren Young (wyoung) on 2022-03-31 20:06:32 in reply to 14.1 [link] [source]

If you run prism.js in a browser that chokes on the code for any reason, the contents of your <pre><code> blocks will simply remain unstyled. It's a purely cosmetic improvement added to the presentation after it's already rendered in the default colors.

If even the fact of the JS running to no useful end bothers you, you can block it on a per-script per-site basis with something like NoScript.