Fossil Forum

TH1: Confusion about HTMLized output
Login

TH1: Confusion about HTMLized output

TH1: Confusion about HTMLized output

(1) By Florian Balmer (florian.balmer) on 2018-10-12 15:20:16 [source]

Consider this TH1 snippet in the Header, Footer, CSS or Javascript part of a skin template:

<th1>
puts "puts: &<>\"'"
html "\nhtml: &<>\"'"
puts [ htmlize "\nputs\[htmlize\]: &<>\"'" ]
set test "&<>\"'"
</th1>
Outside TH1 block: test = $test

Output:

puts: &amp;&lt;&gt;&quot;&#39;
html: &<>"'
puts[htmlize]: &amp;amp;&amp;lt;&amp;gt;&amp;quot;&amp;#39;
Outside TH1 block: test = &<>"'

It seems that puts already does what I would expect html to do, while the latter seems to work more like a bare-metal print function. Consequently, the results of puts[htmlize] are double-HTMLized.

The document The TH1 Scripting Language states:

  • puts STRING: Outputs the STRING unchanged.
  • html STRING: Outputs the STRING escaped for HTML.
  • htmlize STRING: Escape all characters of STRING which have special meaning in HTML. Returns the escaped string.

I must admit I don't understand this behavior, somehow. Is it possible that the code in the underlying scripting engine for puts and html was exchanged by mistake?

However, as several skin templates seem to rely on html, changing this may require a lot of careful testing.

TH1 variables in skin templates seem to be used mostly to construct hyperlinks, and the standards seem to allow both non-HTMLized and HTMLized forms:

<a href="url&param">
<a href="url&amp;param">
<script> console.log("url&param"); /* → url&param */ </script>
<script> console.log("url&amp;param"); /* → url&amp;param */ </script>

Problems might occur with string variables containing quotation marks, but this doesn't seem to be a common case.

(2) By Richard Hipp (drh) on 2018-10-12 18:08:05 in reply to 1 [link] [source]

The "puts" command escapes its output so that it is safe to include it in the middle of HTML. The "html" command is like "puts" except it does raw output, with no escaping.

The "htmlize" command escapes its argument so that it is safe to output as part of a webpage.

Perhaps the names were not well chosen. But they are what they are so we need to live with them for historical compatibility.

(3) By Florian Balmer (florian.balmer) on 2018-10-12 20:46:51 in reply to 2 [link] [source]

A better example would have been:

<th1>
set test "&<>\"'"
puts "puts: $test"
html "\nhtml: $test"
puts [ htmlize "\nputs\[htmlize\]: $test" ]
</th1>
Outside TH1 block: test = $test

Resulting in:

puts: &amp;&lt;&gt;&quot;&#39;
html: &<>"'
puts[htmlize]: &amp;amp;&amp;lt;&amp;gt;&amp;quot;&amp;#39;
Outside TH1 block: test = &<>"'

Because my initial assumption was that the html command would only escape variables, to make them fit with the rest of the literal (already escaped) HTML string.

But from the perspective of a HTML/CGI-oriented scripting language (at least in the context of Fossil skinning), the html command could also be considered as "outputting HTML, already escaped", and the puts command as "outputting text, need escaping".

I may revert the related changes to the hamburger menu customization template, as it looks somewhat simpler without the explicit <TH1> blocks.