Fossil User Forum

TH1 taint observations
Login

TH1 taint observations

TH1 taint observations

(1.1) By MG (mgr) on 2025-04-23 12:38:31 edited from 1.0 [source]

Following cleanup of lots of TH1 code, I noticed that comparison like 'eq' do differentiate between a 'pure' string like eg "abc" and the 'tainted' version thereof. In other words, if one want to check for some value in user input/query results, one needs to do

if {$somevar eq [taint "some string"]} {
  #do something
}

or

if {[untaint $somevar] eq "some string"} {
  #do something
}

Came as a surprise, but makes sense.

In one instance, my code cobbles together a querystring ($qr) by means of convoluted if/elseif/else, set "text $f text", for loop and maybe some $::var usage (in a proc), but without untaint other then the using it in some if {[untaint $xx] eq "yy"} ...

And the resulting querystring can easily be broken by a user input of say ', but does not trigger on the call

query $qr {...}

Will try to construct a minimal example tomorrow.

(2) By Jörgen Kosche (jkosche) on 2025-04-23 12:49:45 in reply to 1.1 [link] [source]

I noticed that comparison like 'eq' do differentiate between a 'pure' string like eg "abc" and the 'tainted' version thereof.

It does? That is kind of surprising. Regular expressions work, the current code for ticket view contains:

if [regexp $tagpattern $foundin] { ... do something
Here $foundin should be tainted. But it works as expected.

If comparisons with eq stop working that might possibly break existing scripts. Or do I think wrong here?

Does setting the vuln-report to off change this behaviour? I should test once at home (currently in the office at work).

(3) By Richard Hipp (drh) on 2025-04-23 12:52:54 in reply to 1.1 [link] [source]

Please try again with check-in 2025-04-23T12:51Z or later.

(4) By MG (mgr) on 2025-04-24 07:26:10 in reply to 3 [link] [source]

Confirmed that 'eq' now recognizes equal strings even when they do have differing taintedness

(5) By MG (mgr) on 2025-04-24 07:38:35 in reply to 1.1 [link] [source]

After some debugging, the 'culprit' in my case is string range, that looses the tainted status.

As does lindex and probably some more, thats the ones I had to test in my case.

The following do as well, but that seems to be right/on purpose:

  • htmlize
  • httpize
  • markdown
  • string length
  • expr
set test [taint "ABCDEFGH abcdef"]
set testnum [taint 123]

#string range -> untaint
set test_range [string range $test 3 5 ]
puts $test_range 
puts [string is tainted $test_range]

#string length-> untaint
set test_length [string length $test]
puts $test_length 
puts [string is tainted $test_length ]

#lindex -> untaint
set test_lindex [lindex $test 1]
puts $test_lindex
puts [string is tainted $test_lindex]

#markdown-> untaint
set test_markdown [markdown $test]
puts $test_markdown
puts [string is tainted $test_markdown]

#htmlize-> untaint
set test_htmlize [htmlize $test]
puts $test_htmlize
puts [string is tainted $test_htmlize]

#httpize-> untaint
set test_httpize [httpize $test]
puts $test_httpize
puts [string is tainted $test_httpize]

#expr -> untaint
set test_expr [expr $testnum+999000 ]
puts $test_expr 
puts [string is tainted $test_expr ]

(6) By MG (mgr) on 2025-04-24 07:59:13 in reply to 5 [link] [source]

some more

  • string repeat keeps taintedness, makes sense
  • foreach looses taintedness, probably wrong
  • lappend looses taintedness, probably wrong, not only looses taintedness on returned string but also on the variable, as it modifies the variable itself ...
set test [taint "ABCDEFGH abcdef"]

#string repeat
set test_repeat [string repeat $test 3]
puts " repeat: $test_repeat "
puts [string is tainted $test_repeat ]

#foreach
foreach el $test {
  puts " ELEMENT: $el "
  puts [string is tainted $el ]
}

#lappend
lappend test xyz
puts " append: $test "
puts [string is tainted $test ]

(7) By Richard Hipp (drh) on 2025-04-24 11:39:50 in reply to 6 [link] [source]

Thanks for testing! See check-in 2025-04-24T11:18Z for updates.

(8) By MG (mgr) on 2025-04-24 12:57:53 in reply to 7 [link] [source]

Thank you. Works so far, some more missing htmlize and needed SQL sanitation in my code came up this way.

Did not know that string trim did exist (not in TH1 docs).

Neither are the new taint and untaint so far, maybe a section about all this 'tainting' would make sense.

(9.2) By MG (mgr) on 2025-04-24 16:44:01 edited from 9.1 in reply to 7 [link] [source]

More questions:

  • Shouldn't cgiHeaderLine, combobox and maybe copybtn also log/error on tainted texts?
  • Is the comment below from src/th_main.c concerning "outside of <th1> </th1>" documented somewhere? Could not find in TH1 docs.
** TH1 variables are $aaa or $<aaa>.  The first form of
** variable is literal.  The second is run through htmlize
** before being inserted.
  • Shouldn't the first form $aaa when used "outside of <th1> </th1>" also log/error on tainted texts? It does not at the moment.
  • Shouldn't getParameter return tainted strings (at least if not the DEFAULT)?

(10) By MG (mgr) on 2025-04-25 13:42:12 in reply to 9.2 [link] [source]

Reading through the code (th_main.c), I see that combobox does htmlize already, did not see it for copybtn but testing shows it does as well.

And getParameter got adjusted by 6a6b85448c736142.

(11) By MG (mgr) on 2025-04-25 14:55:13 in reply to 9.2 [link] [source]

testing cgiHeaderLine:

set test [taint "aha\nSomeBadHeader: danger"]
cgiHeaderLine "Some: $test\n"

emits (without taint-complaining) the following headers:

HTTP/1.0 200 OK
Date: Fri, 25 Apr 2025 14:47:46 +0000
Connection: close
X-UA-Compatible: IE=edge
Cache-control: no-cache
Content-Security-Policy: default-src 'self' data:; script-src 'self' 'nonce-289e424f9a02d24d6e125767f5ee1750e9e006054fe5c151'; style-src 'self' 'unsafe-inline'; img-src * data:
Some: aha
SomeBadHeader: danger
X-Frame-Options: SAMEORIGIN
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding

So cgiHeaderLine should probably also check for taintedness, as does redirect.

Interestingly, cgiHeaderLine needs a \n at the end, otherwise it 'destroys' to following Header:

cgiHeaderLine "Some: something"

emits

HTTP/1.0 200 OK
Date: Fri, 25 Apr 2025 14:53:32 +0000
Connection: close
X-UA-Compatible: IE=edge
Cache-control: no-cache
Content-Security-Policy: default-src 'self' data:; script-src 'self' 'nonce-aacd50161c936e90990f31e7936545aca4e29a8fafbac962'; style-src 'self' 'unsafe-inline'; img-src * data:
Some: somethingX-Frame-Options: SAMEORIGIN
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding

notice how the X-Frame-Options gets lost

(12.1) By MG (mgr) on 2025-04-25 15:07:35 edited from 12.0 in reply to 9.2 [link] [source]

yet another one: if enable_htmlify is off, puts does not htmlify strings (so equals html) - but gets not taint-triggered as html would.

set test [taint ABC<br/>DEF]

enable_htmlify 0
puts $test

(13.1) By Richard Hipp (drh) on 2025-04-25 16:10:08 edited from 13.0 in reply to 12.1 [link] [source]

The simplify-pikchr-cmd branch deals with enable_htmlify by eliminating it entirely. I think this is ok. I believe that code is SB's debugging logic that he used while trying to get the /pikchrshow page running. I await SB's approval before merging to trunk.

UPDATE: Now on trunk.