Fossil Forum

Addition of /*hi*/ at end of files in bf3a32
Login

Addition of /*hi*/ at end of files in bf3a32

Addition of /*hi*/ at end of files in bf3a32

(1) By John Rouillard (rouilj) on 2021-11-16 02:58:04 [link] [source]

I was looking through the diffs for fossil:/info/bf3a32f59e which says:

=== 2021-11-14 ===
19:38:05 [bf3a32f59e] *CURRENT* Fix an incorrect malloc() associated with
         "fossil commit -v" (user: drh tags: trunk)

A diff of change change shows three files:

  • src/db.c
  • src/main.c
  • src/sync.c

with a new line at the end of the file:

/*hi*/

This change doesn't appear to have anything to do with the log message. So being nosy can anybody tell me what this comment means?

-- rouilj

(2) By Stephan Beal (stephan) on 2021-11-16 03:48:48 in reply to 1 [link] [source]

So being nosy can anybody tell me what this comment means?

When i asked about it in /chat yesterday someone (who i won't name but they're of course free to out themselves) suggested that:

Those look suspiciously like someone trying to attack the project with embedded unbalanced bidi sequences. :) {the preceding is meant humorously}

Sounds legit.

(3) By Warren Young (wyetr) on 2021-11-16 03:52:34 in reply to 2 [link] [source]

I diffed that commit and ran it through hexdump -C to verify that, but the three instances are plain ASCII.

Another explanation is wanted. :)

(4) By John Rouillard (rouilj) on 2021-11-16 04:37:37 in reply to 3 [link] [source]

I did something similar by tailing the files and piped it through od -c. Only ascii chars 8-/.

(7) By Warren Young (wyoung) on 2021-11-16 04:49:13 in reply to 4 [link] [source]

od -c is another interpretation layer, as is the "-C" option to hexdump, which is why I also checked the hex byte values.

There's a moral in there, if you go looking.

(5) By Scott Robison (sdr) on 2021-11-16 04:38:21 in reply to 3 [link] [source]

Now that you've confirmed this, no one will expect someone to try now! No one expects the Spanish Inquisition Trojan Source attack!

(6) By jamsek on 2021-11-16 04:45:50 in reply to 5 [link] [source]

I can confirm that embedded unbalanced bidi sequences were not in the
original patch!

(8) By Florian Balmer (florian.balmer) on 2021-11-16 11:43:13 in reply to 1 [source]

Following is a simple CMD/JScript hybrid script to check text files for bidi format marks on Windows:

@if (1 == 0) @end /*:: mode=js
@cscript.exe //E:jscript //NOLOGO "%~f0" %*
@exit /B %ERRORLEVEL% & ::*/
if( WScript.Arguments.length!=2 ){
  WScript.StdErr.Write('Usage: bidifmt.cmd FILENAME ENCODING\n');
  WScript.Quit(1);  // Not verified.
}
var Filename = WScript.Arguments.Item(0);
var Encoding = WScript.Arguments.Item(1);
var Text, Stream = WScript.CreateObject('ADODB.Stream');
try{
  Stream.Type = /*adTypeText=*/2;
  Stream.Charset = Encoding;
  Stream.Open();
  Stream.LoadFromFile(Filename);
  Text = Stream.ReadText(/*adReadAll=*/-1);
  Stream.Close();
}catch(e){
  WScript.StdErr.Write('ERROR: ' + e.message + '\n');
  WScript.Quit(1);  // Not verified.
}
//  https://www.trojansource.codes/trojan-source.pdf
//  LRE U+202A  PDF U+202C  LRO U+202D  LRI U+2066  FSI U+2068
//  RLE U+202B              RLO U+202E  RLI U+2067  PDI U+2069
WScript.Quit(
  /\u202A|\u202B|\u202C|\u202D|\u202E|\u2066|\u2067|\u2068|\u2069/.test(Text)
  ? /* Only return 0 if no bidi format marks found. No: */ 1 : /* Yes: */ 0);

When saved as bidifmt.cmd, the following example command prints a list of C source and header files in the current tree that are encoded in UTF-8 and contain any bidi format marks.

for /F "usebackq delims=" %F in (`dir /A:-D/B/S *.c *.h`) do @call bidifmt.cmd "%F" "UTF-8" 2>nul || echo %F

Note that the call statement is required to be able to process the exit code from a batch script. Also note that %F needs to be escaped to %%F when used in batch scripts.

The script merely reports the existence of bidi marks, and doesn't do anything more sophisticated. The ENCODING argument can also be set to "UTF-16" or "GB18030", if required. It also works with "ISO-8859-1" and other MBCS encodings -- but of course it only makes sense with full range Unicode encodings.

Such CMD/JScript hybrids are quite handy, as they run everywhere, and can do almost anything. And thanks to this library by Douglas Crockford himself, they can fully process JSON, for example. (It's also possible to bundle multiple individual files to a single WSF script, so not too much copy-pasting is required.)

(10) By Florian Balmer (florian.balmer) on 2021-11-16 12:44:41 in reply to 8 [link] [source]

P.S.

That thing primarily runs as a CMD/batch script, where:

  • @if (1 == 0) @end /*:: mode=js

is a never-true condition with suppressed command echo'ing (by @), that makes CMD jump to the next line, ignoring everything after the brackets, with :: being just eye candy for symmetry (see below) and mode=js a hint to my text editor to colorize a .cmd file with .js colors, and:

  • @cscript.exe //E:jscript //NOLOGO "%~f0" %*

calls the CLI JScript engine with the CMD script file as source and all its arguments, and finally:

  • @exit /B %ERRORLEVEL% & ::*/

terminates the CMD script with the exit code from the JScript engine (which is somewhat redundant, but explicit), chained with a batch comment command :: (mirrored above for symmetry) to ignore the remainder of the line.

Then the JScript engine sees the (Microsoft-specific) conditional compilation statement:

  • @if (1 == 0) @end /*:: mode=js

so skips everything up to @end, plus the following multi-line /* ... */ comment, and then continues processing as normal Javascript code after the comment.

(9) By Richard Hipp (drh) on 2021-11-16 12:26:25 in reply to 1 [link] [source]

The added comments were a mistake on my part. I'll fix it.

(11) By Florian Balmer (florian.balmer) on 2021-11-16 12:46:48 in reply to 9 [link] [source]

Visit from your godchildren ... ? ;-)