Fossil Forum

Segfault in 2.22 with ancient repository
Login

Segfault in 2.22 with ancient repository

Segfault in 2.22 with ancient repository

(1) By S. Ross Gohlke (rossgohlke) on 2023-07-01 14:34:52 [link] [source]

I have one old repository which is giving me troubles with fossil 2.22.

All testing was done with binaries built from src in Downloads with vanilla debug configuration.

# ./configure --fossil-debug
# make

When I update or open the repository I get a segmentation fault. With open, it happens after it lists the files (the list is complete). The --verbose flag shows no additional information.

This does not happen with fossil 2.21, and it does not happen with every repository. I tried the same actions on two different hosts with the same result. Both hosts are running FreeBSD 14.

I don't see anything special about this repository in settings.

% gdb fossil fossil.core
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from fossil...
(No debugging symbols found in fossil)
[New LWP 102195]
Core was generated by `/var/tmp/build/fossil-src-2.22/fossil open --verbose /var/tmp/test.fossil'.
Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  0x000000000075bcc8 in ?? ()
(gdb) bt
#0  0x000000000075bcc8 in ?? ()
#1  0x0000000000000000 in ?? ()

I am new to gdb. Any suggestions for how to troubleshoot are appreciated.

(2.1) By Stephan Beal (stephan) on 2023-07-01 15:34:53 edited from 2.0 in reply to 1 [link] [source]

This does not happen with fossil 2.21, and it does not happen with every repository.

Can you make a copy of one of the failing repositories available via a link here? If we can reproduce it, it'll probably take no time at all to solve.

I am new to gdb. Any suggestions for how to troubleshoot are appreciated.

The first thing to try is the "bt" command right after it fails. That should show more of the stack trace.

Edit: nevermind, you already did that. Then i'm at a loss!

(3) By mark on 2023-07-01 15:48:02 in reply to 1 [link] [source]

Could you please build fossil by invoking make like:

./configure --static
CFLAGS="-DDEBUG -g" make

and then run the backtrace on the core dump produced by that binary?

The binary that produced your above trace doesn't appear to have been built with debug symbols.

(4) By S. Ross Gohlke (rossgohlke) on 2023-07-01 16:26:01 in reply to 3 [source]

It gave the exact same result.

A snippet from build to show CFLAGS were applied

cc -I. -I./src -I./extsrc -Ibld -Wall -Wdeclaration-after-statement  -DDEBUG -g -DHAVE_AUTOCONFIG_H -DNDEBUG=1  -DSQLITE_DQS=0  -DSQLITE_THREADSAFE=0  -DSQLITE_DEFAULT_ME
MSTATUS=0  -DSQLITE_DEFAULT_WAL_SYNCHRONOUS=1  -DSQLITE_LIKE_DOESNT_MATCH_BLOBS  -DSQLITE_OMIT_DECLTYPE  -DSQLITE_OMIT_DEPRECATED  -DSQLITE_OMIT_PROGRESS_CALLBACK  -DSQLI
TE_OMIT_SHARED_CACHE  -DSQLITE_OMIT_LOAD_EXTENSION  -DSQLITE_MAX_EXPR_DEPTH=0  -DSQLITE_ENABLE_LOCKING_STYLE=0  -DSQLITE_DEFAULT_FILE_FORMAT=4  -DSQLITE_ENABLE_EXPLAIN_CO
MMENTS  -DSQLITE_ENABLE_FTS4  -DSQLITE_ENABLE_DBSTAT_VTAB  -DSQLITE_ENABLE_FTS5  -DSQLITE_ENABLE_STMTVTAB  -DSQLITE_HAVE_ZLIB  -DSQLITE_ENABLE_DBPAGE_VTAB  -DSQLITE_TRUST
ED_SCHEMA=0  -DHAVE_USLEEP  -Dmain=sqlite3_shell  -DSQLITE_SHELL_IS_UTF8=1  -DSQLITE_OMIT_LOAD_EXTENSION=1  -DUSE_SYSTEM_SQLITE=0  -DSQLITE_SHELL_DBNAME_PROC=sqlcmd_get_d
bname  -DSQLITE_SHELL_INIT_PROC=sqlcmd_init_proc   -DHAVE_LINENOISE -c ./extsrc/shell.c -o bld/shell.o

static build

% ./fossil version -vv
...
FOSSIL_STATIC_BUILD
...

gdb output

% gdb fossil fossil.core
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from fossil...
(No debugging symbols found in fossil)
[New LWP 101221]
Core was generated by `/var/tmp/build/fossil-src-2.22/fossil open --verbose /var/tmp/test.fossil'.
Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  0x00000000008315a8 in ?? ()
(gdb) bt
#0  0x00000000008315a8 in ?? ()
#1  0x0000000000000000 in ?? ()

(7) By mark on 2023-07-01 17:05:58 in reply to 4 [link] [source]

static build

% ./fossil version -vv ... FOSSIL_STATIC_BUILD ... gdb output

% gdb fossil fossil.core GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD] Copyright (C) 2023 Free Software

Reading symbols from fossil...
(No debugging symbols found in fossil)

Can you please pass the path of the freshly built Fossil binary—the same one that generated the dump —to gdb:

gdb /path/to/compiled/fossil fossil.core

(8) By S. Ross Gohlke (rossgohlke) on 2023-07-01 17:14:47 in reply to 7 [link] [source]

Duh. You are right.

$ gdb ./fossil /var/tmp/test2/fossil.core 
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./fossil...
[New LWP 100177]
Core was generated by `./fossil open --workdir /var/tmp/test2 /var/tmp/test.fossil'.
Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  0x00000000008315a8 in ensure_empty_dirs_created (clearDirTable=0)
    at ./src/update.c:659
659         for(i=0; i<pGlob->nPattern; i++){
(gdb) bt
#0  0x00000000008315a8 in ensure_empty_dirs_created (clearDirTable=0)
    at ./src/update.c:659
#1  0x00000000006d0231 in checkout_cmd () at ./src/checkout.c:358
#2  0x00000000006e2133 in cmd_open () at ./src/db.c:4275
#3  0x000000000073cbe6 in fossil_main (argc=5, argv=0x8218a6ef8)
    at ./src/main.c:964
#4  0x000000000073c1a2 in main (argc=5, argv=0x8218a6ef8) at ./src/main.c:680

(14) By mark on 2023-07-02 02:48:26 in reply to 8 [link] [source]

Duh. You are right.

I've committed the same mistake in the past. I make it a rule to use
absolute paths when debugging, or {--prefix,PREFIX}=$HOME make install,
so that binaries are installed into $HOME/bin which is first in my
command path.


$ gdb ./fossil /var/tmp/test2/fossil.core 
GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.

...

Reading symbols from ./fossil...
[New LWP 100177]
Core was generated by `./fossil open --workdir /var/tmp/test2 /var/tmp/test.fossil'.
Program terminated with signal SIGSEGV, Segmentation fault.
Address not mapped to object.
#0  0x00000000008315a8 in ensure_empty_dirs_created (clearDirTable=0)
    at ./src/update.c:659
659         for(i=0; i<pGlob->nPattern; i++){
(gdb) bt
#0  0x00000000008315a8 in ensure_empty_dirs_created (clearDirTable=0)
    at ./src/update.c:659
#1  0x00000000006d0231 in checkout_cmd () at ./src/checkout.c:358
#2  0x00000000006e2133 in cmd_open () at ./src/db.c:4275
#3  0x000000000073cbe6 in fossil_main (argc=5, argv=0x8218a6ef8)
    at ./src/main.c:964
#4  0x000000000073c1a2 in main (argc=5, argv=0x8218a6ef8) at ./src/main.c:680

I see you already resolved this, and as Stephan noted, it has been
fixed in a subsequent version but thanks for the report nonetheless!

(5) By Warren Young (wyoung) on 2023-07-01 16:32:48 in reply to 3 [link] [source]

The binary that produced your above trace doesn't appear to have been built with debug symbols.

More like the stack’s been scribbled on and is now useless, or the core is from a mismatched binary.

It is sometimes possible to get around this by running Fossil under GDB/LLDB from the start rather than try to make use of a dropped core, giving the debugger a chance to notice the problem before it gets that bad:

  $ gdb fossil
  (gdb) run arg arg arg

…where the args are the same as those you passed to Fossil to get the core to drop the first time.

I offer this for completeness. It's better at this point to build Fossil with all the sanitizers enabled:

  $ ./configure --with-sanitizer=address,enum,null,undefined CFLAGS='-DDEBUG -g -O0'
  $ make clean         ← important!
  $ make -j11

Then, being sure to use the new binary, not the one in your PATH, retry the command that's failing:

  $ ./fossil arg arg arg

That should yield more interesting output.

(6) By S. Ross Gohlke (rossgohlke) on 2023-07-01 17:02:14 in reply to 5 [link] [source]

I see how I could have been clearer showing my workflow to avoid some confusion like whether the core file was fresh (it always was).

These exact steps produced different output.

$ ./configure --with-sanitizer=address,enum,null,undefined CFLAGS='-DDEBUG -g -O0'
$ make clean
$ make -j4
$ ./fossil open --workdir /var/tmp/test /var/tmp/test.fossil
Pull from https://userid@domain.tld/path/to/repo                          
SSL: cannot connect to host domain.tld:443 (EVP lib)                             
Pull done, wire bytes sent: 0  received: 0  remote:                                 
Autosync failed.                                                                    
continue in spite of sync failure (y/N)? y
...
[file list]
src/update.c:659:23: runtime error: member access within null pointer of type 'Glob' (aka 'struct Glob')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/update.c:659:23 in 
src/update.c:659:23: runtime error: load of null pointer of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/update.c:659:23 in 
AddressSanitizer:DEADLYSIGNAL
=================================================================
==63556==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000001bb227d bp 0x7fffffffdf10 sp 0x7fffffffde90 T0)
==63556==The signal is caused by a READ memory access.
==63556==Hint: address points to the zero page.
    #0 0x1bb227d in ensure_empty_dirs_created /var/tmp/build/fossil-src-2.22/./src/update.c:659:15
    #1 0x135d4df in checkout_cmd /var/tmp/build/fossil-src-2.22/./src/checkout.c:358:3
    #2 0x13a29e5 in cmd_open /var/tmp/build/fossil-src-2.22/./src/db.c:4275:5
    #3 0x1564311 in fossil_main /var/tmp/build/fossil-src-2.22/./src/main.c:964:7
    #4 0x1560ea1 in main /var/tmp/build/fossil-src-2.22/./src/main.c:680:43
    #5 0x802155c0a in __libc_start1 (/lib/libc.so.7+0x8ac0a)
    #6 0x5af901 in _start /usr/src/lib/csu/amd64/crt1_c.c:52:2
    #7 0x801c74007  (<unknown module>)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /var/tmp/build/fossil-src-2.22/./src/update.c:659:15 in ensure_empty_dirs_created
==63556==ABORTING

(9) By Stephan Beal (stephan) on 2023-07-01 17:17:09 in reply to 6 [link] [source]

points to the zero page. #0 0x1bb227d in ensure_empty_dirs_created /var/tmp/build/fossil-src-

iirc this was recently fixed and happens when a versioned settings file is empty. i'm on a phone so cannot easily verify that.

(10) By S. Ross Gohlke (rossgohlke) on 2023-07-01 17:28:18 in reply to 9 [link] [source]

I downloaded and built the latest Tarball from here. That fixes the problem.

Sorry for the noise but I do appreciate the help. Next time will go much faster.

Leaving a note for myself, if the latest official release, whether by package (pkg on FreeBSD) or Download, gives a problem, try the latest Tarball before freaking out.

(11) By S. Ross Gohlke (rossgohlke) on 2023-07-01 17:42:00 in reply to 9 [link] [source]

This seems to have the fix and references a forum post.

Just to be clear, I have never used .fossil-settings directory, so the problem was not an empty file at .fossil-settings/empty-dirs. I'm just glad it works.

(12) By S. Ross Gohlke (rossgohlke) on 2023-07-01 17:46:06 in reply to 11 [link] [source]

... I just realized that it does have any empty value for empty-dirs in the database.

$ fossil settings -R /var/tmp/test.fossil
...
empty-dirs           (local)
...

I have no idea why or when that got set.

(13) By S. Ross Gohlke (rossgohlke) on 2023-07-01 17:51:54 in reply to 12 [link] [source]

So...

$ fossil unset empty-dirs -R /var/tmp/test.fossil

solves the original problem in that fossil 2.22 does not exit with segmentation fault when updating or opening this repository.