Fossil Forum

pikchrshow segfault
Login

pikchrshow segfault

pikchrshow segfault

(1) By sean (jungleboogie) on 2020-09-17 20:18:14 [link] [source]

Hi,

All my testing for/wtih pikchrshow has been on the hosted Fossil-scm or pikchr.org. Today I tried it on a brand new repo with Fossil and when visiting /pikchrshow, I got a segfault.

Does something more need to bet setup on a new repo to prevent segfaults?

Fossil info:

$ fossil version -v
This is fossil version 2.13 [5c92bbfca7] 2020-09-17 19:31:14 UTC
Compiled on Sep 17 2020 13:08:19 using clang-10.0.1  (64-bit)
Schema version 2015-01-24
Detected memory page size is 4096 bytes
zlib 1.2.3, loaded 1.2.3
hardened-SHA1 by Marc Stevens and Dan Shumow
SSL (LibreSSL 3.2.1)
libfuse 2.6, loaded unknown
FOSSIL_ENABLE_LEGACY_MV_RM
UNICODE_COMMAND_LINE
FOSSIL_DYNAMIC_BUILD
HAVE_PLEDGE
SQLite 3.34.0 2020-09-15 20:48:30 3d35fa0be8
SQLITE_DEFAULT_FILE_FORMAT=4
SQLITE_DEFAULT_WAL_SYNCHRONOUS=1
SQLITE_ENABLE_DBSTAT_VTAB
SQLITE_ENABLE_FTS4
SQLITE_ENABLE_FTS5
SQLITE_ENABLE_JSON1
SQLITE_ENABLE_LOCKING_STYLE=0
SQLITE_ENABLE_STMTVTAB
SQLITE_LIKE_DOESNT_MATCH_BLOBS
SQLITE_MAX_EXPR_DEPTH=0
SQLITE_OMIT_DECLTYPE
SQLITE_OMIT_DEPRECATED
SQLITE_OMIT_LOAD_EXTENSION
SQLITE_OMIT_PROGRESS_CALLBACK
SQLITE_OMIT_SHARED_CACHE
SQLITE_THREADSAFE=0
SQLITE_USE_ALLOCA

Stats of repo to show practically brand new:

$ fossil dbstat
repository-size:   229,376 bytes
artifact-count:    1 (stored as 1 full text and 0 deltas)
artifact-sizes:    163 average, 163 max, 163 total
compression-ratio: 0:10
check-ins:         1
files:             0 across all branches
wiki-pages:        0 (0 changes)
tickets:           0 (0 changes)
events:            0
tag-changes:       0
latest-change:     2020-09-15 04:34:28 - about 2 days ago
project-age:       3 days or approximately 0.01 years.
project-id:        c816eb0ea148d7338a92f0e44b5ff742dd4f1c41
schema-version:    2015-01-24
fossil-version:    2020-09-17 19:31:14 [5c92bbfca7] [2.13] (clang-10.0.1 )
sqlite-version:    2020-09-15 20:48:30 [3d35fa0be8] (3.34.0)
database-stats:    56 pages, 4096 bytes/pg, 0 free pages, UTF-8, delete mode

Visiting http://192.168.30.33:8080/pikchrshow

Shows Segfault immediately.

cli output:

$ fossil server
Listening for HTTP requests on TCP port 8080
------------- 2020-09-17 20:12:49 UTC ------------
panic: Segfault
HTTP_HOST=192.168.30.33:8080
HTTP_REFERER=http://192.168.30.33:8080/index
HTTP_USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:80.0) Gecko/20100101 Firefox/80.0
PATH_INFO=/pikchrshow
REMOTE_ADDR=192.168.30.22
REQUEST_METHOD=GET
REQUEST_URI=/pikchrshow

backtrace and more:

$ gdb /usr/local/bin/fossil fossil.core
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-unknown-openbsd6.7"...
Core was generated by `fossil'.
Program terminated with signal 6, Aborted.
Loaded symbols for /usr/local/bin/fossil
Reading symbols from /usr/lib/libfuse.so.2.0...done.
Loaded symbols for /usr/lib/libfuse.so.2.0
Reading symbols from /usr/lib/libm.so.10.1...done.
Loaded symbols for /usr/lib/libm.so.10.1
Reading symbols from /usr/lib/libssl.so.48.1...done.
Loaded symbols for /usr/lib/libssl.so.48.1
Reading symbols from /usr/lib/libcrypto.so.46.1...done.
Loaded symbols for /usr/lib/libcrypto.so.46.1
Reading symbols from /usr/lib/libz.so.5.0...done.
Loaded symbols for /usr/lib/libz.so.5.0
Reading symbols from /usr/lib/libc.so.96.0...done.
Loaded symbols for /usr/lib/libc.so.96.0
Reading symbols from /usr/libexec/ld.so...Error while reading shared library symbols:
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /usr/libexec/ld.so]
#0  thrkill () at /tmp/-:3
3       /tmp/-: No such file or directory.
        in /tmp/-
(gdb) bt
#0  thrkill () at /tmp/-:3
#1  0x00000efc13a14c0e in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51
#2  0x00000ef9e2b73193 in fossil_panic (zFormat=Variable "zFormat" is not available.
) at printf.c:1149
#3  0x00000ef9e2b449c2 in sigsegv_handler () at main.c:1481
#4  <signal handler called>
#5  vxprintf (pBlob=Variable "pBlob" is not available.
) at printf.c:218
#6  0x00000ef9e2b727ad in mprintf (zFormat=Variable "zFormat" is not available.
) at printf.c:891
#7  0x00000ef9e2add91e in builtin_emit_fossil_js_apis (zApi=Variable "zApi" is not available.
) at builtin.c:708
#8  0x00000ef9e2b6f696 in pikchrshow_page () at pikchrshow.c:329
#9  0x00000ef9e2b45d04 in process_one_web_page (zNotFound=0x0, pFileGlob=0x0, allowRepoList=0) at main.c:1964
#10 0x00000ef9e2b470aa in cmd_webserver () at main.c:2961
#11 0x00000ef9e2b433b3 in fossil_main (argc=Variable "argc" is not available.
) at main.c:938
#12 0x00000ef9e2b42c39 in main (argc=Variable "argc" is not available.
) at main.c:648
Current language:  auto; currently asm

Compiler version:

$ cc -v
OpenBSD clang version 10.0.1
Target: amd64-unknown-openbsd6.8
Thread model: posix
InstalledDir: /usr/bin

Sorry to have not tried running locally sooner. On the bright side, I've certainly had no problems building Fossil.

(2) By Stephan Beal (stephan) on 2020-09-17 20:25:45 in reply to 1 [link] [source]

All my testing for/wtih pikchrshow has been on the hosted Fossil-scm or pikchr.org. Today I tried it on a brand new repo with Fossil and when visiting /pikchrshow, I got a segfault.

i'm puzzled by the specifics of the segfault but will look into it immediately. Its crashing while rendering the page in a place where it makes no difference whether your repo is old or new.

i'll get back to you soon...

(3) By Stephan Beal (stephan) on 2020-09-17 20:27:04 in reply to 1 [link] [source]

OpenBSD clang version 10.0.1

Just out of curiosity, do you have another compiler you could try that out this? You can tell the build to use a different compiler with:

./configure CC=/the/compiler

then do clean/make. In the meantime i'll try to reproduce it with clang on linux.

(4) By sean (jungleboogie) on 2020-09-17 20:37:59 in reply to 3 [link] [source]

Just out of curiosity, do you have another compiler you could try that out this?

I'll install one. Which would you like it to be?

(6) By Stephan Beal (stephan) on 2020-09-17 20:42:48 in reply to 4 [link] [source]

I'll install one. Which would you like it to be?

Any but clang, if you have one installed. i'm not sure what's normally installed on BSD, but would not be surprised if clang is the only one installed by default. If you don't have another readily available, no big deal. (i thought everyone has at least two C compilers installed!)

(19) By sean (jungleboogie) on 2020-09-18 06:42:09 in reply to 3 [link] [source]

I got around to installing another compiler and the pikchrshow page worked.

So we have

  • amd64 with clang 8.0.1 and 10.0.1 segfaulting on pikchshow.
  • amd64 with gcc 8.4.0 loading pikchshow.
  • arm64 and i386 with clang 10.0.1 loading pikchshow.
$ egcc -v
Using built-in specs.
COLLECT_GCC=egcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-openbsd6.8/8.4.0/lto-wrapper
Target: x86_64-unknown-openbsd6.8
Configured with: /usr/obj/ports/gcc-8.4.0/gcc-8.4.0/configure --with-stage1-ldflags=-L/usr/obj/ports/gcc-8.4.0/bootstrap/lib --verbose --program-transform-name='s,^,e,' --disable-nls --with-system-zlib --disable-libmudflap --disable-libgomp --disable-libssp --disable-tls --
with-gnu-ld --with-gnu-as --enable-threads=posix --enable-wchar_t --with-gmp=/usr/local --enable-languages=c,c++,fortran,objc,ada --disable-libstdcxx-pch --enable-default-ssp --enable-default-pie --without-isl --enable-cpp --prefix=/usr/local --sysconfdir=/etc --mandir=/usr
/local/man --infodir=/usr/local/info --localstatedir=/var --disable-silent-rules --disable-gtk-doc
Thread model: posix
gcc version 8.4.0 (GCC)

(20.1) By Stephan Beal (stephan) on 2020-09-18 08:33:20 edited from 20.0 in reply to 19 [link] [source]

So we have ...

Thank you for that. We can add gcc 8.3.0 on ARM Linux and clang 7.0.1 on ARM Linux to the "works for me" list.

Edit: also gcc 9.3.0 on x86/64 and clang 10.0.0 on x86/64 "work for me."

(25.3) Originally by sean (jungleboogie) with edits by Warren Young (wyoung) on 2020-09-18 19:21:18 from 25.2 in reply to 19 [link] [source]

I think these are all the results in a table format...feel free to edit to make it more readable.

OS OS Ver arch compiler Status
OpenBSD 6.7 amd64 Clang 8.0.1 fail
OpenBSD 6.8snapshot amd64 Clang 10.0.1 fail
OpenBSD 6.8snapshot i386 Clang 10.0.1 pass
OpenBSD 6.8snapshot arm64 Clang 10.0.1 pass
OpenBSD 6.8snapshot amd64 GCC 8.4.0 pass
FreeBSD 12.1-p10 amd64 Clang 8.0.1 pass
Linux CentOS 8 arm64 GCC 8.3.1 pass
Linux ??? arm64 clang 7.0.1 pass
Linux ??? amd64 clang 10.0.0 pass
Linux ??? amd64 gcc 9.3.0 pass

WY edit: Merged my results in. Took over the GCC 8.3.0 result with my CentOS 8 result, since I can give an actual target platform, rather than "???".

(5) By Stephan Beal (stephan) on 2020-09-17 20:41:32 in reply to 1 [link] [source]

Compiled on Sep 17 2020 13:08:19 using clang-10.0.1 (64-bit)

i'm unfortunately unable to reproduce this with any of the different jsmode settings, but my local clang is only 7.0.1. (The stack trace indicates that your jsmode is not "bundled".)

Just to rule this out: can you change the -O2 in Makefile to -O0 and try again? The stacktrace you're showing truly makes the most sense if it's a genuine compiler bug. The code path it's complaining about is all well-tested and widely deployed (wikiedit, fileedit, the line numbering code, and parts of the forum all go through pikchshow.c:329 (builtin_emit_fossil_js_apis() on my local copy)). Also, if you would, please try opening up wikiedit or fileedit: they go through that same path.

(7) By sean (jungleboogie) on 2020-09-17 21:06:20 in reply to 5 [link] [source]

Just to rule this out: can you change the -O2 in Makefile to -O0 and try again?

Yes, this works! There's no segfault on pikchrshow or for wikiedit sandbox.

I'm able to use pikchrshow as expected and have no problems generating drawings.

I don't have a different compiler installed and 10.0.1 became the default compiler a few months back in OpenBSD -current.

(8) By Warren Young (wyoung) on 2020-09-17 21:10:11 in reply to 7 [link] [source]

That makes me want to see what happens when you build with

  $ ./configure --with-sanitizer=address,enum,null,undefined

Be sure to do a clean build. Stale objects built without those flags mixed with objects built with them can cause new, confusing symptoms.

(10) By sean (jungleboogie) on 2020-09-17 21:20:19 in reply to 8 [link] [source]

Unsupported!

...
touch bld/headers
cc -g -O2 -o bld/codecheck1 ./src/codecheck1.c
cc -I. -I./src -Ibld -Wall -DFOSSIL_DYNAMIC_BUILD=1 -DFOSSIL_HAVE_FUSEFS -fsanitize=address,enum,null,undefined  -g -O
2 -DHAVE_AUTOCONFIG_H -D_HAVE_SQLITE_CONFIG_H -o bld/add.o -c bld/add_.c
cc: error: unsupported option '-fsanitize=address' for target 'amd64-unknown-openbsd6.8'
*** Error 1 in /home/sean/fossil-repos/fossil (./src/main.mk:920 'bld/add.o')

(11) By Warren Young (wyoung) on 2020-09-17 21:32:58 in reply to 10 [link] [source]

I was mainly interested in the UBSan results (-fsanitize=undefined) which, it is claimed is supported under OpenBSD.

Until UBsan clears the code, I think it's too early to be claiming "compiler bug."

Enable as many of the sanitizers as you can on your platform.

(12) By sean (jungleboogie) on 2020-09-17 22:06:28 in reply to 11 [link] [source]

What's this tell you?

Generated with ./configure --with-sanitizer=undefined

nitize=undefined -lfuse -lm -lssl -lcrypto -lz
ld: error: undefined symbol: _Unwind_GetLanguageSpecificData
>>> referenced by gcc_personality_v0.c:189 (/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/
builtins/gcc_personality_v0.c:189)
>>>               gcc_personality_v0.o:(__gcc_personality_v0) in archive /usr/lib/libcompiler_rt.a

ld: error: undefined symbol: _Unwind_GetIP
>>> referenced by gcc_personality_v0.c:193 (/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/
builtins/gcc_personality_v0.c:193)
>>>               gcc_personality_v0.o:(__gcc_personality_v0) in archive /usr/lib/libcompiler_rt.a

ld: error: undefined symbol: _Unwind_GetRegionStart
>>> referenced by gcc_personality_v0.c:194 (/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/
builtins/gcc_personality_v0.c:194)
>>>               gcc_personality_v0.o:(__gcc_personality_v0) in archive /usr/lib/libcompiler_rt.a

ld: error: undefined symbol: _Unwind_SetGR
>>> referenced by gcc_personality_v0.c:224 (/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/
builtins/gcc_personality_v0.c:224)
>>>               gcc_personality_v0.o:(__gcc_personality_v0) in archive /usr/lib/libcompiler_rt.a
>>> referenced by gcc_personality_v0.c:226 (/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/
builtins/gcc_personality_v0.c:226)
>>>               gcc_personality_v0.o:(__gcc_personality_v0) in archive /usr/lib/libcompiler_rt.a

ld: error: undefined symbol: _Unwind_SetIP
>>> referenced by gcc_personality_v0.c:227 (/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/
builtins/gcc_personality_v0.c:227)
>>>               gcc_personality_v0.o:(__gcc_personality_v0) in archive /usr/lib/libcompiler_rt.a

ld: error: undefined symbol: __ubsan_handle_type_mismatch_v1
>>> referenced by add.c:86 (./src/add.c:86)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:87 (./src/add.c:87)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:87 (./src/add.c:87)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:87 (./src/add.c:87)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:0 (./src/add.c:0)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:107 (./src/add.c:107)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:438 (./src/add.c:438)
>>>               bld/add.o:(add_cmd)
>>> referenced by add.c:441 (./src/add.c:441)
>>>               bld/add.o:(add_cmd)
>>> referenced by add.c:457 (./src/add.c:457)
>>>               bld/add.o:(add_cmd)
>>> referenced by add.c:608 (./src/add.c:608)
>>>               bld/add.o:(delete_cmd)
>>> referenced 97720 more times

ld: error: undefined symbol: __ubsan_handle_out_of_bounds
>>> referenced by add.c:87 (./src/add.c:87)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:104 (./src/add.c:104)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by add.c:107 (./src/add.c:107)
>>>               bld/add.o:(fossil_reserved_name)
>>> referenced by blob.c:952 (./src/blob.c:952)
>>>               bld/blob.o:(blob_read_link)
>>> referenced by builtin.c:62 (./src/builtin.c:62)
>>>               bld/builtin.o:(builtin_file)
>>> referenced by builtin.c:61 (./src/builtin.c:61)
>>>               bld/builtin.o:(builtin_file)
>>> referenced by builtin.c:43 (./src/builtin.c:43)
>>>               bld/builtin.o:(builtin_file_index)
>>> referenced by builtin.c:62 (./src/builtin.c:62)
>>>               bld/builtin.o:(builtin_text)
>>> referenced by builtin.c:62 (./src/builtin.c:62)
>>>               bld/builtin.o:(builtin_webpage)
>>> referenced by builtin.c:299 (./src/builtin.c:299)
>>>               bld/builtin.o:(builtin_request_js)
>>> referenced 898 more times

ld: error: undefined symbol: __ubsan_handle_builtin_unreachable
>>> referenced by add.c:798 (./src/add.c:798)
>>>               bld/add.o:(addremove_cmd)
>>> referenced by add.c:877 (./src/add.c:877)
>>>               bld/add.o:(mv_cmd)
>>> referenced by add.c:881 (./src/add.c:881)
>>>               bld/add.o:(mv_cmd)
>>> referenced by add.c:1023 (./src/add.c:1023)
>>>               bld/add.o:(mv_cmd)
>>> referenced by add.c:1055 (./src/add.c:1055)
>>>               bld/add.o:(mv_cmd)
>>> referenced by ajax.c:332 (./src/ajax.c:332)
>>>               bld/ajax.o:(ajax_route_preview_text)
>>> referenced by alerts.c:431 (./src/alerts.c:431)
>>>               bld/alerts.o:(emailerError)
>>> referenced by alerts.c:1097 (./src/alerts.c:1097)
>>>               bld/alerts.o:(alert_cmd)
>>> referenced by alerts.c:1361 (./src/alerts.c:1361)
>>>               bld/alerts.o:(subscribe_page)
>>> referenced by alerts.c:1413 (./src/alerts.c:1413)
>>>               bld/alerts.o:(subscribe_page)
>>> referenced 942 more times

ld: error: undefined symbol: __ubsan_handle_divrem_overflow
>>> referenced by bag.c:133 (./src/bag.c:133)
>>>               bld/bag.o:(bag_insert)
>>> referenced by bag.c:103 (./src/bag.c:103)
>>>               bld/bag.o:(bag_resize)
>>> referenced by bag.c:156 (./src/bag.c:156)
>>>               bld/bag.o:(bag_find)
>>> referenced by bag.c:172 (./src/bag.c:172)
>>>               bld/bag.o:(bag_remove)
>>> referenced by bag.c:219 (./src/bag.c:219)
>>>               bld/bag.o:(bag_next)
>>> referenced by delta.c:385 (./src/delta.c:385)
>>>               bld/delta.o:(delta_create)
>>> referenced by delta.c:404 (./src/delta.c:404)
>>>               bld/delta.o:(delta_create)
>>> referenced by diff.c:227 (./src/diff.c:227)
>>>               bld/diff.o:(break_into_lines)
>>> referenced by diff.c:1011 (./src/diff.c:1011)
>>>               bld/diff.o:(sbsDiff)
>>> referenced by diff.c:1157 (./src/diff.c:1157)
>>>               bld/diff.o:(sbsDiff)
>>> referenced 148 more times

ld: error: undefined symbol: __ubsan_handle_float_cast_overflow
>>> referenced by branch.c:489 (./src/branch.c:489)
>>>               bld/branch.o:(brlist_page)
>>> referenced by browse.c:1011 (./src/browse.c:1011)
>>>               bld/browse.o:(human_readable_age)
>>> referenced by diff.c:2259 (./src/diff.c:2259)
>>>               bld/diff.o:(annotate_file)
>>> referenced by diff.c:2223 (./src/diff.c:2223)
>>>               bld/diff.o:(current_time_in_milliseconds)
>>> referenced by export.c:1167 (./src/export.c:1167)
>>>               bld/export.o:(gitmirror_send_checkin)
>>> referenced by fusefs.c:151 (./src/fusefs.c:151)
>>>               bld/fusefs.o:(fusefs_getattr)
>>> referenced by piechart.c:191 (./src/piechart.c:191)
>>>               bld/piechart.o:(piechart_render)
>>> referenced by piechart.c:161 (./src/piechart.c:161)
>>>               bld/piechart.o:(piechart_render)
>>> referenced by pikchr.y:3886
>>>               bld/pikchr.o:(yy_reduce)
>>> referenced by pikchr.y:3887
>>>               bld/pikchr.o:(yy_reduce)
>>> referenced 68 more times

ld: error: undefined symbol: __ubsan_handle_shift_out_of_bounds
>>> referenced by graph.c:641 (./src/graph.c:641)
>>>               bld/graph.o:(graph_finish)
>>> referenced by graph.c:701 (./src/graph.c:701)
>>>               bld/graph.o:(graph_finish)
>>> referenced by graph.c:676 (./src/graph.c:676)
>>>               bld/graph.o:(graph_finish)
>>> referenced by graph.c:693 (./src/graph.c:693)
>>>               bld/graph.o:(graph_finish)
>>> referenced by graph.c:695 (./src/graph.c:695)
>>>               bld/graph.o:(graph_finish)
>>> referenced by graph.c:760 (./src/graph.c:760)
>>>               bld/graph.o:(graph_finish)
>>> referenced by graph.c:307 (./src/graph.c:307)
>>>               bld/graph.o:(assignChildrenToRail)
>>> referenced by graph.c:405 (./src/graph.c:405)
>>>               bld/graph.o:(riser_to_top)
>>> referenced by graph.c:363 (./src/graph.c:363)
>>>               bld/graph.o:(createMergeRiser)
>>> referenced by graph.c:393 (./src/graph.c:393)
>>>               bld/graph.o:(find_max_rail)
>>> referenced 78 more times
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** Error 1 in /home/sean/fossil-repos/fossil (./src/main.mk:744 'fossil')

(14) By Warren Young (wyoung) on 2020-09-18 01:35:30 in reply to 12 [link] [source]

What's this tell you?

That your OS doesn't support UBsan properly.

I've checked, and libubsan isn't available according to pkg_info -Q, and the -fsanitize=undefined flag is being passed to the linker, which is "cc" rather than "ld", which means the libubsan bit is supposed to be taken care of behind the scenes anyway.

If I were an OpenBSD user, I'd file that as a packaging bug.

I've tried to fix the build several times, which has taken hours without useful result, so I give up.

(Hours because Clang is a memory pig, my VMs give OpenBSD "only" half a gig of memory, and there are dependency problems in Fossil's build system that cause things to be rebuilt when they really shouldn't be, making a cache thrash induced hour-long build process take another hour when I try a simple "make" after a failed build.)

(15) By sean (jungleboogie) on 2020-09-18 01:42:54 in reply to 14 [link] [source]

Thanks for your time with this.

I’ll email the OpenBSD package maintainer for Fossil and see what he says. Maybe we’ll need to find someone else for help. At the very least, the Makefile will need to be patched so there are no segfaults.

(16) By sean (jungleboogie) on 2020-09-18 02:27:35 in reply to 14 [link] [source]

Could you please provide me:

sysctl kern.version and cc -v. I stumbled on something funny, and before I report it, I want to see what you were/are working with.

thanks!

(17) By Warren Young (wyoung) on 2020-09-18 03:27:32 in reply to 16 [link] [source]

I started on OpenBSD 6.4. (Old VM I had laying around.)

I then upgraded it in stages to 6.7, bumping the RAM allocation to a gig along the way, where your requested values are:

# sysctl kern.version
kern.version=OpenBSD 6.7 (GENERIC.MP) #182: Thu May  7 11:11:58 MDT 2020
    deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
# cc -v
OpenBSD clang version 8.0.1 (tags/RELEASE_801/final) (based on LLVM 8.0.1)
Target: amd64-unknown-openbsd6.7
Thread model: posix
InstalledDir: /usr/bin

I later tried running the tip-of-trunk build — with no special build options — under Valgrind, and it absolutely lost its mind. According to Valgrind on OpenBSD, Fossil is utter dreck.

...Or maybe something else is going on. :)

I do note that this segfault is happening below Fossil, in the libc printf() guts. I wonder if the same people that brought us randomly-linked kernels on each boot are playing some funny games down in there to try and stop buffer overflows and such from becoming security holes.

As a sanity check, I then built Fossil tip on FreeBSD 12.1-p10, and it builds and runs just fine, with UBSan.

I was initially wary of the fact that you're running a beta version of OpenBSD (6.8), but the same thing happens in earlier released versions, including the latest release, which ships with Clang 8.0.1, which I vetted with the FreeBSD test.

One more point of interest: I don't believe the UBsan link failure has a thing to do with Fossil specifically. I can't link "Hello, World!" with -fsanitize=undefined on OpenBSD 6.7, either. (Trivially reproduced, so I won't say more here.)

So...upgrade to FreeBSD? 😼

(18) By sean (jungleboogie) on 2020-09-18 06:01:11 in reply to 17 [link] [source]

Warren,

Thanks for the details. FYI, if you plan on using the VM, you can use syspatch(8) to bring it up to date.

So you've got clang 8.0.1 and I have 10.0.1 and your OS is older.

I wonder if the same people that brought us randomly-linked kernels on each boot are playing some funny games down in there to try and stop buffer overflows and such from becoming security holes.

It's not unusual for applications to crash and generate core dumps on OpenBSD, but it certainly makes this more of a puzzle to track down.

I was initially wary of the fact that you're running a beta version of OpenBSD (6.8)

Right, it's just marked beta because a release is happening "soon", probably mid-next month is my guess. And you're right, we can cause the segfault on 6.7 and 6.8.

##i386 and arm64 work with pikchrshow So here's what I'm finding interesting...

This i386 netbook is NOT segfaulting on the pikchrshow or wikiedit page

kern.version=OpenBSD 6.8-beta (GENERIC.MP) #410: Tue Sep 15 12:58:24 MDT 2020
    deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP

$ cc -v
OpenBSD clang version 10.0.1 
Target: i386-unknown-openbsd6.8
Thread model: posix
InstalledDir: /usr/bin

It's a few commits behind trunk now: This is fossil version 2.13 [cd22f0f07d] 2020-09-18 01:21:09 UTC

I also have an arm64 board that isn't segfaulting on pikchrshow

$ sysctl kern.version
kern.version=OpenBSD 6.8-beta (GENERIC.MP) #805: Mon Sep 14 15:37:52 MDT 2020
    deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP

$ cc -v
OpenBSD clang version 10.0.1 
Target: aarch64-unknown-openbsd6.8
Thread model: posix
InstalledDir: /usr/bin

And a newer version of fossil: 58d86b16bf

##amd64 wikiedit page works On the same machine (a VM in the cloud) that generates the segfault for pikchrshow, the wikieditor works. In fact, I'm drafting this post in the editor and viewing it with the preview.

snip of the Makefile to show it has O2:

CFLAGS = -g -O2
LIB =     -lfuse -lm -lssl -lcrypto -lz
BCCFLAGS =       $(CFLAGS)
TCCFLAGS =      -Wall -DFOSSIL_DYNAMIC_BUILD=1 -DFOSSIL_HAVE_FUSEFS  $(CFLAGS) -DHAVE_AUTOCONFIG_H -D_HAVE_SQLITE_CONFIG_H
INSTALLDIR = $(DESTDIR)/usr/local/bin
USE_SYSTEM_SQLITE = 0
USE_LINENOISE = 1
USE_MMAN_H = 0
USE_SEE = 0
FOSSIL_ENABLE_MINIZ = 0
APPNAME = fossil

So are we still thinking a compiler issue that's specific to amd64 arch only?

(37.1) By sean (jungleboogie) on 2020-09-26 05:44:31 edited from 37.0 in reply to 17 [link] [source]

Hi Warren,

If you have the time, can you please run your 6.7 VM again with tip of Fossil trunk and see if the pikchrshow still segfaults?

On my latest OpenBSD snapshot for amd64, it no longer segfaults! I just want to confirm your env still segfaults, and meanwhile, I'll setup a pristine 6.8 snapshot and see what happens.

I don't know how to show a list of commits between 17 Sept and today on github, so we could possibly have a clue as what was fixed in in OpenBSD src.

Needless to say, this is [hopefully] good news!

(38) By Warren Young (wyoung) on 2020-09-27 01:10:22 in reply to 37.1 [link] [source]

Symptom's gone now. I even tried your circle "Hello, world!" fit test: it rendered as expected.

Since this result means an upstream OpenBSD fix isn't responsible, it means the fix must have been done in Fossil proper or in pikchr.c.

I'd bisect it, but builds take so long on OpenBSD, and I'm late for a thing. Anyway, this is your itch, Sean.

(40) By sean (jungleboogie) on 2020-09-27 01:16:42 in reply to 38 [link] [source]

I'd bisect it, but builds take so long on OpenBSD, and I'm late for a thing. Anyway, this is your itch, Sean.

Have a nice evening and thanks for your efforts.

(39.1) By sean (jungleboogie) on 2020-09-27 01:19:12 edited from 39.0 in reply to 37.1 [link] [source]

Okay, great news!!

This wasn't a compiler issue in clang 8.0.1 or 10.0.1 on OpenBSD. This was a Fossil issue!

Stephan's commit, probably, inadvertently fixed this.

These are the commits I tested

good

  1. 38d6a8f30eb0d58a
  2. 052d37480927b601
  3. 9b2b6f5b1cc695e5

bad

  1. 82a0b517a78e44c0
  2. 6854244949f33472
  3. 43116c73fd
  4. 58d86b16bf2c233e

I still don't know/understand how only amd64 was effected by this, though.

edit... Just for posterity before I destroy this VM:

# sysctl kern.version && cc -v
kern.version=OpenBSD 6.7 (GENERIC.MP) #5: Tue Jul 21 13:50:07 MDT 2020
    root@syspatch-67-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

OpenBSD clang version 8.0.1 (tags/RELEASE_801/final) (based on LLVM 8.0.1)
Target: amd64-unknown-openbsd6.7
Thread model: posix
InstalledDir: /usr/bin

(41) By Stephan Beal (stephan) on 2020-09-27 05:04:39 in reply to 39.1 [link] [source]

Stephan's commit, probably, inadvertently fixed this.

If that was the fix then it was indeed inadvertent. That checkin was only cleanups to how the JS is emitted, as a precursor to inclusion of the pikchr JS in more pages.

Weird.

(42) By sean (jungleboogie) on 2020-09-27 06:02:16 in reply to 41 [link] [source]

It indeed is that commit. For whatever reason, "dom" causes the crash...

$ diff -u /home/sean/pikchrshow.c src/pikchrshow.c
--- /home/sean/pikchrshow.c     Sat Sep 26 22:50:38 2020
+++ src/pikchrshow.c    Sat Sep 26 22:50:51 2020
@@ -359,7 +359,7 @@
       } CX("</div>"/*#pikchrshow-output*/);
     } CX("</fieldset>"/*#pikchrshow-output-wrapper*/);
   } CX("</div>"/*sbs-wrapper*/);
-  builtin_fossil_js_bundle_or("fetch", "copybutton", "popupwidget",
+  builtin_fossil_js_bundle_or("dom", "fetch", "copybutton", "popupwidget",
                               "storage", "pikchr", 0);
   builtin_request_js("fossil.page.pikchrshow.js");
   builtin_fulfill_js_requests();

(13) By Stephan Beal (stephan) on 2020-09-17 22:32:49 in reply to 11 [link] [source]

Until UBsan clears the code, I think it's too early to be claiming "compiler bug."

OTOH, it's arguably either a bug in his compiler for choking on it or a bug in the rest of them for not choking on it ;).

(21) By anonymous on 2020-09-18 08:19:54 in reply to 10 [link] [source]

Does valgrind work on OpenBSD, then? It only needs -g to produce useful output, which we already have.

(22) By Stephan Beal (stephan) on 2020-09-18 08:24:20 in reply to 21 [link] [source]

Does valgrind work on OpenBSD, then? It only needs -g to produce useful output, which we already have.

Valgrind on fossil is... irksome. Not because of illegal memory access or whatnot, but because fossil's memory management policy is largely, "let the OS clean it up when the app closes in a few milliseconds," leading to what valgrind will report as dozens, if not hundreds of leaks. That makes sorting through valgrind's output a pain.

Can valgrind be told to ignore leaks and only take an interest in illegal memory access and similar violations?

(23) By Warren Young (wyoung) on 2020-09-18 08:58:05 in reply to 22 [link] [source]

To be clear, I think Valgrind may well be flagging things that are not Fossil's fault, but which partly explain this apparent Clang-on-AMD64 issue.

It spews pages and pages of noise on my OpenBSD VM, like nothing I've ever seen before.

(26) By Warren Young (wyoung) on 2020-09-18 17:45:38 in reply to 23 [link] [source]

Confirmed. On OpenBSD 6.7:

  $ valgrind --log-file=x fossil server ~/tmp/x.fossil
  ...wait for it to start, then Ctrl-C it...
  $ wc -l x
      4064 x

Then on FreeBSD after jumping through hoops to get Valgrind working, I get 28 lines of output in the log file, the same as on a recent 64-bit Linux.

I suspect the same causes behind Valgrind not being officially supported on FreeBSD are simply expressed in more acute form on OpenBSD.

(24) By anonymous on 2020-09-18 10:21:15 in reply to 22 [link] [source]

fossil's memory management policy is largely, "let the OS clean it up when the app closes in a few milliseconds,"

Which is a very good policy unless one is writing a library which may expect to be initialised and de-initialised multiple times per application lifetime, which Fossil isn't. And the code becomes much simpler as a result.

Can valgrind be told to ignore leaks and only take an interest in illegal memory access and similar violations?

To be fair, Valgrind doesn't elaborate by default on the kind of "leaks" where the pointer is still present in the address space somewhere and the application just chooses not to free() it before exiting. (It calls them "still reachable"). There are options to control what to report: --show-leak-kinds=definite --errors-for-leak-kinds=definite or even --leak-check=no).

And I just had to upgrade Valgrind to the version from Debian Buster to match libc6 from Debian Buster; otherwise it reported a ton of uninitialised values originating on libc and couldn't find any leaks at all. (Just an anecdote, not a very useful one on OpenBSD.)

(9) By Stephan Beal (stephan) on 2020-09-17 21:12:52 in reply to 7 [link] [source]

Yes, this works! There's no segfault on pikchrshow or for wikiedit sandbox.

LOL!!! What are the odds of finding 2 compiler bugs, in different compilers and OSes, in one calendar week?!?!

C: circle fill yellow
down; L: line 0.4
left; arc cw from L.w
right; arc from L.e

(Still waiting on someone to draw me a better "hands raised in triumph" pikchr. ;))

Despite my jubilation at having unlocked a second achievement this week, it's still a disturbing segfault.

(27) By JohnQSmith on 2020-09-18 19:28:55 in reply to 9 [source]

(Still waiting on someone to draw me a better "hands raised in triumph" pikchr. ;))

How's this?

C: circle fill yellow
cylinder radius .1 width 40% height 50% fill black at .05 below C
cylinder radius .1 width 40% height 50% at .1 above previous color yellow fill yellow
circle radius .05 at .1 above .1 right of C fill black
circle radius .05 at .07 below previous color yellow fill yellow
circle radius .05 at .1 above .1 left of C fill black
circle radius .05 at .07 below previous color yellow fill yellow
L: line down from C.s thickness .04
arc up right from L.e thickness .04
dot thickness .05
arc cw up left from L.w thickness .04
dot thickness .05

(28) By Stephan Beal (stephan) on 2020-09-18 20:24:19 in reply to 27 [link] [source]

How's this?

Absolutely epic. You've just unwittingly become the first guest contribution in the Pikmojicon.

(29) By Martin Gagnon (mgagnon) on 2020-09-18 20:28:30 in reply to 27 [link] [source]

Safety First

6'
C: circle fill yellow
cylinder radius .1 width 40% height 50% fill black at .05 below C
Z: cylinder radius .1 width 40% height 50% at .1 above previous color yellow fill yellow
circle radius .05 at .1 above .1 right of C fill black
circle radius .05 at .07 below previous color yellow fill yellow
circle radius .05 at .1 above .1 left of C fill black
circle radius .05 at .07 below previous color yellow fill yellow
L: line down from C.s thickness .04
arc up right from L.e thickness .04
dot thickness .05
arc cw up left from L.w thickness .04
dot thickness .05
A1: arc from C.nw to Z.sw
A2: arc cw from C.ne to Z.se
line from A1.end to A2.end then to C.se then to C.sw  close fill 0x999999


C2: circle at (2, 0) fill yellow
cylinder radius .1 width 40% height 50% fill black at .05 below C2
Z2: cylinder radius .1 width 40% height 50% at .1 above previous color yellow fill yellow
circle radius .05 at .1 above .1 right of C2 fill black
circle radius .05 at .07 below previous color yellow fill yellow
circle radius .05 at .1 above .1 left of C2 fill black
circle radius .05 at .07 below previous color yellow fill yellow
L2: line down from C2.s thickness .04
arc up right from L2.e thickness .04
dot thickness .05
arc cw up left from L2.w thickness .04
dot thickness .05
AA1: arc from C2.nw to Z2.sw
AA2: arc cw from C2.ne to Z2.se
line from AA1.end to AA2.end then to C2.se then to C2.sw  close fill 0x999999

arrow <-> "6'" above from 0.2 right of C.e to 0.2 left of C2.w

(30) By Stephan Beal (stephan) on 2020-09-18 20:35:57 in reply to 29 [link] [source]

(pic elided)

Also epic and also added to the pikmojicon!

PS: here in Germany we're more efficient, in that 1.5m suffices instead of 6' (1.83m). (Here's where that winky pikmoji belongs ;)

(31) By Richard Hipp (drh) on 2020-09-19 01:02:39 in reply to 30 [link] [source]

One of the key features of Pikchr, that I need to somehow highlight in the documentation, is that you can take a complex image written by someone else and easily change the "6'" to "6 feet" and put a separate "1.5 meters" below the line.

(32) By Kees Nuyt (knu) on 2020-09-20 14:37:27 in reply to 29 [link] [source]

Safety First

Nice!

# Impossible trident pikchr script
# https://en.wikipedia.org/wiki/Impossible_trident
# pikchr script by Kees Nuyt, license Creative Commons BY-NC-SA 
# https://creativecommons.org/licenses/by-nc-sa/4.0/

scale = 0.5
eh = 0.5cm
ew = 0.2cm
ed = 2 * eh
er = 0.4cm
lws = 4.0cm
lwm = lws + er
lwl = lwm + er

ellipse height eh width ew
L1: line width lwl from last ellipse.n
line width lwm from last ellipse.s
LV: line height eh down

move right er down ed from last ellipse.n
ellipse height eh width ew
L3: line width lws right from last ellipse.n to LV.end then down eh right ew
line width lwm right from last ellipse.s then to LV.start

move right er down ed from last ellipse.n
ellipse height eh width ew
line width lwl right from last ellipse.n then to L1.end
line width lwl right from last ellipse.s then up eh

Feel free to add this impossible trident to the pikchrs collection.

-- 
Regards,
Kees Nuyt

(34) By Stephan Beal (stephan) on 2020-09-20 17:53:02 in reply to 32 [link] [source]

Feel free to add this impossible trident to the pikchrs collection.

Done, thank you!

https://fossil.wanderinghorse.net/r/pikmojicon/doc/tip/pikchrs/impossible/

(33.1) By MBL (RoboManni) on 2020-09-20 15:41:35 edited from 33.0 in reply to 29 [link] [source]

Is it also possible to reuse a set with parameters, which could be another set or values ? This duplication would become simply the two times use of such a set

(35) By Stephan Beal (stephan) on 2020-09-20 17:58:53 in reply to 33.1 [link] [source]

Is it also possible to reuse a set with parameters,

pikchr doesn't currently directly support such reuse, but there are ideas being floated around on the pikchr forum in that direction (noting that adding functions to the language is not, AFAIK, currently on the map, as that introduces complications like the possibility of infinite loops). e.g. see post /forumpost/7b0b304999.

However, in fossil it could currently be done with the pikchr CLI command and some embedded TH1.

(36) By Andreas Kupries (aku) on 2020-09-20 18:39:12 in reply to 35 [link] [source]

Note that my proposal, as referenced, does not allow for recursion and infinite loops.

You can use a symbol X only after its definition.

When you use X in the body of X, X is not defined yet. That happens only after the end of the body.

As such trying to use X in X will result in an error.

Indirect use has the same problem of having to refer to some symbol not yet defined at some point.

All you get from the proposed changes is a DAG of definitions.