panic: Segfault during process_one_web_page
(1.1) By Alfred M. Szmidt (ams) on 2022-05-11 09:49:30 edited from 1.0 [source]
I keep getting:
panic: Segfault during process_one_web_page
with the latest trunk [0833f7225b] on OpenBSD 7.1 when trying to visit the Repository List. I can reproduce this using:
fossil server /var/www/htdocs/fossil --repolist
Any tips on debugging?
(2) By mark on 2022-05-11 10:56:48 in reply to 1.1 [link] [source]
I can't reproduce on OpenBSD 7.1-current or 6.9-release with neither a
debug nor release build of trunk.
Can you run a backtrace on the core file?
(3) By Richard Hipp (drh) on 2022-05-11 11:09:25 in reply to 1.1 [link] [source]
Please try again with the latest trunk check-in.
(4.1) By Stephan Beal (stephan) on 2022-05-11 11:12:49 edited from 4.0 in reply to 1.1 [link] [source]
Any tips on debugging?
The first thing to try, assuming you're working from a checkout which has been used to build multiple versions, is "make clean" and then reconfigure and rebuild. Once in a blue moon dependencies don't quite work and we end up linking an old/binary-incompatible object file in there somewhere. It doesn't happen often, but when it does it often results in weirdness like inexplicable segfaults.
Edit: nevermind - Richard's concurrent response seems like the more likely culprit.
(5) By Alfred M. Szmidt (ams) on 2022-05-11 11:11:57 in reply to 2 [link] [source]
No core dump is produced, this is a call to fossil_panic() or whatever and that doesn't do that AFAIU. Did you compile with -static?
(6) By Alfred M. Szmidt (ams) on 2022-05-11 11:16:36 in reply to 3 [link] [source]
That does the trick, thank you!
(7) By mark on 2022-05-11 11:31:35 in reply to 5 [link] [source]
Both with ./configure --static
and without. I've no idea why I can't
reproduce it. But I now see Richard's fix.
I just looked at the code, Richard's installed a segv handler, that's
why there's no core file. Though it looks like it provides a backtrace
on some platforms.
(8) By Alfred M. Szmidt (ams) on 2022-05-11 11:37:15 in reply to 7 [link] [source]
Yeah, on GNU systems it will do a nicer backtrace. OpenBSD lacks backtrace(3), I think some of the other BSDs might have it -- one can get around it by using libexecinfo on OpenBSD if one wants. That might be a nice thing to do to the configure script, check if the library exists, and define HAVE_BACKTRACE ...
(9) By Richard Hipp (drh) on 2022-05-11 11:40:08 in reply to 4.1 [link] [source]
I found the problem by running:
valgrind ./fossil server /home/drh/www/repos --repolist
Valgrind told me the exact source code line where the problem was occurring. From there, the fix was easy.
(10.1) By mark on 2022-05-11 11:52:53 edited from 10.0 in reply to 8 [link] [source]
I discovered that recently when looking to install a segv handler in
fnc. When I realised I couldn't show the trace on base OpenBSD, I
scrapped the idea. The BUGS section in 6.9's backtrace(3) is funny
but I can't bring it up on https://man.openbsd.org for some reason.
That's a good idea about tweaking the configure script though.
ETA: copypasta from my local manpage
BUGS
As typical with GNU software the interface is clumsy and error prone.
While writing a more sophisticated backtracing mechanism it was obvious
that the GNU functionality could be trivially emulated.
Due to a bug in gcc one has to compile applications with the following
flags -Wl,--export-dynamic in order to get human readable function names.
(11) By george on 2022-05-11 21:39:58 in reply to 3 [link] [source]
Thank you for handling the issue while I was away!
I'm sorry for the bother; I did not expect that g.zPath
may be NULL
.
@ams, thank you for reporting that issue.