`-lpthread` is required to compile a statically linked Fossil

(1) By Marcos Cruz (programandala.net) on 2020-12-14 13:03:14 [source]

I've been trying to compile Fossil with ./configure --static in order to execute it in my shared hosting, where I have SSH access but no permissions to execute gcc.

The compilation displayed many warnings about undefined phtread_ references and finally failed. After some research I added -lpthread at the end of the LIB value in Makefile, and it worked.

Anyway, I have a doubt: Was -lpthread omitted in Makefile because of a bug in configure or some misconfiguration in my Debian?

Thank you.

(2) By Warren Young (wyoung) on 2020-12-14 18:35:38 in reply to 1 [link] [source]

"Debian" isn't enough for me to replicate your symptom. My local Buster VM doesn't even allow static linkage due to changes in glibc, so you must be on a materially different version of Debian.

I've attempted to address this issue on the new allow-static-openssl branch. Please give it a try. I won't merge it to trunk until I can prove that it actually produces a runnable build, which I can't do locally given my local set of test VMs.

Alternately, provide enough info that would allow me to build a VM matching your environment:

 $ uname -a
 $ lsb_release -a

etc.

(5) By Marcos Cruz (programandala.net) on 2020-12-16 12:59:54 in reply to 2 [link] [source]

"Debian" isn't enough for me to replicate your symptom.

Of course. Sorry. Since I had succeeded the staticatically linked compilation by modifying the Makefile, my intention was just to know if someone had had a similar issue, before investigating further.

I compiled Fossil 2.14-preview-20201125 on an updated Debian 10.7 "buster", 64-bit Intel.

uname -a
Linux 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux

I've attempted to address this issue on the new allow-static-openssl branch. Please give it a try

Thank you. Now configure --static adds -lpthread to LIB in the Makefile:

Before (v2.14-preview-20201125):

LIB =    -static -lfuse -lm -lresolv -lssl -lcrypto -lz -ldl

Now:

LIB =    -static -lssl -lcrypto -lpthread -lfuse -lm -lresolv -lssl -lcrypto -lz -ldl

But the compilation fails as before, after displaying many "pthreads_" undefined references, and some "sem_".

When I move -lpthread to the end of the list, it works.

(6) By Warren Young (wyoung) on 2020-12-16 20:15:17 in reply to 5 [link] [source]

drh's recent change on trunk should work for you, then.

(7) By Marcos Cruz (programandala.net) on 2020-12-17 15:18:19 in reply to 6 [link] [source]

Now ./configure --static creates a Makefile with this line:

LIB =    -static -lpthread -lm -lresolv -lssl -lcrypto -lz -ldl

But the compilation fails after displaying 11 messages like this, about different "pthread_" undefined references in different functions of the same library:

/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/libcrypto.a(threads_pthread.o): in function `CRYPTO_THREAD_get_local':
(.text+0x143): undefined reference to `pthread_getspecific'

Again, when I move -lpthread to the end of the line, make clean;make works.

(8) By Warren Young (wyoung) on 2020-12-18 00:00:03 in reply to 7 [link] [source]

I've made another change to it which should solve the problem.

(9) By Marcos Cruz (programandala.net) on 2020-12-18 01:02:35 in reply to 8 [link] [source]

Yes, now it works without touching the Makefile. Thank you!

(3) By Richard Hipp (drh) on 2020-12-14 19:08:48 in reply to 1 [link] [source]

Fossil itself does not use pthreads. It is single-threaded. (The SQLite build contains SQLITE_THREADSAFE=0 so that SQLite does not attempt to use pthreads either).

I'm guessing that the pthread requirement comes about because you are linking against OpenSSL.

(4) By Warren Young (wyoung) on 2020-12-14 19:17:36 in reply to 3 [link] [source]

Almost certainly, which is why my branch only affects proc check-for-openssl in auto.def.

(10) By george on 2020-12-19 19:02:23 in reply to 3 [link] [source]

Fossil ... is single-threaded

This is surprising. Especially within the context of a long running SCGI server.

Please provide some documentation of both threading model and database-locking model of Fossil.

What happens on simultaneous HTTP requests? Does it fork?
What if a CLI is invoked during HTTP request (or vice-versa)?
How a program can safely and efficiently read and write a repository that is being served with fossil server?
What if this sever is under high load?
What if this program is invoked through CGI extention?
Are there any special recommendations for the Wapp scripts?
What is the recommended connection string for Go programs?

(11) By Richard Hipp (drh) on 2020-12-19 20:57:39 in reply to 10 [link] [source]

Fossil uses a process model. Instead of starting a new thread, it starts a new process. A process is, after all, just a thread with its own private address space. A process better than a thread, because a oricess has its own address space and hence there are never any concerns about concurrency control.

Every HTTP request is handled by a separate process. Also every SCGI request. The fork() system call is very fast (on Linux).
The CLI is obviously running in a separate process from any servers that happen to be on the same system.
Multiple processes can read/write the repository as SQLite handles concurrent access to the repository database automatically. Fossil doesn't have to worry with that.
Server load has nothing to do with it. Every request is a separate process.
Every CGI request is a separate process. This is how CGI works. This is true of all CGI programs, not just Fossil.
Wapp is also single-threaded.
I don't have any idea how to do concurrency in Go.

Perhaps you've been told that multi-threading is necessary for efficiency. Fossil stands as a proof by example that this thesis is wrong. It is quite possible, and in fact not all that difficult, to be lightweight, efficient, and responsive, while still delegating each HTTP/SCGI/CGI request to a separate process. Remember, Fossil runs easily on a $5/month VPS or on a RaspberryPi. Competing multi-threaded systems (ex: GitLab) require an order of magnitude more processing power, or so I'm told.

(12.1) By Warren Young (wyoung) on 2020-12-20 05:40:53 edited from 12.0 in reply to 10 [link] [source]

Adding to drh's comments, point-by-point:

On Windows, it's complicated. The initial HTTP request is handled by a thread, but it writes info from the HTTP request out to temporary files and passes them to a subprocess which reads them back in. Other commands that do fork()-like things on Linux also use processes instead in native Windows builds of Fossil, though with stdin/stdout tied to the parent via Win32 pipes instead of using temporary files.

It'd be interesting to benchmark server operations of a native Windows Fossil server versus operation on the same hardware under WSL2.

WSL1 and Cygwin should be slower than either, but I'd encourage someone sufficiently interested to benchmark them, too.
SQLite transactions arbitrate access. Under some conditions, you can get a lock contention that requires one of the two processes to retry. This usually indicates that you're trying to do something like fossil serve on the same repo DB file you're committing directly to, which you can fix by committing to a clone rather than to the same DB file, so that arbitration goes through the sync protocol instead of thru the lower-level SQLite layer.
Ditto.
Define "high." Is it higher than sqlite.org, for which we have an existence proof of adequacy? It achieves that in part by using features documented in "Managing Server Load".

If instead we're talking about something far less busy than sqlite.org — which I'd guess is almost all uses of Fossil — then the only way I see that you can get into trouble short of a DDoS attack is underprovisioning your server. You might need a $10/mo VPS instead of a $5/mo VPS, for example.
Then you have a process-per-conn model, so see point #1. (i.e. How many processes fit into RAM under fork() vs re-exec models based on host OS, etc.)
No idea.
I'm reading an implication that you're doing SQL queries against the Fossil repo being served, which begs many new questions. If so, please justify those uses versus, say, "fossil sql" commands, which delegates such details back to Fossil, which we may presume does things correctly.

(13.1) By george on 2020-12-21 22:34:57 edited from 13.0 in reply to 10 [link] [source]

Richard and Warren, thank you for the explanations!

Define "high"

In my brave dreams that's on the order of a thousand users that actively read (on order of 5 requests per minute per user on average) and occasionally write (on order of 5 requests per hour per user on average).

doing SQL queries against the Fossil repo being served

Yes. Thanks you sagacity.

please justify

Basically three reasons:

ability to insert data into fx_* tables (and use it from within the Fossil)
ability to scale-up the rate of write-requests from these hmm... let me name it sensors.
it just "feels wrong" to finagle with formatting/IO/parsing when a direct API is (seemingly) available.

lock contention

Yes, that was on my mind. I thought that the OS's kernel can manage in-process spin-locks, mutexes and semaphores much faster than filesystem locks. But I may be wrong on that. In no way I'm a kernel guru.

What I'm actually more anxious about is stability/correctness. Sqlite3 is a Magic Wand that Just Works, but one still have to know "the right spell" (e.g. database opening parameters).

fossil sql

Yes, I intend to employ native Fossil commands whenever there is a risk to mess up the repository.

UPDATE: I'll try to learn from The Guru :-)

(14) By Warren Young (wyoung) on 2020-12-22 00:50:13 in reply to 13.1 [link] [source]

5 [read] requests per minute per user…5 [write] requests per hour per user…

We should probably take that up on the SQLite forum, but briefly, you should be okay. The main difficulty is SQLite's strong locking and single-threaded nature, which limits you to ~60 TPS on a fast spinning disk, but orders of magnitude more on an SSD.

Modern SSDs can be made strong enough to withstand such TPS rates through the expected life of the drive.

ability to scale-up the rate of write-requests

What sort of scaling are you talking about?

There's vertical scaling, where the hardware's the main limit on TPS rates and such. This is what I've been talking about so far.

There's also horizontal scaling, but I don't see how it would apply here. That would imply that you've got a cluster of Fossil servers and are somehow syncing them, but that doesn't happen with fx tables and other things you're talking about.

a direct API is (seemingly) available.

Are you speaking of this?

I thought that the OS's kernel can manage in-process spin-locks, mutexes and semaphores much faster than filesystem locks.

I don't see how you can avoid filesystem locks for an ACID-compliant DB stored on a filesystem. See this doc for details on SQLite's approach.

Ignore the big red hints to go read the WAL docs instead. Fossil doesn't enable WAL by default, so it's unlikely the WAL docs apply to your use cases unless you go out of your way to enable it.

I asked why we don't use WAL by default once, but got no answer, so here's me asking again... :)

(15) By anonymous on 2024-07-14 12:40:05 in reply to 3 [link] [source]

I am looking to determine the RAM resource allocation required on container based server to run several Fossil repo servers on that container without crashes. The RAM requirement is presumed to be 3 x maximum blob size (one blob to hold the new blob to be checked in, one to hold current version, a second blob to hold latest blob version already checked in, and a third blob to hold the new computed reverse delta file to replace the current version. The command line commands will be executed through Fossil CGI script server extensions.

Single threaded is good because it keeps RAM memory consumption from memory intensive operations more predictable (like fossil checkins and check outs which involve building or extracting reverse deltas), and so is less likely to crash due to running out of memory. These operations are not time critical and so a single thread is not usually a problem. Some additional RAM would be required for other non fossil processes which can be concurrent.

Does the single thread mean single thread per Fossil repo server running, or only a single thread only from all Fossil CLI commands per computer? In other words do running Fossil server processes for several repos running at the same time on the same computer have a common lock file to ensure that only one command line Fossil command runs at a time, or would I need to implement a lock file myself for expensive processes to achieve that?

(16) By Stephan Beal (stephan) on 2024-07-14 13:35:06 in reply to 15 [link] [source]

Does the single thread mean single thread per Fossil repo server running, or only a single thread only from all Fossil CLI commands per computer?

Fossil is a single-threaded application. It does not start any new threads. Its standalone server mode forks a new instance for each request, rather than managing threads to do so.

A single-CPU system, however, can start as many concurrent instances (processes) of fossil as it wants to. The majority of the time, that will work without any sort of locking errors. The exception is when some unusually long operation locks the database for "too long" and causes locking errors on other instances. Fossil uses different busy-wait times for different contexts, of anywhere from 5-15 seconds.

In other words do running Fossil server processes for several repos running at the same time on the same computer have a common lock file to ensure that only one command line Fossil command runs at a time,

No.

or would I need to implement a lock file myself for expensive processes to achieve that?

Why do you feel that such a lock is necessary?

The locking is managed automatically by sqlite, but that's per database (i.e. repository), not per machine.

(18) By Konstantin Khomutov (kostix) on 2024-07-15 12:13:24 in reply to 16 [link] [source]

Why do you feel that such a lock is necessary?

I beleive that is naturally follows from the OP's requirement to have bounded RAM usage, so they have come up with a (made up but, well, anyway) prediction about memory consumption by a single working Fossil process and then seek a way to ensure there exist at most a single working Fossil process in the container at any given time.

(19) By Warren Young (wyoung) on 2024-07-15 13:07:26 in reply to 18 [link] [source]

All that is plausible, but anonymous #15's wish remains a strange way to approach the solution, on multiple axes:

Choking everything down to a single thread to avoid DoS isn't going to guarantee success. You can exceed the system's max VM set size with a single process.
The idea that you need to do this at all is wrong-headed. On a system hosting many Fossil repos, it is most likely that only one is active at any given instant. Under that condition, it is already as single-threaded as it needs to be.
All this concern about server-side crashes overlooks the fact that you can always commit to the local repo and push your changes later. The OP said Fossil service isn't time-critical; fine, then, keep trying the push periodically until it succeeds.

If there are other services running on the Fossil host that need guaranteed VM space, run the containers under Podman inside one of its eponymous pods, and set a memory limit on it:

$ podman pod create --memory 1G …

My container runs under Podman 24/7 in an instance-per-repo model, though to be transparent, I don't bother running them all under a single pod. They're running as individual containers, for better isolation.

(It's possible to have a pod with better isolation than the default by playing with podman pod create --share.)

(20) By anonymous on 2024-07-17 17:22:46 in reply to 16 [link] [source]

I am intending to run several repos the same container with one master repo containing all the users who might log into the container being used to provide a single sign on through group logins. Users would be added or removed from the other repos to provide access to the other repos. I also want to provide Unix login access to only one sysadmin user, and provide access to common command line functions such as checkin and checkout through CGI Server Extensions. This setup could correspond to a company with all its employees registered master repo and the other repos representing various projects. It could also represent a large project with the master repo configured as a containing open source code in a parent repo, and the other repos representing various project modules using proprietary code.

While I accept that no data in the repos should be lost if the whole OS crashes and that the container can be restarted automatically, there is still the problem of people not realising and forgetting that a particular operation like checkin or checkout might not have gone through, and there is also the risk of corruption of files in the working directory going unnoticed, particularly if the crash was caused by an operation on another repo. I am also trying to reduce the maintenance overhead associated with checking status of files and rolled back operations.

By providing for certain common command line operations to be executed by server CGI extensions I am trying to reduce the maintenance overhead of managing Unix users on the server (I am proposing to mount an external SMB or NFS network share which has usr access managed by others and a single no-login Unix user to access the working directories from the server, and limit the command line operations carried out by the command line users to their respective roles on their repos through the CGI extensions). I am planning to implement this by providing by either adding to the existing SQL table for web gui user capabilities (if that can be done without breaking things) or adding another table to the repo to hold these. Basically the CGI script would check if the user has the capability to do the operation, and if so would execute the operation, and the checkin and checkout operations would log the web gui userid instead of the Unix user ID. A global lock file which would be checked and the CGI script message back "busy, try again" and quit, or else queue the operation and report success via email or chat when complete would be easy to implement, and since there will be relatively few intensive operations, limiting to one RAM intensive operation at a time on the server should not result in significant delay. A lock file is fairly trivial to implement in that context.

Is there any better way to achieve the above? Also would it be better to try to extend the the CGI extension user capability through an additional SQL table, or by adding to the existing RBAC capabilities.

(17) By Andy Bradford (andybradford) on 2024-07-14 20:08:15 in reply to 15 [link] [source]

> In other  words do running  Fossil server processes for  several repos
> running at the same time on the  same computer have a common lock file
> to ensure that only one command line Fossil command runs at a time, or
> would I need  to implement a lock file myself  for expensive processes
> to achieve that?

There is  no common lock  that would permit only  one client to  spawn a
Fossil process in the way you're  describing it. If you're using HTTP as
transport, you  could use tcpserver to  limit it to one  connection at a
time which would give the effect that you describe.

If you really want to limit it to one process, the following only allows
one Fossil instance to run at a time on port 11072:

tcpserver -c 1 -vDRHl0 10.7.4.2 11072 chroot -u _fossil /var/fossil /usr/local/bin/fossil http /fossils

Keep in  mind that  when Fossil  uses HTTP as  a transport,  it connects
multiple times per sync/clone transfer, and as a result connections will
naturally balance themselves; so this will not necessarily be noticeable
by clients  as each request  will use a  separate TCP connection  and it
will have  the appearance of  running parallel requests, but  in reality
TCP connections  will queue  up if  the transfer size  is large  and the
network bandwidth is small.

Andy