Push fails with "waiting for server..."

(1) By anonymous on 2022-07-31 11:40:05 [link] [source]

Hi,

when I try to push to my fossil repo I get the following:

---
$ push -v
Push to https://my.domain.org/path/to/fossil/repo
                Bytes      Cards  Artifacts     Deltas
Sent:           39292        559          0          0
Received:       35576        502          0          0
waiting for server...
---

Then nothing happens and after some time the process exits.

In my web server log I do see this POST request (anonymized):

---
my.domain.org x.x.x.x - - [31/Jul/2022:11:27:09 +0200] "POST /path/to/fossil/repo HTTP/1.0" 200 0
---          

The fossil.log remains empty.

Cloning/pulling does work however.

I did check the filesystem ownership/permissions settings on the server and as well as the fossil user permissions and couldn't really find an error.

How could I further debug this?
I assume it would be good to somehow get more verbose output to find out why the push really fails.


Thank you very much.

(2) By Stephan Beal (stephan) on 2022-07-31 16:12:19 in reply to 1 [link] [source]

Cloning/pulling does work however.

i don't currently have any ideas, and don't recall this ever having come up before, but a few questions...

Does "sync" hang in the same way?
What version of fossil are you running?
What OSes (client and server)?

(3) By anonymous on 2022-07-31 22:22:44 in reply to 2 [link] [source]

> Does "sync" hang in the same way?

Yes it does.

> What version of fossil are you running?

---
$ fossil version
This is fossil version 2.18 [84f25d7eb1] 2022-02-23 13:22:20 UTC
---

Same for server and client.

> What OSes (client and server)?

Both are OpenBSD.

(4) By Richard Hipp (drh) on 2022-07-31 22:33:11 in reply to 3 [link] [source]

Try this:

fossil sync -v --httptrace

The -v option gives more output on standard out. Does that provide any more clues? The --httptrace option puts the complete text of each HTTP request and reply in files named http-*.txt in the current working directory. Perhaps those files might give insight into what is going wrong?

(6) By anonymous on 2022-08-04 10:04:07 in reply to 4 [link] [source]

Update on the issue:

I noticed that the data to be synced contained a larger binary file (~50MB).
Checking in only the other smaller files the sync worked without a problem.
I then checked-in the binary file as unversioned and did

---
$ fossil sync -v -u --httptrace
...
                Bytes      Cards  Artifacts     Deltas
Sent:            3175         43          0          0
Received:        3715         54          0          0
Unversioned-file sent: path/to/large/file.bin
waiting for server...
---

I do get a file `http-request-2.txt` which contains the binary data of the file.
The file `http-reply-2.txt` remains empty.

Running `slowcgi` in the foreground with verbose mode eventually gives:

---
...
slowcgi: fork: /cgi-bin/scm
slowcgi: caught exit of unknown child 90939
---

Since I followed fossils' openbsd fastcgi setup instructions [1] I doubt that it is a "max request body" problem having the following in my httpd.conf:

---
connection max request body 104857600
---

I suppose then this may also be a none-fossil issue. 

It still would be nice to produce/find meaningful output that describes the origin of the problem. I will continue to debug this and hopefully find a solution.

Reading about the fossil sync protocol [2] is interesting.

[1] https://fossil-scm.org/home/doc/trunk/www/server/openbsd/fastcgi.md
[2] https://fossil-scm.org/home/doc/trunk/www/sync.wiki

(7) By Martin Gagnon (mgagnon) on 2022-08-04 10:54:26 in reply to 6 [source]

To be sure the problem is related with httpd/slowcgi/…. You can check if it works when serving directly with fossil (by running an instance of fossil server directly)

It could help narrowing down the issue.

It still would be nice to produce/find meaningful output that describes the origin of the problem

May be the fossil instance got killed by the kernel because of some restriction from the system. If it’s the case, fossil cannot really produce such meaningful output. Perhaps OpenBSD could. Have you check syslog or httpd log for useful messages?

(8) By anonymous on 2022-08-04 13:39:09 in reply to 4 [link] [source]

I believe to have found the answer to the problem:

The default timeout of openbsd slowcgi is 120s [1]. After that the process is "cleaned". I recompiled slowcgi with a 600s timeout. Now the request is going through.

So fossil works just fine. 

Thank you very much!

[1] e.g. https://github.com/openbsd/src/blob/master/usr.sbin/slowcgi/slowcgi.c#L41

(5) By Kees Nuyt (knu) on 2022-08-01 10:53:15 in reply to 1 [link] [source]

I did check the filesystem ownership/permissions settings on the server

Sometimes overlooked: both the repository and the directory it resides in have to be writable for the user the webserver runs as.

Fossil User Forum

Push fails with ”waiting for server...”