freeing memory

(1) By Alan Bram (flyboy) on 2023-10-04 22:08:48 [link] [source]

A question about the internal operation of the Fossil code:

Should the caller of db_get() eventually free the returned memory by calling fossil_free()? Looking at db_get_manifest_setting() I can't see where that is happening.

(P.S. Is it OK to ask questions about Fossil code internals here on this forum?)

(2) By Stephan Beal (stephan) on 2023-10-04 22:16:22 in reply to 1 [link] [source]

(brief response from a tablet...)

Should the caller of db_get() eventually free the returned memory by calling fossil_free()?

If it returns non-const char ptr then yes. Fossil's internals very often specifically do not free memory, relying on the OS to do so when it closes. Memory allocated in loops, or in large amounts, is managed carefully, but small amounts for things like db_get() are often intentionally leaked because we know the app is only going to run for a single command and then exit. It took me a few years to come to grips with that approach, but it has a long and glorious history in this project.

(P.S. Is it OK to ask questions about Fossil code internals here on this forum?)

Absolutely.

(3) By Alan Bram (flyboy) on 2023-10-04 23:41:29 in reply to 2 [link] [source]

the app is only going to run for a single command and then exit

I was worried about the case of a long-running standalone HTTP server. But now that I look at the docs again, I see I missed the part where it says it "dispatches a copy of itself to deal with each incoming request."

Fair enough. Thank you!

(4) By Andy Bradford (andybradford) on 2023-10-06 03:53:21 in reply to 3 [link] [source]

> I was worried about the case of a long-running standalone HTTP server.

Actually, your concern about the  long-running standalone HTTP server is
still valid. While it spawns off a new fossil process for every request,
the  parent process  itself  could still  have a  memory  leak (I'm  not
suggesting that it  does, only that because it is  a long-running server
there is the potential for mischief).

Andy

(5) By Warren Young (wyoung) on 2023-10-06 12:20:23 in reply to 4 [link] [source]

I've looked into this, and Fossil doesn't even try to open the repo DB in the fossil server parent process. It waits until it receives the first connection, and it repeats that on each subsequent connection.

While this does mean fossil server benefits from the anti-leak protection of a modern OS's resource management during process shutdown, it means that those of us running Fossil servers in this model pay the cost of repeatedly opening the repo DB on each and every HTTP hit.

More, Fossil does other expensive OS calls like checking file permissions to manage the chroot process. Yes, in principle, this means Fossil could chroot itself as a different user on each HTTP hit if you change the file ownership in between!

I suppose the counterargument is, if you opened the repo DB in the parent in order to pay the costs of parsing the schema and such once on boot, how do you pass ownership of the DB down to the children without creating a resource conflict any time Fossil ends up handling 2+ HTTP hits in parallel?

(6) By Stephan Beal (stephan) on 2023-10-06 12:23:51 in reply to 5 [link] [source]

I suppose the counterargument is, if you opened the repo DB in the parent in order to pay the costs of parsing the schema and such once on boot, how do you pass ownership of the DB down to the children without creating a resource conflict any time Fossil ends up handling 2+ HTTP hits in parallel?

From what is currently section 2.6 in https://www.sqlite.org/howtocorrupt.html:

Do not open an SQLite database connection, then fork(), then try to use that database connection in the child process. All kinds of locking problems will result and you can easily end up with a corrupt database. SQLite is not designed to support that kind of behavior. Any database connection that is used in a child process must be opened in the child process, not inherited from the parent.

Do not even call sqlite3_close() on a database connection from a child process if the connection was opened in the parent. It is safe to close the underlying file descriptor, but the sqlite3_close() interface might invoke cleanup activities that will delete content out from under the parent, leading to errors and perhaps even database corruption.

(8) By Alan Bram (flyboy) on 2023-10-06 22:15:15 in reply to 5 [source]

... those of us running Fossil servers in this model pay the cost of repeatedly opening the repo DB on each and every HTTP hit.

More, Fossil does other expensive OS calls ...

Yeah, I'm a little surprised that Fossil doesn't support any way to run everything in one process, especially since there are so many other ways to run it. I guess I just assumed (without reading the docs carefully enough) that the standalone HTTP server configuration was single-process.

I wonder: is this a recommended pattern for Sqlite apps in general?

(9) By Warren Young (wyoung) on 2023-10-06 23:10:30 in reply to 8 [link] [source]

Using SQLite In Multi-Threaded Applications.

But also, see SQLite FAQ item 6.

(10) By Andy Bradford (andybradford) on 2023-10-07 03:17:41 in reply to 5 [link] [source]

> Fossil  doesn't even  try to  open the  repo DB  in the  fossil server
> parent process. It  waits until it receives the  first connection, and
> it repeats that on each subsequent connection.

I think this is  a sound design decision. It keeps  the code simpler and
also  minimizes the  amount of  mischief that  can be  done in  the long
running parent process.

> those  of us  running Fossil  servers in  this model  pay the  cost of
> repeatedly opening the repo DB on each and every HTTP hit.

Don't OSes cache files that are opened repeatedly like this?

Andy

(11) By Warren Young (wyoung) on 2023-10-07 05:49:13 in reply to 10 [link] [source]

The cost I speak of is the time needed to load and parse the DB schema. It is because this takes significant time that it is standard advice not to open and close the DB for each operation, yet this is exactly what a Fossil server does, even for rendering semi-static pages like the project's /home page.

This is also why SQLite connection pools exist even though there is no TCP connection as with a client/server DBMS. They keep the sqlite3* DB handle open between uses in order to amortize this expense across multiple uses of that handle.

The primary saving grace here is that most Fossil servers aren't all that busy. It would be worth optimizing if you had sustained rates of dozens of hits per second.

(7) By Alan Bram (flyboy) on 2023-10-06 22:09:35 in reply to 4 [link] [source]

parent process itself could still have a memory leak

Sure. But I told myself maybe these guys were careful to avoid memory leaks in this limited portion of the code; and then just relaxed about it in the bulk of the code that isn't long-running. But I didn't really verify that.

Also, I didn't really mean to focus on the DB operations. I was just using that as the example that I happened to be noticing.