Advice about repo organization

(1) By Steve Schow (Dewdman42) on 2019-08-12 19:06:50 [link] [source]

I would like some advice about how some of you might best recommend repo organization. I realize there is probably no right or wrong answers here, but I'd appreciate your experience and insights when working with fossil.

Basically its pretty clear to me that when i have a larger project to work on, involving numerous source files effecting a singular end thing, then it makes sense for that collection of source files to be in its own repo. That repo would have its own wiki, its own tickets. Etc.

However I also use fossil to manage a larger collection of much smaller tools, many of them are single file scripts, for example. However I am not at all sure when is the best time to split them off to a separate repo. Some of them are dependent on test scripts and other things which are shared among different scripts. also for these, I prefer to have one monolithic ticket system to keep track of all the various things I am working on. But I do find the timeline becomes very convoluted with checkins from different scripts happening here and there...or I can get into lots of branching and merging too I suppose...but seems like overkill for one-source scripts..

I also use this monolithic fossil repo to manage many of my system configuration files, etc.

I guess the main thing is unless the project is large enough to warrant its own dedicated issue tracker, I want just one issue tracker for all the misc config files and semi-related scripts I might be working on, but I am wondering what suggestions people have for how to best manage lots of scripts, which might be evolving quite a lot individually, with lots of checkins over time, and each one is fairly independent in purpose as if it were a standalone product really....generally. But simply not big enough of a "thing" to warrant a completely separate repo, with separate issue tracking, separate wiki and all the rest.

Yet I need to be able to see what is going on with one of them in isolation.

What approaches are some of you using for this kind of situation?

(2) By Warren Young (wyetr) on 2019-08-12 19:32:17 in reply to 1 [link] [source]

First off, system configuration file repos should be separate from everything else, period, if only for security purposes. Also, they tend to be "opened" in a much different location (e.g. /etc) than everything else. Unless you were planning on opening your "everything" repo at the file system root, you're going to need multiple repos anyway.

I wouldn't use the ticket system to decide when to use separate repos and when to combine resource. I'd let the project tell me instead.

If I have a main project and I extract one library from within that project in a way that allows that library to be reused by other projects, then it should be a separate repo, with its own ticket system. The reusable library has a defined and hopefully stable API now, because it now needs to be usable by multiple projects. That project repo gets its own tickets, naturally, but it's being pushed by the project's needs, not by counting ticket trackers. You have a defined set of outside users, so of course they're going to need a ticket tracker for reporting problems with your new reusable code base and for requesting features for it.

For an opposite case, consider the use of in-tree dependencies. If I copy someone else's code into my working tree simply to keep direct control over that dependency — as opposed to depending on an external component with an independent lifetime — then I'm going to check it directly into its dependent's repo rather than create a sub-repo for it and open it --nested under the dependent's check-out directory.

Another heuristic you can use is, "Who needs this bit of code?" If you can identify some subset of the code that needs its own set of user permissions, that tells you where to draw a repo boundary line.

If the thing bothering you about having multiple repos is the need to keep up with multiple timelines, you can recombine them by either using the /timeline.rss feature in your favorite RSS feed reader or by setting up email alerts.

(3.1) By Steve Schow (Dewdman42) on 2019-08-12 19:58:46 edited from 3.0 in reply to 2 [link] [source]

all great points.

As of now I'm the only one accessing the repo and its here at home, so not too concerned about security in this case. Yet.

I actually even just have one monolithic repo for all config files on about 5 machines in the LAN. I am NOT editing those files in place where they are used. I have a working dir that is in the same dir hierarchy as the machine where it resides and is used, then I copy the changes to that mimic'd location and checkin there. There is also a shared path because many of them share the same file, for example. .bashrc, etc... its still much preferable for me to have one single system admin ticket system to look at; I absolutely do not want to be switching around between different repos to see all my admin tasks. or many little short lived script projects that take a few weeks to finish, with a bunch of versions and then basically end up being updated rarely. Yet they are fairly independent.

I'm still not sure how I would best share a test bootstrapping system between different projects, in the same repo is obviously easy. In separate repos, not as easy. Especially if I ever plan to share that particular project on GitHub or something, then the test subsystem that is shared between projects would also need to be included in the GitHub export from fossil somehow. Anyway, that's kind of a separate topic...maybe...

for me the reason reason to have less repos is that I have one main ticket report I am using every day on a vast hodge podge of tasks..... but if I have a singular project that is big enough to warrant its own dedicated ticket system and source tree, of course it makes complete sense to have its own repo..that's a given.

I'm having a harder time figuring out how to work with many smaller little projects though.. Many of them are barely related...have no dependencies on each other whatsoever...but they are just too small to warrant a separate repo...ESPECIALLY...if I am going to have a growing list of ticket systems to wade through.

(4) By Steve Schow (Dewdman42) on 2019-08-12 19:56:10 in reply to 3.0 [link] [source]

I need to learn more about how you are consolidating multiple repo ticket reports into one report.

When we use gitea or GitHub, etc..they basically have a separate ticket system for each repo. So in that environment it is exactly what you're saying...seperate ticket systems and we basically have to try to think of ways to group things into repos that makes sense....if you have a giant monolithic repo with lots of barely related or unrelated files..then it is what it is..they are all in there and the checkins would be in whatever order they are...even if they are unrelated to each other...Same problem I'm having in fossil...

If you start creating a lot of isolated repos, many of them might have only one script or something. Of course there are literally thousands of repos on GitHub with just a couple of files. And usually that makes sense for sharing code of course and having an isolated issue tracker for that shared thing.

Its really hard for me working at home on a large collection of unrelated or barely related things to keep track of all my tasks that way if I seperate them all into separate repos. Not to mention admining so many separate fossil repos...

since i'm arguing with myself here... There is also something very simple and clean about having a repo be related to just one little project, even if its literally just one javascript file. Its easy to clone anywhere and work on it. But having a hundred fossil repos (with their own wiki and tickets), is the thing I'm trying to avoid.

(5) By Warren Young (wyetr) on 2019-08-12 21:46:44 in reply to 4 [link] [source]

I need to learn more about how you are consolidating multiple repo ticket reports into one report.

I'm not talking about "one report," I'm talking about "one stream." By using either an RSS feed reader or an email client, you can consume update messages from multiple repos as a single stream of updates.

For the RSS option, start here:

    $ fossil help /timeline.rss

I use Feedly to consume the RSS feeds of public Fossil repos. There are lots of alternative RSS feed readers, if you don't like Feedly for some reason.

I don't think you can use RSS readers like Feedly with private RSS streams, but with private repos, they're more directly under my control anyway, so I don't really miss their activity reports in my RSS reader. It's my public repos — which can have updates from other people on them — which require monitoring.

Alternately, set up email alerts. For monitoring the timeline, you probably want to subscribe to everything. That option will work with both public and private repos.

Not to mention admining so many separate fossil repos...

Go have yourself a double scoop of fossil conf export/import with sprinkles.

(6) By Warren Young (wyetr) on 2019-08-12 21:55:07 in reply to 3.1 [link] [source]

then I copy the changes to that mimic'd location and checkin there.

It's just as well. System configuration files tend to require particular file and directory permissions to enforce your security requirements, so since Fossil doesn't manage those perms, you're better off keeping those permissions separate. Just make sure that your "installation" step doesn't revert the existing perms.

I'm still not sure how I would best share a test bootstrapping system between different projects,

 $ fossil open --nested

a growing list of ticket systems to wade through.

That's not so much of a problem once you have a single stream of updates.

Besides, who wants tickets from half a dozen unrelated projects all cluttering a single repo's open ticket list? When I sit down to think about the bug reports and feature requests for a given piece of software, I want only that project's tickets in front of me.

(7) By Warren Young (wyetr) on 2019-08-12 21:57:28 in reply to 5 [link] [source]

Go have yourself a double scoop of fossil conf export/import with sprinkles.

And add a cup of Admin → Login-Group on the side.

(8) By Steve Schow (Dewdman42) on 2019-08-12 22:22:44 in reply to 6 [link] [source]

I do. I do not want 100+ repos. That's my question, how to avoid doing that.

As I said, for larger projects it makes complete sense for separate repos, it does not make sense at all for me with these little tiny one-file projects. It just adds a lot of repo admin overhead and me having to hunt all over the place to find tickets.

(9.1) By Steve Schow (Dewdman42) on 2019-08-12 22:29:27 edited from 9.0 in reply to 6 [link] [source]

I'll look into the --nested option you mentioned for how to share my shared testing scripts and other shared tools.

(10) By anonymous on 2019-08-12 22:57:24 in reply to 1 [link] [source]

For single file projects, you can get a timeline of just that file.

For multi-file, small projects, the best I can think of is that the timeline can filter on tags. If each commit for a given small project uses a tag related to that project, you can filter the timeline for that tag.

This is easiest if each check in is just files for one project, though you could include multiple tags, one for each project included in the check in.

Of course, remembering to apply the correct tag(s) is a problem. Maybe a pre-commit hook could derive the tag(s) from the paths of the files being checked in. I'm not sure what all a pre-commit hook can do.

Alternately, a post-commit hook could add the tag(s) to the commit.

Another option would be to use a wrapper script around the fossil ci command to supply the tag(s).

Maybe something like:

# This assumes each project is in its own directory
my $c = `fossil changes`;
my @c = split /[\n\r]+/, $c;
die "No changes to check in\n" unless (0 < @c);
my %tags;
for (@c)
{
    m|([^/]*)/|;        # capture directory name
    my $t = $1;
    $t =~ s/^\s+//;     # strip leading white space
    $t =~ s/\s+$//;     # strip trailing white space
    $t =~ s/\s+/_/g;    # replace each run of embedded spaces with a _
    $tags{$t} = 1;
}
my $f = 'fossil ci ';
for (keys %tags)
{
    $f .= "--tag $_ ";
}
$f .= join ' ', @ARGV;
system $f;

(11) By Steve Schow (Dewdman42) on 2019-08-13 00:23:13 in reply to 10 [link] [source]

tags could work. Or branches. I think without question there are some cases I could regroup some things into smaller repos, for example, I could create a repo that is a "collection" of semi-related scripts...and then it will not be quite so monolithic, but pros and cons with that. Probably I should be at least using branches to work on sub-projects of a repo as much as I can, but I'm sure that will lead to a lot of merging. But eh...that's what SCM is for.

I think what is concerning me is that my monolithic repo is acting as a kind of todo list. Its one place I can go into and change priorities, move things up and down the list in one giant list, even if many unrelated or semi related things are in that same repo.

I wish Fossil had a bit of a higher level project management system with issue tracking and wiki and everything that sits one layer higher then per repo..then I could organize my task priorities and not lose sight of anything, even if if the underlying SCM is spread out to multiple repos in ways that makes sense and is basically cleaner to work with that way.

Gitea kinda gets there, but fossil has so many advantages over gitea or git...I don't wanna go there. Redmine also is quite nice, but same issue, I want to use fossil.

(12) By Warren Young (wyoung) on 2019-08-13 05:10:45 in reply to 11 [link] [source]

It sounds like you want the oft-requested submodules feature.

(13) By anonymous on 2019-08-13 17:29:09 in reply to 11 [link] [source]

I wish Fossil had a bit of a higher level project management system with issue tracking and wiki and everything that sits one layer higher then per repo.

Actually, this is at least partially possible.

Fossil's ticket system is very configurable using TH1, TCL and HTML. If you're will and able to do JavaScript, you could even more.

The setup would basically be an "admin repo" to manage tickets for all projects and all the various project repos.

Idea 1: In the ticket configuration, add 2 lists: A list of project names and a list of project base URLs. (If TCL has "dictionaries", "hash tables" or "associative arrays", then instead use a table with project names as keys and base URLs as values.) In the ticket schema, add a new field for Project. In the new ticket setup, add a drop-down list for the Projects to set the new Project field. In view ticket, you will need code to look up the base URL for the Project and code to format "project artifact references" to HTML links to that project's repo. Maybe use "[[uuid]]" to denote project references. With JavaScript, this might be simpler and you can make this work for wiki pages, too.

To link to tickets from the commit comments, JavaScript will be needed.

Idea 2: Create specially named wiki pages for each project. I suggest "Project/name". The body of the page will be the project's base URL. This will probably require using JavaScript even for just ticket references.

With the right JavaScript, the project base URL in the admin repo's project wiki page can also serve as a link to the project's main wiki page.

Idea 3: Use scripts around the Fossil command line interface to synchronize tickets between the admin repo and the project repos.

Use "fossil rss -y t -n 20" to get a list of the 20 most recent tickets, then use "fossil ticket show" to extract the ticket details. Unfortunately, "fossil rss" doesn't have a "--since" option to report only tickets since a specified time and date, so your script will have to will have to have save the uuid of the last ticket extracted and only extract tickets more recent than that from the list produced by "fossil rss".

At the other repo, use "fossil ticket set" to update tickets.

(14.1) By Steve Schow (Dewdman42) on 2019-08-13 19:01:59 edited from 14.0 in reply to 13 [link] [source]

Thanks for this post.

I was thinking about this last night also. Modifying the web interface in fossil has so far thrown me off course, I haven't been able to wrap my head around TH1, TCL and HTML...but I've never really done any web programming before and I know its even a bit different here in fossil due to the TH1 stuff which I still don't totally get, but I haven't tried that hard either. But yes...something in the web ui that basically gives us the bigger picture across multiple Repos including consolidating tickets and wikis in some way and then basically drill into each repo for their own source as needed.

I am perfectly ok with not having multi-repo tickets, and I'm mostly ok with all repos being fully independent of each other in terms of the source.

I just want a way to have one large consolidated portal that I can go to, see all my tickets across all repos in some way that makes sense, get to all the wikis, etc.. quickly and easily in friendly way.

I'd love to be able to have a master ticket list that basically shows me all or some of my repo's tickets..and then I can choose to filter that list per repo, or group by repo, or sort by repo, or sort rather by priority across all repos, etc. Creating the notion of projects and sub projects would then let me produce better ticket lists and ways to quickly reprioritize and/or enter new tickets and have the system make sure to add it to the correct underlying repo without having to first open that repo's separate ticket system and enter a new ticket there. I want to be able to work from one project management list and then use filters to find what I want to find and edit what I want to edit in tickets across all repos.

At that point I'd have no problem whatsoever creating many tiny little fossil repos, no matter how small the project, if its fully independent, then fine...whenever I feel it can exist isolated in its own source tree, then use a separate repo.

I don't know if I would need to fork the fossil src code and extend these things in source or whether it could be done purely with the existing TH1/HTML/Javascript mechanisms so that I could basically use the factory fossil binary and then clone my repos around with those capabilities...I have a lot to learn in order to see what is possible that way I guess. I haven't spent that much time trying to figure it out, but every time I start seeing the TH1 stuff, it confuses me and throws me off. But I've never done any web programming of any kind, so its all kind of mysterious to me right now. I know javascript quite well, but not for web programming, I know absolutely nothing about that.

(15.1) By Warren Young (wyoung) on 2019-08-13 19:25:44 edited from 15.0 in reply to 14.0 [link] [source]

I think all of the major pieces to do what you want are already in Fossil:

The Fossil ticket list pages are just a simple SELECT query wrapped in an HTML generator that links one SQL row in one HTML <table> row to the individual ticket rendering page.
SQLite has the ability to attach multiple database files to a single connection, allowing queries over multiple DBs. So, you'd be changing the simple single-table SELECT in #1 to a query over the same table in all Fossil DBs you want to query.
fossil server --repolist already knows how to collect a set of Fossil DB files and present them as one.

All you need, then, is a bit of glue.

What I'd do, if I wanted this, is to start at both ends and work toward the middle.

First, hack up a prototype starting from #1 using hand-written ATTACH DATABASE calls before the SELECT to get a list of tickets across multiple repos. That'll give you an HTML table showing all tickets across all repos, but the links won't work because the current HTML rendering code assumes they're all pointing into the current repo.

Next, start from the other end by extending the fossil server --repolist functionality to expose a new TH1 extended command to report the list of known repos. (It could be called repoList, for example.)

Finally, extend the existing ticket pages (/reportlist, /rptnew, etc.) to make them aware of when they're running in repolist mode so they store and retrieve their SQL queries somewhere other than the repo DB file. (Perhaps ~/.fossil as global configuration settings?)

Having done all of this, you can then visit /rptnew via your fossil server --repolist instance to create a new ticket report using the repoList() TH1 proc to give a result iterating over all tables, which in turn generates a /rptview that links into the individual repos.

It probably sounds more difficult than it really is. What it is, is tedious glue code, not especially tricky code. And that means all it wants is someone to slog through it, which makes it a task for someone that wants it to exist. i.e. You. :)

(16) By Steve Schow (Dewdman42) on 2019-08-13 20:49:12 in reply to 15.1 [link] [source]

slogging through some tedious glue coding is no problem, but most of what you just suggested above is totally above my pay grade of understanding...so...its probably not going to happen. I'm just too far away in my understanding of fossil internals to even contemplate it at the moment. Everything you just suggested probably sounds easy and doable because you already understand the internals of fossil. for me...that was all completely outside my sphere of understanding how fossil works well enough to roll something like that out without months of learning fossil internals first.

It definitely sounds doable according to your description...but probably won't happen by me.

(17) By anonymous on 2019-08-13 21:14:49 in reply to 14.1 [link] [source]

Certainly wyoung's "repolist mode" would be very nice, but what I described (ideas 1 and 2) can be done with just TH1, TCL, HTML and JavaScript using the current, "stock" Fossil executable.

Idea 3 would be implemented in whatever language you choose that can spawn a process to run Fossil commands.

One caveat about idea 3, The ticket IDs in the admin repo would not be the same as the ticket IDs in the project repos. The ticket schema in the admin repo would need to have a "Project Ticket" field (as well as a "Project" field) and the ticket schemas in the projects would need an "Admin Ticket" field.

Idea 3 is something I know I can do, but I don't have a need.

Ideas 1 and 2, I might be able to do, but I would have to learn more TH1, TCL and JavaScript. And, I don't have a need.

Note: In theory, the admin repo could directly pull tickets from the project repos, but Fossil currently doesn't support pulling just tickets. And even if it did, we would still want a way to identify which project repo each ticket came from so that artifact references in each ticket could be properly resolved. Also, we want a way to push ticket updates back to their associated project repos without pushing all tickets to all repos. Even better if we could create tickets on the admin repo and select the intended project so we could push new tickets to their respective project repos.

(18) By Warren Young (wyoung) on 2019-08-13 21:15:27 in reply to 16 [link] [source]

How do you eat an elephant? One bite at a time.

I'm not sure what you're finding difficult about my proposal, so I'll construct a Q&A for you based on my guesses at your questions:

What is /reportlist? It is the page in Fossil UI that lists the existing ticket reports. It's where you get to when you visit Hamburger → Tickets in the stock UI, but you could just append that to your repo's base URL instead. Say fossil help /reportlist to get docs on that page.
What is /rptview? It is the page showing the results of one of these queries. Beyond that, see #1.
Where is this implemented in the Fossil source? Do a recursive grep of src/ under the Fossil source tree for "WEBPAGE: reportlist" or similar to answer that. For both of the above, it's src/report.c.
What is TH1? It's answered in the Fossil docs.
What is an extended command? TH1, being a simplified dialect of Tcl — the Tool Command Language — is based on "commands", which are the same as what other languages would call functions or procedures. (But only if you squint.) Tcl and TH1 make it really easy to define custom commands in C code. It is more or less what these languages were designed to do: allow creation of new languages with custom commands. In a sense, Fossil doesn't implement "TH1", it implements "Fossil TH1", being basic TH1 plus all of the built-in extended commands documented on the page linked in #3.
How do I define an extended TH1 command in Fossil? Add it to src/th_main.c. Search that file for functions comented "TH1 command: foo" for examples of how existing commands are implemented. Most are short, a dozen or two lines. This should give you some confidence.
HTML? SQL? Sorry, can't explain those languages here. But for both, what we're talking about is pidgin-level code. You don't have to read a whole book on either to implement this idea.
What is repolist? It is a mode that fossil server can run in, in which it gathers a list of file names from a given directory and presents that list of repos for a user to pick from. With the new repolist UI skinning feature, this idea of mine simply lets you extend the ticket functionality into repolist mode.
What is ~/.fossil? It is where Fossil puts global settings, among other things.

If you have more questions, just ask. One bite at a time.

(19) By Warren Young (wyoung) on 2019-08-13 21:32:17 in reply to 17 [link] [source]

ideas 1 and 2) can be done with just TH1, TCL, HTML and JavaScript using the current, "stock" Fossil executable.

The main problem with that is that it's a local lash-up, with everyone doing it differently, no one getting the benefit from anyone else's work, unless someone publishes an article showing how they've done it, which others actually make use of. Basically, it's the Lisp curse all over again.

My idea would benefit everyone who uses fossil server --repolist.

I do like your idea of a "Project" drop-down for new tickets, though. In my proposal, I'd implicitly assumed that you'd drill down to the sub-project level to file new tickets. Being able to do it from the top level — repolist mode, in my proposal — with the addition of a single drop-down is a nice addition.

I think your Idea 1 might be roughly on par with my whole proposal, in terms of difficulty and lines of code required. I suppose the choice depends on the mix of languages you know now, plus those you're required to learn. You can do your ideas without any C, whereas you can do mine without any JavaScript.

Tradeoffs, tradeoffs.

(20) By anonymous on 2019-08-14 09:23:15 in reply to 19 [source]

I agree.

If I had a need for this set of features, I would make the needed enhancements to Fossil itself. I particularly like the idea of being able to selectively sync tickets between an admin repo and multiple project repos.

So, instead, I'm sharing some ideas for alternate ways to approximate those features.