Fossil Forum

Remove deadname from repository
Login

Remove deadname from repository

Remove deadname from repository

(1) By anonymous on 2023-10-06 13:22:03 [link] [source]

Hi everyone :)

I'm new to Fossil, and considering how it fits into my work etc. and I have a question that is fairly important to me.

Assuming a person changes name, and for reasons of either trauma or personal safety does not want their old name visible in a repository. How would Fossil handle this? I know there is the amend command, but this only adds a correction, which means the old name is still locatable and, worse, is now connected to the new name, making datamining easy.

This sort of need has appeared in my vicinity in the past at workplaces using Git, and while it was not entirely painless, we could "just" rewrite all commits in the repo.

(2) By Warren Young (wyoung) on 2023-10-06 13:25:09 in reply to 1 [link] [source]

That's what would have to happen in Fossil, too: "just" rewrite all the commits, invalidating all other clones' ability to sync to the parent until re-cloned.

The desire to change past facts to match current desires is anti-Fossil. I doubt you're going to see such a whole-repo rewriting feature appear in Fossil, ever.

Nothing stops you from round-tripping through Git to accomplish this, though.

(3) By anonymous on 2023-10-06 14:46:45 in reply to 2 [link] [source]

Thanks for your answer, I assumed something like this would be possible, if ugly. I assume one could write their own rewriter by changing the database directly?

I do wonder, though.

Wouldn't the design be almost exactly the same, but allow for this change, if author names were just not included in the commit hash? Having them only as tags (which I realise are also immutable, but can probably be purged in the database much more easily?) would not change much about how the repo is used or how secure the content is, but would allow the very real usecase of people changing names. Where proof of provenance is required, signing the commits would be a much better mechanism anyway.

This is not actually something Git can do, so I'm not comparing the two here, but it seems like something Fossil, in contrast to Git, could fairly easily accomplish, since it already has its artifact and tag system. This would make renames a relative non-issue.

I'm focusing on this specific usecase because in most systems designed for historical correctness, the harm this might create is overlooked. Most of the time this is a trade-off, and mutability sometimes has to be sacrificed (someone changing their legal name will not, for example, change old signatures etc.), but this seems like a case where this is not actually a trade-off. Of course, changing something like this after the fact would be a whole different matter I assume, but I'm interested in what you think of the concept.

(4.2) By Stephan Beal (stephan) on 2023-10-06 15:00:04 edited from 4.1 in reply to 3 [link] [source]

Wouldn't the design be almost exactly the same, but allow for this change, if author names were just not included in the commit hash?

It's too late for that. That was an architectural decision made way back in 2006 or 2007 and can't change now without invaliding every single commit made to a fossil repository since then.

Perhaps (but only perhaps) if fossil were to be rebuilt from scratch in 2023, that would be a viable thing to do, but it's not now.

Fossil's design is incompatible with retroactive changes. Just like the clay tablets of old, authors' names, as recorded by fossil, are the names they had at the time of the creation of the artifact.

Just like the Wachowski Brothers Sisters - most of us will remember the first three moveis of The Matrix franchise being created by brothers, rather than sisters, and no amount of editing will undo that in our memories.

(5) By Warren Young (wyoung) on 2023-10-06 15:41:12 in reply to 3 [link] [source]

I assume one could write their own rewriter by changing the database directly?

The relational lookup tables you refer to contain data parsed from the commits in the blob table. You can write all the SQL you like to change things to suit your preferences, but it'll all go away on the next fossil rebuild.

Having them only as tags…

…creates a link between this unique identifier and their current name, "making datamining easy," as you say.

There is no such thing as an opaque yet globally unique identifier that doesn't allow tracking. As soon as the first connection is made, the whole sweater unravels.

The closest thing we have to what you want is fossil amend --user-override which doesn't hide any true facts, merely pastes a fresh new label over them in the UI.

I'm interested in what you think of the concept.

I despise it for the same reason I despised the Soviet apparatchiks for airbrushing people out of photographs when they said something that went against the party line.

If you don't want your identity tied to things you say, say them anonymously.

And if this then leads into the discussion about making Fossil fully anonymous, beware that we've had this argument before, and it would be kind of you to catch up with it in the archives before restarting the argument afresh.

(6.3) By Warren Young (wyoung) on 2023-10-08 20:23:47 edited from 6.2 in reply to 1 [link] [source]

I happened to revisit this topic due to it coming up in a Hacker News posting responding to an article titled "Things I just don't like about Git," and I realized that this entire thread here might have gone off the rails from the start.

If I understand the HN comments, Git stores the user's name and email in each commit, with the result that several common changes will break the link between commits and the committer's identity:

  • ISP change: Who here still has the same primary email address since they first got on the Internet?
  • surname change: Even the most reactionary must agree that in Western cultures, women will tend to change their surnames at least once in their lifetime.
  • affiliation change: I for one have committed to projects from multiple affiliations over the years; it's still "me" underneath, but if commits are tied to emails, they will appear to come from different people. All those job-hoppers out in Silly Valley must be rather annoyed to have their GH reputation smeared across a dozen different corporate email TLDs.

The thing is, Fossil doesn't do this. What it stores in the manifest's U card is the user name. That's it. No given name, no surname, no email.

This means Fossil doestoday! — allow you to commit as a user named after a blind hash1 and to rebind that to as many different names and email addresses over time as you like. As a rule, Fossil doesn't even show these in fossil info reports on PII grounds.2

This leads me to my question for the anonymous OP: Does this way of looking at things help you? An unchanging user login name across time does nothing for your worry about tracking a person through identity changes, but it does at least mean you don't have to rewrite the entire repo history merely to change your contact info on your past commits, as Git apparently requires.


  1. ^ e.g. fossil sync https://faffb80a4c@myhost/repo
  2. ^ This very forum didn't even show the human name at the start. It took multiple complaints about opaque user names before that changed, but it still hides the email address to all but repo admins.

(7) By Alan Bram (flyboy) on 2023-10-08 21:53:40 in reply to 6.3 [link] [source]

you don't have to rewrite the entire repo history merely to change your contact info on your past commits, as Git apparently requires.

Actually Git does have a roughly equivalent feature, IIUC. See the gitmailmap feature.

Yes, Git stores both a name and an email address immutably with each commit. But they can both be something like your "blind hash." Github even makes it very easy to set this up.

(8) By Konstantin Khomutov (kostix) on 2023-10-09 07:17:24 in reply to 6.3 [link] [source]

Interestingly, a similar feature (again) has been recently discussed on the Git list. The Git maintainer suggested to store hashes of the author name and author e-mail in commit objects as a way to be anonymous but still allow verification of authorship, when needed.

(9) By Peacememories (peacememories) on 2023-10-10 13:28:11 in reply to 6.3 [link] [source]

Hi again, this time in a less anonymous manner.

Thank you for revisiting this again. I, too, had a feeling this was going off the rails a bit which, to be honest, kind of turned me off the project until I saw your follow-up post. I do understand the reactions though, I think I unwittingly hit a nerve with something that was already extensively discussed in this projects history. Sorry about that.

Also, this post feels kind of rambling, since I am trying to fit a lot of context and personal experience in here, so to answer your question directly:

Yes, your explanation helped clarify a lot of things, and I am of the opinion that Fossil does a fine job of allowing people to update their identity as it changes - at least better than what Git does.


Before I go on, I want to re-emphasize that the motivation behind my comment was not to rewrite or erase history, or erase someone's hard work on a project. I was specifically talking about situations in which people are either personally hurt or endangered through affiliation with their own name.

Since it came up: I also don't necessarily advocate for anonymity, I think it's fine to have a history of who did what, I just feel it is sometimes necessary to hide what they were called at the time, to mitigate very specific kinds of harassment.


For the first issue here Fossil already has an easy fix. As far as I can tell (remember, I'm new to the system), I can just fossil amend a commit, and while the information about the original authors name is not gone, the user would not have to look at it anymore. I also like that Fossil does not require one to provide an email for a commit. This has always struck me as a weird artifact of Git - though I guess if your main collaboration protocol is email, this makes sense.

For the second issue, I think no SCM handles that well, apart from maybe Pijul, though I'm not yet sure what I think about its other properties, so that might be a moot point.

The solution you mentioned - using hashes as user names - should work for keeping name changes simple and without direct traceability (remember, I do not care about hiding who did what, just what they were called at the time). Of course this would require everyone to use this scheme from the start - most of the situations I know were with people who were happily using their username (which was sometimes even their full name) until something happened which required them to distance themselves from that identity. Obviously, with a public repository even rewriting the history there wouldn't help much, since it's bound to be downloaded somewhere a malicious actor could start digging.

For actions like these I am mostly concerned with private repositories with few contributors where a coordinated rewrite would actually work, but there the danger of abuse is also much lower, so I'm not really sure what a good overall solution would look like there.

So once again, thanks for following up and I hope this topic didn't leave too bad a taste in your mouth. I'm just trying to get to know the technology I find, and make sure that if I use it I don't accidentally hurt my friends and colleagues :)

(10.1) By Warren Young (wyoung) on 2023-10-10 16:31:45 edited from 10.0 in reply to 9 [source]

I think I unwittingly hit a nerve with something that was already extensively discussed in this projects history.

There are several separate aspects here we should tease apart:

  1. Full anonymity: This is what I was referring to with my "go search the archives" comment. Multiple times, this forum has been invaded by the crypto-bros, the sort who believe we can solve every current social ill by converting all information technology into distributed anonymous irrepudiable blockchain smart-contract ledger tokens. 🙄

  2. Pseudonymity: I believe this is your most fruitful path. This category of solutions includes ideas like opaque user names, a future option to make the call to display_name_from_login() in src/forum.c conditional so that the forum admin can choose whether it suppresses display of the person's name from their supplied contact info on the forum, etc.

  3. Personal accountability: This is a key element of Fossil's original design, and it actively and purposefully fights against some of your wishes. Fossil's original designed purpose is to serve the needs of the SQLite software project, where there is a positive requirement to record who did what to the source code, when, and to do it in a durable manner. Some history editing facilities do exist, as we have discussed, but they're meant to be used sparingly and to operate without burning down the whole house merely to kill the bedbugs.

Before we go on, it's important that you know I was one of the strongest proponents of the "anonymous" login option here for the forums. I believe the option to speak without identifying yourself to be a fundamental human right. The path to tyranny includes forcing absolutely every utterance and writing to be attributable to a real person.

At the same time, I'm a moderator here, and I apply stronger criteria before approving anonymous posts. Furthermore, I tend not to grant pseudonymous users the right to post without going through moderation — WrTForum — until they've made enough posts to build a proper reputation. Even then, I've come to later regret letting that dog off the chain more than once.

Because of this, I also believe the act of signing one's public speech with a real name to be a reliable sign of its social acceptability. If you're willing to put your name on your posts — as I do — I believe you should earn your WrTForum bit earlier than otherwise. I say this not so much because it then allows accountability but because it acts as a social brake on what a reasonable person is willing to say. I believe a large part of the social ills associated with Internet communication stem from the ability to hide behind a pseudonym, on the other side of the planet, far out of range of justified social retribution.

And atop all that, I support the need to seek and get forgiveness; I abhor cancel culture and gotcha politics. Utterly destroying people for this one thing they said once upon a time is a path to unthinking dogmas, not to enlightenment. The world is stupendously complicated, and most of that is because people are complicated. Multiply by eight billion, and there you are. Dichotomous thinking simply doesn't cut it.

situations in which people are either personally hurt or endangered through affiliation with their own name.

You continue to be quite vague about these harms. It's time you started trotting out the specifics.

I don't need chapter-and-verse on real incidents, but distilling them to true yet generic stories we can learn from would be much more helpful in getting to actionable solutions.

it is sometimes necessary to hide what they were called at the time, to mitigate very specific kinds of harassment.

"Very specific" yet unstated. Hmmm. 😉

I'm willing to speculate ahead of your response that you're worried about Internet mobs of one sort or another. I'm not sure technology has much of a role other than what you currently see in the Fossil project: a reasonable effort to protect committers' PII, pulling the sharpest of the mob's teeth from "go."

Beyond that, I believe that if you want durable and workable solutions, they're better done at the culture and legal levels. Attempts to solve social problems with technology have a long history of failure.

Fossil does not require one to provide an email for a commit

While that is true, a given software project administrator may require this and more.

Take the SQLite project again. drh not only has his fellow committers' email addresses, he has signed legal contracts from them in support of his licensing requirements, driven by the needs of his paying users. I bring it up because any push you might make to drive Fossil toward a world where committers can be unaccountable isn't likely to be implemented, if only because drh doesn't want it or need it.

More than once, features have been added to Fossil that the SQLite project doesn't use, but they're generally driven by people committing code and convincing drh to merge them despite having no personal interest in them, not by people pleading loudly on the forums.

…provide an email for a commit. This has always struck me as a weird artifact of Git - though I guess if your main collaboration protocol is email, this makes sense.

There are several aspects of Git that fall out of its original intent, to provide patch postings to the LKML. I can't remember which git subcommand does this, but I do remember being struck by its output format: RFC822 mail, suitable for piping to mutt or similar.

I was surprised to learn that nearly 8% of commits to a recent Linux kernel release came from "unknown" employers, yet I doubt much of this is fully-anonymous or even pseudonymous. I expect that most of it is due to sending patches through GMail or similar, and that the contributors and their affiliations could be tracked down eventually, if one were sufficiently motivated.

using hashes as user names…would require everyone to use this scheme from the start

I'm not convinced that this is both a problem to be solved and an inevitability.

We're all still learning how to cope with a world where Joe Schmoe can say something and get the attention of a large fraction of the world's population. That simply was not possible until quite recently, on the scale of humanity's development. We are social animals, but it was rare for a single person to be able to address mere thousands of others for nearly all of human history.

We've had roughly a hundred years where extraordinarily popular and lucky people could get on a radio mic or in front of a television camera and pull this feat off, but it wasn't until the Internet's population went up the hockey stick that everyone could do this.

Human societies are only now learning how to cope with the resulting effects. There must be zillions of quiet corners on your favorite global-scale social network that feel like the privacy you get in a public booth in your local hangout spot, but it's an illusion. By default, posts are visible for all time to everyone who cares to go look.

One of the things I think is going to fall out of this is a change in behaviors, then in the laws passed by the people who developed these behaviors from childhood once they get into power. These are the societal and legal solutions I brought up above. It will take a generation or two, but once these hasty solutions are refined by enough trips through the court system, it'll start to settle on a far more stable system than you could hope to achieve with technological solutions.

a coordinated rewrite would actually work, but there the danger of abuse is also much lower

I'm speculating in advance of you putting these unstated harms out as specifics, but I expect we'll find that the key driver behind them all is global-scale volumes of data, compute power, and attention.

Small groups of humans have always managed to develop social problems, but the thing is, we've been doing it since prehistory, and out of pure necessity we developed a wide range of solutions to these problems over that span. As long as you keep the group size under the Dunbar number, technological solutions to this class of problem are likely to be surplus to requirements.

I'm just trying to get to know the technology I find, and make sure that if I use it I don't accidentally hurt my friends and colleagues :)

That's a noble sentiment, but you can't stop all harms from the start short of bubblewrap solutions. Pain teaches the lessons we need in order to grow.

(11) By Konstantin Khomutov (kostix) on 2023-10-10 18:40:01 in reply to 10.1 [link] [source]

I can't remember which git subcommand does this, but I do remember being struck by its output format: RFC822 mail, suitable for piping to mutt or similar.

A fun fact: Git even has git send-email script which does what its name says: sends an e-mail message (with your patch series) ;-)

People routinely use it to send their series to the main Git list.

(12) By Marcelo Huerta (richieadler) on 2023-10-10 21:34:38 in reply to 10.1 [link] [source]

"Very specific" yet unstated. Hmmm.

Not the OP, but I have to disagree most vehemently with you. As far as I can see the very specific harassment in question is stated very clearly in the title of the thread.

You may not be aware of this, but deadname is the way trans people refer to the name they were assigned at birth, and which they reject as part of the past identity they want to leave behind.

A very common form of harassment by transphobes is to misgender trans people and use their deadnames instead of their new chosen names. Transphobes are particularly nasty about this, following people through their various online activities and doxxing them, and AFAIK it's usually painful for a trans person to be linked with their deadnames for whatever reason. This can have serious repercusions at their work and in their communities.

(And in another note, optimism regarding the social change needed to avoid this specific kind of harassment is, I think, somewhat naïve, given that the main source of ideas causing the harassment is not going to disappear any time soon. In the mean --probably too long-- time until some significant progress is made, any technological asistence that can be rendered is, I'd say, useful.)

(13) By Warren Young (wyoung) on 2023-10-10 22:00:11 in reply to 12 [link] [source]

they reject as part of the past identity they want to leave behind.

Then they can create a new login on the repo and begin committing under the new name.

I don't see that a desire to abandon a past identity requires Fossil to support counterfactual history rewriting.

following people through their various online activities and doxxing them

If that's illegal where you are and you have reasonable recourse to the law, that takes it out of the sphere of tech solutions, and if not, fix that instead.

optimism regarding the social change needed to avoid this specific kind of harassment is, I think, somewhat naïve,

I did say I expect it to take generations, and you only have to look back over the past few generations to see quite a lot of it occurring.

I get that it's tempting to wish for instant total global social change by twiddling some knobs in the tech, but that doesn't change the people. Only time and societal will does that.

any technological asistence that can be rendered is, I'd say, useful.)

Be specific.

We've explained how "fixing" this in the Git context requires invalidating the entire repo, and how it doesn't in Fossil. What more do you want or expect?

That's not a brush-off, it's an invitation to get specific. What would be necessary to please you? Why do you believe that would that work without creating worse problems?

(14) By spindrift on 2023-10-11 19:40:51 in reply to 10.1 [link] [source]

Interestingly, on an off-topic issue, while I see that your reply below has closed the thread, it appears that replies are still possible.

Is this expected, or a separate fossil forum bug?

(15.1) By Stephan Beal (stephan) on 2023-10-11 20:45:47 edited from 15.0 in reply to 14 [link] [source]

Is this expected, or a separate fossil forum bug?

Closing applies to a given post and all responses, so it'd possible to close of individual branches of a thread. In order to close a whole thread, the top-most post has to be closed. That was the intent of this particular closure and it will be corrected momentarily.

PS: additionally, forum admins can always respond to closed posts.