Fossil Forum

Bug with ADDED_BY_MERGE files?

Bug with ADDED_BY_MERGE files?

Bug with ADDED_BY_MERGE files?

(1) By Florian Balmer (florian.balmer) on 2021-09-16 06:28:20 [source]

I'm frequently transferring series of commit diffs, or partial or full commits to other branches or other repositories, mostly in a semi-automated fashion with some home-made scripts.

During one such undertaking, I encountered the following warning:

ERROR: [FILE] is different on disk compared to the repository
NOTICE: Repository version of [FILE] stored in [FILE-0123456789abcdef]
working checkout does not match what would have ended up in the repository: 00112233445566778899aabbccddeeff versus ffeeddccbbaa99887766554433221100

The message scared me and prompted me to deeply check all my disk drives, and to verify the integrity of all my backup media and recent backups, amounting to quite some work. That's why I'm reporting the issue with a minimal reproduction scenario.

The important point is that an ADDED_BY_MERGE file is replaced with another file of the same size (and optionally a time stamp from the past, i.e. extracted from a Fossil-generated tarball of a past check-in, but this is not required) before commit.

rem Copy/paste to command shell.
rem Skip the `rem' lines on *nix.
fossil init sample.fossil
fossil open sample.fossil
echo ABCDEF[6] > sample.txt
fossil add sample.txt
fossil ci -m t0 --no-warnings
fossil up prev
fossil merge tip
echo GHIJKL[6] > sample.txt
rem Set time stamp to 1 hour ago (not necessary for this test).
rem powershell -nologo -c "(Get-Item 'sample.txt').LastWriteTime=(Get-Date).AddHours(-1)"
fossil sql ".headers on" ".mode box" "SELECT * FROM vfile;"
fossil ci -m b0 --branch b0 --no-warnings
echo GHIJKLM[7] > sample.txt
fossil ci -m b0 --branch b0 --no-warnings

The commit only succeeds after changing sample.txt a second time so that it has a different file size.

Is this a bug?

And if yes, is this a fix?

Index: src/checkin.c
--- src/checkin.c
+++ src/checkin.c
@@ -2572,11 +2572,11 @@
   ** table. If there were arguments passed to this command, only
   ** the identified files are inserted (if they have been modified).
     "SELECT id, %Q || pathname, mrid, %s, %s, %s FROM vfile "
-    "WHERE chnged IN (1, 7, 9) AND NOT deleted AND is_selected(id)",
+    "WHERE chnged IN (1, 2, 3, 4, 5, 7, 9) AND NOT deleted AND is_selected(id)",
     glob_expr("pathname", db_get("crlf-glob",db_get("crnl-glob",""))),
     glob_expr("pathname", db_get("binary-glob","")),
     glob_expr("pathname", db_get("encoding-glob",""))

With this patch, the commit routine performs some extra work to detect changes in files already added to the VFILE table by merge operation (see comment on the VFILE.CHNGED field).

Maybe that my use case is not quite regular, but in the end it's just scripted replaying of merges and commits, with on-the-fly modifications to individual files, so nothing too fancy. You know, poor man's rebasing ;-)

Tests and patches based on Fossil [88e5336007].

(2.1) By Florian Balmer (florian.balmer) on 2021-09-16 10:30:53 edited from 2.0 in reply to 1 [link] [source]

P.S. Maybe I should elaborate why I think this might be a bug:

In Fossil, it's completely legal to modify an UPDATED_BY_MERGE file prior to committing -- in fact, this is even required to resolve merge conflicts, for example. So my feeling is that the same should be allowed with ADDED_BY_MERGE files.

P.P.S. I'd like to add it's already possible to edit ADDED_BY_MERGE files -- just not if the edited version happens to have the same size as the original.

(3) By Larry Brasfield (larrybr) on 2021-09-16 13:45:55 in reply to 2.1 [link] [source]

That size-preserving change clearly should be handled like ones where the size did change. So not Bug? but Bug!

(4) By Florian Balmer (florian.balmer) on 2021-09-17 11:14:50 in reply to 3 [link] [source]

Problem is: I'm not sure if my suggested fix is correct. I've tried several approaches, and the one posted above seems the least invasive and doesn't introduce new failures with the Fossil test suite.

The commit also succeeds after fossil changes is run, and this seems to fill in the VFILE.MTIME field, which is left empty by the merge operation. That's why I've added the fossil sql command to dump the VFILE table, hoping this is one more hint to someone with insight to verify or falsify my suggestion.

(5) By jamsek on 2021-09-17 11:52:41 in reply to 2.1 [link] [source]

P.P.S. I'd like to add it's already possible to edit ADDED_BY_MERGE files -- just not if the edited version happens to have the same size as the original.

If this wasn't the case, I'd likely argue that it is not a bug.

However, given the quote, I agree that it should be fixed.

(6) By Florian Balmer (florian.balmer) on 2021-09-17 12:16:33 in reply to 5 [link] [source]

Yes, the problem kind of touches the Fossil philosophy to do multiple checks before storing anything in the repository, that's the reason why it seems important to have an expert's opinion.

But consistent behavior also seems important -- which is your opinion too, according to your answer.

(7) By jamsek on 2021-09-17 12:22:58 in reply to 6 [link] [source]

Are you in the chat? I'd pose the question there, tbh.

Your change looks ok to me, and, yes, I think consistency is important. Your
diff is minimal, too, but there may well be some reason it is the way it is so
it can't hurt to get Richard or Stephan to review.

If you're not in the chat, I'll link your thread in there for more input.

(8) By Florian Balmer (florian.balmer) on 2021-09-17 13:33:43 in reply to 7 [link] [source]

Thanks for your interest.

I'm currently online sporadically -- so not in chat. But my experience is that forum posts also get attention, and there's no hurry with this issue. But feel free to ask this in chat.

(9) By Richard Hipp (drh) on 2021-09-17 15:02:53 in reply to 8 [link] [source]

I posted a patch (in the fossil patch format) on the chatroom. Please go get it and try it out.

Aside: I suppose we need a way to add small attachments like this to forum posts...

(10) By Larry Brasfield (larrybr) on 2021-09-17 15:07:32 in reply to 5 [link] [source]

It would have also been a bug if (upon commit) Fossil ignored edits to files added by a merge. If reconciling the merge's changes, whatever they are, whether to files long in the project or new, requires changes to new files, then those changes should be honored at the next commit. It would be weird for Fossil to require a pair of commits for this special case.1

Once the above point is granted, it is then clear that for Fossil to reject edits to newly added files just because they happen to retain their size is also a bug, (and an insidious one.)

  1. Imagine the commit comments:
    (1st commit) Merge trunk changes, almost. Do not build or test because (see next commit.)
    (2nd commit) Merge reconciliation required changes to this added file.

(11) By Florian Balmer (florian.balmer) on 2021-09-18 08:52:28 in reply to 9 [link] [source]

I too can confirm your patch fixes the problem. Thank you!

I tested commit-patch-1.patch (1536 bytes) attached to chat message #8397.

(12) By Florian Balmer (florian.balmer) on 2021-09-18 08:53:11 in reply to 10 [link] [source]

I see it the same way.

The "insidious" part for me was that I was lost in thought watching my script running, and suspected a pre-/post-commit checksum mismatch due to a hardware failure! It took me some time to reproduce the problem, but my relief was worth it ;-)