File History

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

History of tools/cvs2fossil/lib/c2f_prev.tcl

Let's bring down the number of eol-spaces someswhat more. No functional changes. file: [20798784] check-in: [e757cd3d] user: jan.nijtmans branch: trunk, size: 61718
Extended main import method (pushto) to handle all types of changesets, not only revisions. Tag changesets lead to tagging of imported revisions, branch changesets reflect the proper location where branches start, and make it possible to handle tagging of branches without revisions as well. Modified code returning changesets for a projects to return all, not only revision, in sync with the previous. Changed the code determining tag/branch lod's to use table 'preferedparent'. file: [b2c73145] check-in: [983090a3] user: aku branch: trunk, size: 61723
Updated method 'drop' of changesets, the in-memory parts have migrated to 'destroy' as part of the work on pass InitCSets. file: [2f48609c] check-in: [8dd5afbc] user: aku branch: trunk, size: 57982
Updated my notes regarding memory usage. Converted more locations to incremental query processing via 'state foreachrow', now throughout the importer. file: [dfc591f7] check-in: [f637d422] user: aku branch: trunk, size: 58323
New command 'state foreachrow' for incremental result processing, using less memory. Converted a number of places in pass InitCSet to this command, and marked a number of othre places for possible future use. file: [f8ed1ba0] check-in: [6559f323] user: aku branch: trunk, size: 58292
Plugged memory leak in changeset destructor. Updated commentary. Reformatting of a few integrity checks for readability. file: [53853a44] check-in: [4b0f43fb] user: aku branch: trunk, size: 57968
Changed the encoding of the values stored in DEPC. Keep only start/end of the range, not the list of all positions in it. That caused the memory-blowup. file: [1e64f84c] check-in: [59b54efa] user: aku branch: trunk, size: 58106
Split internals of breakinternaldependencies into more manageable pieces in prep for upcoming work on the handling of pseudo-dependencies. file: [4afc7209] check-in: [530168ec] user: aku branch: trunk, size: 57701
Merged bugfix [b3d61d7829] into the main branch for optimization of memory usage. file: [e87b9bea] check-in: [efec424a] user: aku branch: trunk, size: 56920
Merged bugfix [b3d61d7829] into this semi-abandoned branch just in case we will work on it again. Do it now instead of forgetting it later. file: [20a9a93e] check-in: [383c10f0] user: aku branch: trunk, size: 56127
Fixed bug made in [f46458d5bd] which prevented the saving of the changesets generated by the breaking of the internal dependencies. file: [85ce3459] check-in: [b3d61d78] user: aku branch: trunk, size: 55719
Added high-level logging for memory tracing to the code breaking the preliminary changesets. First runs indicate that the DEPC array becomes so very large, caused by a high amount of indirect dependencies (several hundred). file: [c3cfc3bb] check-in: [c2ad73ed] user: aku branch: trunk, size: 56866
Modified the changeset class to move handling of the changeset lists to fully after their creation and storage. This is item (3) in cvsfossil.txt. The results do not satisfy however. During the creation of each changeset memory usage is (fractonally) lower, however at the end, after all changesets haven been loaded memory usage is consistently higher. The reason for that is not known. I am saving this for possible future evolution and usage, but will not pursue this further right now. The gains seem to be too small compared to the overall loss. InitializeBreakstate is likely a better target, despite its complexity. file: [604b326c] check-in: [faf57d74] user: aku branch: trunk, size: 56073
Reworked the basic structure of pass InitCSets to keep memory consumption down. Now incremental creates, breaks, saves, and releases changesets, instead of piling them on before saving all at the end. Memory tracking confirms that this changes the accumulating mountain into a near-constant usage, with the expected spikes from the breaking. file: [cc210d25] check-in: [f46458d5] user: aku branch: trunk, size: 55665
Extended pass InitCsets and underlying code with more log output geared towards memory introspection, and added markers for special locations. Extended my notes with general observations from the first test runs over my example CVS repositories. file: [433b16c0] check-in: [27ed4f7d] user: aku branch: trunk, size: 55153
Tuned the handling of the vendor branch in case we have multiple different symbols representing it. The import pass now effectively merges these symbols into a single line of development. file: [d233796d] check-in: [6d5de5f1] user: aku branch: trunk, size: 54833
Properly initialize the array containing the changesets split by type. file: [c1187b45] check-in: [21d9664f] user: aku branch: trunk, size: 54354
Updated the copyright information of all files touched in the new year. file: [90fc37a9] check-in: [66235f24] user: aku branch: trunk, size: 54302
Get the line of development for changesets directly from the items and their lod references. The in-memory data from the meta table is out-of-date since the adjustment of parents in pass 'FilterSymbols'. Print the LOD information when sorting the changesets. file: [3b42fd43] check-in: [0d13da30] user: aku branch: trunk, size: 54297
Added tracking of file removal in changesets. file: [a49a0c12] check-in: [c9270189] user: aku branch: trunk, size: 52673
Accept a last trunk-changeset on a vendor branch with the :trunk: already defined, and warn. Force changeset to be vendor-only, out of trunk. file: [f526a431] check-in: [a1bbf19d] user: aku branch: trunk, size: 52658
Reworked the revision import to use the new state tracking system instead of the simple array. Moved some log outputs. Added a file listing the known problems to prevent me from forgetting stuff as it piles up :/ file: [c36a0d49] check-in: [e1dbf318] user: aku branch: trunk, size: 52319
Changeset handling, extended logging of how parent is determined. Fossil access, fixed importrev call to use correct workspace/repository. Fixed handling of output, stripping unwanted text, checking of output syntax. Extended logging. Added final 'rebuild'. NOTE: formation of the changesets/manifests is buggy, is not tracking unchanged files across changesets. Further not yet tracking when files have been removed. file: [b5624625] check-in: [9214c118] user: aku branch: trunk, size: 52859
Fix use (scoping) of revision items when looking for vendor branch data. file: [fd889684] check-in: [b405f4fc] user: aku branch: trunk, size: 52597
Reworked the code determining the parent of the currently committed changeset. It should now handle the transition from vendor branch to trunk correctly. file: [301208cc] check-in: [e8efbc31] user: aku branch: trunk, size: 52584
Tinkered with the revision information transfered from a changeset to push, to the fossil accessor code, modified the logging as well. file: [3d8ecf2b] check-in: [7c43583d] user: aku branch: trunk, size: 51255
Moved the most complex parts of pushto into their own commands. file: [696f01d0] check-in: [3cd599ca] user: aku branch: trunk, size: 51233
Added basic import of changesets. Note that this code is incomplete with regard to handling the various possible interactions between a vendor-branh and trunk. file: [5b323107] check-in: [348e45b0] user: aku branch: trunk, size: 50885
Broke package dependency cycle introduced when moving the cset load code from the InitCsets pass to the cset class. file: [079c4b00] check-in: [9e1b461b] user: aku branch: trunk, size: 47329
Moved the code loading changesets from state to its proper class. file: [6bd4e144] check-in: [49dd66f6] user: aku branch: trunk, size: 47455
More comments on sql statements. file: [08cfcb07] check-in: [6809145e] user: aku branch: trunk, size: 46843
Went to explicit var-substitution for the dynamic sql queries, makes formatting easier. file: [28af7570] check-in: [0ee9711e] user: aku branch: trunk, size: 43845
Removed lots of now dead code. Added a note to the last remaining user of the changeset method 'nextmap'. file: [9958d4b1] check-in: [3c0ef2c3] user: aku branch: trunk, size: 43355
Reworked ComputeLimits in the last breaker pass. Moved the heavy computation of the max predecessor / min successor data down to the sql in the changeset class. file: [f2b05566] check-in: [711e0002] user: aku branch: trunk, size: 49838
The performance was still not satisfying, even with faster recomputing of successors. Doing it multiple times (Building the graph in each breaker and sort passes) eats time. Caching in memory blows the memory. Chosen solution: Cache this information in the database.    Created a new pass 'CsetDeps' which is run between 'InitCsets' and 'BreakRevCsetCycles' (i.e. changeset creation and first breaker pass). It computes the changeset dependencies from the file-level dependencies once and saves the result in the state, in the new table 'cssuccessor'. Now the breaker and sort passes can get the information quickly, with virtually no effort. The dependencies are recomputed incrementally when a changeset is split by one of the breaker passes, for its fragments and its predecessors.    The loop check is now trivial, and integrated into the successor computation, with the heavy lifting for the detailed analysis and reporting moved down into the type-dependent SQL queries. The relevant new method is 'loops'. Now that the loop check is incremental the pass based checks have been removed from the integrity module, and the option '--loopcheck' has been eliminated. For paranoia the graph setup and modification code got its loop check reinstated as an assert, redusing the changeset report code.    Renumbered the breaker and sort passes. A number of places, like graph setup and traversal, loading of changesets, etc. got feedback indicators to show their progress.    The selection of revision and symbol changesets for the associated breaker passes was a bit on the slow side. We now keep changeset lists sorted by type (during loading or general construction) and access them directly. file: [bf483d02] check-in: [00bf8c19] user: aku branch: trunk, size: 48309
Bugfix. Typo. file: [c123dc0a] check-in: [c7847514] user: aku branch: trunk, size: 44539
Fix table linkage in query, and duplicated conditions :( file: [9c2b1c44] check-in: [f7cca3f0] user: aku branch: trunk, size: 44535
Performance bugfix. nextmap/premap can still be performance killers and memory hogs. Moved the computation of sucessor changesets down to the type-dependent code (new methods) and the SQL database, i.e. the C level. In the current setup it was possible that the DB would deliver us millions of file-level dependency pairs which the Tcl level would then reduce to tens of actual changeset dependencies. Tcl did not cope well with that amount of data. Now the reduction happens in the query itself. A concrete example was a branch in the Tcl CVS generating nearly 9 million pairs, which reduced to roughly 200 changeset dependencies. This blew the memory out of the water and the converter ground to a halt, busily swapping. Ok, causes behind us, also added another index on 'csitem(iid)' to speed the search for changesets from the revisions, tags, and branches. file: [a4cf3579] check-in: [9c570550] user: aku branch: trunk, size: 44627
Bugfix. Have the symbol dependency retrieval commands actually return something. file: [e427524c] check-in: [71201058] user: aku branch: trunk, size: 40774
Deactivated caching of the nextmap/premap data, with the indices the retrieval seems to be fast enough to allow us to reduce mem consumption. Tweaked log output, and sql formatting. file: [9054f716] check-in: [ac026148] user: aku branch: trunk, size: 40696
Bugfix in ValidateFragments, tweaked comment a bit, bugfix in SQL, reordered tables in the successor/predecessor queries a bit to show the actual progression of their use. file: [0103dd59] check-in: [fbfb5318] user: aku branch: trunk, size: 40576
Easier name for self-referential changesets, loopcheck. Made conditional on option --loopcheck, default off, and avoided if the general checks on changesets report trouble. Reinstated the loop check in the cycle breaker core in simpler form, reusing the new command in the changeset class. file: [6a7202a9] check-in: [0af7a3c8] user: aku branch: trunk, size: 40511
Moved the integrity checks for split fragments into separate command. Reworked breaking of internal dependencies to contrain the length of the pending list. That part of the system is still a memory hog, especially for large changesets. Added notes about this and the successor retrieval being a bottleneck. file: [b900e055] check-in: [c14e8f84] user: aku branch: trunk, size: 40523
Fixed bug in new changeset code, tagged and untagged item lists went out of sync. file: [e7ab6ef5] check-in: [facb4a87] user: aku branch: trunk, size: 38913
Replaced the checks for self-referential changesets in the cycle breaker with a scheme in the changeset class doing checks when splitting a changeset, which is also called by the general changeset integrity code, after each pass. Extended log output at high verbosity levels. Thorough checking of the fragments a changeset is to be split into. file: [6ad5e037] check-in: [b42cff97] user: aku branch: trunk, size: 38868
Renamed state table 'csrevision' to 'csitem' to reflect the new internals of changesets. Updated all places where it is used. file: [1c3ee2d5] check-in: [80b1e893] user: aku branch: trunk, size: 35513
Renamed changeset method to describe modified results, and updated the one invoker file: [f84e4346] check-in: [61829b07] user: aku branch: trunk, size: 35523
Renamed changeset method to describe modified results, and updated the one invoker. Modified the sorting of time ranges. Now by max, min as tiebreaker, and object name as last tiebreaker. file: [aa2d0f35] check-in: [04d76a9e] user: aku branch: trunk, size: 35535
Simplified some code dealing with the item -> changeset map, using the changed semantics (1:n -> 1:1). file: [ee1979da] check-in: [39e19c0c] user: aku branch: trunk, size: 35532
Brought the variable names into alignment with the semantics, now again naming what is stored in them. file: [01c2b1ab] check-in: [deab4d03] user: aku branch: trunk, size: 35765
Reworked the in-memory databases of changesets. Objects now hold items, not only revisions. Tags, and branches are new possibilities. Lists of ids go to the type-dependent retrieval command. List of tagged items (type/id pairs) come back, and are in the API. The 1:n map revisions to changesets is now an 1:1-map tagged items to changeset. file: [2ef585bd] check-in: [0fcfbf78] user: aku branch: trunk, size: 35808
Implemented time ranges and dependency retrieval for the tag and branch based changesets. file: [ebc1cdf1] check-in: [b1666f8f] user: aku branch: trunk, size: 35323
Moved the existing successor/predecessor code from main class to the proper singleton. Fixed config of main class, isn't simple dispatch any longer. Simplified calculation of the readable representation of changesets and removed code which has become superfluous. file: [15d330b0] check-in: [70d22835] user: aku branch: trunk, size: 31573
Integrate the new singletons with the main class, route the relevant places to them. file: [225212ee] check-in: [c74fe3de] user: aku branch: trunk, size: 31099
This commit starts a series of incremental changes not quite completely overhauling the handling of changesets, i.e. of project-level revisions. Actually this series of changes already started with [8ce7ffff21] as the bug it fixes was found when the first parts of the overhaul tripped the new integrity conditions for the modified changesets.    Background: In the last few days I repeatedly ran against the wall of an assertion in pass 9, last of the cycle breakers, with the revision changesets coming in out of order when the symbols were added to the dependency graph.    While walking to the office, and later re-reading the relevant parts of cvs2svn again I had several insights. Trigger was the realization that giving the tag changesets successor dependencies was wrong. Tags describe a state, they have no successors. This caused the re-read, and I recognized that my handling of the symbol changesets was completely wrong, that with using revisions as their data. It should have been the tags and branches. From there their actual dependencies (versus my reuse of revision dependencies) fell out naturally.    I have decided to commit my rewrite of the internals incrementally to make it easier to follow them, despite leaving the sourcebase in an unusable state during the series. One big commit would be much more difficult to understand.    The central change is to the changeset class, which becomes a trinity, holding either revisions, tags, or branches, with type-dependent code to retrieve their dependencies. The remainder of the changes are 'just' adaptions of the users to the changed API.    First change: Add outline of the helper classes, singletons actually, to hold the type-dependent functionality. file: [dd4e59c4] check-in: [27b15b70] user: aku branch: trunk, size: 30503
Added convenience method for assertions and used it in place of the existing if/trouble internal constructions. Changed API of 'log write' so that we can defer substituation of the message to when the write actually happen, and converted all places which would be hit by double-substitution. The remaining 'log write' calls will be converted incrementally. file: [836578d1] check-in: [47d52d1e] user: aku branch: trunk, size: 27658
Changesets, extended human readable representation, and tweaking of log output. file: [21310ece] check-in: [911d56a8] user: aku branch: trunk, size: 27662
Bugfixes when generating revision changesets. (1) The dependencies for a revision are a list, not single. (2) Use pseudo-dependencies to separate revisions of the same file from each other if they have no direct dependencies in the state. file: [c2418271] check-in: [67876506] user: aku branch: trunk, size: 27339
Code cleanup. Removed trailing whitespace across the board. file: [92ffd4d5] check-in: [b679ca33] user: aku branch: trunk, size: 25798
Changesets: Added accessor for to retrieve number of changesets known, and dropped the "trunk root -> NTDB root" dependency, is problematic. file: [e10db583] check-in: [96167b2a] user: aku branch: trunk, size: 25806
Tweaked human readable representation of changesets to include their type. file: [41cc33d8] check-in: [0868adf9] user: aku branch: trunk, size: 25844
Created convenience methods to create the human readable repesentation of a changeset and lists of such, and made liberal use of them. file: [82795e9f] check-in: [87cf6090] user: aku branch: trunk, size: 25836
Bugfix in changeset class. Documented and fixed the SQL statements pulling the successor and predecessor information out of the state. It mishandled the Trunk <-> NTDB transitions. file: [d8c97279] check-in: [184c5632] user: aku branch: trunk, size: 25649
Bugfix in changeset class. Forgot to update the map from revisions to containing changesets when breaking the internal dependencies of the initial changesets. This affected only the first fragment as all the revisions put into separate fragments where still pointing to the original changeset. This lead to bogus links at the level of changesets, the changeset was seemingly still referencing itself. file: [13709538] check-in: [17ec2d68] user: aku branch: trunk, size: 22066
Bugfix in the changeset class. The index from revisions to containing changesets is not 1:1, but 1:n. While only one revision changeset is possible there can also be zero or more symbol changesets. file: [d3b49cf4] check-in: [8c9030e3] user: aku branch: trunk, size: 21475
Added a number of assertions and must-not-happens with associated log output. Plus some small tweaks, and notes. file: [3c9d1e00] check-in: [eabaea87] user: aku branch: trunk, size: 20707
Fixed bug in the initialization of mybranchcode for changesets. file: [6a863159] check-in: [47e271a4] user: aku branch: trunk, size: 20558
Continued work on pass 8. Completed the handling of backward branches, file level analysis and splitting them. Extended changesets with the necessary methods to the predecessor data and proper per-revision maps. file: [e0ef5599] check-in: [e50f9ed5] user: aku branch: trunk, size: 20588
Continued work on pass 8, added outline for handling of retrograde branches, extended changesets with predicate allowing us to find the branch changesets. file: [75f13681] check-in: [4866889e] user: aku branch: trunk, size: 18881
Extended changeset class with in-memory database mapping from changeset ids to the proper object, and extended the objects with position information and associated accessors. Extended pass 8 to load the commit order computed in pass 6, this is stored in the new position slot of changesets, and an inverted index mapping from position to changeset at that position. file: [1769e1da] check-in: [de4cff41] user: aku branch: trunk, size: 18384
Modified the API for the construction of changesets a bit, now allowing their construction with the correct id, instead of correcting it later. Updated pass 5 to use this, and fixed bug where the id counter for changesets was left uninitialized, allowing the improper generation of duplicate ids. file: [04c278bb] check-in: [65be27aa] user: aku branch: trunk, size: 17991
Moved the functionality for splitting a changeset based on the sets of revisions for the fragments to be into a separate command, and into the changeset class, for use outside of changeset links. file: [da85eb5f] check-in: [59207428] user: aku branch: trunk, size: 17976
Added convenience command to the state package when the sql returns a single row. Added more statistics about revisions, tags, branches, symbols, changesets to various passes. file: [458c8956] check-in: [96b7bfb8] user: aku branch: trunk, size: 17413
Completed pass 7, breaking dependency cycles over symbol changesets. Moved the bulk of the cycle breaker code into its own class as it was common to the passes 6 and 7, and updated the two passes accordingly. Added code to load the changeset counter from the state to start properly. file: [bd4777dd] check-in: [770a9b57] user: aku branch: trunk, size: 17427
Completed pass 6, wrote the code performing the breaking of cycles. Done by analysing each triple of changesets in the cycle at the file dependency level to see which revisions can be sorted apart. Added some additional utility routines. Extended the changeset class with the accessors required by the cycle breaker. file: [b05edeeb] check-in: [94c39d63] user: aku branch: trunk, size: 17246
Continued work on pass 6. Completed creation of changeset graph (nodes, dependencies), started on topological iteration and breaking cycles. Basic iteration is complete, fiding a cycle ditto. Not yet done is to actually break a found cycle. Extended the changeset class with the necessary accessor methods (getting cset type, successors, time range). Note: Looking at my code it may be that my decision to save the cset order caused this pass to subsume the RevisionTopologicalSortPass of cvs2svn. Check again when I am done. Note 2: The test case (tcl repository, tcl project) had no cycles. file: [27b0babb] check-in: [85bd219d] user: aku branch: trunk, size: 15627
Reworked the in-memory storage of changesets in pass 5 and supporting classes, and added loading of changesets from the persistent state for when the pass is skipped. file: [971e3d4d] check-in: [24c0b662] user: aku branch: trunk, size: 14801
Rewrote the algorithm for breaking internal dependencies to my liking. The complex part handling multiple splits has moved from the pass code to the changeset class itself, reusing the state computed for the first split. The state is a bit more complex to allow for its incremental update after a break has been done. Factored major pieces into separate procedures to keep the highlevel code readable. Added lots of official log output to help debugging in case of trouble. file: [987e3e6c] check-in: [08ebab80] user: aku branch: trunk, size: 14582
Oops. pass 5 is not complete. Missed the breaking of internal dependencies, this is done in this pass already. Extended pass _2_ and file revisions with code to save the branchchildren (possible dependencies), and pass 5 and changesets with the proper algorithm. From cvs2svn, works, do not truly like it, as it throws away and recomputes a lot of state after each split of a cset. Could update and reuse the state to perform all splits in one go. Will try that next, for now we have a working form in the code base. file: [d69fb888] check-in: [95af789e] user: aku branch: trunk, size: 8298
Completed pass 5, computing the initial set of changesets. Defined persistent structure and filled out the long-existing placeholder class (project::rev). file: [855ccb92] check-in: [5f7acef8] user: aku branch: trunk, size: 3135
Added a lot of skeleton files for the revision and symbol data structures, for both project and file level. file: [72f0105a] check-in: [84de38d7] user: aku branch: trunk, size: 1676 Added