Fossil

Check-in [9c570550]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Performance bugfix. nextmap/premap can still be performance killers and memory hogs. Moved the computation of sucessor changesets down to the type-dependent code (new methods) and the SQL database, i.e. the C level. In the current setup it was possible that the DB would deliver us millions of file-level dependency pairs which the Tcl level would then reduce to tens of actual changeset dependencies. Tcl did not cope well with that amount of data. Now the reduction happens in the query itself. A concrete example was a branch in the Tcl CVS generating nearly 9 million pairs, which reduced to roughly 200 changeset dependencies. This blew the memory out of the water and the converter ground to a halt, busily swapping. Ok, causes behind us, also added another index on 'csitem(iid)' to speed the search for changesets from the revisions, tags, and branches.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1:9c5705502507e993fa2486d31093571139db2128
User & Date: aku 2007-12-02 05:49:00
Context
2007-12-02
06:17
Fix table linkage in query, and duplicated conditions :( check-in: f7cca3f0 user: aku tags: trunk
05:49
Performance bugfix. nextmap/premap can still be performance killers and memory hogs. Moved the computation of sucessor changesets down to the type-dependent code (new methods) and the SQL database, i.e. the C level. In the current setup it was possible that the DB would deliver us millions of file-level dependency pairs which the Tcl level would then reduce to tens of actual changeset dependencies. Tcl did not cope well with that amount of data. Now the reduction happens in the query itself. A concrete example was a branch in the Tcl CVS generating nearly 9 million pairs, which reduced to roughly 200 changeset dependencies. This blew the memory out of the water and the converter ground to a halt, busily swapping. Ok, causes behind us, also added another index on 'csitem(iid)' to speed the search for changesets from the revisions, tags, and branches. check-in: 9c570550 user: aku tags: trunk
04:55
Bugfix. Have the symbol dependency retrieval commands actually return something. check-in: 71201058 user: aku tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to tools/cvs2fossil/lib/c2f_pinitcsets.tcl.

91
92
93
94
95
96
97
98

99
100
101
102
103
104
105

	state writing csitem {
	    cid  INTEGER  NOT NULL  REFERENCES changeset,
	    pos  INTEGER  NOT NULL,
	    iid  INTEGER  NOT NULL, -- REFERENCES revision|tag|branch
	    UNIQUE (cid, pos),
	    UNIQUE (cid, iid)
	}


	project::rev getcstypes
	return
    }

    typemethod load {} {
	# Pass manager interface. Executed to load data computed by







|
>







91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

	state writing csitem {
	    cid  INTEGER  NOT NULL  REFERENCES changeset,
	    pos  INTEGER  NOT NULL,
	    iid  INTEGER  NOT NULL, -- REFERENCES revision|tag|branch
	    UNIQUE (cid, pos),
	    UNIQUE (cid, iid)
	} { iid }
	# Index on: iid (successor/predecessor retrieval)

	project::rev getcstypes
	return
    }

    typemethod load {} {
	# Pass manager interface. Executed to load data computed by

Changes to tools/cvs2fossil/lib/c2f_prev.tcl.

84
85
86
87
88
89
90







91
92
93
94





95
96
97
98
99
100
101
102
103
104
105
106
107
108

109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
....
1055
1056
1057
1058
1059
1060
1061


























































1062
1063
1064
1065
1066
1067
1068
....
1136
1137
1138
1139
1140
1141
1142






1143
1144
1145
1146
1147
1148
1149
....
1248
1249
1250
1251
1252
1253
1254







































1255
1256
1257
1258
1259
1260
1261
    delegate method bysymbol   to mytypeobj
    delegate method byrevision to mytypeobj
    delegate method isbranch   to mytypeobj
    delegate method istag      to mytypeobj

    method setpos {p} { set mypos $p ; return }
    method pos    {}  { return $mypos }








    # result = dict (item -> list (changeset))
    method successormap {} {
	# NOTE / FUTURE: Possible bottleneck.





	array set tmp {}
	foreach {rev children} [$self nextmap] {
	    foreach child $children {
		lappend tmp($rev) $myitemmap($child)
	    }
	    set tmp($rev) [lsort -unique $tmp($rev)]
	}
	return [array get tmp]
    }

    # result = list (changeset)
    method successors {} {
	# NOTE / FUTURE: Possible bottleneck.
	set csets {}

	foreach {_ children} [$self nextmap] {
	    foreach child $children {
		lappend csets $myitemmap($child)
	    }
	}
	return [lsort -unique $csets]
    }

    # result = dict (item -> list (changeset))
    method predecessormap {} {
	# NOTE / FUTURE: Possible bottleneck.
	array set tmp {}
	foreach {rev children} [$self premap] {
	    foreach child $children {
		lappend tmp($rev) $myitemmap($child)
	    }
	    set tmp($rev) [lsort -unique $tmp($rev)]
	}
................................................................................
	    WHERE  R.rid IN $theset
	    AND    B.first = R.rid
	"] {
	    lappend dependencies([list rev $rid]) [list sym::branch $parent]
	}
	return
    }


























































}

# # ## ### ##### ######## ############# #####################
## Helper singleton. Commands for tag symbol changesets.

snit::type ::vc::fossil::import::cvs::project::rev::sym::tag {
    typemethod byrevision {} { return 0 }
................................................................................
	    AND    T.sid = P.sid
	    AND    P.pid = TX.sid
	"] {
	    lappend dependencies([list sym::tag $tid]) [list sym::tag $parent]
	}
	return
    }






}

# # ## ### ##### ######## ############# #####################
## Helper singleton. Commands for branch symbol changesets.

snit::type ::vc::fossil::import::cvs::project::rev::sym::branch {
    typemethod byrevision {} { return 0 }
................................................................................
	    FROM   branch B, preferedparent P, tag T
	    WHERE  B.bid IN $theset
	    AND    B.sid = P.sid
	    AND    P.pid = T.sid
	"] {
	    lappend dependencies([list sym::branch $bid]) [list sym::tag $parent]
	}







































	return
    }

    # # ## ### ##### ######## #############
    ## Configuration

    pragma -hasinstances   no ; # singleton







>
>
>
>
>
>
>



|
>
>
>
>
>










|
|
|
<
>
|
|
|
|
<
<
<
<
<
<
<







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







 







>
>
>
>
>
>







 







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119

120
121
122
123
124







125
126
127
128
129
130
131
....
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
....
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
....
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
    delegate method bysymbol   to mytypeobj
    delegate method byrevision to mytypeobj
    delegate method isbranch   to mytypeobj
    delegate method istag      to mytypeobj

    method setpos {p} { set mypos $p ; return }
    method pos    {}  { return $mypos }

    # result = list (changeset)
    method successors {} {
	return [struct::list map \
		    [$mytypeobj cs_successors $myitems] \
		    [mytypemethod of]]
    }

    # result = dict (item -> list (changeset))
    method successormap {} {
	# NOTE / FUTURE: Definitive bottleneck (can be millions of pairs).
	#
	# Only user is pass 9, computing the limits of backward
	# branches per branch in the changeset. TODO: Fold that into
	# the SQL query, i.e. move the crunching from Tcl to C.

	array set tmp {}
	foreach {rev children} [$self nextmap] {
	    foreach child $children {
		lappend tmp($rev) $myitemmap($child)
	    }
	    set tmp($rev) [lsort -unique $tmp($rev)]
	}
	return [array get tmp]
    }

    # result = dict (item -> list (changeset))
    method predecessormap {} {
	# NOTE / FUTURE: Definitive bottleneck (can be millions of pairs).

	#
	# Only user is pass 9, computing the limits of backward
	# branches per branch in the changeset. TODO: Fold that into
	# the SQL query, i.e. move the crunching from Tcl to C.








	array set tmp {}
	foreach {rev children} [$self premap] {
	    foreach child $children {
		lappend tmp($rev) $myitemmap($child)
	    }
	    set tmp($rev) [lsort -unique $tmp($rev)]
	}
................................................................................
	    WHERE  R.rid IN $theset
	    AND    B.first = R.rid
	"] {
	    lappend dependencies([list rev $rid]) [list sym::branch $parent]
	}
	return
    }

    # result = list (changeset-id)
    typemethod cs_successors {revisions} {
        # This is a variant of 'successors' which maps the low-level
        # data directly to the associated changesets. I.e. instead
        # millions of dependency pairs (in extreme cases (Example: Tcl
        # CVS)) we return a very short and much more manageable list
        # of changesets.

	set theset ('[join $revisions {','}]')
	return [state run "
	    SELECT C.cid
	    FROM   revision R, csitem CI, changeset C
	    WHERE  R.rid   IN $theset     -- Restrict to revisions of interest
	    AND    R.child IS NOT NULL    -- Has primary child
            AND    CI.iid = R.rid
            AND    C.cid = CI.cid
            AND    C.type = 0
    UNION
	    SELECT C.cid
	    FROM   revision R, revisionbranchchildren B, csitem CI, changeset C
	    WHERE  R.rid   IN $theset     -- Restrict to revisions of interest
	    AND    R.rid = B.rid          -- Select subset of branch children
            AND    CI.iid = R.rid
            AND    C.cid = CI.cid
            AND    C.type = 0
    UNION
	    SELECT C.cid
	    FROM   revision R, revision RA, csitem CI, changeset C
	    WHERE  R.rid   IN $theset      -- Restrict to revisions of interest
	    AND    R.isdefault             -- Restrict to NTDB
	    AND    R.dbchild IS NOT NULL   -- and last NTDB belonging to trunk
	    AND    RA.rid = R.dbchild      -- Go directly to trunk root
	    AND    RA.child IS NOT NULL    -- Has primary child.
            AND    CI.iid = R.rid
            AND    C.cid = CI.cid
            AND    C.type = 0
            AND    CI.iid = R.rid
            AND    C.cid = CI.cid
            AND    C.type = 0
    UNION
	    SELECT C.cid
	    FROM   revision R, tag T, csitem CI, changeset C
	    WHERE  R.rid in $theset
	    AND    T.rev = R.rid
            AND    CI.iid = T.tid
            AND    C.cid = CI.cid
            AND    C.type = 1
    UNION
	    SELECT C.cid
	    FROM   revision R, branch B, csitem CI, changeset C
	    WHERE  R.rid in $theset
	    AND    B.root = R.rid
            AND    CI.iid = B.bid
            AND    C.cid = CI.cid
            AND    C.type = 2
	"]
    }
}

# # ## ### ##### ######## ############# #####################
## Helper singleton. Commands for tag symbol changesets.

snit::type ::vc::fossil::import::cvs::project::rev::sym::tag {
    typemethod byrevision {} { return 0 }
................................................................................
	    AND    T.sid = P.sid
	    AND    P.pid = TX.sid
	"] {
	    lappend dependencies([list sym::tag $tid]) [list sym::tag $parent]
	}
	return
    }

    # result = list (changeset-id)
    typemethod cs_successors {tags} {
	# Tags have no successors.
	return
    }
}

# # ## ### ##### ######## ############# #####################
## Helper singleton. Commands for branch symbol changesets.

snit::type ::vc::fossil::import::cvs::project::rev::sym::branch {
    typemethod byrevision {} { return 0 }
................................................................................
	    FROM   branch B, preferedparent P, tag T
	    WHERE  B.bid IN $theset
	    AND    B.sid = P.sid
	    AND    P.pid = T.sid
	"] {
	    lappend dependencies([list sym::branch $bid]) [list sym::tag $parent]
	}
	return
    }

    # result = list (changeset-id)
    typemethod cs_successors {branches} {
        # This is a variant of 'successors' which maps the low-level
        # data directly to the associated changesets. I.e. instead
        # millions of dependency pairs (in extreme cases (Example: Tcl
        # CVS)) we return a very short and much more manageable list
        # of changesets.

	set theset ('[join $branches {','}]')
        return [state run "
	    SELECT C.cid
	    FROM   branch B, revision R, csitem CI, changeset C
	    WHERE  B.bid IN $theset
	    AND    B.first = R.rid
            AND    CI.iid = R.rid
            AND    C.cid = CI.cid
            AND    C.type = 0
    UNION
	    SELECT C.cid
	    FROM   branch B, preferedparent P, branch BX, csitem CI, changeset C
	    WHERE  B.bid IN $theset
	    AND    B.sid = P.pid
	    AND    BX.sid = P.sid
            AND    CI.iid = BX.bid
            AND    C.cid = CI.cid
            AND    C.type = 2
    UNION
	    SELECT C.cid
	    FROM   branch B, preferedparent P, tag T, csitem CI, changeset C
	    WHERE  B.bid IN $theset
	    AND    B.sid = P.pid
	    AND    T.sid = P.sid
            AND    CI.iid = T.tid
            AND    C.cid = CI.cid
            AND    C.type = 1
	"]
	return
    }

    # # ## ### ##### ######## #############
    ## Configuration

    pragma -hasinstances   no ; # singleton