Related to this, better support for branches would be nice.
* Filter reports to a particular branch
* Indicate which branch a ChangeSet was committed to
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Adam I have read the code. It is always hard to work with
someone elses code. So if you have and write up to help me
along, please send it to me. Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The code relating to CVS itself, the objects, much of the
log parsing etc code and the math code, is contained in the
CVSMonitor::MetaData::* namespace.
From the top down...
CVS Objects
------------
::Repository
Represents a repository.
Contains a number of ::Module's
::Module
A single "module" in the repository ( for CVS's very vague
definition of "module" ).
Contains ::File's and ::ChangeSet's.
::File
Represents a single CVS/RSS file.
Contains a number of ::Version's
::Version
Represents a single version/revision of a single file.
::ChangeSet
A virtual object derived from analysing all the ::Version's
in a ::Module. The math is done just after the module log
file is parsed into ::File's and ::Version's.
Links to a number of ::Version's directly, across a range of
::File's.
Also contains the "Karma" math.
::Delta
A ::Delta is an abstract parent module/interface used to
represent any "change" to a project. In our case, this
covers both a single ::Version and the virtual ::ChangeSet,
both of which represent changes to the code.
::Source
Because all the math, range, charting etc stuff is all based
on ::ChangeSets, we define the Interface ( logically like a
Java Interface, but without the enforcement stuff ) ::Source.
Anything that is a ::Source acts as a "source" of ChangeSet
objects, from which other code can do various math. Anything
that acts as a consumer of ChangeSet data generally takes it
from ::Source objects. All a child of ::Source is required
to do is to implement the "getChangeSets" method.
::Author
This is a virtual object representing an Author, from memory
it is derived at runtime as needed, by finding all the
distinct "author" values in all the ::Version's. ( It might
be done at update-time, but I can't recall exactly ).
::Activity
Activity is a set of changes. An ::Activity object is
essentially a logical Collection of ::ChangeSet objects that
can be modified.
How we do Search
------------------
A Search is primarily orentated around a single ::Activity
object.
For a ChangeSet search, we go like the following
1. cvsmonitor.pl:findSearchSpace - Create an empty
::Activity ( )
2. cvsmonitor.pl:findSearchSpace - Populate it with the
contents of one or more ::Source's ( Repositories or Modules )
3. cvsmonitor.pl:cmdSearchChangeSet - Apply a series of
filters to the ::Activity via $Activity->filter ( see
CVSMonitor/MetaData/Activity.pm:filter )
4. cvsmonitor.pl:cmdSearchChangeSet - Sort it using
$Activity->sort
5. Hand it off to viewSearchChangeSetResults( $Activity ) to
display the results.
So to add "Search by branch", you'd probably need to do a
couple of things.
1. Add another named filter to sub
CVSMonitor::MetaData::Activity::filter, to implement
whatever concept of branch/tag search we want ( substring,
regex, dropdown list of the tags available? ).
2. Add the HTML for the textbox or whatever to page
"Search/ChangeSet"
3. Add anything to cvsmonitor.pl:viewSearchChangeSet as
needed ( perhaps to re-enter the value if nothing is found )
4. Add the code to cvsmonitor.pl:cmdSearchChangeSet to see
if they put in a value, and to filter the ::Activity object.
On the seperate issue of Showing the branch/tag that a
ChangeSet is one... maybe you'll need either to logically
AND or OR ( there's different implications for each ) the
values for all the ::Version's within the ::ChangeSet... I'm
not sure exactly what the rule for the branch/tag of an
entire ::ChangeSet would be in our case...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Actually we will need another object for a file.
It will need to track the branch mappings.
Something I noticed is that a branch declaration always as
the suffix x.0.2
So :branch_new: 1.5.0.2.
We need to store this. When looking for a branch, people will
request information on branch_new and we need to map back
to the head revision for that branch for that file.
Are we having fun yet?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm thinking we might have to move much of this out to
another class... I can see the math we need, but it might be
computationally expensive, so in the long run we would
probably want to freeze the branch data out to the scratch
like we do for ChangeSets...
Another thing, and a problem I've already encountered and
fixed, is that you can't trust version numbers, they arn't
reliable and they jump around at the whim of the
adminstrator of the CVS repository.
However, it is possible to determine for any changeset, what
previous changeset was. The algorithm is in there somewhere...
The net result is we have a (very fast) ->previous method to
get the previous ::Version.
To identify all the branches for a given Version, we should
be able to just iterate up the ->previous links and
accumulate and branch tags we find.
That assumes a "branch" goes from the branch head all the
way to the initial addition of the file. To work out the
values for a "subranch" ( the point AT WHICH WE SPLIT
through to the head ) just take all a Versions branches and
subtract all those of it's ->previous Version if it exists.
To find the head of each branch, you want an algorithm that
goes over each version from oldest to newest, and sets the
value of hash of { branch => revision } to the current
Version for each branch of the version. At the end, the last
values left in these should be heads of branches... or
alternatively, I think you could go the reverse direction,
and do a first-in, instead of last-in...
Which means for a single file you should be able to derive
Using finding and storing THIS data struct, you should be
able to cheaply determine:
1. Entire branch, ->previous from branch-head to file root
2. Sub-branch, ->previous from branch-head to branch-root
3. "Is a version in a branch", ->previous until we hit
either the branch-root or the file-root
I hope this gives you some algorithm ideas...
I certainly think there's enough code here to warrant it's
own class though :)
Possibly ::File::Branch as a hash implementing the struct
above, and ::Module::Branch implementing some sort of
aggregate branch logic for all of the files in a
module/repository?
For a ChangeSet to be on a branch, EVERY version in the
ChangeSet would have to be within that version range,
although NOT on the root of it. I guess that means then
"each version is either the head or can previous back to the
root without hitting the root of another branch" and "the
previous of each is either the root or can walk back to the
root without hitting the root of another branch"... if you
get what I mean...
Thoughts?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you think of all the revisions in a file as a set of
linked nodes in a tree shape, any algorithm we implement
must be implemented _only_ by walking the links between nodes.
We can't use or trust version numbers for anything real
other than display purposes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=153576
This may become possible once we capture branch/tag
information properly.
Logged In: YES
user_id=975859
Related to this, better support for branches would be nice.
* Filter reports to a particular branch
* Indicate which branch a ChangeSet was committed to
Logged In: YES
user_id=385376
Adam I have read the code. It is always hard to work with
someone elses code. So if you have and write up to help me
along, please send it to me. Thanks.
Logged In: YES
user_id=153576
The code relating to CVS itself, the objects, much of the
log parsing etc code and the math code, is contained in the
CVSMonitor::MetaData::* namespace.
From the top down...
CVS Objects
------------
::Repository
Represents a repository.
Contains a number of ::Module's
::Module
A single "module" in the repository ( for CVS's very vague
definition of "module" ).
Contains ::File's and ::ChangeSet's.
::File
Represents a single CVS/RSS file.
Contains a number of ::Version's
::Version
Represents a single version/revision of a single file.
::ChangeSet
A virtual object derived from analysing all the ::Version's
in a ::Module. The math is done just after the module log
file is parsed into ::File's and ::Version's.
Links to a number of ::Version's directly, across a range of
::File's.
Also contains the "Karma" math.
::Delta
A ::Delta is an abstract parent module/interface used to
represent any "change" to a project. In our case, this
covers both a single ::Version and the virtual ::ChangeSet,
both of which represent changes to the code.
::Source
Because all the math, range, charting etc stuff is all based
on ::ChangeSets, we define the Interface ( logically like a
Java Interface, but without the enforcement stuff ) ::Source.
Anything that is a ::Source acts as a "source" of ChangeSet
objects, from which other code can do various math. Anything
that acts as a consumer of ChangeSet data generally takes it
from ::Source objects. All a child of ::Source is required
to do is to implement the "getChangeSets" method.
::Author
This is a virtual object representing an Author, from memory
it is derived at runtime as needed, by finding all the
distinct "author" values in all the ::Version's. ( It might
be done at update-time, but I can't recall exactly ).
::Activity
Activity is a set of changes. An ::Activity object is
essentially a logical Collection of ::ChangeSet objects that
can be modified.
How we do Search
------------------
A Search is primarily orentated around a single ::Activity
object.
For a ChangeSet search, we go like the following
1. cvsmonitor.pl:findSearchSpace - Create an empty
::Activity ( )
2. cvsmonitor.pl:findSearchSpace - Populate it with the
contents of one or more ::Source's ( Repositories or Modules )
3. cvsmonitor.pl:cmdSearchChangeSet - Apply a series of
filters to the ::Activity via $Activity->filter ( see
CVSMonitor/MetaData/Activity.pm:filter )
4. cvsmonitor.pl:cmdSearchChangeSet - Sort it using
$Activity->sort
5. Hand it off to viewSearchChangeSetResults( $Activity ) to
display the results.
So to add "Search by branch", you'd probably need to do a
couple of things.
1. Add another named filter to sub
CVSMonitor::MetaData::Activity::filter, to implement
whatever concept of branch/tag search we want ( substring,
regex, dropdown list of the tags available? ).
2. Add the HTML for the textbox or whatever to page
"Search/ChangeSet"
3. Add anything to cvsmonitor.pl:viewSearchChangeSet as
needed ( perhaps to re-enter the value if nothing is found )
4. Add the code to cvsmonitor.pl:cmdSearchChangeSet to see
if they put in a value, and to filter the ::Activity object.
-------------------------------------------------------------
On the seperate issue of Showing the branch/tag that a
ChangeSet is one... maybe you'll need either to logically
AND or OR ( there's different implications for each ) the
values for all the ::Version's within the ::ChangeSet... I'm
not sure exactly what the rule for the branch/tag of an
entire ::ChangeSet would be in our case...
Logged In: YES
user_id=385376
Actually we will need another object for a file.
It will need to track the branch mappings.
Something I noticed is that a branch declaration always as
the suffix x.0.2
So :branch_new: 1.5.0.2.
We need to store this. When looking for a branch, people will
request information on branch_new and we need to map back
to the head revision for that branch for that file.
Are we having fun yet?
Logged In: YES
user_id=153576
I'm thinking we might have to move much of this out to
another class... I can see the math we need, but it might be
computationally expensive, so in the long run we would
probably want to freeze the branch data out to the scratch
like we do for ChangeSets...
Another thing, and a problem I've already encountered and
fixed, is that you can't trust version numbers, they arn't
reliable and they jump around at the whim of the
adminstrator of the CVS repository.
However, it is possible to determine for any changeset, what
previous changeset was. The algorithm is in there somewhere...
The net result is we have a (very fast) ->previous method to
get the previous ::Version.
To identify all the branches for a given Version, we should
be able to just iterate up the ->previous links and
accumulate and branch tags we find.
That assumes a "branch" goes from the branch head all the
way to the initial addition of the file. To work out the
values for a "subranch" ( the point AT WHICH WE SPLIT
through to the head ) just take all a Versions branches and
subtract all those of it's ->previous Version if it exists.
To find the head of each branch, you want an algorithm that
goes over each version from oldest to newest, and sets the
value of hash of { branch => revision } to the current
Version for each branch of the version. At the end, the last
values left in these should be heads of branches... or
alternatively, I think you could go the reverse direction,
and do a first-in, instead of last-in...
Which means for a single file you should be able to derive
$File->{branches} = {
HEAD => [ '0.1', '0.353' ],
oracle-dev => [ '0.8', '0.8.0.42' ],
crackle => [ '0.8', '0.8.1.21' ],
oracle-subdev => [ '0.8.0.21', '0.8.0.21.0.41' ],
};
Using finding and storing THIS data struct, you should be
able to cheaply determine:
1. Entire branch, ->previous from branch-head to file root
2. Sub-branch, ->previous from branch-head to branch-root
3. "Is a version in a branch", ->previous until we hit
either the branch-root or the file-root
I hope this gives you some algorithm ideas...
I certainly think there's enough code here to warrant it's
own class though :)
Possibly ::File::Branch as a hash implementing the struct
above, and ::Module::Branch implementing some sort of
aggregate branch logic for all of the files in a
module/repository?
For a ChangeSet to be on a branch, EVERY version in the
ChangeSet would have to be within that version range,
although NOT on the root of it. I guess that means then
"each version is either the head or can previous back to the
root without hitting the root of another branch" and "the
previous of each is either the root or can walk back to the
root without hitting the root of another branch"... if you
get what I mean...
Thoughts?
Logged In: YES
user_id=153576
That should read
For any ::Version it's possibly to tell what the previous
::Version was
Logged In: YES
user_id=153576
And just to double clarify :)
If you think of all the revisions in a file as a set of
linked nodes in a tree shape, any algorithm we implement
must be implemented _only_ by walking the links between nodes.
We can't use or trust version numbers for anything real
other than display purposes.