A Python tool which produced lifespan sequences from edits history.
The tool is first developed for the Wikipedia edits history but can easily be adapted for others applications. From a database containing for each article its list of revisions, produce one csv file per article containing authored sequences and lifespans.
Output format:
i,j,lifespan,author
with
- i : begining of the chars sequence
- j : end of the chars sequence
- lifespan : number of edits the sequence has survives until the lattest revision
- author : author id of the sequence.