Name | Modified | Size | Downloads / Week |
---|---|---|---|
licenses | 2019-02-27 | ||
src | 2019-02-27 | ||
README.md | 2019-02-27 | 3.9 kB | |
Totals: 3 Items | 3.9 kB | 0 |
Contact information
Any feedback will be appreciated. You can email us at lihook48931@gmail.com
background
The open source software (OSS) movement has gained momentum in recent years. A large number of OSS projects are freely available for public downloads. OSS improves software reusability and reduces cost as software developers can learn from source code of existing OSS projects. In fact, software license analysis is a pre-condition for legally taking components for reuse, or for modification for a specific purpose. The potential effects of having only limited rights to reuse and modify software components need to be taken into account in some manner, and for large systems this is a nontrivial process. This being so, it is helpful to have automated tools that can provide information regarding the use of the licenses and other general views of the software.
In making decisions on the potential reuse of software components (and modules), it is necessary to take licensing into account a reuse-support tool or environment would provide a versatile profileof the candidate modules, regarding their reusability.
introduction
We collected 92 standard licenses text with continuously update. and We propose a method to mark the comments of license as sentence-token. We use the term sentence-token to refer to a sentence of a known license. A license (both by-inclusion or by-reference) is a sequence of sentence-tokens. Sentence-tokens are generalized using one or more regular expressions. we propose an idea for license identification based on the analysis of each sentence in the license statement of a source code file.
We refer to the pair <sentence-token, regular expression> as a sentence-token expression. The objective of this set is to translate each sentence found in the licensing statement into a sentence of a known license (a sentence-token).
We have identified 417 sentencetoken expressions. For example, two of the sentencetoken expressions matching variants of the sentence-token “GPLGen” (the license is inside a given file name) are:
GPLGen:00:1:^([^,;]+) is <licensed> under the terms of the GPL,? <version>$:2 GPLGen:01:1:^([^,;]+) is <LICENSED> under the GPL, <VERSION>$:
Each license corresponds to a sequence of one or more sentence-tokens (which we call a license rule) plus a set of non-critical sentence tokens (which we call its associate sentence-tokens). Most by-inclusion licenses require matching two or more consecutive sentence-tokens. For example, the BSD2 license rule is the sequence of 5 sentence-tokens, <BSDpre, BSDcondSource, BSDcondBinary, BSDasIs, BSDWarr>;
License
Copyright (C) 2009-2014 likooh.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
annex
licensetoken.dict: 417 sentence-token expression. rules.dict: the rule of license.