on Stackoverflow.com a programmer issued a request for a fulltext-search code completion engine that is capable to match proposals that have a partial overlap only with the prefix entered by the user.
For example, the user may enter 'cod' and the proposal engine should propose 'hashCode' as one proposal that matches.
The request is described in more detail here: http://code-recommenders.blogspot.com/2011/05/subword-matching-completion-engine-for.html and http://www.eclipse.org/forums/index.php/t/209269/
I spent a few minutes on implementing a naive completion engine in Eclipse that uses plain regular expressions to find all proposals that match the regex. This works quite well, however, it's not ranking these proposals in a meaningful way. For instance, we may want proposals that share a common prefix or do share some subwords to be ranked higher than those that just match the regex.
While doing little research on this topic I came across your project and found the hint in code to ask here in the forum for some guidance which metric might fit my scenario best. Do you have any ideas which metrics might be most promising when used inside a code completion engine as matching and ranking algorithm?
I would start with Jaro or JaroWinkler as a first guess (JaroWinkler favours common prefix)
I'll give it a try. Thanks.
Log in to post a comment.