[plsql] OOM/Endless loop while parsing (PL)SQL
A source code analyzer
Brought to you by:
adangel,
juansotuyo
When parsing (pl)SQL, some constructs in files cause an infinite loop in the PL/SQL parser. This loop keeps creating Token instances that finally result in an OOM - depending on memory.
I have the fix for this; I will add a pull request for it.
Another but related issue is related to the language filter that is used to "filter" files in directories that should be scanned. This filter list is constructed from the rules selected. The filter is however not applied to the FILES that are added on the command line..
See https://github.com/pmd/pmd/pull/119
Regarding the second issue: Are you taking about CPD or PMD? At least for CPD, this is correct behavior, as it is assumed that you know what you do, if you explicitely name the files on the command line :) See also https://github.com/adangel/pmd/pull/30 . For PMD, I would expect the same behavior...
An example file that causes the loop is attached. It is trivial:
=== start
create or replace force view oxa.o_xa_function_role_types as
select "CFT_ID","CFR_ID","CFT_NAME","TCN","LOG_MODULE","LOG_USER","LOG_DATE","LOG_TIME" from crm_function_role_types
/
=== end
If you change the following in PLSQLParser:
public static void main(String[] args)
throws Exception {
this will exhibit the issue.
For the second issue:
CPD for me had so many problems that I did a very big rewrite for it. Core problems were:
. bad performance; it took way too long to scan all files in our code
. incorrect/incomplete character encoding handling (we have multiple modules having different encoding)
. inconsistent report of duplications: changing the order of input files provides wildly different results(!) even though files have not changed at all!
. duplicate reporting of overlapping regions
Since the rewrite is quite big and was done quite a while ago I am not sure whether you would like that as a change?
So, the issue is with PMD. And I think it is easy to prove that the behavior is wrong. If I start PMD with a rule set that does not contain any PLSQL rules, then parsing of these files is useless because the only result they can possibly give is parsing errors - no rule violations. So even if I do pass them on the command line - the result is just a longer run time (and a loop if you're unlucky ;)
Hope I explained it well.. By itself this issue is not a real problem; it's just inconsistent in my opinion.
Last edit: Frits Jalvingh 2016-10-19
Hi Frits, thanks for the sample code and the explantation. I created [#1536] for your second issue. I'm not sure yet, what would be the correct behavior - or more consistent. How did you execute PMD? I guess through command line... (as you mentiond
-d
...). How did you create the command line? Via a shell script or something similar?About your CPD improvements - yes, I would be very interested in your changes! Maybe we could backport those into the main CPD code, so that everybody can benefit :) Do you have the fork somewhere online available?
Regards,
Andreas
Related
Issues:
#1536This bug will be fixed with PMD 5.3.8, 5.4.3, 5.5.2 and later.
Commit: https://github.com/pmd/pmd/commit/4310b3634474c267bc8842461d226ecc223b7546 (cherry-picked from https://github.com/pmd/pmd/commit/42d5a402c704f8893e04fb76df05fd3219cd2fbe)