Menu

Using EPIC on file that includes Archive::Extract throws erroneous errors

Help
Scott
2016-03-31
2016-04-01
  • Scott

    Scott - 2016-03-31

    I wanted to see if anyone has had issues with this. Archive::Extract has a bug, but the code is valid. I have tried this with the latest release of Eclipse and the latest release of EPIC on a fresh install and still see the errors. I get erroneous syntax errors and unclosed parenthesis errors.

    Following is an e-mail I also sent to the Archive::Extract developer that has example code which reproduces the problem.

    Hello,

    I've been enjoying using Archive::Extract, but ran into a problem when using Eclipse/EPIC on Ubuntu Linux 14.04LTS

    The problem I see is that when I include Archive::Extract in most of my programs, my programs starts showing erroneous errors.

    If I comment out "use Archive::Extract" I no longer get these errors.

    The problem seems to be on line 146. If I comment it out. I no longer have any issues.

    The line is as follows:

    ($PROGRAMS->{$pgm}) = grep { scalar run(command=> "$_ $opt -1 ") } can_run($pgm);

    In looking at the IPC::Cmd man page it shows an example like:

    my $cmd = [$full_path, '-b', 'theregister.co.uk'];

    if( scalar run( command => $cmd,
    verbose => 0,
    buffer => \$buffer,
    timeout => 20 )
    ) {
    print "fetched webpage successfully: $buffer\n";
    }

    Notice that in the $cmd assignment it shows $full_path

    In the loop in Extract.pm it is only passing "unzip", not the full path.

    I would say there is a bug in the EPIC parser off the top. But I believe there is also at least one bug in the Archive::Extract code.

    1. IPC::Cmd run needs a full path, and a full path is not being passed.

    2. From the IPC::Cmd man page.

    "run will return a simple true or false when called in scalar context."

    In the call above you seem go be greping for a true false, in the can_run statement, which will return the full path. So it appears that the above line is not doing what you intend. I'm thinking that it should be something more like the line for UNZIP on freebsd, something like:

    if ( $pgm eq 'unzip' and ON_FREEBSD and my $unzip = can_run('info-unzip') ) {
    $PROGRAMS->{$pgm} = $unzip;
    next CMD;
    }

    If you want to dig a big deeper, here is a copy of the program I have that includes Archive::Extract that throws errors in Eclipse/EPIC.

    I tried this in the Latest, freshly downloaded release of both and it still reproduces.

    Regards,
    Scott

    EDIT: Removed big gnarly code example, replaced with smaller one below

     

    Last edit: Scott 2016-04-01
  • Jan Ploski

    Jan Ploski - 2016-03-31

    FWIW, I tried opening your included script in EPIC 0.7.0 on Debian oldstable (perl 5.14.2, Archive::Extract 0.48), and it does not show any syntax errors except for complaining about bareword HANDLE used with strict subs (ok, also missing modules String::ShellQuote, File::BOM, but these do not seem relevant). (BTW, maybe you can minimize your test code to demonstrate the problem.)

     
  • Scott

    Scott - 2016-04-01

    Firstly, thanks for the quick response. I've been loving EPIC for quite some time, it's an awesome piece of software.

    I'm on Perl 5.18.2 and Archive::Extract 0.76

    I chopped out everything but the one big function convert_data_to_tsv, it seems to be some interaction with code in that function, or the size of that function, and the line of code in Archive::Extract.

     ($PROGRAMS->{$pgm}) = grep { scalar run(command=> [ $_, $opt, '-1' ]) } can_run($pgm);
    

    I started removing code from that function, but once I get that function down to a certain size, the error seems to go away.

    Also running perl -c on any of the files returns no errors. Is there something from the command line I can run to emulate what EPIC is doing inside of Eclipse?

    If I remove any of the elseif blocks in the function the problem seems to disappear. Also as I started removing different lines the problem would go away, but the lines were not consistent in what they were made up of. When I added a simple line back, something like :

    my foo="bar";
    

    The errors would come back

    It seems that there might be something memory/buffer related, as it doesn't seem to be predicated by removing any one line of code, or a specific group of code.

    Getting that large function down to a better size seems to fix the issue. BTW, I didn't write that big monolithic function :(

    If I open Archive::Extract by itself, it does not show any errors.

    The errors I get when I include Archive::Extract are:

    Description Resource Path Location Type
    Global symbol "$firstcol" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 12 Perl Problem
    Global symbol "$row" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 12 Perl Problem
    Global symbol "$row" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 11 Perl Problem
    syntax error Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 11 Perl Problem
    Global symbol "$out" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 14 Perl Problem
    Global symbol "$tsvout" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 14 Perl Problem
    syntax error Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 13 Perl Problem
    Unmatched right curly bracket Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 13 Perl Problem
    Unmatched right curly bracket Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 16 Perl Problem
    Global symbol "$retObj" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 15 Perl Problem
    Global symbol "$row" requires explicit package name Convert2TSV.pm /appserver/pub2mobile/site_perl/Tropare line 14 Perl Problem

    Here's a copy that's pared down a bit that still reproduces the problem.

    Something else that's interesting. I did a "Source -> Format" on the code that was throwing the error and the error went away, so in order to reproduce this, the following code may need to be copy'd/pasted verbatim.

    package Tropare::Convert2TSV;
    
    use strict;
    use warnings;
    use JSON qw(encode_json decode_json);
    use Exporter;
    use File::Basename qw(fileparse);
    use String::ShellQuote qw( shell_quote );
    use Archive::Extract;
    use Text::ParseWords;
    use File::BOM qw( :all );
    use Text::CSV_XS;
    use Spreadsheet::ParseExcel;
    use Spreadsheet::XLSX;
    use Data::Dumper;
    use vars qw($VERSION @ISA @EXPORT_OK %EXPORT_TAGS);
    
    our $VERSION    = 1.00;
    @ISA        = qw(Exporter);
    # @EXPORT       = (SlapConf);
    @EXPORT_OK  = qw();
    %EXPORT_TAGS    = ( DEFAULT => [qw()]);
    
    sub convert_data_to_tsv {
        my $fileInfo = shift;
        my $outFile = shift;
            my $spill = '';
        my $retObj = {status=>'failure', rows=>0, args=>{fileInfo=>$fileInfo,outFile=>$outFile}};
    
            my %textformats = map { $_ => 1 } qw(txt tsv csv psv);
            if ($textformats{$fileInfo->{type}}) {
    
                    if (!open(HANDLE, '<', $fileInfo->{path})){
                            die "can't open $fileInfo->{path}, $!";
                    } else {
                            ($fileInfo->{encoding},$spill) = defuse(HANDLE);
                            if (!$fileInfo->{encoding}) {
    
                                    my $file = shell_quote($fileInfo->{path});
                                    if (!system("/usr/bin/iconv -f utf-8 -t utf-8 $file >/dev/null")) {
                                            $fileInfo->{encoding} = 'UTF-8';
                                    } else {
                                            $fileInfo->{encoding} = 'windows-1252';
                                            binmode(HANDLE, "encoding($fileInfo->{encoding})");
                                    }
                            }
    
                            my $firstLine = $spill.<HANDLE>;
                            if ($fileInfo->{encoding} eq 'windows-1252') {
    
                            }
                            close HANDLE;
                            if ($fileInfo->{type} eq 'txt') {
                                    $fileInfo->{comma_cols} = scalar parse_line(',',0,$firstLine);
                                    $fileInfo->{tab_cols} = scalar parse_line("\t",0,$firstLine);
                                    $fileInfo->{pipe_cols} = scalar parse_line('\|',0,$firstLine);
                                    if (($fileInfo->{comma_cols} >= $fileInfo->{tab_cols}) && ($fileInfo->{comma_cols} >= $fileInfo->{pipe_cols})) {
                                            $fileInfo->{type} = 'csv';
                                            $fileInfo->{sep_char} = ',';
                                    } elsif ($fileInfo->{tab_cols} >= $fileInfo->{pipe_cols}) {
                                            $fileInfo->{type} = 'tsv';
                                            $fileInfo->{sep_char} = "\t";
                                    } else {
                                            $fileInfo->{type} = 'psv';
                                            $fileInfo->{sep_char} = "|";
                                    }
                            }
                    }
    }
    
        my $tsvout = Text::CSV_XS->new ({ binary => 1, sep_char => "\t", eol => "\n", quote_char=>undef, escape_char=>undef});
        my $out;
        my $infile;
        if (defined $outFile && length($outFile) && ($outFile ne '-')) {
            open $out, '>', $outFile or die "can't open $outFile for writing: $!";
        } else {
            open $out, '>&', STDOUT or die "can't open SDTOUT for writing: $!";
        }
            binmode ($out,"encoding(UTF-8)"); 
        if ($fileInfo->{type} eq 'csv') {
            my $csv = Text::CSV_XS->new ({ binary => 1 });
            if ($fileInfo->{encoding} eq 'windows-1252') {
                open($infile, "<", $fileInfo->{path}) or die "can't open $fileInfo->{path}, $!";
                binmode($infile, "encoding($fileInfo->{encoding})");
            } else {
                open($infile, '<:via(File::BOM)', $fileInfo->{path}) or die "can't open $fileInfo->{path}, $!";
            }
            while (my $row = $csv->getline ($infile)) {
                if ($retObj->{rows} == 0) {
    # assume the first one is the header and strip any unacceptable characters
                    for (@$row) {
                        $_//='';
                        s/\\/\//g;
                        s/\t/ /g;
                        s/\n/ /g;
                        s/\r/ /g;
                    }
                }
                for (@$row) {
                    s/\\/\\\\/g;
                    s/\t/\\t/g;
                    s/\n/\\n/g;
                    s/\r/\\r/g;
                }
                if (exists $fileInfo->{makelist} && $fileInfo->{makelist}){
    # save first col and remove all the others
                    my $firstcol = shift @{$row};
                    undef @{$row};
                    push @{$row},$firstcol;
                }
                    $tsvout->print ($out , $row);
                $retObj->{rows}++;
            }
            close $infile;
            $retObj->{status}='success';
        } elsif ($fileInfo->{type} eq 'tsv') {
            my $tsv = Text::CSV_XS->new ({ binary => 1, sep_char => "\t", eol => "\n", quote_char=>undef, escape_char => undef });
            if ($fileInfo->{encoding} eq 'windows-1252') {
                open($infile, "<", $fileInfo->{path}) or die "can't open $fileInfo->{path}, $!";
                binmode($infile, "encoding($fileInfo->{encoding})");
            } else {
                open($infile, '<:via(File::BOM)', $fileInfo->{path}) or die "can't open $fileInfo->{path}, $!";
            }
            while (my $row = $tsv->getline ($infile)) {
                if ($retObj->{rows} == 0) {
    # assume the first one is the header and strip any unacceptable characters
                    for (@$row) {
                        $_//='';
                        s/\\/\//g;
                        s/\t/ /g;
                        s/\n/ /g;
                        s/\r/ /g;
                    }
                }
                for (@$row) {
                    s/\\/\\\\/g;
                    s/\t/\\t/g;
                    s/\n/\\n/g;
                    s/\r/\\r/g;
                }
                if (exists $fileInfo->{makelist} && $fileInfo->{makelist}){
    # save first col and remove all the others
                    my $firstcol = shift @{$row};
                    undef @{$row};
                    push @{$row},$firstcol;
                }
                $tsvout->print ($out , $row);
                $retObj->{rows}++;
            }
            close $infile;
            $retObj->{status}='success';
        } elsif ($fileInfo->{type} eq 'psv') {
            my $tsv = Text::CSV_XS->new ({ binary => 1, sep_char => '|', eol => "\n", quote_char=>undef, escape_char => undef });
            if ($fileInfo->{encoding} eq 'windows-1252') {
                open($infile, "<", $fileInfo->{path}) or die "can't open $fileInfo->{path}, $!";
                binmode($infile, "encoding($fileInfo->{encoding})");
            } else {
                open($infile, '<:via(File::BOM)', $fileInfo->{path}) or die "can't open $fileInfo->{path}, $!";
            }
            while (my $row = $tsv->getline ($infile)) {
                if ($retObj->{rows} == 0) {
    # assume the first one is the header and strip any unacceptable characters
                    for (@$row) {
                        $_//='';
                        s/\\/\//g;
                        s/\t/ /g;
                        s/\n/ /g;
                        s/\r/ /g;
                    }
                }
                for (@$row) {
                    s/\\/\\\\/g;
                    s/\t/\\t/g;
                    s/\n/\\n/g;
                    s/\r/\\r/g;
                }
                if (exists $fileInfo->{makelist} && $fileInfo->{makelist}){
    # save first col and remove all the others
                    my $firstcol = shift @{$row};
                    undef @{$row};
                    push @{$row},$firstcol;
                }
                $tsvout->print ($out , $row);
                $retObj->{rows}++;
            }
            close $infile;
            $retObj->{status}='success';
        } elsif ($fileInfo->{type} eq 'xls') {
            my $sheetData = ParseOldExcel({file=>$fileInfo->{path}});
    #print Dumper($sheetData) if -t;
            if (scalar @{$sheetData->{sheets}[0]{rows}}) {
            # if there are any rows then assume the first one is the header and strip any unacceptable characters
                for (@{@{$sheetData->{sheets}[0]{rows}}[0]}) {
                    $_//='';
                    s/\\/\//g;
                    s/\t/ /g;
                    s/\n/ /g;
                    s/\r/ /g;
                }
            }
            foreach my $row (@{$sheetData->{sheets}[0]{rows}}) {
                for (@$row) {
                    $_//='';
                    s/\\/\\\\/g;
                    s/\t/\\t/g;
                    s/\n/\\n/g;
                    s/\r/\\r/g;
                }
                if (exists $fileInfo->{makelist} && $fileInfo->{makelist}){
    # save first col and remove all the others
                    my $firstcol = shift @{$row};
                    undef @{$row};
                    push @{$row},$firstcol;
                }
                $tsvout->print ($out , $row);
                $retObj->{rows}++;
            }
            $retObj->{status}='success';
            $retObj->{extra_name}=$sheetData->{sheets}[0]{name};
        } elsif ($fileInfo->{type} eq 'xlsx') {
            my $sheetData = ParseNewExcel({file=>$fileInfo->{path}});
    #print Dumper($sheetData) if -t;
            if (scalar @{$sheetData->{sheets}[0]{rows}}) {
            # if there are any rows then assume the first one is the header and strip any unacceptable characters
                for (@{@{$sheetData->{sheets}[0]{rows}}[0]}) {
                    $_//='';
                    s/\\/\//g;
                    s/\t/ /g;
                    s/\n/ /g;
                    s/\r/ /g;
                }
            }
            foreach my $row (@{$sheetData->{sheets}[0]{rows}}) {
                for (@$row) {
                    $_//='';
                    s/\\/\\\\/g;
                    s/\t/\\t/g;
                    s/\n/\\n/g;
                    s/\r/\\r/g;
                }
                if (exists $fileInfo->{makelist} && $fileInfo->{makelist}){
    # save first col and remove all the others
                    my $firstcol = shift @{$row};
                    undef @{$row};
                    push @{$row},$firstcol;
                }
                $tsvout->print ($out , $row);
                $retObj->{rows}++;
            }
            $retObj->{status}='success';
            $retObj->{extra_name}=$sheetData->{sheets}[0]{name};
        } 
        close $out;
        return $retObj;
    }
    
    1;
    
     
  • Scott

    Scott - 2016-04-01

    As an added note, I did the "Source->Format" on the original file, without things chopped out and it still throws the errors. The errors seem to change/move around based on how much I chop out.

     
  • Scott

    Scott - 2016-04-01

    I also forgot to mention I am also on EPIC 7.0

    Here is the attachment with the unmodified, pared down source that still reproduces the errors. With this file I am getting the following:

    Description Resource Path Location Type
    Unmatched right curly bracket Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 10 Perl Problem
    syntax error Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 16 Perl Problem
    syntax error Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 10 Perl Problem
    Global symbol "$fileInfo" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 10 Perl Problem
    Global symbol "$out" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 17 Perl Problem
    Global symbol "$tsvout" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 17 Perl Problem
    Global symbol "$retObj" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 18 Perl Problem
    Unmatched right curly bracket Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 19 Perl Problem
    syntax error Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 19 Perl Problem
    Global symbol "$retObj" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 20 Perl Problem
    Global symbol "$retObj" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 21 Perl Problem
    Global symbol "$sheetData" requires explicit package name Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 21 Perl Problem
    Unmatched right curly bracket Convert2TSV2.pm /appserver/pub2mobile/site_perl/Tropare line 22 Perl Problem

     
  • Jan Ploski

    Jan Ploski - 2016-04-01

    Your source code doesn't matter (much), you can comment out pretty much everything and still get those syntax errors.

    To reproduce them from shell (without EPIC's involvement), you can use the following invocation "cat yourcode.pl | perl -c" instead of just "perl -c yourcode.pl". This simulates how EPIC passes the source code to the compiler.

    The reason for those errors seems to be that IPC::Cmd does funny things to STDIN (in the dup_fds and reopen_fds functions - if you comment them out, the syntax errors disappear).

    Anyway, it's not a good practice to run external programs as a side effect of mere compilation, so that part might need to be fixed by Archive::Extract. BTW, in my environment full path of the program (/usr/bin/unzip) is passed to the "run" function, unlike what you mentioned earlier.

     
  • Scott

    Scott - 2016-04-01

    That explains it. I'll follow up w/ the Archive::Extract folks as I agree running anything as a side effect of compilation is not best practice at best, and a security issue at the worst. Once again, EPIC is well.. EPIC. Thanks for your help in solving this.

     

Log in to post a comment.