I would like to delete a semicolon at a specific position in a huge CSV file (180.000 lines).
How can I do this in Notepad++ (Excel 2003 can't of course open my file which > 65.000 lines)?
I think I could register a macro where I'll use a regular expression (regexp) to find the semicolon which is at the 71st position of each line and delete it. Then I launch this macro in a loop.
But ... I don't know how write this regexp!
Could you help me please?
Thanks.
Gôm
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I made a mistake, this semicolon is not "at the 71st position". I must delete the 71st semicolon which could be at the 71st position the 90th, the 166th, etc.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You can't do this in notepad++ (it's regex engine sucks) but here is the regex to do it if you have access to a better tool that uses perl regular expressions
s/^((?:[^;]*;){70}[^;]*);/\1/
# Example -- this removes the 8th ';'
perl -e "my $t=q{;d;fd;d;a;sd;df;fd;a;as;sd;f;d;aadsf;;fea;fs;;;;aef;se;}; $t =~ s/^((?:[^;]*;){7}[^;]*);/\1/; print $t"
I'd have suggested sed or awk for the job, but the idea is the same: it is a scripting language which will do it.
However, using the NppExec plugin, you can launvh the script from Notepad++, and even edit it at the same time with Notepad++ too. I haven't played with this much, but it could be a powerful combo.
CChris
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried: perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/;" {your filename goes here}, but it doesn't work.
I think the right syntax is: perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/; {your filename goes here}" ... but I'm trying and no result in return for the moment!?
Note: I installed Perl and execute your script like this:
"my_file.csv" is of course in "C:\Perl\bin" ... at the beginning I thought the problem was the path "C:\Temp\my_file.csv" so I put the file directly in the same directory than "perl.exe".
@cchris: Thanks too! ;)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi!
I'm french so excuse my poor level in English! ;)
I would like to delete a semicolon at a specific position in a huge CSV file (180.000 lines).
How can I do this in Notepad++ (Excel 2003 can't of course open my file which > 65.000 lines)?
I think I could register a macro where I'll use a regular expression (regexp) to find the semicolon which is at the 71st position of each line and delete it. Then I launch this macro in a loop.
But ... I don't know how write this regexp!
Could you help me please?
Thanks.
Gôm
I made a mistake, this semicolon is not "at the 71st position". I must delete the 71st semicolon which could be at the 71st position the 90th, the 166th, etc.
You can't do this in notepad++ (it's regex engine sucks) but here is the regex to do it if you have access to a better tool that uses perl regular expressions
s/^((?:[^;]*;){70}[^;]*);/\1/
# Example -- this removes the 8th ';'
perl -e "my $t=q{;d;fd;d;a;sd;df;fd;a;as;sd;f;d;aadsf;;fea;fs;;;;aef;se;}; $t =~ s/^((?:[^;]*;){7}[^;]*);/\1/; print $t"
;d;fd;d;a;sd;df;fd;a;as;sd;f;d;aadsf;;fea;fs;;;;aef;se;
;d;fd;d;a;sd;df;fda;as;sd;f;d;aadsf;;fea;fs;;;;aef;se;
Forgot to mention... this exact command will do what you want:
perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/;" {your filename goes here}
You can get perl for windows here (http://www.activestate.com/activeperl/)
I'd have suggested sed or awk for the job, but the idea is the same: it is a scripting language which will do it.
However, using the NppExec plugin, you can launvh the script from Notepad++, and even edit it at the same time with Notepad++ too. I haven't played with this much, but it could be a powerful combo.
CChris
@lespea: Thank you very much!!!
I tried: perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/;" {your filename goes here}, but it doesn't work.
I think the right syntax is: perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/; {your filename goes here}" ... but I'm trying and no result in return for the moment!?
Note: I installed Perl and execute your script like this:
C:\Perl\bin>perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/; {my_file.csv}"
"my_file.csv" is of course in "C:\Perl\bin" ... at the beginning I thought the problem was the path "C:\Temp\my_file.csv" so I put the file directly in the same directory than "perl.exe".
@cchris: Thanks too! ;)
My mistake ... the correct syntax is:
perl -pe "s/^((?:[^;]*;){7}[^;]*);/\1/;" my_file.csv
I'm ashamed!
Another problem ... I see the correct result in DOS, but my original file is never updated!?
Where am I wrong again? :(
just stick
> newfile.txt
on the end of your command, and it will dump the output to the newfile.txt.
C:\>perl -pe "s/^((?:[^;]*;){70}[^;]*);/\1/;" C:\Temp\my_file.csv > C:\Temp\my_file_new.csv
Hurraaaaaaaaaah!
Thank you everybody: lespea, cchris and davegb3!
Long life to Notepad++ and Perl! ;)