From: Soren A <sor...@fa...> - 2002-12-19 13:57:39
|
Hello, Manu and others here like shell cleverness and i am very into writing bash scripts too. I came up with a new one yesterday that i like quite a lot. IMHO the *nix program `basename' leaves a lot to be desired. It is of very limited utility because it has never been taught to think about globbing of file extensions. Thus in order to have `basename' return a root filename stripped of an extension, the programmer has to know beforehand precisely what the extension is, and cannot instead argue a class (list -- not "class" in the OO sense) of potential extensions for basename to look for. $ basename /usr/local/lib/librle.a # gives "librle.a" $ basename /usr/local/lib/librle.a '.a' # gives "librle" In a project I was working on I had a shell script ("exobuild.sh", a piece of "ExoBuild", my Perl extension for building modules outside of the directory where the module source is located), I needed something different. I figured out how to do it and then generalized it a little into a shell script that I named `ebasename'. Like `egrep' to simple `grep', `ebasename' is like `basename' but enhanced. $ ebasename /usr/local/lib/librle.a '.a' '.lib' '.dll.a' # gives "librle" I think the reader can readily grasp the interface from the above example. The first argument is always the filename spec to be processed, and any subsequent args are elements for a list of possible extensions. At the present time the perhaps obvious feature of allowing one or more of the elements to be a true regex or a glob-type regex is *not* implemented, however (and I am about to discuss the implementation). `ebasename' is implemented using bash and perl in combination. The initial argument parsing is done by the bash shell and a bash array variable is formed of the args [2..n]. The a perl '-e' expression ("one-liner") is invoked which calls upon perl to use the core module File::Basename for its fileparse() functionality. This wonderful fileparse() subroutine takes as its parameters a name of a file (a filename spec) and a list of patterns (regexen) to be scanned for matches on the first arg. But I call perl's `quotemeta()' function on the elements of the second parameter (the array) in order to prevent spurious matches of the dot "." with any character, that would otherwise result. Here's the code: ---------------8<------------------------------------------------------ #! /bin/bash <-- msys users need to edit that shebang line. # (C)2002 Soren Andersen. LGPL license. See www.fsf.org. filename=$1 ; shift declare -i i=$# declare -a suffixen while [[ $i -gt 0 ]] ; do suffixen[$i]="${!i}" ((i--)) done # example suffixen: '.tar.gz' '.tar.bz2' '.zip' '.tgz' '.tbz' /bin/perl -M'File::Basename' \ -e "my \$fin=qq[$filename];" \ -e 'print q[],(fileparse($fin,map(quotemeta,@ARGV)))[0],qq(\n);' ${suffixen[@]} ---------------8<------------------------------------------------------ There are several things going on here. One is that this is for bash 2.x (no problem, that's what we with MSYS have); also note that in bash, the funny-looking variable ${somefoo[@]} is a way of creating a list of every element of the array variable $somefoo (which we populated before invoking perl). I can think of several things to ask for feedback on, but of greatest interest to me right now is to solicit info about making this a "C" -only program, for speed and portability. Not that invoking the perl interpreter is such a hang-up -- it isn't. It is barely noticeable even when you are looking for it. And i am developing on a Cyrix "ix486+" machine! Anyway, The first thing that comes to mind for me is "pcre" -- the OS package for perl-compatible regular expressions. Has anyone written any projects that used pcre? Any advice on that, or other, or opinions? Best, Soren A |