Project: secp
Outline: secp is a generic utility to perform URL downloads to a sink directory.
the supported url schemes are ftp, scp, http, https, gridftp, file and nfs
when available in compressed form or packaged in an archive, the files/directories
are automatically decompressed/unpackaged to the sink name.
the following archive formats are recognised:
.gz, .tgz, .zip
as default, the software will follow:
* .uar text files containing a list of URLs (one per line) to download.
* HTML meta-refresh and href tags to interactively download files from the web.
* RDF online resources (<dclite4g:onlineResource> tag)
Dependencies: curl, scp, gawk, sed, bash, globus-url-copy [optional]
Change log:
version 1.0
* first release
version 1.1
* add the -f option to force a local copy of the file in case of nfs driver
* add the handling of the new return message of gridftp server "No such file or directory"
version 1.2
* corrected mkdir with -p flag when creating output directory
version 1.3
* changed the -R flag semantic for the opposite to retry on timeout by default
version 2.0 2008-05-17
* added support for scp and for http, https and ftp via wget
* improved error handling in particular for nfs and gridftp drivers
* improved message logging to stderr
* use of logApp function to log messages if available, unless LOG_FUNCTION variable is defined
* add -q (quiet) option to suppress echo of local filenames to stdout
* add -O option to force file or directory overrides in output directory (default is to not override)
* add -w (work directory) option to specify working directory (defaulting to /tmp)
* add watchdog for secp hangs in particular for gridftp or wget transfers (-t and -R options)
* add unpacking support for .tar, .tgz, .tar.gz, .Z, tar.Z, .bz, .bz2, .tbz, .tar.bz, .tar.bz2, .zip
* add support for uar (url archive) type unpacking
* add -c -p and -b options
* add -z option
version 2.1 2008-07-16 by manu
* removed -fast option
version 2.2
* corrected mkdir with -p flag when creating the tmp input file directory
* added the gsiftp driver (same of gridftp)
* added the cache driver
version 2.3
* added https driver with curl (for gridsite support)
* removed http driver with GET (outdated)
* modified usage
* removed ams driver (outdated)
version 2.3.1
* fixed error parsing for https driver
version 2.4
* added automatic uncompression of .gz files. If you want to disable it, you need to add the -z option
* added s3 driver
* added s (skip) option
version 3.0
* rewritten for performances (removed external log function support) - more than 10 times faster
* removed dependency on bash_debug.sh watchdog and log function
* added long opt support. Multiple options with one - is not supported anymore (ex. -co is not supported, shall be -c -o)
* removed -b, -Z, -w option. Not used anymore.
* -D option is deprecated. Debug can be now performed using standard bash debugging tools (sh -x)
* timeout (-t option) is now expressed in seconds
* secp now uncompress the files even if .gz or .tgz is written in the URI, since the new catalog
contains URIs with the .gz and .tgz suffix, this is needed for retro compatibility of the services.
This can be disabled with the new -U option. Moreover, for performance issues, secp will try adding only the
.gz extension if the file do not exist, and not all the others.
* added possibility to follow RDF and HTML auto-refresh meta-tag and HREF links for support to the new cache ws protocol
and for direct download from the G-POD catalogue. This can be disabled using the new -H option.
* removed support for un-compression of tar.gz, .Z, tar.Z, .bz, .bz2, .tbz, .tar.bz, .tar.bz2 files (not used anymore)
* retries (-r) and timeouts (-t) are now handled by the drivers (for performance issues)
* added -rt option to setup delay between retries
* removed WGET dependency, using curl insthead
version 3.0.1
* added -w option (set-up tmp directory base for drivers)
* added -co -qo -qco for retro-compatibility.
version 3.0.2
* unzip support for multiple files in the zip
version 3.1
* merged with ciop-tool version 3.0.0
* added FILE driver support for directories copy (from ciop-tool, with -x switch to exclude files in the copy)
* added HDFS driver (from ciop-tool)
* added support for EO-SSO login followup and HTTP basic authentication
* added support for credeltials storing in the user home
* added support for session cookies (for cURL driver)
* fixed https proxy authentication for SL6
version 3.1.1
* fixed minor bugs
* added -F for URL load from file list
version 3.1.2
* fixed unzip folder detection plus other minor fixes
License:
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.