Menu

Developer's Guide

Salvatore Pinto

Developer manual

This manual is intended to be a short explanation of the server/client system from a developer point of view.

Server/client communication protocol

The grid cache client/server protocol is HTTP over SSL.

X509 certificates are used for user authentication, using the SSL protocol implemented via the openssl routines. The openssl routines loads a custom verify callback script in order to support X509 proxy certificates, as for the GSI.

HTTP requests are handled using the mongoose open source libraries, modified in order to allow access to the internal calls and enable support for separated server certificate and certificate key in PEM format (as specified in the GSI).

The client can perform the following HTTP requests, which correspond to the following access types:

  • GET / : PING operation, check that the server is on
  • GET /REPOSITORY/FILE : request for download the FILE in the REPOSITORY. If the repository is local or the repository is remote but the file is present in the cache, the file is returned to the user. Otherwise, a download script is started to download the file from the remote repository and place it in the cache.
  • GET /REPOSITORY/DIR : request for listing of the DIR in the REPOSITORY. The mongoose directory list libraries or the list script are used accordingly to what is specified in the local_repositories.list file. The output is provided to the user and contains the listing of the local or remote repository contents for the directory.
  • PUT /REPOSITORY/FILE : request for FILE upload in the REPOSITORY. The file is provided in the content of the request. The put_file_as function (a modified version of the put_file mongoose function to support user impersonation according to the GSI) or the put script are used accordingly to what is specified in the local_repositories.list file. The content of the PUT message are forwarded directly to the file, without encoding support.
  • PUT /REPOSITORY/DIR : request for DIR creation in the REPOSITORY. The directory is created via a mkdir_as function, which handles user impersonation via the GSI, or the put script, accordingly to what is specified in the local_repositories.list file.

External data transfer protocols support

By default, if the file repository is local or the file is in the local cache, the server will use the internal HTTP support (via mongoose) to provide the file to the user.

It is indeed possible to use other external data transfer protocols. In this case, according to the ext_mapfile contents, the IP of the request and the request options (passed via the query (?) field in the URI), the user is provided with a list of URIs referred to the external available data transfer protocols.

This is implemented via a generic sendReplyDownURI function, which generates the URIs accordingly

Thread system

The cache-server and the mongoose libraries are a multi-threaded.

Mongoose will have a fixed number of threads (selectable by the user) to manage the HTTP requests. These thread are named worker threads and calls the mg_request_handler cache-server function to serve the request. The number of worker threads is fixed and selectable only at start time. Another additional thread, named mongoose master thread, is created to listen for the HTTP port and dispatch the requests to the worker threads.

Each call to the download script (executed when a file from a remote repository needs to be cached) is executed in a separated thread, created on-the-fly by the application. The reason for this is that, if the download is not terminated after the timeout of the HTTP request, the HTTP request is released, the user is provided with a "Please wait" message and the download will continue in the background in another thread. Multiple requests for the same files will not affect already running download scripts (the download script is started once for each file).

Other two additional thread are started by the application. One is the Cache Manager thread, which will manage the local caches and re-load the grid-mapfile and host certificate on change. Another is the User Impersonation thread which will impersonate the user for the PUT operation (thus it is started only if PUT operations are enabled).

For security, on user request, all the threads can drops all the privileges and run as normal user (by default, the globus user), except the User Impersonation thread which keeps the privilege to impersonate other users. In order to minimize the risk of exploits, the User Impersonation thread is started in a separated memory space (using the fork command) and communicates with the worker threads only via a dedicated double-end pipeline.

Cache Manager Thread

The cache manager thread is the one in charge of the deletion of the files from the local cache.

A file is deleted from the cache when:

  • The file is expired. This happens when the last access to the file is lower than the current time minus a configurable threshold, named expire_time. So, for example, if the expire_time is 3 days, every file which has been never accessed (downloaded from the client) more than 3 days ago is deleted.
  • The size of the cache folder exceeds the maximum allowable size for the cache. In this case the files are deleted from the last to the

To obtain the list of the files to delete. Every sleep_time (which is a customizable variable of the server, as default 1 minute), the thread scans all the repositories cache folders. If it finds an expired file, it simply deletes it. After the scan of the entire cache folder (ant its sub-folders), the server knows the size of the cache and the oldest files. There is a maximum size of files maintained in memory by the thread, which can be customized and is the <maximum_files_to_delete_at_once> variable (default = 15). If the size of the directory is greater than the maximum allowed size for the cache, the thread delete the oldest files in order to return into the maximum size. If all the files maintained in memory are deleted, but the size of the cache folder is still greater than the maximum, the thread restarts to scan the directory. </maximum_files_to_delete_at_once>

To be more precise, the thread does not delete the file, but it call a delete script. For more information about this, see the Scripting system guide.

If there is some client downloading this file, the thread does not skip its deletion. You can use the remove script to perform a check if the file is currently used or not, but, anyway, for local caches with a not too small maximum size and expire time, it is very unlikely that the thread will remove a file while someone is downloading it, since when the file transfer server access the file it updates its last access time to the current time.

Errors management and logging

Logging is performed using shared logging routines, which separates logging messages to different levels: ERROR, WARNING, INFO and DEBUG. The standard level displayed is WARNING, but the user can increase the logging level using the -d switch from the command line. Logging output goes to stderr and stdout by default, but can be redirected to a local file on request.

Regarding the errors in the running of the scripts. For the list and put scripts, the error is redirected directly to the user, while for the download script the error is cached inside an internal list, which is cleared after a defined amount of time. This way, you can protect the remote repository from flooding of requests which generate errors (note that if a request is successful, the next requests will access the cache directly, thus the remote repository is protected in any case).

Scripting system

The grid-cache-server does not delete or download a file directly, but it always call a delete script or a download script to perform this operation.

This is useful to easily extend the server capabilities in order to support download from repositories with a particular protocol, to manage logging and registration of the presence/absence of files in the cache. So, for example, it is possible to register in an external catalogue each file present in the local cache (putting an home made registration routine in the download script and a unregistration routine in the delete script). In this case, the grid system is aware of every file present in the cache and it can send jobs where the input files are already in the cache, reducing the amount of resources used.

For put and list operation, the server may use the internal mongoose libraries or custom scripts. Also file ACL may use an external scripts for evaluation.

In the following paragraphs we will describe the format of the cache server scripts.

Download script

Every time the grid-cache-server needs to download a file to the local cache, it calls the download script with the following command line:

download_script <REMOTE_PATH_URL> <LOCAL_REPO_PATH> <REMOTE_FILE_NAME> <LOCAL_FILE_NAME>

Where:

  • download_script is the script specified in the repository list for the given repository, it can contain other additional command line parameters.
  • <remote_path_url> is the URL of the remote path, as specified in the last column of the repository list.</remote_path_url>
  • <local_repo_path> is the full path of the local repository, it is specified in the firs column of the repository list.</local_repo_path>
  • <remote_file_name> is the relative path plus file name of the remote file which need to be downloaded.</remote_file_name>
  • <local_file_name> is the local relative path and file name of the file to be downloaded.</local_file_name>

To obtain the full remote file URI and local file URI you can simply cat respectively REMOTE_PATH_URL with REMOTE_FILE_NAME and LOCAL_REPO_PATH with LOCAL_FILE_NAME.

In simple words, what the download script must do is to copy the $REMOTE_PATH_URL$REMOTE_FILE_NAME into the local file $LOCAL_REPO_PATH$LOCAL_FILE_NAME, returning 0 when the download complete successfully and an error code if the download is not completed. Also, no files must be created in the LOCAL_REPO_PATH if the download is unsuccessful and no temporary files at all must be created in the LOCAL_REPO_PATH.

NOTE: No log of the downloaded files in the cache is maintained by the caching server. Therefore, if you want a log, you need to manage it from the script.

When the server starts, it calls the download script with the command line:

download_script <REMOTE_PATH_URL> <LOCAL_REPO_PATH> --check

This is used to perform a check that every external software or configuration file needed by the script is present and working. The script must return 0 if the check resulted in no errors or a number different from 0 if some error occurred.

An example of download script is provided inside the samples folder of the source package.

Delete script

The delete script is the same of the download script, with the only difference that now it can be called with multiple local file names (and thre is no remote file name specified):

delete_script <REMOTE_PATH_URL> <LOCAL_REPO_PATH> <LOCAL_FILE_NAME_1> … <LOCAL_FILE_NAME_N>

What this script needs to do is to remove the $LOCAL_REPO_PATH$ LOCAL_FILE_NAME_i (with i from 1 to N) files. N can be a number between 1 and the <maximum_files_to_delete_at_once> variable (see the #Cache Manager Thread paragraph). </maximum_files_to_delete_at_once>

As the download script, the delete script must respond to the --check switch during the server startup and it shall maintain its own log of the deleted files (if desired).

An example of delete script provided inside the samples folder of the source package.

Put script

The put script handles the file writing on a remote repository. It is called via

put_script <REMOTE_PATH_URL> <LOCAL_REPO_PATH> <LOCAL_FILE_PATH> < FILEDATA

What this script needs to do is to create a file mapped to the $LOCAL_REPO_PATH$ LOCAL_FILE_NAME in the remote repository (mapped via $REMOTE_PATH_URL) and write there the file contents which are in stdin to the script.

An example of put script provided inside the samples folder of the source package.

List script

The delete script is the same of the download script, with the only difference that his output is directly sent to the user. Thus, since the user will probably use a web browser to interpret the output, it is recommended to provide HTML compliant output. The list script is called via:

list_script <REMOTE_PATH_URL> <LOCAL_REPO_PATH> <LOCAL_DIRECTORY>

What this script needs to do is to list the contents of the %LOCAL_DIRECTORY folder in the $REMOTE_PATH_URL repository mapped to the $LOCAL_REPO_PATH path.

As the other scripts, the list script must respond to the --check switch during the server startup.

An example of list script provided inside the samples folder of the source package.

File ACL auth script

The ACL auth script will give authorization for a given request. The script is called via

auth_script <REQUEST_IP> <REQUEST_SOURCE_PORT> <USER_CN> <REQUEST_URI>

and needs to return 0 if the request for the given $REQUEST_URI is authorized for the given $REQUEST_IP, $REQUEST_SOURCE_PORT and $USER_CN (user certificate common name)


Related

Wiki: Home

MongoDB Logo MongoDB