V:4.1.3/Karajan:Task Library

From Java CoG Kit

Jump to: navigation, search

Files: task.k, task.xml

The task library interfaces with the Java CoG Kit abstraction classes, allowing the use of services for job submission and file operations. The tasks in this library can function in two modes: scheduled or unscheduled. When scheduled, remote tasks are not executed directly. They are rather passed to a scheduler which can handle issues such as throttling, resource allocation, and task-to-resource mapping.

Task Elements


task:scheduler(type, resources, handlers, *properties)

Defines the a scheduler to be used. The scope of the scheduler is the scope of the execution of the parent of scope. The type describes the particular type of scheduler that is desired. The resources that can be used by the scheduler are passed in the resources argument and can be defined using resources. Each scheduler will also require a list of task handlers, specified using handlers with the help of the handler element. Each type a scheduler may support an optional set of properties. The *properties argument, if specified, must be a map containing string keys and values.

The following schedulers are currently available:

  1. Default

The “default” scheduler uses a round-robin scheduling policy. However it also performs lookahead matching. This means that if a certain host has reached its maximum allowable number of tasks, it will be skipped. Also, if a suitable host is not found for the next task in the queue, other tasks may be scheduled. The scheduler will use the handlers in the order they were specified in the handlers list, with the first handler having the highest priority.

The default scheduler supports the following properties:

jobsPerCpu
Sets the maximum number of tasks that the scheduler will allocate for one CPU.
maxSimultaneousJobs
Sets the total maximum number of remote tasks that the scheduler will allow at any given time.
showTaskList
If set to true the scheduler will pop-up a window providing a lists of tasks that are being executed, and additional task and memory statistics.
  1. Weighted

The weighted scheduler is an experimental adaptive scheduler that maintains a "performance" history of all the hosts that it manages. Each host starts with a score of 1. If a task fails on a host, its score is decreased by a certain factor, if a task succeeds, the score is increased by a certain factor, and so on. Scores are also temporarily decreased with each job running on a host. Periodically, a normalization of the scores is performed. The normalization process involves multiplying each score with the same factor such that the geometric average of the scores after the normalization is 1. The weighted scheduler has two host selection strategies: "random" and "best". The "best" strategy means that the host with the highest score at the time of the submission of a task will be choosen for that task. By contrast the "random" strategy uses a weighted random choice, which gives higher chances for a host with a higher score. With a weighted random policy, every host, assuming that scoreLowCap > 0, and given a sufficiently large number of tasks, will eventually get a chance to be used (and thus possibly increase its score).

The following properties are available for the weighted scheduler (default values are listed in parantheses, after the property name; factors are values with which the score for a host is multiplied in a certain event):

connectionRefusedFactor
(0.1) factor for connection refused exceptions while submitting to a host
connectionTimeoutFactor
(0.05) factor for connection timeout expcetions while submitting to a host
jobSubmissionTaskLoadFactor
(0.9) used when a job is submitted successfully; reversed when the job completes (either successfully or in failure)
transferTaskLoadFactor
(0.9) used when a task transfer is started; reversed when the transfer completes
fileOperationTaskLoadFactor
(0.95) used when a file operation is started; reversed when the operation completes
successFactor
(1.2) factor used upon successfull completion of a task
failureFactor
(0.9) used when a task fails
scoreHighCap
(100) maximum value for a score
scoreLowCap
(0.001) minimum value for a score
renormalizationDelay
(100) number of tasks submitted after which a normalization occurs
policy
("random") use either a weighted "random" host selection policy or a "best" score host selection policy.

task:handler(type, provider)

A handler specifies a Java CoG Kit Abstraction handler. A handler is used to submit tasks. Type indicates the type of handler. They type is string and can have one of the following values: “execution”, “file”, and “file-transfer”.

Execution handlers are used for submitting jobs. File handlers are used for file operations (such as renaming, deleting, and listing of files). File transfer handlers are used only for transferring files. It is possible to transfer files using file handlers, but it is not possible to delete a file using a file transfer handler.

The provider argument indicates the provider to be used for the handler. For a list of currently supported providers please see the abstractions guide.


task:resources(...)

Encapsulates a set of hosts which can be specified using host.


task:host(name , *cpus , ...)

Returns a host definition. The name argument indicates the host name or IP address. The number of CPUs of the host can be specified using the *cpus argument. A set of services can also be specified on the default channel.


task:service(type, provider, *uri, *project, *jobManager, *securityContext)

Returns a service definition. The type of the service can be one of “execution”, “file”, or “file-transfer”. Provider indicates The Java CoG Kit abstraction provider for the service. For a list of currently supported providers please see the abstractions guide.

The *uri argument can be used to specify a URI for the service. If missing the host name of the host containing the service will be used.

The *project argument can be used to automatically bind a queuing system project to the service in order to alleviate the need to do it with the execute element.

The *jobManager argument can be used to specify a job manager different from the default. Examples of job managers include Fork, PBS, and Condor.

A non-default security context can be specified using the *securityContext argument.


task:securityContext(provider, credentials)

Returns a Java CoG Kit abstraction security context. The returned context will be instantiated for the specified provider. The credentials argument can be used to pass a specific set of credentials to security context.


task:allocateHost(name)

Allows tasks to be grouped on one host. By default, the scheduler assigns a different host to each task. AllocateHost can be used to reserve a host from the scheduler until it completes. The name indicates the name of the variable to be set with the allocated host, and is automatically quoted.

//Define a scheduler    
scheduler(
  ...
)       

allocateHost(host1
  execute("/bin/date", stdout="date", host=host1)    
  transfer(srcfile="date", srchost=host1, desthost="localhost")
)

Or, in XML:

<scheduler>
  ...
</scheduler>

<allocateHost name="host1">    
  <execute executable="/bin/date" stdout="date" host="{host1}"/>    
  <transfer srcfile="date" srchost="{host1}" desthost="localhost"/>    
</allocateHost>

The default scheduler uses a late binding mechanism with allocateHost. It generates a virtual host that is only bound to an actual host when the first task using it is submitted to the scheduler. This removes the limitation on the number of parallel allocateHost that can be running, and allows contained jobs to be submitted to the scheduler, which will later handle the throttling issues.

Multiple allocateHost can be nested allowing the grouping of tasks on multiple dependent hosts.


task:host(host, type, provider)

Checks if a host, specified with the host element contains a service of the specified type and with the specified provider. Returns true if such a service exists, and false otherwise.


task:execute(executable, *arguments, *directory, *stdout, *stderr, *, *redirect, *provider, *host, *count, *jobtype, *maxtime, *maxwalltime, *maxcputime, *environment, *queue, *project, *minmemory, *maxmemory)

Executes a remote job. Executable indicates the executable to be run. Arguments can be passed to the executable using *arguments. If present, the *directory argument specifies the remote directory in which the job will be executed. *Stdout and *stderr allow the redirection of the output and error streams to a remote file. *Stdin allows the redirection of the standard input from a remote file. If *redirect is set to true the standard output and standard error of the remote job is redirected to the local console. The *host argument allows the job to be executed on a specific host, and the *provider argument allows the job to be executed using a specific provider.

The rest of the arguments are passed to the underlying provider.


task:transfer(*srcfile, *srcdir, *srchost, *destfile, *destdir, *desthost, *provider)

Transfers a file. The file can be transfered between the local machine and a remote machine, or between two remote machines. The name of the source file is specified by the *srcfile argument. If present, *destfile specifies the name of the target file, otherwise the source file name is used.

The *srcdir argument indicates the directory on the source machine where the source file can be found. If the *srcdir argument is missing, the default directory will be assumed (provider dependent).

The *destdir argument indicates the directory on the target machine where the file will be copied. If the *destdir argument is missing, the default directory will be assumed (provider dependent).

*Srchost and *desthost indicate the source and the target machines respectively, while the *provider argument can be used to force the scheduler to use a specific provider, or in the event a scheduler is not used.


task:file:list(dir, *host, *provider)

Returns a list of files in a directory specified by dir, on the *host machine. The *provider argument can be used to select a specific provider for the operation. *Provider defaults to the local provider.


task:file:remove(name, *host, *provider)

Removes a file specified by name, on the *host machine. The *provider argument can be used to select a specific provider for the operation. *Provider defaults to the local provider.


task:file:exists(name, *host, *provider)

Returns true if the file specified by name exists on the *host machine. The *provider argument can be used to select a specific provider for the operation. *Provider defaults to the local provider.


task:dir:make(name, *host, *provider)

Creates a directory specified by name, on the *host machine. The *provider argument can be used to select a specific provider for the operation. *Provider defaults to the local provider.


task:dir:remove(name, *host, *provider)

Removes an empty directory.


task:file:isDirectory(name, host, provider)

Returns true if the file specified by name exists on the *host machine and it is a directory. The *provider argument can be used to select a specific provider for the operation. *Provider defaults to the local provider.


task:file:chmod(name, mode, *host, *)

Changes the permissions on the file specified with the name argument to the mode string indicated by the mode argument. If *host and *provider are present, the operation is done remotely using the respective provider.


task:file:rename(from, to, *host, *provider)

Renames a file. The source and target name are specified using the from and to arguments. If *host and *provider are present, the operation is done remotely.


task:SSHSecurityContext(credentials)

Instantiates a SSH security context. This is simply a convenience function for securityContext(”ssh”, credentials).


task:InteractiveSSHSecurityContext(*username, *privateKey, *nogui)

Instantiates a SSH security context which will lazily display a dialog window allowing the user to input a user-name/password pair or a user-name/private key/passphrase set. The dialog will only be displayed once per each instance of an interactive SSH security context.

If *username and/or *privateKey are specified, the values are used to pre-fill the corresponding dialog fields.

The InteractiveSSHSecurityContext makes use of a class present in the SSH provider of the Java CoG Kit. This class will try to determine whether a GUI can be displayed or not (by checking GraphicsEnvironment.isHeadless()). If a Swing dialog cannot be displayed, a text-mode interface is used instead. The *nogui argument can be used to force the use of the text-mode interface.


task:passwordAuthentication(username, password)

Returns a username/password pair suitable to be used as a credential for a securityContext.


task:publicKeyAuthentication(username, privatekey, passphrase)

Returns a username/privatekey/passphrase set which can be used as credentials for securityContext. The privatekey argument must point to a file containing the private key.

Personal tools
Collaboration and Jobs