Java CoG Kit Abstraction Guide

From Java CoG Kit

Jump to: navigation, search

Gregor von Laszewski and Kaizar Amin

Version: 4.1.3

Last update: January 17, 2006

This document includes basic information about the different Java CoG Kit abstractions and their intended usage.

Contents

Introduction

The Java Cog Kit abstractions offers a programming model that supports elementary Grid patterns such as job execution, file transfer, and file operations. It also provides advanced patterns such as execution (control) flows in the form of directed acyclic graphs (DAG).

The abstractions module is based on a Grid abstraction model, de-coupling the Grid patterns from their implementation. The applications developed with the offered Grid abstractions are independent of the low-level Grid implementations. Hence, applications are shielded from changes in the low-level Grid protocols and interfaces.

The abstractions module offers the following benefits to Grid application programmers:

  • develop client applications that will be interoperable across multiple Grid backend implementations;
  • provide reusable code to support rapid prototyping of basic Grid access patterns;
  • provide an open-source and extensible architecture that can be built collectively and incrementally based on community feedback; and
  • access the same set of interfaces implemented in disparate technologies.

The job execution pattern in cog-abstractions currently supports GT v2.4.3, GT v3.0.2, GT v3.2.0, GT v3.2.1, GT v4.0.0, Condor, and SSH implementations. Other platforms (Condor and Unicore) will be supported in future releases upon request by the community. We like to invite the community to participate in these activities.

The file transfer pattern in cog-abstractions allows file transfers between Grid FTP, FTP, Web DAV, and SSH resources.

The file operation pattern in cog-abstractions permits remote file access operations (such as ls, chmod, cd, get, put) on files hosted on Grid FTP, FTP, and Web DAV resources.

Installation

Cog-abstractions source is available with the Java CoG Kit v4. Instructions for downloading the Java Cog Kit are available in the Java CoG Kit Installation Guide at Url: \MANUALBASE/install.pdf.

We note that cog-abstractions is explicitly a client-side library. The current version of cog-abstractions provides support for GT v2.4.3, GT v3.0.2, GT v3.2.0 GT v3.2.1, GT v4.0.0, Condor, SSH, FTP, and WebDAV. In future, we will provide also support for GT4.

Hence, in order to execute job submission tasks with our abstractions, the reader is directed to install GT v2.4.3, GT v3.0.2, GT v3.2.0, GT v3.2.1, GT v4.0.0, Condor, or an SSH job submission server. Likewise, in order to execute file transfer and file operation tasks, the reader is directed to install the GridFTP, FTP, and WebDAV file servers.

For details on installing the Globus Toolkit please visit the Globus Alliance Web page http://www.globus.org

Examples

Several examples that demonstrate the ease of use and functionality of the Java CoG Kit abstractions are provided with the Java CoG Kit distribution. These examples are well documented in the Java CoG Kit Example Guide at Url: \MANUALBASE/examples.pdf.

The examples are further divided into the following packages:

image:alert.gif the links to the CVS sources should be placed in the table image:alert.gif seveal examples are missing in each of the directories

execution examples demonstrating the execution of a job on a remote job execution service (GT v2.4.3, GT v3.0.2, GT v3.2.0, GT v3.2.1, GT v4.0.0, Condor, and SSH).
transfer examples demonstrating the transfer of files and directories between two file servers (GridFTP, FTP, and WebDAV).
file examples demonstrating the operations on files hosted on remote file servers (GridFTP, FTP, and WebDAV).
taskgraph examples demonstrating the execution of hierarchical task graphs and control dependencies between Grid tasks.
queue and set examples demonstrating the execution of hierarchical queues and sets of tasks respectively.
xml examples demonstrating the ability to checkpoint partially executed task graphs to an xml document and construct task graphs from checkpointed xml documents.

invocation

examples demonstrating the ability to integrate with Web Services.

After successfully compiling the Java CoG Kit, these examples can be executed from the launcher scripts available in cog/dist/cog-4_1_5/bin/examples directory.

Design

The abstractions framework facilitates access to a wide range of Grid functionality using the Task-Provider model. This model allows a normalized access to a variety of Grid technologies without having to adapt to the semantics of each of them.

Every Grid job (remote execution, file transfer, file operation) is represented as a Grid task. All job-specific details are represented as a task specification. Further, the remote execution and file servers are locally represented as service objects. Every service has a provider attribute associated with it that signifies the technology in which the service is implemented. In order to execute the Grid job, a user needs to create an abstract Grid task and associate a specification and service to it. The task is then submitted to a task-handler. The task-handler extracts the specification details and depending on the provider attribute of the service translates them into the protocol specific constructs expected by the backend service. The task is submitted and executed by the handler in an asynchronous mode. Hence, the client need not wait (block) for the task to be completed by the remote service. Instead, once the task is submitted to the task-handler, the client is free to continue with other activities and gets asynchronously notified once the task is done (completed or failed).

One of the most elementary usage pattern in Grid computing is the execution of a Grid task (job submission, file transfer, file operation, or information query). An extension to this basic Grid pattern is a Grid workflow pattern that enables the user to submit a set of Grid tasks along with an execution dependency. Therefore, the initial design of abstractions framework concentrates on providing the elements required to support these important usage patterns. Other Grid patterns can be supported by extending the flexible abstractions design based on community feedback.

In the rest of this section, we describe the important entities and outline their semantics within the Java CoG Kit to implement the discussed abstractions framework

ExecutableObject

An ExecutableObject provides a high-level abstraction for artifacts that can be executed on the Grid. It can be specialized as a Grid Task or a TaskGraph. An ExecutableObject in the Java CoG Kit has a unique identity, name, and an execution status.


 public interface ExecutableObject {
 
     public static final int TASK = 1;
     public static final int TASKGRAPH = 2;
 
     public int getObjectType();
 
     public void setName(String name);
     public String getName();
 
     public void setIdentity(Identity id);
     public Identity getIdentity();
 
     public void setStatus(Status status);
     public void setStatus(int status);
     public Status getStatus();
 
     public void addStatusListener(StatusListener listener);
     public void removeStatusListener(StatusListener listener);
 
 }

Task

A Task is an atomic unit of execution in cog-abstractions. It represents a generic Grid functionality including remote job execution, file transfer request, file operation, or information query. It has a unique identity, name, execution status, specification, and set of services (for remote execution).

The task identity helps in uniquely representing the task across the Grid. The task specification represents the actual attributes or parameters required for the execution of the Grid-centric task. The generalized specification can be extended for common Grid tasks such as remote job execution, file transfer, and information query. A Grid task also contains a set of Grid services that enable the remote execution of the task. For example, a job submission task contains a remote job execution service that will ultimately enable the execution of the task. Likewise, the file transfer task contains a source file server and a destination file server as its supporting services.

A Grid task contains the abstract elements required for any Grid functionality. However, the actual execution of a Grid task requires a specific abstract-to-protocol mapping that translates all the abstract components into backend protocol-specific entities. This translation is done by the Handlers (see 5.7). A task of a given type (execution, file transfer, or file operation) is submitted to an appropriate task handler (execution task handler, file transfer task handler, or a file operation task handler respectively). Based on the provider attribute of the task, the handler provides the appropriate abstract-to-protocol transformation.


 public interface Task extends ExecutableObject {
     public static final int JOB_SUBMISSION = 1;
     public static final int FILE_TRANSFER = 2;
     public static final int INFORMATION_QUERY = 3;
     public static final int FILE_OPERATION = 4;
 
     public void setType(int type);
     public int getType();
 
     public void setProvider(String provider);
     public String getProvider();
 
     public void setService(int index, Service service);
     public void addService(Service service);
     public Service removeService(int index);
     public Collection removeAllServices();
     public void removeService(Collection collection);
     public Service getService(int index);
     public Collection getAllServices();
 
     public void setRequiredService(int value);
     public int getRequiredServices();
 
     public void setSpecification(Specification specification);
     public Specification getSpecification();
 
     public void setStdOutput(String output);
     public String getStdOutput();
 
     public void setStdError(String error);
     public String getStdError();
 
     public void setAttribute(String name, Object value);
     public Object getAttribute(String name);
     public Enumeration getAllAttributes();
 
     public void addOutputListener(OutputListener listener);
     public void removeOutputListener(OutputListener listener);
 
     public void toXML(File file) throws MarshalException;
     public String toString();
 
     public boolean isUnsubmitted();
     public boolean isActive();
     public boolean isCompleted();
     public boolean isSuspended();
     public boolean isFailed();
     public boolean isCanceled();
 
     public Calendar getSubmittedTime();
     public Calendar getCompletedTime();
 
 }

Specification

Every Grid Task has an associated Specification that dictates the objective of the task and the environment required to achieve the objective. The TaskHandler manages the tasks based on the parameters specified in the task specification.


 public interface Specification {
 
     public static final int JOB_SUBMISSION = 1;
     public static final int FILE_TRANSFER = 2;
     public static final int INFORMATION_QUERY = 3;
 
     public void setType(int type);
     public int getType();
 
     public void setSpecification(String specification);
     public String getSpecification();
 }

A task specification is a generalized concept and can be further extended as JobSpecification, FileSpecification, and InformationSpecification1 . We note that the specific parameters required in a task specification depend on the underlying Grid implementation used for the execution of the Task. For example, GT4 has several required parameters that are not supported by GT2 (and vice versa). However, the specification classes in cog-abstractions offer some commonly used attributes that can be extended or omitted based on the requirements of the task and specific Grid implementation.

The JobSpecification mentions the common parameters needed for the remote job execution independent of the low-level implementation. Implementation-specific parameters can be added to the specification as additional attributes.

 public interface JobSpecification extends Specification {
 
     public void setExecutable(String executable);
     public String getExecutable();
 
     public void setDirectory(String directory);
     public String getDirectory();
 
     public String setArgument(String argument);
     public String setArgument(Vector argument);
     public String getArgument(int index);
     public String getArgumentsAsString();
     public Vector getArgumentsAsVector();
     public void addArgument(String argument);
     public void addArgument(int index, String argument);
     public removeArgument(int index);
     public removeArgument(String argument);
 
     public void setStdOutput(String output);
     public String getStdOutput();
 
     public void setStdInput(String input);
     public String getStdInput();
 
     public void setStdError(String error);
     public String getStdError();
 
     public void setBatchJob(boolean bool);
     public boolean isBatchJob();
 
     public void setRedirected(boolean bool);
     public boolean isRedirected();
 
     public void setLocalExecutable(boolean bool);
     public boolean isLocalExecutable();
 
     public void setLocalInput(boolean bool);
     public boolean isLocalInput();
 
     public void setAttribute(String name, Object value);
     public Object getAttribute(String name);
     public Enumeration getAllAttributes();
 }

The FileTransferSpecification provides the commonly used attributes for file transfers between Grid resources. We note once again that not all attributes are supported by every Grid implementation.

 public interface FileTransferSpecification extends Specification {
 
     public void setSourceDirectory(String directory);
     public String getSourceDirectory();
 
     public void setDestinationDirectory(String directory);
     public String getDestinationDirectory();
 
     public void setSourceFile(String file);
     public String getSourceFile();
 
     public void setDestinationFile(String file);
     public String getDestinationFile();
 
     public void setSource(String source);
     public String getSource();
 
     public void setDestination(String destination);
     public String getDestination();
 
     public void setThirdParty(boolean bool);
     public boolean isThirdParty();
 
     public void setAttribute(String name, Object value);
     public Object getAttribute(String name);
     public Enumeration getAllAttributes();
 }

The FileOperationSpecification offers the functionality to invoke important operations on files hosted on remote Grid resources.

A list of all the supported operations and their corresponding arguments are presented in table 1.


Table 1: Valid command names for FileOperationSpecifications.

Please note that the following variables represent Strings: directoryName, fileName, local fileName, remoteFileName localDirectoryName, oldFileName, newFileName. Mode is an Integer and force is a Boolean.

  • start
    • Initializes the connection to the file server
  • stop
    • Terminates the connection to the file server
  • pwd
    • Returns the current working directory of the file server as a String
  • cd <directoryName>
    • Sets the working directory of the file server
  • ls
    • Returns all files in the current directory of the file server as a Collection of GridFiles.
  • ls <directoryName>
    • Returns all files in the specified directory of the file server as a Collection of GridFiles
  • mkdir <directoryName>
    • Creates a directory on the file server with the speci?ed name
  • rmdir <directoryName>
    • Deletes the given directory on the file server, only if it is empty.
  • rmdir <directoryName> <force>
    • Deletes the specified directory on the file server either if it is empty or if force==true
  • rmfile <fileName>
    • Deletes the given file from the file server
  • exists <fileName>
    • Returns a Boolean specifying if the file exists on the file server
  • isDirectory <directoryName>
    • Returns a Boolean specifying if the given directoryName is a valid directory on the file server
  • getfile <remoteFilename> <localFileName>
    • Transfers the remote file on the file server to the local file on he client machine
  • putfile <localFilename> <remoteFileName>
    • Transfers the local file on the client machine to the remote file on the file server
  • getdir <remoteDirectoryName> <localDirectoryName>
    • Transfers the remote directory on the file server to the local directory on the client machine
  • putdir <localDirectoryName> <remoteDirectoryName>
    • Transfers the local directory on the client machine to the remote directory on the file server
  • rename <oldFileName> <String newFileName>
    • Change the name of the old name of the file to the given new name
  • chmod <fileName> <mode>
    • Changes permissions of file on the file server; the mode argument is as per the Unix chmod command.


 public interface FileOperationSpecification extends Specification {
 
     public static String START = "start";
     public static String STOP = "stop";
     public static String PWD = "pwd";
     public static String CD = "cd";
     public static String LS = "ls";
     public static String MKDIR = "mkdir";
     public static String RMDIR = "rmdir";
     public static String RMFILE = "rmfile";
     public static String GETFILE = "getfile";
     public static String PUTFILE = "putfile";
     public static String GETDIR = "getdir";
     public static String PUTDIR = "putfile";
     public static String MGET = "mget";
     public static String MPUT = "mput";
     public static String RENAME = "rename";
     public static String CHMOD = "chmod";
     public static String EXISTS = "exists";
     public static String ISDIRECTORY = "isDirectory";
 
     public void setOperation(String operation);
     public String getOperation();
 
     public void setArgument(String arguments, int index);
     public int addArgument(String argument);
     public Collection getArguments();
     public String getArgument(int n);
     public int getArgumentSize();
 
     public void setAttribute(String name, Object value);
     public Object getAttribute(String name);
     public Enumeration getAllAttributes();
 }

Service

Every Grid Task has a set of remote Services that support the actual execution of the task. The service interface is a local representation of the remote Grid service. Every service has a provider attribute that specifies the technology and provider supported by that service. It also has a service contact and a security context specific to that provider.


 public interface Service {
     public static final int JOB_SUBMISSION = 1;
     public static final int FILE_TRANSFER = 2;
     public static final int INFORMATION_QUERY = 3;
     public static final int FILE_OPERATION = 4;
 
     public static final int JOB_SUBMISSION_SERVICE = 0;
     public static final int DEFAULT_SERVICE = 0;
     public static final int FILE_TRANSFER_SOURCE_SERVICE = 0;
     public static final int FILE_TRANSFER_DESTINATION_SERVICE = 1;
 
     public void setIdentity(Identity identity);
     public Identity getIdentity();
 
     public void setName(String name);
     public String getName();
 
     public void setProvider(String provider);
     public String getProvider();
 
     public void setType(int type);
     public int getType();
 
     public void setServiceContact(ServiceContact serviceContact);
     public ServiceContact getServiceContact();
 
     public void setSecurityContext(SecurityContext securityContext);
     public SecurityContext getSecurityContext();
 
     public void setAttribute(String name, Object value);
     public Object getAttribute(String name);
     public Enumeration getAllAttributes();
 }

PIC Figure 1: A TaskGraph can represent multiple levels of hierarchical DAG


TaskGraph

A TaskGraph provides a building block for expressing complex dependencies between tasks. Advanced applications require mechanisms to execute client-side workflows that process the tasks based on user-defined control dependencies. Hence, the data structure representing the TaskGraph aggregates a set of ExecutableObjects (Tasks and TaskGraphs) and allows the user to define dependencies between these tasks. In graph theoretical terms, a TaskGraph provides the elements to express workflows as a hierarchical directed acyclic graph (see Figure 1). A TaskGraph can theoretically contain infinite levels of hierarchy. However, practically it is constrained with the availability of resources (memory) on a particular system.


 public interface TaskGraph extends ExecutableObject {
 
     public void add(ExecutableObject graphNode) throws Exception;
     public ExecutableObject remove(Identity id) throws Exception;
     public ExecutableObject get(Identity id);
 
     public ExecutableObject[] toArray();
     public Enumeration elements();
 
     public void setDependency(Dependency dependency);
     public Dependency getDependency();
     public void addDependency(Identity from, Identity to);
     public boolean removeDependency(Identity from, Identity to);
 
     public void setAttribute(String name, Object value);
     public Object getAttribute(String name);
     public Enumeration getAllAttributes();
 
     public int getSize();
     public boolean isEmpty();
     public boolean contains(Identity id);
 
     public int getUnsubmittedCount();
     public int getSubmittedCount();
     public int getActiveCount();
     public int getCompletedCount();
     public int getSuspendedCount();
     public int getResumedCount();
     public int getFailedCount();
     public int getCanceledCount();
 
     public void toXML(File file) throws MarshalException;
 
     public Calendar getSubmittedTime();
     public Calendar getCompletedTime();
 
     public void addChangeListener(ChangeListener listener);
     public void removeChangeListener(ChangeListener listener);
 }

Cog-abstractions provides two additional utility classes that specialize the functionality of the TaskGraph. The Set is a special type of TaskGraph with no dependencies. Intuitively, it represents a bag of tasks that can be executed in parallel. The Queue is another specialized TaskGraph that represents a first-in-first-out (FIFO) queue. The dependencies in a Queue are not set explicitly but are maintained implicitly based on the addition of a Task to the Queue.

Status

Every ExecutableObject (Task or TaskGraph) has an associated execution status. An ExecutableObject can be in one of the following status: unsubmitted, submitted, active, suspended, resumed, failed, canceled, and completed. We note that not every status is supported by every Grid implementation. In other words, for some Grid implementations it may not be possible to suspend and resume remote execution.

It is easy to associate a simple Task with one of the above mentioned status. For example, initially the task is unsubmitted; its status changes to submitted when it is handled by a handler; its status changes to active when it is being executed remotely; and so on. However, it is not apparent how a TaskGraph is mapped to one the supported status. Cog-abstractions uses the following logic to map a TaskGraph to its appropriate status.


 public interface Status {
 
     public static final int UNSUBMITTED = 0;
     public static final int SUBMITTED = 1;
     public static final int ACTIVE = 2;
     public static final int SUSPENDED = 3;
     public static final int RESUMED = 4;
     public static final int FAILED = 5;
     public static final int CANCELED = 6;
     public static final int COMPLETED = 7;
     public static final int UNKNOWN = 9999;
 
     public abstract void setStatusCode(int status);
     public abstract int getStatusCode();
     public abstract void setPrevStatusCode(int status);
     public abstract int getPrevStatusCode();
     public abstract void setException(Exception exception);
     public abstract Exception getException();
     public abstract void setMessage(String message);
     public abstract String getMessage();
     public void setTime(Calendar time);
     public Calendar getTime();
 }

Handlers

Cog-abstractions contains the TaskHandler and the TaskGraphHandler, to process a Task and a TaskGraph respectively. Once a Task or a TaskGraph is submitted to the appropriate handler, the handler interacts with the desired Grid implementation and accomplishes the necessary tasks. The handlers in cog-abstractions can be viewed as adaptors that translate the abstract definitions of a Task and TaskGraph into implementation-specific constructs understood by the backend Grid services. For example, a GT3 execution TaskHandler will extract the appropriate attributes from the execution Task and make the necessary calls to the remote Grid service factory, retrieve the Grid service handle, and interact with the newly created service instance. Symmetric translations would be done for other Grid implementations. Intuitively, a handler is specific to the backed implementation and is the only part of cog-abstractions that needs to be extended for supporting additional Grid implementations.


 public interface TaskHandler {
     public void setType(int type);
     public int getType();
 
     public void submit(Task task)
         throws
             IllegalSpecException,
             InvalidSecurityContextException,
             InvalidServiceContactException,
             TaskSubmissionException;
 
     public void suspend(Task task)
         throws InvalidSecurityContextException,
         TaskSubmissionException;
 
     public void resume(Task task)
         throws InvalidSecurityContextException,
         TaskSubmissionException;
 
     public void cancel(Task task)
         throws InvalidSecurityContextException,
         TaskSubmissionException;
 
     public void remove(Task task)
         throws ActiveTaskException;
     public Task[] getAllTasks();
     public Collection getActiveTasks();
     public Collection getFailedTasks();
     public Collection getCompletedTasks();
     public Collection getSuspendedTasks();
     public Collection getResumedTasks();
     public Collection getCanceledTasks();
 }

The TaskGraphHandler provides a similar functionality as the task handler interface. However, it has an additional responsibility of enforcing the dependency on the graph-like task sets submitted to it. It can be implemented as an advanced workflow engine coordinating the execution of tasks on corresponding Grid resources honoring the user-defined dependencies.


 public interface TaskGraphHandler {
 
     public void submit(TaskGraph taskgraph)
        throws
        IllegalSpecException,
        InvalidSecurityContextException,
        InvalidServiceContactException,
        TaskSubmissionException;
 
     public void suspend()
        throws InvalidSecurityContextException,
        TaskSubmissionException;
 
     public void resume()
        throws InvalidSecurityContextException,
        TaskSubmissionException;
 
     public void cancel()
        throws InvalidSecurityContextException,
        TaskSubmissionException;
 
     public Task[] getAllTask();
     public Collection getActiveTasks();
     public Collection getFailedTasks();
     public Collection getCompletedTasks();
     public Collection getSuspendedTasks();
     public Collection getResumedTasks();
     public Collection getCanceledTasks();
 }

FileResource

An alternate model of abstraction for file operations, bypassing the task model, is also available for applications that desire direct interactions with the file hosting Grid servers. A FileResource provides all the necessary functionality to directly invoke remote file operations. Methods of a FileResource bear close correspondence with the commands outlined in Table 1.

 public interface FileResource{
 
     public static final String FTP = "ftp";
     public static final String GridFTP = "gridftp";
     public static final String WebDAV = "webdav";
     public static final String Local = "local";
 
     public String getProtocol();
 
     public void setServiceContact(ServiceContact serviceContact);
     public ServiceContact getServiceContact();
 
     public void setSecurityContext(SecurityContext securityContext);
     public SecurityContext getSecurityContext();
 
     public void start() throws IllegalHostException,
             InvalidSecurityContextException, GeneralException;
     public void stop() throws GeneralException;
     public boolean isStarted();
 
     public void setCurrentDirectory(String directoryName)
             throws DirectoryNotFoundException, GeneralException;
     public String getCurrentDirectory() throws GeneralException;
 
     public Collection list() throws GeneralException;
     public Collection list(String directoryName)
             throws DirectoryNotFoundException, GeneralException;
 
     public void createDirectory(String directoryName) throws GeneralException;
     public void deleteDirectory(String directoryName, boolean force)
             throws DirectoryNotFoundException, GeneralException;
     public void deleteFile(String fileName)
             throws FileNotFoundException, GeneralException;
 
     public void getFile(String remoteFileName, String localFileName)
             throws FileNotFoundException, GeneralException;
     public void putFile(String localFileName, String remoteFileName)
             throws FileNotFoundException, GeneralException;
 
     public void getDirectory(String remoteDirectoryName, String localDirectoryName)
             throws DirectoryNotFoundException, GeneralException;
     public void putDirectory(String localDirectoryName, String remoteDirectoryName)
             throws DirectoryNotFoundException, GeneralException;
 
     public void getMultipleFiles(String[] remoteFileNames, String[] localFileNames)
             throws FileNotFoundException, GeneralException;
     public void getMultipleFiles(String[] remoteFileNames, String localDirectoryName)
             throws FileNotFoundException, DirectoryNotFoundException,
             GeneralException;
     public void putMultipleFiles(String[] localFileNames, String[] remoteFileNames)
             throws FileNotFoundException, GeneralException;
     public void putMultipleFiles(String[] localFileNames, String remoteDirectoryName)
             throws FileNotFoundException, DirectoryNotFoundException,
             GeneralException;
 
     public void rename(String oldFileName, String newFileName)
             throws FileNotFoundException, GeneralException;
 
     public void changeMode(String fileName, int mode)
             throws FileNotFoundException, GeneralException;
     public void changeMode(GridFile gridFile)
             throws FileNotFoundException, GeneralException;
 
     public GridFile getGridFile(String fileName)
             throws FileNotFoundException, GeneralException;
 
     public boolean exists(String fileName)
             throws FileNotFoundException, GeneralException;
 
     public boolean isDirectory(String directoryName)
             throws GeneralException;
 
     public void submit(ExecutableObject commandWorkflow)
             throws IllegalSpecException, TaskSubmissionException;
 
     public void setAttribute(String name, Object value);
     public Enumeration getAllAttributes();
     public Object getAttribute(String name);
 }

GridFile

A GridFile is an abstract representation of a remote file or directory. It represents the basic properties of a file such as size, name, modification date, access permissions etc. This abstraction is a passive information carrier and cannot be used to modify the properties or contents of a remote resource directly. It can be used as an input to the file resource abstraction to change access permissions of a remote resource.


 public interface GridFile {
 
     public static final byte UNKNOWN_TYPE = 0;
     public static final byte FILE_TYPE = 1;
     public static final byte DIRECTORY_TYPE = 2;
     public static final byte SOFTLINK_TYPE = 3;
     public static final byte DEVICE_TYPE = 4;
 
     public void setSize(long size);
     public long getSize();
 
     public void setName(String name);
     public String getName();
 
     public void setAbsolutePathName(String name);
     public String getAbsolutePathName();
 
     public void setLastModified(String date);
     public String getLastModified();
 
     public void setFileType(byte type);
     public byte getFileType();
 
     public boolean isFile();
     public boolean isDirectory();
     public boolean isSoftLink();
     public boolean isDevice();
 
     public void setMode(String mode);
     public String getMode();
 
     public void setUserPermissions(Permissions userPermissions);
     public Permissions getUserPermissions();
 
     public void setGroupPermissions(Permissions groupPermissions);
     public Permissions getGroupPermissions();
 
     public void setAllPermissions(Permissions allPermissions);
     public Permissions getAllPermissions();
 
     public boolean userCanRead();
     public boolean userCanWrite();
     public boolean userCanExecute();
 
     public boolean groupCanRead();
     public boolean groupCanWrite();
     public boolean groupCanExecute();
 
     public boolean allCanRead();
     public boolean allCanWrite();
     public boolean allCanExecute();
 }

Permissions

The permissions interface provides a means to get and set permissions for a class of users along with GridFile and FileResource abstractions.

 public interface Permissions{
 
     public void setRead(boolean canRead);
     public boolean getRead();
 
     public void setWrite(boolean canWrite);
     public boolean getWrite();
 
     public void setExecute(boolean canExecute);
     public boolean getExecute();
 
     public String toString();
 }

Programmer’s Guide

  1. How to execute a remote job execution task (6.1)
  2. How to execute a file transfer task (6.2)
  3. How to execute a file operation task (6.3)
  4. How to execute a simple task graph (DAG) (6.4)
  5. How to execute a hierarchical task graph (6.5)
  6. How to checkpoint a task (6.6)
  7. How to execute a checkpointed task (6.7)
  8. How to checkpoint a task graph (6.8)
  9. How to resume a checkpointed task graph (6.9)
 10. How to invoke remote file operations without adhering to the task model (6.10)
 11. How to run a native Condor job with the CoG Kit abstractions (6.11)
 12. How to integrate a new provider module in the Java CoG Kit(6.12)



How to execute a remote job execution task

Executing a remote job becomes extremely simple with cog-abstractions. To begin with, create a Task with the appropriate attributes.

 Task task = new TaskImpl("myTestTask", Task.JOB_SUBMISSION);
 
 // Set the desired provider. Default options are
 // GT2, GT3.0.2, GT3.2.0, GT3.2.1, GT4.0.0, Condor, or SSH
 //
 task.setProvider("GT4.0.0");

image:alert.gif Note: the provider GT4.0.0 best with GT4.0.1. Due to changes in several core globus services we do not recommend that you use GT4.0.0. However, we still use the providername GT4.0.0.

Then, create a JobSpecification for the task and set the appropriate attributes as per the task requirements.

 JobSpecification spec = new JobSpecificationImpl();
 
 // Set the location and name of the executable.
 // If the executable is a local executable, then
 // spec.setLocalExecutable(true)
 //
 spec.setExecutable("/bin/ls");
 
 /* Set the arguments (if any)
    for the executable
 */
 spec.addArguments("-l");
 spec.addArguments("-a");
 
 /* Set the name of the file which serves
    as the input to the executable
 
    If the input file needs to be redirected
    from the local machine, then
    spec.setLocalInput(true)
 */
 spec.setStdInput("abstractions-testInput");
 
 /* Set the name of the file to which the remote
    output must be stored in.
 
    If the remote output needs to be redirected
    to the local machine, then
    spec.setRedirected(true)
 
    If the remote output needs to be manipulated at
    the local machine rather than storing it in a
    file, then
    spec.setRedirected(true);
    spec.setStdOutput(null);
    The output is now available from
    task.getOutput(); and can be used
    or displayed as desired.
 */
 spec.setStdOutput("abstractions-testOutput");
 
 /* Set the execution mode of the job */
 spec.setBatchJob(true);
 
 /* Add additional attributes that are not
    provided by default. These add on
    attributes will be considered by the
    handler only if it supports it.
 */
 spec.setAttribute("count","546");
 
 /* Assign this specification to the task */
 task.setSpecification(spec);

Next, assign an appropriate service to the task with a provider, service contact, and security context. This step assumes you have a valid user certificate successfully obtained from appropriate certificate authority.


 Service service = new ServiceImpl(Service.JOB˘SUBMISSION);
 
 /* Assign a provider to this service.
    The provider assigned to the service
    must be the same as that assigned to the task.
 */
 service.setProvider("GT4.0.0");
 
 /* Retrieve the appropriate security context
    for this provider from the AbstractionFactory class
 */
 SecurityContext securityContext =
   AbstractionFactory.newSecurityContext("GT4.0.0");
 
 /* Assign the default credentials
    available as a valid proxy certificate
    whose location is specified in the
    cog.properties file present in the
    HOME/.globus directory
 
    To assign non-default credentials
    create a GSSCredential and pass
    this GSSCredential as the argument
    instead of null
 */
 securityContext.setCredentials(null);
 
 /* Assign this security credential to the service */
 service.setSecurityContext(securityContext);
 
 /* Create a new service contact with the
    appropriate service endpoint address
 */
 ServiceContact serviceContact =
     new ServiceContactImpl(
     __https://127.0.0.1:8443/wsrf/services/ManagedJobFactoryService);
 
 /* Assign this service contact with the service */
 service.setServiceContact(serviceContact);
 
 /* Assign the service to the task */
 task.setService(Service.JOB˘SUBMISSION˘SERVICE, service);

Next, create a TaskHandler and submit the task for execution.


     try {
       handler.submit(task);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", i0spe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

If it is required to monitor the status of the task (desired in most interactive tasks), then the monitoring class must implement the StatusListener interface and before submitting the task to a handler subscribe to the task for its status changes.


 task.addStatusListener(this);

If registered to listen to the status notification of the task, implement the statusChanged() function.


 public void statusChanged(StatusEvent event)
 {
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
         + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED) {
       /* Makes sense if
          spec.setRedirected(true);
   spec.setStdOutput(null);
       */
       logger.debug("Output = "
           + task.getStdOutput());
       System.exit(1);
     }
 }

How to execute a file transfer task?

Executing a file transfer is extremely simple with cog-abstractions. To begin with, create a Task with the appropriate attributes.


 Task task = new TaskImpl("myTestTask", Task.FILE˘TRANSFER);
 
 /⋆ Since a file transfer task has two providers:
    the source provider and the destination provider,
    we do not set the provider attribute of the task.
 ⋆/

Then, create a FileTransferSpecification for the task and set the appropriate attributes as per the task requirements.


 FileTransferSpecification spec = new FileTransferSpecificationImpl();
 
 /⋆ Set the source and destination files ⋆/
 spec.setSourceDirectory("/home/username");
 spec.setSourceFile("sourceFile");
 
 spec.setDestinationDirectory("/home/username");
 spec.setDestinationFile("destinationFile");
 
 /⋆ Assign this specification to the task ⋆/
 task.setSpecification(spec);

Next, assign the appropriate source and destination services service to the task.


 Service sourceService = new ServiceImpl(Service.FILE˘TRANSFER);
 
 /⋆ Assign a provider to this service.
 ⋆/
 sourceService.setProvider("GridFTP");
 
 /⋆ Retrieve the appropriate security context
    for this provider from the AbstractionFactory class
 ⋆/
 SecurityContext sourceSecurityContext =
   AbstractionFactory.newSecurityContext("GridFTP");
 
 /⋆ Assign the default credentials
    available as a valid proxy certificate
    whose location is specified in the
    cog.properties file present in the
    HOME/.globus directory
 
    To assign non-default credentials
    create a GSSCredential and pass
    this GSSCredential as the argument
    instead of null
 ⋆/
 sourceSecurityContext.setCredentials(null);
 
 /⋆ Assign this security credential to the source service ⋆/
 sourceService.setSecurityContext(sourceSecurityContext);
 
 /⋆ Create a new service contact with the
    appropriate service endpoint address
 ⋆/
 ServiceContact sourceServiceContact =
     new ServiceContactImpl();
 
 sourceServiceContact.setHost("hot.anl.gov");
 sourceServiceContact.setPort(2811);
 
 /⋆ Assign this service contact with the service ⋆/
 sourceService.setServiceContact(sourceServiceContact);
 
 /⋆ Assign the service to the task ⋆/
 task.setService(Service.FILE_TRANSFER_SOURCE_SERVICE, sourceService);
 
 /⋆ Create a destination file transfer service ⋆/
 Service destinationService = new ServiceImpl(Service.FILE_TRANSFER);
 
 /⋆ Assign a provider to this service.
    The provider assigned to the service
    must be the same as that assigned to the task.
 ⋆/
 destinationService.setProvider("GridFTP");
 
 /⋆ Retrieve the appropriate security context
    for this provider from the AbstractionFactory class
 ⋆/
 SecurityContext destinationSecurityContext =
   AbstractionFactory.newSecurityContext("GridFTP");
 
 /⋆ Assign the default credentials
    available as a valid proxy certificate
    whose location is specified in the
    cog.properties file present in the
    HOME/.globus directory
 
    To assign non-default credentials
    create a GSSCredential and pass
    this GSSCredential as the argument
    instead of null
 ⋆/
 destinationSecurityContext.setCredentials(null);
 
 /⋆ Assign this security credential to the destination service ⋆/
 destinationService.setSecurityContext(destinationSecurityContext);
 
 /⋆ Create a new service contact with the
    appropriate service endpoint address
 ⋆/
 ServiceContact destinationServiceContact =
     new ServiceContactImpl();
 
 destinationServiceContact.setHost("cold.anl.gov");
 destinationServiceContact.setPort(2811);
 
 /⋆ Assign this service contact with the service ⋆/
 destinationService.setServiceContact(destinationServiceContact);
 
 /⋆ Assign the service to the task ⋆/
 task.setService(Service.FILE_TRANSFER_DESTINATION_SERVICE, destinationService);

Next, create a TaskHandler and submit the task for execution.


     try {
       handler.submit(task);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

If it is required to monitor the status of the task (desired in most interactive tasks), then the monitoring class must implement the StatusListener interface and before submitting the task to a handler subscribe to the task for its status changes.


 task.addStatusListener(this);

If registered to listen to the status notification of the task, implement the statusChanged() function.


 public void statusChanged(StatusEvent event) {
 
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
        + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED }}
         status.getStatusCode() == Status.FAILED) {
       logger.info("Task Done");
       System.exit(1);
     }
 }

How to execute a file operation task?

Execution of file operation tasks work slightly differently compared to other tasks. Execution and file transfer tasks are submitted independently of other tasks, thereby displaying a connectionless behavior between the client machine and the remote server. On the other hand, file operation tasks are dependent on each other showing a connection-oriented behavior between the client machine and the remote server. For example, lets assume the client executes a file operation task to “start” a remote connection. The client then executes a task to “list” all the files in the current directory. It can be seen that the two task are not independent, and that the execution of the second task must be done in the context of the connection established in the first task.

Therefore, to execute connection-oriented file operation tasks, we adopt a notion of “sessionID”. To begin with, we create a Task with appropriate attributes.


 Task task = new TaskImpl("myFirstTask", Task.FILE_OPERATION);
 
 /⋆ Set the desired provider. Default options are
    GridFTP, FTP, or WebDAV
 ⋆/
 task.setProvider("gridftp");

Then, create a FileOperationSpecification for the task and set the appropriate attributes as per the task requirements.


 FileOperationSpecification spec = new FileOperationSpecificationImpl();
 
 /⋆ Set the operation and the corresponding arguments.
    In our example, the operation is "start" since we want to
    establish a new connection with the file server. We note that
    "start" is ALWAYS the first operation to be executed in any
    sequence of related file operations.
 ⋆/
 spec.setOperation(FileOperationSpecification.START);
 
 /⋆ The "start" operation has no arguments ⋆/
 
 /⋆ Assign this specification to the task ⋆/
 task.setSpecification(spec);

Next, assign an appropriate service to the task with a provider, service contact, and security context.


 Service service = new ServiceImpl(Service.FILE_OPERATION);
 
 /⋆ Assign a provider to this service.
    The provider assigned to the service
    must be the same as that assigned to the task.
 ⋆/
 service.setProvider("gridftp");
 
 /⋆ Retrieve the appropriate security context
    for this provider from the AbstractionFactory class
 ⋆/
 SecurityContext securityContext =
   AbstractionFactory.newSecurityContext("gridftp");
 
 /⋆ Assign the default credentials
    available as a valid proxy certificate
    whose location is specified in the
    cog.properties file present in the
    HOME/.globus directory
 
    To assign non-default credentials
    create a GSSCredential and pass
    this GSSCredential as the argument
    instead of null
 ⋆/
 securityContext.setCredentials(null);
 
 /⋆ Assign this security credential to the service ⋆/
 service.setSecurityContext(securityContext);
 
 /⋆ Create a new service contact with the
    appropriate service endpoint address
 ⋆/
 ServiceContact serviceContact =
     new ServiceContactImpl("hot.anl.gov:2119");
 
 /⋆ Assign this service contact with the service ⋆/
 service.setServiceContact(serviceContact);
 
 /⋆ Assign the service to the task ⋆/
 task.setService(Service.DEFAULT_SERVICE, service);

Next, create a TaskHandler and submit the task for execution.


     try {
       handler.submit(task);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

Since file operation tasks are always interactive tasks, we monitor the status of the task. The monitoring class must implement the StatusListener interface and before submitting the task to a handler subscribe to the task for its status changes.


 task.addStatusListener(this);

The monitoring class listen to the status notification of the task, by implementing the statusChanged() method.


 public void statusChanged(StatusEvent event)
 {
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
         + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED) {
       /⋆ The output of the "start" operation is
           a sessionID. This sessionID can be used
    for all successive file operation tasks
    to invoke operations with context to this
    newly established connection.
       ⋆/
 
       logger.debug("SessionID = "
           + task.getAttribute("output"));
     }
 }

We note that rather than extracting the output from task.getStdOutput() (which returns a String), we retrieve the output as task.getAttribute(“output”) (which returns an Object). The output Object can then be type cast to the target class as specified in Table 1.

Once we have obtained the connection sessionID, we can submit any file operation specified in Table 1. For example, lets execute the “list directoryName” operation. We create a Task with appropriate attributes.


 Task task = new TaskImpl("mySecondTask", Task.FILE_OPERATION);
 
 /⋆ Set the desired provider. Default options are
    GridFTP, FTP, or WebDAV
 ⋆/
 task.setProvider("gridftp");

Then, create a FileOperationSpecification for the task and set the appropriate attributes as per the task requirements.


 FileOperationSpecification spec = new FileOperationSpecificationImpl();
 
 /⋆ Set the operation and the corresponding arguments.
    In our example, the operation is "list directoryName".
 ⋆/
 spec.setOperation(FileOperationSpecification.LS);
 
 /⋆ We then set the arguments for this operations ⋆/
 spec.addArgument("/home/amin/test");
 
 /⋆ Assign this specification to the task ⋆/
 task.setSpecification(spec);

Since, we have already retrieved a sessionID for this connection, we do not need to create a new Service for this task. Instead, we simply specify the sessionID.


    executing the "start" operation. We note here the
    importance of the initial "start" operation. Without
    that operation, we will not get any sessionID.
 ⋆/
 task.setAttribute("sessionID", sessionID);

Next, create a TaskHandler and submit the task for execution.


     try {
       handler.submit(task);
     } catch (InvalidSecurityContextException ise) {
       logger.error(__Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

Once again we monitor the status of the task. The monitoring class must implement the StatusListener interface and before submitting the task to a handler subscribe to the task for its status changes.


 task.addStatusListener(this);

The monitoring class listen to the status notification of the task, by implementing the statusChanged() method.


 public void statusChanged(StatusEvent event)
 {
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
         + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED) {
       /⋆ The output of the "list" operation is
           a Collection of GridFile objects. This Collection is
    manipulated as desired.
       ⋆/
     }
 }

Table 1 lists all valid operations, their input arguments, and their corresponding outputs.

How to execute a simple task graph (DAG)

To create a TaskGraph, we assume that we have created three tasks: task1, task2, and task3. Instructions for creating job submission and file transfer tasks are available in Sections 6.1 and 6.2 respectively. We then create a TaskGraph and add a dependency between these tasks.


 /⋆ Give a convenient name to the TaskGraph ⋆/
 tg.setName("testGraph");
 
 /⋆ Add the tasks to the TaskGraph ⋆/
 tg.add(task1);
 tg.add(task2);
 tg.add(task3);
 
 /⋆ Add dependencies between these tasks.
 
    Dependency is added as
    task1 --> task2 --> task3.
 
    This implies task1 is executed before task2
    and task2 is executed before task3.
 ⋆/
 tg.addDependency(task1,task2);
 tg.addDependency(task2,task3);

Next, create a TaskGraphHandler and submit the task for execution.


     try {
       handler.submit(tg);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

If it is required to monitor the status of the task graph (desired in most interactive task graphs), then the monitoring class must implement the StatusListener interface and before submitting the task graph to a handler subscribe to the task graph for its status changes.


 tg.addStatusListener(this);

If registered to listen to the status notification of the task graph, implement the statusChanged() function.


 public void statusChanged(StatusEvent event) {
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
        + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED }}
         status.getStatusCode() == Status.FAILED) {
       logger.info(__Task Graph Done);
       System.exit(1);
     }
 }

How to execute a hierarchical task graph?

To create a hierarchical TaskGraph, we assume that we have created three tasks and one TaskGraph: task1, task2, task3, and tg. Instructions for creating job submission and file transfer tasks and simple TaskGraphs are available in Sections 6.1, 6.2, and 6.4 respectively. We then create a TaskGraph and add a dependency between these ExecutableObjects.


 /⋆ Give a convenient name to the TaskGraph ⋆/
 htg.setName("testGraph");
 
 /⋆ Add the ExecutableObjects to the TaskGraph ⋆/
 htg.add(task1);
 htg.add(task2);
 htg.add(task3);
 htg.add(tg);
 
 /⋆ Add dependencies between these ExecutableObjects.
 
    Dependency is added as
    task1 --> task2 --> task3 --> tg.
 
    This implies task1 is executed before task2
    and task2 is executed before task3, and
    task3 is executed before TaskGraph tg.
 ⋆/
 htg.addDependency(task1,task2);
 htg.addDependency(task2,task3);
 htg.addDependency(task3,tg);

Next, create a TaskGraphHandler and submit the task for execution.


     try {
       handler.submit(htg);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception",ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception",tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception",ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception",isce);
       System.exit(1);
     }

If it is required to monitor the status of the task graph (desired in most interactive task graphs), then the monitoring class must implement the StatusListener interface and before submitting the task graph to a handler subscribe to the task graph for its status changes.


 htg.addStatusListener(this);

If registered to listen to the status notification of the task graph, implement the statusChanged() function.


 public void statusChanged(StatusEvent event) {
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
        + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED }}
         status.getStatusCode() == Status.FAILED) {
       logger.info("Task Graph Done");
       System.exit(1);
     }
 }

How to checkpoint a task?

We assume you have created a Grid task as mentioned in Sections 6.1 and 6.2. To checkpoint this task you need to invoke the “marshall” method of the TaskMarshaller class, supplying it with the task and details of the checkpointing file.


 try {
     File xmlFile = new File("Task.xml");
     xmlFile.createNewFile();
     TaskMarshaller.marshal(task, xmlFile);
 } catch (Exception e) {
    logger.error("Cannot marshal the task", e);
 }
 

How to execute a checkpointed task

We assume the Grid task is checkpointed in an XML file called “Task.xml”. First, unmrshal the checkpointed file into a task object.


     File xmlFile = new File("Task.xml");
     task = TaskUnmarshaller.unmarshal(xmlFile);
 } catch (Exception e) {
   logger.error("Cannot unmarshal task", e);
 }

Next, create a TaskHandler and submit the task for execution.


     try {
       handler.submit(task);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

If it is required to monitor the status of the task (desired in most interactive tasks), then the monitoring class must implement the StatusListener interface and before submitting the task to a handler subscribe to the task for its status changes.


 task.addStatusListener(this);

If registered to listen to the status notification of the task, implement the statusChanged() function.


 public void statusChanged(StatusEvent event) {
 
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
        + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED }}
         status.getStatusCode() == Status.FAILED) {
       logger.info("Task Done");
       System.exit(1);
     }
 }

How to checkpoint a task graph

We assume you have created a task graph as mentioned in Sections 6.4 and 6.5. To checkpoint this task graph, you need to invoke the "marshall" method of the TaskGraphMarshaller class, supplying it with the task graph and details of the checkpointing file.


     File xmlFile = new File("TaskGraph.xml");
     xmlFile.createNewFile();
     TaskGraphMarshaller.marshal(taskGraph, xmlFile);
 } catch (Exception e) {
    logger.error(__Cannot marshal the task graph, e);
 }

How to resume a checkpointed task graph

We assume the task graph is checkpointed in an XML file called “TaskGraph.xml”. First, unmrshal the checkpointed file into a task graph object.


 try {
     File xmlFile = new File("TaskGraph.xml");
     taskGraph = TaskGraphUnmarshaller.unmarshal(xmlFile);
 } catch (Exception e) {
   logger.error("Cannot unmarshal task graph", e);
 }

Next, create a TaskGraphHandler and submit the task graph for execution.


     try {
       handler.submit(taskGraph);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

This will initiate the re-execution of all previously uncompleted elements of the task graph. If it is required to monitor the status of the task graph, then the monitoring class must implement the StatusListener interface and before submitting the taskgraph to a handler subscribe to the task graph for its status changes.


 taskGraph.addStatusListener(this);

If registered to listen to the status notification of the task graph, implement the statusChanged() function.


 public void statusChanged(StatusEvent event) {
 
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
        + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED }}
         status.getStatusCode() == Status.FAILED) {
       logger.info(__Task graph Done);
       System.exit(1);
     }
 }

How to invoke remote file operations without adhering to the task model?

File operations can invoked on any file server without adhering to the traditional task model. Create a FileResource object


 FileResource resource = AbstractionFactory.newFileResource(provider);
 
 /⋆ Set an appropriate name for this resource ⋆/
 resource.setName("MyFileResource");
 
 /⋆ Create appropriate security context for this resource ⋆/
 securityContext = AbstractionFactory.newSecurityContext("local");
 
 /⋆ Set the security context for this resource ⋆/
 resource.setSecurityContext(securityContext);
 
 /⋆ Create a new service contact with the
    appropriate service endpoint address
 ⋆/
 ServiceContact serviceContact =
     new ServiceContactImpl("hot.anl.gov:2119");
 
 /⋆ Set the service contact for this resource ⋆/
 resource.setServiceContact(serviceContact);
 
 /⋆ Initialize the file resource connection  ⋆/
 resource.start();
 
 /⋆ Invoke any method available on the file
    resource object and process its input
    as required
 ⋆/
 
 /⋆ When done with this file resource,
    terminate the connection
 ⋆/
 resource.stop();

How to run a native Condor job using the CoG Kit abstractions

Native Condor jobs are submitted to the Condor pool using job description files. This is in contrast to the interface- and API-based approach adopted in the Java CoG Kit. Nevertheless, the abstractions offered by the CoG Kit can be used to submit native Condor jobs without explicitly creating description files. Since the Condor job submission paradigm requires all job submissions to be done locally from a Condor submit machine, throughout this example we assume that the job will be executed locally from the Condor submit machine.

To begin with, create a Task with the appropriate attributes.


 Task task = new TaskImpl("myCondorTask", Task.JOB_SUBMISSION);
 
 /⋆ Set the provider as Condor.
 ⋆/
 task.setProvider("Condor");

Then, create a JobSpecification for the task and set the appropriate attributes as per the task requirements.


 JobSpecification spec = new JobSpecificationImpl();
 
 /⋆ Set the location and name of the executable.
 ⋆/
 spec.setExecutable("/bin/ls");
 
 /⋆ Set the arguments (if any)
    for the executable
 ⋆/
 spec.addArguments("-l");
 spec.addArguments("-a");
 
 /⋆ Set the path name of the initial
    directory. This method is equivalent
    to the Initialdir attribute of the
    Condor submit description file.
 ⋆/
 spec.setDirectory("/home/amin")
 
 /⋆ Set the name of the file which serves
    as the input to the executable
 ⋆/
 spec.setStdInput("condor-input");
 
 /⋆ Set the name of the file to which the remote
    output and remote error must be stored in.
    If left blank or null, the output and error
    will be redirected to the /dev/null of the
    remote machine.
 ⋆/
 spec.setStdOutput("condor-output");
 spec.setStdError("condor-error");
 
 /⋆ Add additional attributes that are not
    provided by in the JobSpecification.
    All the attributes that are acceptable in
    Condor job submit file can be specified
    as a (name,value) pair.
    For example, to specify the LOG file
    in a job, the user can set the value as:
 ⋆/
 spec.setAttribute("log", "condor-log");
 
 /⋆ Assign this specification to the task ⋆/
 task.setSpecification(spec);

Please note that input file staging and output file redirection is handled differently in Condor when compared to the paradigm supported in the Java CoG kit. For this reason, the methods setLocalExecutable(), setLocalInput(), and setRedirected() of JobSpecification are invalid for the Condor provider. Instead, the user is directed to use the attributes should_transfer_files and when_to_transfer_output with the semantics as explained in the Condor v6.6.7 manual.


 spec.setAttribute("when_to_transfer_output","ON_EXIT");

Java CoG Kit transparently translates the JobSpecification into a submit description file acceptable by Condor (condor_submit). This description file is available in the /tmp directory of the Condor submit machine. Further, unless specified otherwise, all Condor jobs are executed in the Vanilla universe by default.

In order to be compatible with the standard job submission paradigm adopted by the Java CoG Kit, all Condor jobs specified through the JobSpecification (and thereby automatically translated to Condor submit files) are executed only once. For multiple executions of the job, with different input, output, and initial directories, the user should specify a custom submit description file in the JobSpecification. If a customized description file is specified, then all the other methods of the JobSpecification (including setExecutable, setArgument, etc) are ignored by the Java CoG Kit. Also note that status monitoring and automatic status update notification is not possible with jobs submitted through customized description files. For such jobs, the user is directed to monitor the status of the job(s) externally via log files.


 spec.setAttribute("descriptionFile","condor-description.txt");

Next, assign an appropriate service to the task with a provider, service contact, and security context.


 /⋆ Create a new job submission service ⋆/
 Service service = new ServiceImpl(Service.JOB˘SUBMISSION);
 
 /⋆ Assign a provider to this service.
    The provider assigned to the service
    must be the same as that assigned to the task.
 ⋆/
 service.setProvider("Condor");
 
 /⋆ Retrieve the appropriate security context
    for this provider from the AbstractionFactory class
 ⋆/
 SecurityContext securityContext =
   AbstractionFactory.newSecurityContext("Condor");
 
 /⋆ Assign the default credentials.
 ⋆/
 securityContext.setCredentials(null);
 
 /⋆ Assign this security credential to the service ⋆/
 service.setSecurityContext(securityContext);
 
 /⋆ Create a new service contact with the
    appropriate service endpoint address.
    Since this task is submitted locally on
    the Condor submit machine, we specify
    localhost
 ⋆/
 ServiceContact serviceContact =
     new ServiceContactImpl(
     "localhost");
 
 /⋆ Assign this service contact with the service ⋆/
 service.setServiceContact(serviceContact);
 
 /⋆ Assign the service to the task ⋆/
 task.setService(Service.JOB_SUBMISSION_SERVICE, service);

Next, create a TaskHandler and submit the task for execution.


 TaskHandler handler = new GenericTaskHandlerImpl();
     try {
       handler.submit(task);
     } catch (InvalidSecurityContextException ise) {
       logger.error("Security Exception", ise);
       System.exit(1);
     } catch (TaskSubmissionException tse) {
       logger.error("TaskSubmission Exception", tse);
       System.exit(1);
     } catch (IllegalSpecException ispe) {
       logger.error("Specification Exception", ispe);
       System.exit(1);
     } catch (InvalidServiceContactException isce) {
       logger.error("Service Contact Exception", isce);
       System.exit(1);
     }

If it is required to monitor the status of the task (desired in most interactive tasks), then the monitoring class must implement the StatusListener interface and before submitting the task to a handler subscribe to the task for its status changes. Status monitoring is ignored for tasks with a customized Condor description file. For such jobs, the user is directed to monitor the status of the job(s) externally via log files.


 task.addStatusListener(this);

If registered to listen to the status notification of the task, implement the statusChanged() function.


 public void statusChanged(StatusEvent event)
 {
     Status status = event.getStatus();
 
     logger.debug("Status changed to "
         + status.getStatusString());
 
     if (status.getStatusCode() == Status.COMPLETED) {
       /⋆ The task.getStdOutput() returns the output
          from the condor_submit command. It does
   not return the output of the executable.
   The output of the executable is available
   in the "file" specified as:
   spec.setStdOutput("file");
       ⋆/
       logger.debug("Output = "
           + task.getStdOutput());
       System.exit(1);
     }
 }

How to integrate a new provider module in the Java CoG Kit?

Lets assume we need to customize task handlers for a provider “MyProvider”. To integrate these customized handlers into our framework, we need to create a new module for MyProvider. In cog/modules directory, we create a new directory for our module, say abstraction-provider-MyProvider. In this newly created module directory, we replicate the directory structure adopted for all Java CoG Kit modules2 . The abstraction-provider-MyProvider module should contain the following files and directories:

  • The module build file required by Apache ant to compile this module. The build.xml file is a standard file for all modules and can be simply copied from any Java CoG Kit module without modification.
  • A properties file containing some elementary information about the module. It contains details such as module name, version, and dependent libraries. The dependent libraries (lib.deps) is a comma separated list of libraries used in this module available in the abstraction-provider-MyProvider/lib directory. Listing below shows a sample project.properties file for the MyProvider module.
       module.name             = abstraction-provider-MyProvider
       long.name               = My customized provider
       version                 = 1.0
       project                 = Java CoG Kit
       #lib.deps               = -
       lib.deps                = lib1.jar, lib2.jar
       debug                   = true
  • This xml file contains a list of other Java CoG Kit modules that are required for the functioning of this module. For example, let MyProvider be dependent or extend the functionality of abstraction-provider-gt2 module. Instead of duplicating all the libraries in the abstraction-provider-gt2/lib directory, we can simply give a module dependency in the dependency.xml file to take care of compilation issues.
  • This is an xml file that allows you to create command line launchers for Java classes.
  • The lib directory contains all the jar files required for this module. To streamline our compilation process, it also required to explicitly mention the names of these jar files in the project.properties file of the module.
  • The etc directory contains the manifest-specific files for this module. These files are standard across modules and can be simply copied from other modules. In addition, the etc directory contains a log4j.properties.module file specifies the logging rules for classes specific to this module. Listing below shows a sample log4j.properties.module file.
       log4j.logger.org.cog.execution.MyProvider=DEBUG
       log4j.logger.org.cog.file.MyProvider=ERROR
  • The resources directory contains the cog-provider.properties for MyProvider. Every provider in the Java CoG Kit may have its own implementation of the execution task handler, file operation task handler, security context, and file resource interfaces. The abstraction factory will search the class path for all files named cog-provider.properties and attempt to load provider information from them. Unless the cog-provider.properties for a provider is present in the Java Virtual Machine class path, the Java CoG Kit will have no information about that provider. The provider properties files can however safely be packaged in a jar file, provided that it is kept in the default package. The cog-provider.properties file specifies the fully qualified class names providing their respective implementations. Listing below shows a sample cog-provider.properties file.
       provider=MyProvider
       sandbox=false
       executionTaskHandler=org.cog.execution.MyProvider.TaskHandlerImpl
       fileOperationTaskHandler=org.cog.file.MyProvider.TaskHandlerImpl
       securityContext=org.cog.SecurityContextImpl
       fileResource=org.cog.file.MyProvider.FileResourceImpl
  • The src directory contains the source files for this module. The sources for a provider module typically contain the implementation classes for the execution task handler, the file operation task handler, the security context, and the file resource interfaces.

Notes About Abstractions Class Loaders

In certain circumstances, such as when different libraries used for different providers are incompatible with each other, in order to allow simultaneous execution of such providers, they need to exist under separate class loader instances, which provide isolated environments. These environments are called provider sandboxes.

Sandboxes are threads with a particular context class loader. Sandboxing is enabled for a provider when the sandbox=true property is used in the cog-provider.properties file.

The following example shows a possible cog-provider.properties file that describes a provider using sandboxing:


 classloader.name=gt3.2.1
 classloader.usesystem=false
 classloader.properties=classloader-gt3_2_1.properties
 provider=gt3.2.1
 sandbox=true
 sandbox.boot=org.globus.cog.abstraction.impl.execution.gt3_2_1.Boot
 ...

The classloader.name property indicates the name of the classloader. The classloader.usesystem property tells the CoG Kit whether the system class loader should be used or not. In the sandboxing case, using the system class loader would serve no purpose. Should the system class loader not be used, such as in the above example, a set of abstraction class loader properties can be specified using the classloader.properties property. The following properties can be used for configuring a CoG class loader:

  • relative=<jar>
    • Example: relative=axis.jar
    • Provides a reference point in the class loader URLs, to be used with the rjar properties. The indicated jar must be present on the class path before the class loader is instantiated.
  • rjar=<jar>
    • Example: rjar=../jglobus.jar
    • Indicates a jar file that should be managed by this class loader. The location of the jar file is relative to the jar file specified using the relative property. This allows for mutable directory-trees of jar files. It works as long as the relative location of the jar files with respect to each other is not changed. It is also intended to accommodate dynamic loading of jar files through Java Webstart.
  • jar=<jar>
    • Example: jar=ogsa.jar
    • Indicates that the class loader should attempt to manage the given jar file. The file must already be present in the class path.
  • package=<package>
    • Example: package=org.globus.cog.abstractions.myprovider
    • Tell the class loader to manager all classes that are inside the indicated package, or inside any descendants of the indicated package.
  • exclude=<package>
    • Example: exclude=org.globus.cog.sensitive
    • Allows the exclusion of certain packages, even though they are present in one of the managed jar files. This may be necessary in order to allow objects that need to be passed between different class loaders to function properly. It is not necessary to exclude system packages, since the class loader will automatically delegate the loading of system classes to the system class loader.

The role of the sandbox.boot property is to provide a way to configure various aspects of classes manager by a class loader. The boot class, which must have a nullary public static boot() method, is executed after the class loader is initialized, and within the class loader environment, but before any method is invoked on classes which the class loader manages.

References

[1] A Java Commodity Grid Kit [las01cogconcurency]

[2] “Java CoG Kit Wiki,” 2004. http://wiki.cogkit.org

[3] “Java CoG Kit Registration,” 2004. http://www.cogkit.org/register

[4] Additional publications about the Java CoG Kit HTML.

If you need to cite the Java CoG Kit, please use [1].

Personal tools
Collaboration and Jobs