A Grid Usage Sensor Service
From Java CoG Kit
Jonathan DiCarlo, University of Chicago (Alumni) BIll Allcock, Argonne National Laboratory Gregor von Laszewski, Argonne National Laboratory, University of Chicago
This page is under construction
Contents |
Links
The new public demo Portal
The primary GUSS documentation is on the CoGKit Wiki under GridUsageSensorServices
GUSS source code can be checked out from the CoGKit CVS repository:
cvs -d:pserver:anonymous@cvs.cogkit.org:/cvs/cogkit co src/cog/modules/monitor
This CVS repository can also be browsed through a web interface. You may be interested in seeing the latest CHANGES.txt or README.txt files.
The org.globus.usage Java package is also highly relevant to GUSS, as it provides the UDP usage data that GUSS uses. It can be checked out from:
cvs -d:pserver:anonymous@cvs.globus.org:/home/globdev/CVS/globus-packages co usage/java
This CVS respository can also be viewed through [1] a web interface
The JavaDocs for all classes in GUSS are also available.
Download a Microsoft PowerPoint presentation about GUSS: Full version: GUSS_for_supercomputing.ppt
Simplified version: [href="http://www.cogkit.org/guss/GUSS_presentation_SC.ppt GUSS_presentation_SC.ppt]
Screenshots-only version: [href="http://www.cogkit.org/guss/GUSS_Screenshots.ppt GUSS_Screenshots.ppt]
Here is the original proposal which started the GUSS project.
The Globus Toolkit has [href="http://www-unix.globus.org/toolkit/docs/development/3.9.5/Usage_Stats.html a page] about the data collection policy
Here is a short description of GUSS written for NMI's web page
The GUSS development portal is at http://usage-stats.globus.org:8080/uPortal. It is only accessible from within the Argonne firewall. You can login with username="developer" and password="developer" to see the visualization portlet in action.</p>
MORE CONENTS FROM OTHER PAGE
Links
http://viewcvs.globus.org/viewcvs.cgi/usage/java/
Sample
http://www.cogkit.org/guss/data.html
GUSS: Grid Usage Sensor Services
The Grid Usage Sensor Service (GUSS) is a system to monitor the volume of file transfers taking place on the Grid. Usage information logged by GridFTP servers will be collected, processed, and distributed by a Web Service: the back-end of GUSS. The front-end will be a portlet which uses this data to display graphs of whatever derived quantities the user specifies. Our hope is that GUSS will become a useful diagnostic tool for all who work on the Grid.
Online Resources
The current draft of the GUSS paper can be checked out from CVS: {{{cvs -d:pserver:anonymous@cvs.cogkit.org:/cvs/cogkit co papers/src/guss-paper}}}
See especially the [wiki:Projects/ANL/GridUsageSensorServices/ToDoList ToDoList] page for GUSS.
The !JavaDocs for all classes in GUSS are online at [2].
The changes to the project with short descriptions of the latest modifications, can be viewed as part of the CHANGES.txt file in the CVS.
GUSS depends on the [wiki:Projects/ANL/GridUsageSensorServices/UsagePacketReceiver UsagePacketReceiver] to monitor usage in real-time. The code for this system is in the package org.globus.usage. It can be viewed through web-CVS at [3]. There is also a README.txt.
Next, we describe the goals, methods, and architecture of GUSS, as well as a timeline and a list of outstanding issues. Your commentary on any of these is welcome.
Basic Functionality
The most basic functionality is to provide answers to the following questions:
- How many GridFTP servers are active?
- How many transfers occured during a specified time period?
- How many people are using GridFTP?
- How many bytes were transferred in total during a specified time period?
- How many bytes were transferred between two particular hosts?
This data should be accessible through a Web interface and should be displayable either as a graph or as a numerical table according to user preference. This and all other user preferences should be maintained between sessions.
Advanced Functionality
We should be able to support any type of graph the user could ask for. There are many possibilities, but they can basically subdivided along four criteria:
Quantity to be Graphed:: For example, "number of users", "number of file transfers", "number of active hosts", or "volume of file transfers".
Histogram vs. Category chart (e.g. bar chart):: The former for seeing how a quantity varies over time; the latter for comparing the quantity totals on one host with each other host.
Lumping of Data:: The default is for transfers to be lumped together by host, but the user might wish to see transfers lumped by file size (to see the breakdown of large files vs. small files) or by user (but see Anonymization below), or even to lump all transfers together to summarize the state of the whole Grid.
Filters:: We should be able to define the time period we're interested in by specifying starting and ending dates. We should be able to block out data from certain hosts we're not interested in. We should be able to exclude transfers of small files and focus on the big ones. We should be able to look at just In fact, we should be able to specify an arbitrary number of arbitrary filters to specify exactly what we need, like a file search or a database query.
Here is a "wish-list" of advanced features to consider. These will not be included in the prototype, but after Supercomputing we will begin working on these.
- Stateful resources: Needed to support dynamic updates and cacheing.
- Dynamic updates: When a user somewhere makes a GridFTP transfer, we want our graph to reflect it right away.
- Efficient cacheing: The work of creating a graph should not be repeated if multiple users request the same data.
- Archiving of data indefinitely: So that years from now we can still request to see how many transfers happened on specific dates.
- Anonymity: The fact that a certain user took a certain file from a certain host should never be available unless the user chooses to make it so.
- Dynamic registry: Whenever a new GridFTP server comes online, GUSS should automatically begin monitoring it.
- Attatching "labels" to transfers: So that we can define arbitrary categoriesof transfers to include or exclude.
- Swing client launched by !WebStart: A richer alternative to the portlet interface.
- Clickable graphs: Click on on certain features to "drill down" to more focused data.
High-level Overview of Architecture
GUSS has three main components. They need to be able to talk to each other even if each one is running on a different server, which is quite likely in a Grid environment.
The GUSS Client is what the user sees. It presents a user-interface allowing the user to specify the desired type of graphs, supporting all the options listed under Advanced Functionality. It submits the user's requests to the GUSS Service and displays the resulting graphs. Additionally, the GUSS Client must have a way of preserving the user's choices between sessions.
The GUSS Service interprets requests from the GUSS Client, retrieves data from the Data Source, parses it according to the request, creates a graph of the relevant values, and sends this graph back to the GUSS Client. The GUSS Service is designed to be able to interoperate seamlessly with various Data Sources and various GUSS Client implementations. Within the GUSS Service, image generation is handled by an open-source (LGPL) Java package called ["JFreeChart"]. ["JFreeChart"] can work as a Swing JPanel subclass, and be integrated into a Java application to run in real-time. It can also easily export a chart as a .png or .jpg image file.
The Data Source is an abstract way to refer to the system which records GridFTP transfers. Specifically, usage must be logged by the GridFTP server or client. The most logical choice to do the logging is the computer which recieves the file. For now the GridFTP server can simply log basic data into a file, and the GUSS Service can grab the logfile off of each active server. In the future we want to make a more active data source which can provide live updates to the GUSS service as transfers happen. The abstraction of the Data Source will allow us to easily plug in this and other new data sources.
The GUSS Service
The back-end of GUSS is a service which collects data from logfiles, processes it, and sends it out in response to requests. The primary intended client will be the GUSS portlet, but we can make the service even more useful if we make it general enough to be used by arbitrary clients. It should have a simple, flexible request interface that allows a client to ask for and recieve whatever subset of the data it needs.
The prototype GUSS Service has been implemented as a Web Service using Apache Axis 1.2. This suits our needs for the moment, but has the drawback of being inherently stateless. It would be useful to be able to maintain records of previous user requests, for cacheing purposes (See Cacheing, below). So in order to maintain state, we may eventually need to move to either a "Grid Service" or a Web Service with associated WS-Resource.
The function of the GUSS web service can be divided into subtasks as follows:
- 1. Get and parse request coming from GUSSClient, including graph scope, graph quantity, and filter list.
- 2. Compare new request to recently served requests to see whether an existing image file can be reused.
- 3. Get data from (abstractly) a Data Source. (See Data Sources).
- 4. Parse incoming data to make a list of GridFTPRecords that meet all filter criteria.
- 5. Use ["JFreeChart"] to make a chart of the user's chosen quantity using the data in the GridFTPRecords.
- 6. Return this chart to the client.
Request from the GUSSClient:
The design of the interface to the web service depends on what types of graphs users might want to see. We must support the three types of specifications described under Advanced Functionality, above. At the most basic, we must allow codes for which variable to graph and the scope of the graph (these are ints interpreted according to class constants), and a list of arbitrary filters.
There are many possible ways to graph any combination of data: Besides basic histograms and bar charts, how about a two-dimensional matrix with source-host on one axis, destination-host on the other, and color codes indicating transfer volumes? ["JFreeChart"] is flexible enough to support all of these, and easily extensible. If at some point in the future we think of a better way to do any particular graph, we need only change the internals of the GUSSService: we leave the interfaces the same, so the existing data sources and clients will work with no change.
One web service can expose multiple public methods. The primary method of the GUSS Service is called makeGraphImage. !MakeGraphImage returns a URL (see "Returning Data to Client" below.) As arguments, makeGraphImage takes three integers (interpreted according to class constants) defining the user's choice of Graph Quantity, Graph Type, and Data Lumping. It also takes an array of filter objects.
These filter objects all implement a common interface so new ones can be created and plugged in without changing existing code. The filters are instantiated by the GUSS Client and passed with the request to the GUSS Service. (The fact that the filters must be serialized into XML (to be included with the SOAP request) limits filter objects to have only primitive data types as fields, but this restriction will not be a problem.) When the GUSSService does its parsing of the logfiles, it ignores any record which fails to pass any of the filters.
A secondary public method of the GUSS Service is makeTable. It takes the same arguments, but instead of creating a graph, it creates a formatted table of numerical data.
Cacheing
Once we get beyond the prototype phase, we will need to think about improving efficiency. Rendering the graph is an expensive operation. It is a waste of resources to create an identical graph from scratch for each user who makes the same request, or to make identical graphs from scratch repeatedly for the same user if no data is changing. Instead, we will need to think about caching, which means we need to think about associating stateful resources with the web service. Specifically, the service will have to remember the requests it has recently served, the time at which each request was processed, and the image it created for each one. If a new request comes in that matches one of the recently-served requests, the service will compare the time-stamp on the previously created image to the timestamps in the Data Source. If new GridFTP transfers have been logged since the image was created, the image is invalid and should be recreated. But if no new transfers have come in yet, the client can be instructed to re-use the old image.
Returning the graph to the client:
The GUSS Service creates the graph as an image and stores it as a .png file on the server with an auto-generated unique name. It creates a URL referencing this file and gives that URL to the GUSS Client as the return value of makeGraphImage.
This is only one possible way of getting the graph to the client. It is not optimally efficient, because after the web service has returned the client still needs to make another connection to the server to retrieve the image file. However, this method has three key advantages:
- 1. The client does not need to do any image processing, so even a "dumb" client (i.e. web-browser based) can easily use the GUSS service.
- 2. The image is reliably transferred through HTTP, so there is no need for the difficult and error-prone task of enveloping the image in XML to be sent as part of a SOAP message.
- 3. The same image, once generated, can be viewed by many users, reducing redundant work. (See Cacheing).
Two other possible ways to return the image to the client, which have not yet been implemented, are:
- 1. Send the image file back with the SOAP reply, either by serializing it, making it an attatchment, or some other method;
- 2. Send back raw numerical data from the web service, and have the client use JChart to do the graphing. This would only be appropriate for a "smart" client like a standalone Swing application, but the advantage is that the smart client could easily animate the graph, let the user zoom in and out, etc.
If we decide in the future to implement one or both of these methods, we can add new public methods to the GUSSService to support them. The interface of the original makeGraphImage() method will not need to change, so backwards compatibility will be preserved.
In the case of the makeTable method, returning the data is much simpler. The return value of makeTable is simply a string containing an HTML description of the table. This lets the GUSS Service control the formatting and contents of the table, which the GUSS Client can easily include in a portlet or any other page.
The Data Sources:
The primary data source will be log files kept by the GridFTP servers. A secondary data source will be UDP packets sent out in response to changes in these log files, used for dynamic updates. Because the GUSS Service can read data from more than one source we will use an abstraction layer to isolate its number-crunching core from the data readers. This way, if we come up with a third possible source of data, we can easily plug it in.
The GUSS Service has a helper class called GridFTPRecord and maintains a linked-list of these. The abstract Data Source class is simply a generator of GridFTPRecords to be collected by the GUSS Service, which will process or ignore them based on whether they fit the filter criteria. The two subclasses of Data Source are a Logfile Reader and an Update Listener. It is important that the GUSS Service be able to easily combine GridFTPRecords coming from more than one Data Source. This way multiple logfiles can be read and all the records combined, and any new records coming in from an !UpdateListener can be added to this list.
GridFTP Server Logs
The GuSS Service will read one logfile from each server, and combine all the records from each one. Each transfer should be recorded in one and only one logfile. (If two records were made of the same transfer they would likely have different timestamps and therefore it would be extremely hard to avoid counting it twice.)
A GridFTP transfer can involve up to three hosts: the sender, the reciever, and the host which commanded the transfer, which may be distinct from the other two. Of these three, it seems most reasonable for the reciever of the file to do the logging. It will write into a logfile the time it began receiving the file, the time when the transfer completed, and the number of bytes in the file (to name three of the most basic data). As long as the GUSS service gets this basic data, it can calculate many useful quantities related to total volume of transfer, volume per host, bandwidth between each pair of hosts, etc.
Obviously, modification to the existing GridFTP servers will be needed to add the logging feature as part of the Globus Toolkit efforrts to enhance the GridFTP server code. This will include the addition and enhancemnet of the GridFTP logging functionality to it. We will work with the GridFTP server team to finalize the contents, format, and location of these logfiles.
Update: The Globus GridFTP server version 3.2.0, when run with the -Z flag, produces logfiles which contain everything that GUSS needs.
The usefulness of GUSS depends on the wide-scale adoption of the logging feature in the new version of the GridFTP server To enocurage adoption, we want to make the upgrade to the new version, and the activation of the logging, as effortless as possiblee for site administrators.
Dynamic Updates
We would like to be able, eventually, to display usage data in real-time, updating the graph dynamically as transfers are made. We imagine a daemon who watches the server's logfile and broadcasts UDP packets whenever a change occurs. The GUSS grid service could catch these packets and update its graphs accordingly. Since UDP is unreliable, we must fall back on reading each logfile as the basic method of getting static, but reliable, data.
Besides packet broadcast, another possibility to consider is to give the GUSS Service a public method to recieve a record of a new GridFTP tranfser. For this method to work, the GridFTP Servers would have to be able to look up the GUSS service through UDDI or its equivalent, and send the data to all running GUSS services. If there are many GUSS services running at once, a broadcast packet may be a better idea after all.
No matter how they happen, dynamic updates are no use to a stateless web service. As long as the GUSS Service is a basic stateless web service, it has no place to store a list of GridFTPRecords, and so it must re-read the logfiles in response to every request. If it has state, it can read the logfiles once when it comes online, maintain the list of all GridFTPRecords locally, and update this list as dynamic updates come in. This is much more efficient than re-reading the logfiles each time, too, so statefulness is clearly a worthy goal, although tricky to implement at present.
The GUSS Client
The first GUSS client in development is a Portlet designed to work with OGCE's portal. See [wiki:Projects/ANL/GridUsageSensorServices/JsrPortletsAnduPortal JsrPortletsAnduPortal]. Other clients are certainly possible, and the GUSS web service does not need to know the nature of the program making requests. In particular, at some point we may want to create a Swing application (perhaps launched using Java !WebStart) to provide types of interactivity that a Web application cannot.
Each user of the GUSS Client should be able to decide exactly what data is shown in the portlet and how that data is graphed; this way users can choose what will be most useful for their own projects. When the user comes back to the client again, he or she of course expects to see new data displayed in exactly the same format he or she previously specified, and that is what we must deliver.
For a Swing application this sort of persistence is easy, but for a portlet it is more of a challenge. Luckily, the portal itself provides a mechanism by which user preferences and other state can be saved; to support login and customization, all we need to do is interface with the portal's features.
The Front-end Portlet
The portlet is currently being implemented using Apache Jetspeed/Turbine, for compatibility with the OGCE portal (NMI). Specifically, it is a "JSP Portlet", an instance of an existing portlet subclass which simply includes a Java Server Page inside a portal. There is no need to write custom portal code when using this method; only a JSP which provides an HTML form-based interface and which makes calls to the GUSSClient class.
In the future it may be desirable to port GUSS to other portals (uPortal, !WebSphere, etc). Portlet standards are still in a state of flux. Eventually the 168 standard may make portal-independence easy. For the moment, doing most of the work in JSPs seems safe, since JSPs are a well-established standard that can be trivially converted into portal-specific portlets.
A portlet, per se, is not well-suited to presenting dynamically changing data, because the view is only redrawn when the whole HTML page is reloaded. We can set an auto-refresh tag in the portlet's HTML to force it to reload periodically. The problem is that each reload will be a new call to the GUSS web service, which will have to generate a new graph image, which the browser will then have to load. If the data has not actually changed since the last refresh, these steps are a waste of resources. The ideal solution would be to have changes propogate from the data source towards the user -- i.e. when an update was made to the server log, the server log would update the GUSS service, the GUSS service would update the portal, and the portal would update the user's screen. However, this is not possible with the current infrastructure of the web.
Instead, to avoid recreating identical images, the GUSS web service must be able to recognize a new request as identical to an old request. (See the Cacheing section, above). In such case it will return a special "no change" string to indicate that the portlet should continue displaying the old image.
The User-Interface Design
The client must provide a user interface which allows the user to specify any of the types of graph described under Advanced Functionality, above. All of the possible choices map naturally into standard GUI elements (pop-up menus etc) which are easily created with HTML forms for the portlet client, or with Swing components for the standalone client. The user can request the latest data at any time by clicking the "Refresh" button, or he can set the time between automatic updates.
Although graphical data is good for getting an overview of a system, there are times when precise, explicit numerical data is preferred. A pair of radio buttons lets the user choose which of the two to recieve.
An "Add Filter" button takes the user to a sub-screen where he can use more menus and text-entry boxes to define an arbitrary filter -- based on dates, file sizes, hosts, or users -- to apply to the data. A list of these filters is stored in user preferences and, when the Refresh button is clicked, all of the filters are converted into a suitable form to be sent as part of the SOAP message to the GUSS Server.
An advanced UI feature to consider for future development is to let the user get more focused information by clicking on parts of the graph itself. For example, a comparison of bandwidth between pairs of hosts might be displayed as a network diagram with values plotted on the links between nodes. A click on one of these links could bring up a detailed histogram of the transfers between those two nodes. This type of thing is much easier to do in a Swing client, but could be done on the Web either as an applet, or as an HTML image-map if appropriate data were returned from the service along with the image.
A Few More Issues
Anonymization
We have no need to know which particular user is transferring which particular file, or to where. In case the user prefers not to share this with the world, we should respect the user's privacy. Therefore we will retain no usernames or filenames. We care only about number of users and number of bytes. The issue of IP addresses is trickier. We need some way of distinguishing the hosts involved in each transfer, but on the other hand, IP addresses are easily traced, and so many users would likely prefer to keep theirs private. One possibility we are considering is to have a central computer at Argonne assign a GUID (Globally Unique ID) to each GridFTP host. When a new host comes online, it would query Argonne and recieve a GUID which would thereafter be used to identify it in all transactions. The only party who would know which IP address a GUID represents would be the holder of that IP address, and the holder can choose whether to make this information public or not.
Locating the GUSSService
We will eventually need to use a UDDI registry or the equivalent to let the GUSS Client find the service automatically. For the prototype version, the user needs to figure out the hostname and port number of a running GUSS Service and enter them into the GUSS Client. (This can be done with a configuration file.)
A related question is how the GUSS Service finds the GridFTP server logs. For the prototype version, the GUSS Service has a configuration file containing a contact list of URLs of server logs to read. This is a poor solution, because it requires modification of the contact list whenever a new GFTP host is brought online. What we really want to do, eventually, is to have each running GFTP host enter itself in a registry, and to have the GUSS Service periodically scan the registry to update its contact list.
Timeline
By the end of August:
- (Done) Learn how to use the plotting package ["JFreeChart"] and verify that it can do what we want it to do.
- (Done) Use GridFTP to pull logfiles off of a server.
- (Done) Make a "Hello World" webservice, to make sure I understand how that part will work.
- (Done) Make an "About the Cog" portlet that uses the 'about' bean, and add it to OGCE. This is something we want to have anyway, but it's also a "Hello, Portlet World" for me, and will teach me what I need to know to make the GUSS portlet.
By October
- (Done) Put plotting system together with portlet, make it interface with webservice.
- (Done) Put GridFTP file-fetching code together with webservice and make it parse the logfiles.
- (Done) Finalize the contents and format of the GridFTP servers' logfiles, verify that they are being written.
By Supercomputing (beginning of November)
- (Done) Have a JSP with a presentable user interface.
- (Done) Have a demo setup running and acccessible over the web
- (Done) Have all the features described in Basic Functionality, above, fully debugged and working.
- (Done) Add more display options as time permits.
December 2004
- Turn the JSP into a JSR168 portlet and make it run in uPortal. See [wiki:Projects/ANL/GridUsageSensorServices/JsrPortletsAnduPortal JsrPotletsAnduPortal].
- Make the portal save the user's preferences.
- Use WSRF to associate web service with database
- Improve scalability
- Add statistical graphing modes
By January 2005
- Work on advanced functionality such as real-time updating and efficient cacheing.
- Write thesis!
Feb-July 2005
Add and debug the rest of the advanced functionality. Write documentation sufficient for someone else to be able to take over from me. Add chapters to the manual, for example a "How to Build a Portlet" tutorial, and a chapter on Portlet Security.
Related Work
A subsect of related work includes (please add if you know about more):
- Netlogger: A logging methodology and a toolkit for doing it. C-based. Used by parts of the globus project already, it would seem. Includes graphing capability, aimed at the level of individual TCP segments.
- NetloggerLite: [4] Very simple, has java API, lets us put in arbitrary name/value pairs.
- log4j: [5] An easily-included package for debugging java, meant to easily include/exclude hierarchial levels of debugging output.
- WSRP4j: Web Services for Remote Portlets for Java. [6] A project in "incubation" status at Apache, i.e. in very early stages. Yet another web services "standard". The idea is to provide a standard way to integrate web-service-delivered content into a portal without having to program a custom portlet. An intriguing idea, but the requirements of GUSS mean that writing a custom portlet will probably be neccessary anyway.
- Network Weather Service: [7] Monitors and attempts to predict quality of service. Implementation exists for Globus and GIS. Senses bandwidth, latency, available CPU, and available memory. Does some really cool stuff with adaptive forcasting (identifies method with lowest cumulative error and forecasts based on that).
- iperf [8] a tool (in C) for profiling throughput and bandwidth of a TCP or UDP connection; for UDP it also shows jitter and datagram lossage. Meant for tweaking OS-specific transport-layer settings, but a system for testing Grid performance could concievably be built around it.
- Grid3: has a really cool page at [9] which shows the current operational status of each machine on their grid. The architecture of their monitoring system is explained at [10]: this system is not built from scratch but is built on top of Ganglia and MonALISA.
- Inca: [11] is a testing framework aimed at verifying interoperability between Virtual Organizations. In an environment with multiple autonomous and heterogenous VOs, Inca can test and monitor them all to make sure they are keeping up their service agreements. It is being used on the !TeraGrid.
- MonALISA: [12] ("Monitoring Agents with a Large Integrated Services Architecture") is an advanced and very ambitious general framework for monitoring distributed resources. It uses Web Services and JINI Lookup Discovery Services to dynamically discover and register resources. It can dynamically load monitoring modules which can be written to monitor any interesting resource; it can register listeners which will be notified of changes in real-time; it has configurable GUIs; and it can detect new resources and begin monitoring them as they come online. MonALISA will be very impressive if it delivers on everything it promises.
- Condor: [13] The Condor project has a website with maps of "Condoor pools" (their term for local Grid installations) around the world. These maps are auto-generated by a program that uses DNS lookup and WHOIS entries to guess at the physical location of an IP address. Although the site makes no claims of geographic accuracy or completeness (e.g., all pools in a foreign country are mapped to that country's capital), it is interesting to see how such geographic patterns can be extracted without human intervention. In order to be included on the map, a Condoor pool has to identify itself to the University of Wisconsin-Madison either through e-mail or UDP. (Note to self: look into the UDP protocol used.)
- Grid Monitoring Software: [14], created by the Texas Advanced Computing Center at the University of Texis at Austin, is a Java application that periodically runs a variety of tests on each host. The tests include GRAM availability, job submission, GridFTP, GRIS, and a test for the presence of Network Weather Services data (see above). The results are stored in a web-service-enabled database called the Grid Portals Information Repository.
- Grid X1: [15] A Canadian project. Uses a custom Perl script to periodically run three tests on each Globus Toolkit installations: GRAM Gatekeeper availability, job submission, and GridFTP. The perl script generates graphs for the website, including a pie chart of the number of CPUs available on each cluster.
- Big Brother: [16] ("The Big Brother Systems and Network Monitor") displays icons representing status conditions for each host on the NEESGrid (Network for Earthquake Engineering Simulation). It monitors connectivity, CPU usage, free disk space, http responsiveness, and other vital information by periodically running simple tests such as ping. This is very similar to the Grid3 site mentioned above.
- RRD Tool: [17] (Round Robin Database Tool)
- TeraGrid's monitoring framework
- Ganglia
- Upshot
- Tau
