Genetic Algorithms on the Grid for Graph Partitioning

From Java CoG Kit

Jump to: navigation, search

Contents

Task Assignments

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac

Project Description

This page is being worked on and reflects an initial state of the project. The project has not yet produced anything at this time.

The task of this project is to design a grid based algorithm to do a genetic algorithm for the graph partitioning problem.

Project Team and Responsibilities

Name E-mail Address Year Department
TBD TBD TBD TBD

Literature

Graph Partitioning

Introduction

In its most general form, the graph partitioning problem asks how best to divide a graph' s vertices into a specified number of subsets such that: (i) the number of vertices per subset is equal and (ii) the number of edges straddling the subsets is minimized. Graph partitioning has several important applications in Computer Science, including VLSI circuit layout, image processing, solving sparse linear systems, computing fill-reducing orderings for sparse matrices, and distributing workloads for parallel computation. Unfortunately, graph partitioning is an NP-hard problem and therefore all known algorithms for generating partitions merely return approximations to the optimal solution. In spite of this theoretical limitation, numerous algorithms for graph partitioning have been developed during the past decade that generate high-quality partitions in very little time.

One of the fundamental problems that must be addressed in every parallel application is that of work-load distribution — the distribution of data and computation across a processor set. Optimal distributions minimize an application' s overall runtime, typically by ensuring that each processor has an equal amount of work while minimizing the parallel overhead induced by the distribution (most notably in the guise of interprocessor communication). Some applications are easy to distribute.


Recursive Bisection

An instance of graph partitioning that deserves special attention is the graph bisection problem. This is simply a variation on graph partitioning in which P must be divided into two subsets. Although bisection seems considerably easier than general H -way partitioning, it is still NP-hard. Most H -way partitioning algorithms utilize a divide-and-conquer approach known as recursive bisection.This technique generates a H -way partition by performing a bisection on the original graph and then recursively considering the resulting subgraphs. It has been shown that even if recursive bisection is performed using an optimal bisection algorithm, it can still result in a suboptimal Q -way partition. In spite of this theoretical limitation, recursive bisection remains the primary graph partitioning strategy due to its simplicity compared to computing Q -way partitions directly.


Weighted graph partitioning problem

The weighted graph partitioning problem allows weights to be associated with the vertices and edges of R . In this problem, a good partition is one in which the total vertex weight of each subset is roughly equal, and the total weight of the cut edges is minimized. In the context of workload graphs, node weights can be used to encode differing computation expenses across the graph (e.g., boundary locations vs. internal locations). Similarly, edge weights can signify the volume of communication required between two nodes.


Grid Computing

Introduction

Grid computing is an emerging computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure. Grids use the resources of many separate computers connected by a network (usually the Internet) to solve large-scale computation problems. Grids provide the ability to perform computations on large data sets, by breaking them down into many smaller ones, or provide the ability to perform many more computations at once than would be possible on a single computer, by modeling a parallel division of labour between processes.


Origins

Like the Internet, the Grid concept evolved from the computational needs of "big science". The Internet was developed to meet the need for a common communication medium between large, federally funded computing centers. These communication links led to resource and information sharing between these centers and eventually to provide access to them for additional users. Ad hoc resource sharing 'procedures' among these original groups pointed the way toward standardization of the protocols needed to communicate between any administrative domain. The current Grid technology can be viewed as an extension or application of this framework to create a more generic resource sharing context.


Common features

Grid computing offers a model for solving massive computational problems by making use of the unused resources (CPU cycles and/or disk storage) of large numbers of disparate computers, often desktop computers, treated as a virtual cluster embedded in a distributed telecommunications infrastructure. Grid computing's focus on the ability to support computation across administrative domains sets it apart from traditional computer clusters or traditional distributed computing.

Grids offer a way to solve Grand Challenge problems like protein folding, financial modelling, earthquake simulation and climate/weather modelling. Grids offer a way of using the information technology resources optimally inside an organisation. They also provide a means for offering information technology as a utility bureau for commercial and non-commercial clients, with those clients paying only for what they use, as with electricity or water.

Grid computing has the design goal of solving problems too big for any single supercomputer, whilst retaining the flexibility to work on multiple smaller problems. Thus Grid computing provides a multi-user environment. Its secondary aims are better exploitation of available computing power and catering for the intermittent demands of large computational exercises.

This approach implies the use of secure authorization techniques to allow remote users to control computing resources. Grid computing involves sharing heterogeneous resources (based on different platforms, hardware/software architectures, and computer languages), located in different places belonging to different administrative domains over a network using open standards. In short, it involves virtualizing computing resources.

Grid computing is often confused with cluster computing. The key difference is that a cluster is a single set of nodes sitting in one location, while a Grid is composed of many clusters and other kinds of resources (e.g. networks, storage factilities).


Genetic Algorithms

The following extracts have been taken from: http://cs.felk.cvut.cz/~xobitko/ga/ (c) Marek Obitko, 1998

Introduction

Genetic algorithms are inspired by Darwin's theory of evolution. Solution to a problem solved by genetic algorithms uses an evolutionary process (it is evolved).

Algorithm begins with a set of solutions (represented by chromosomes) called population. Solutions from one population are taken and used to form a new population. This is motivated by a hope, that the new population will be better than the old one. Solutions which are then selected to form new solutions (offspring) are selected according to their fitness - the more suitable they are the more chances they have to reproduce.

This is repeated until some condition (for example number of populations or improvement of the best solution) is satisfied.

A genetic algorithm (GA) is a search technique used in computer science to find approximate solutions to optimization and search problems. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, natural selection, and recombination (or crossover).


Outline of the Basic Genetic Algorithm


1. [Start] Generate random population of n chromosomes (suitable solutions for the problem)

2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population

3. [New population] Create a new population by repeating following steps until the new population is complete

--3.1. [Selection] Select two parent chromosomes from a population according to their fitness (the better fitness, the bigger chance to be selected)

--3.2. [Crossover] With a crossover probability cross over the parents to form new offspring (children). If no crossover was performed, offspring is the exact copy of parents.

--3.3. [Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome).

--3.4. [Accepting] Place new offspring in the new population

4. [Replace] Use new generated population for a further run of the algorithm

5. [Test] If the end condition is satisfied, stop, and return the best solution in current population

6. [Loop] Go to step 2


Encoding of a Chromosome

A chromosome should in some way contain information about solution that it represents. The most used way of encoding is a binary string. A chromosome then could look like this:

Chromosome 1 1 1 0 1 1 0 0 1 0 0 1 1 0 1 1 0
Chromosome 2 1 0 0 1 0 1 1 0 0 0 0 1 1 1 1 0

Each chromosome is represented by a binary string. Each bit in the string can represent some characteristics of the solution. Another possibility is that the whole string can represent a number. One may encode directly integer or real numbers, sometimes it is useful to encode some permutations and so on.


Crossover

After we have decided what encoding we will use, we can proceed to crossover operation. Crossover operates on selected genes from parent chromosomes and creates new offspring. The simplest way how to do that is to choose randomly some crossover point and copy everything before this point from the first parent and then copy everything after the crossover point from the other parent.

Crossover can be illustrated as follows: ( / is the crossover point):

Chromosome 1 1 1 0 1 1 / 0 0 1 0 0 1 1 0 1 1 0
Chromosome 2 1 0 0 1 0 / 1 1 0 0 0 0 1 1 1 1 0
Offspring 1 1 1 0 1 1 / 1 1 0 0 0 0 1 1 1 1 0
Offspring 2 1 0 0 1 0 / 0 0 1 0 0 1 1 0 1 1 0

There are other ways how to make crossover, for example we can choose more crossover points. Crossover can be quite complicated and depends mainly on the encoding of chromosomes. Specific crossover made for a specific problem can improve performance of the genetic algorithm.


Mutation

After a crossover is performed, mutation takes place. Mutation is intended to prevent falling of all solutions in the population into a local optimum of the solved problem. Mutation operation randomly changes the offspring resulted from crossover. In case of binary encoding we can switch a few randomly chosen bits from 1 to 0 or from 0 to 1. Mutation can be then illustrated as follows:

Original offspring 1 1 1 0 1 1 1 1 0 0 0 0 1 1 1 1 0
Original offspring 2 1 0 0 1 0 0 0 1 0 0 1 1 0 1 1 0
Mutated offspring 1 1 1 0 0 1 1 1 0 0 0 0 1 1 1 1 0
Mutated offspring 2 1 0 0 1 0 0 1 1 0 0 1 1 0 1 1 0

The technique of mutation (as well as crossover) depends mainly on the encoding of chromosomes. For example when we are encoding permutations, mutation could be performed as an exchange of two genes.

Code

The Java CoG Kit contains multiple interfaces and framework for Graphs. These codes have not been designed by this group.

  1. The Karajan Graph Editor Prototype http://www.cogkit.org/viewcvs/viewcvs.cgi/src/cog/modules/grapheditor/src/org/globus/cog/gui/. It allows specification of Graphs as listed in the example directory at http://www.cogkit.org/viewcvs/viewcvs.cgi/src/cog/modules/grapheditor/examples/
  2. A simple graph API at http://www.cogkit.org/viewcvs/viewcvs.cgi/src/cog/modules/util/src/org/globus/cog/util/graph/
  3. Some layout algorithms for Graphs at

http://www.cogkit.org/viewcvs/viewcvs.cgi/src/cog/modules/grapheditor/src/org/globus/cog/gui/grapheditor/canvas/views/layouts/

The future code for this project will be located at

  1. http://svn.sourceforge.net/viewcvs.cgi/cogkit/trunk/src/cog/modules/partioning/

References

  • Chapter 1.5.2 The Graph Patition Problem, in The Stony Brook Algorithm Repository, Steven S. Skiena, Department of Computer Science, State University of New York
http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK4/NODE177.HTM
  • Lecture notes on Graph Partitioning, Part 1, UC Berkeley, CS267: Lectures 20 and 21, Mar 21, 1996 and Apr 2, 1996
http://www.cs.berkeley.edu/~demmel/cs267/lecture18/lecture18.html
  • B. W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal, pages 291-307, 1970.
  • Graph Partitioning Algorithms for Distributing Workloads of Parallel Computations, Bradford L. Chamberlain
This paper surveys graph partitioning algorithms used for parallel computing, with an emphasis on the problem of distributing workloads for parallel computations. Geometric, structural, and refinement-based algorithms are described and contrasted. In addition, multilevel partitioning techniques and issues related to parallel partitioning are addressed. All algorithms are evaluated qualitatively in terms of their execution speed and ability to generate partitions with small separators. http://www.cs.washington.edu/homes/brad/cv/pubs/degree/generals.pdf
  • Introduction to Genetic Algorithms with Java applets, (c) Marek Obitko, 1998
These pages introduce some of the fundamentals of genetics algorithms. These pages are intended to be used for learning about genetics algorithms without any previous knowledge from this area. Only some knowledge of computer programming is assumed. Several interactive Java applets have been included to demonstrate basic concepts of genetic algorithms. http://cs.felk.cvut.cz/~xobitko/ga/

Progress

Communication Rules

  • All e-mails on this project must have the prefix gridga: otherwise it will be automatically deleted by Prof. Gregor's spam filter and the mail will not be read. If one, for some reason fails to do this himself, the follow up mail must include the string gridga: in the reply.
  • Additionally, signatures/fortunes appended to e-mails (especially the longer signatures) may cause the mail filter to flag the mail as spam. Please excercise due caution while use e-mail.
  • the mails should not be directed to Gregor but to team@cogkit.org

Timeline

  • Start Date : 15 June 2006
  • Weekly updates on saturdays by midnight (Indian Standard Time)
  • The weekly updates are usually checked by Sundays noon CST by Gregor


16th July - 22nd July 2006

  • Things achieved
    • Developed Karajan Examples for the "HTML Elements" and "Logic Elements"
    • Checked In these examples into SVN and updated links on the Java CoG Kit Karajan Almanac Page

9th July - 15th July 2006

  • Things achieved
    • Developed Karajan Examples for the "Form Elements" and "List Manipulations Elements"
    • Checked In these examples into SVN and updated links on the Java CoG Kit Karajan Almanac Page

3rd July - 8th July 2006

  • Things achieved
    • Developed Karajan Examples for "Arithmetic Elements"
    • Updated links on the Java CoG Kit Karajan Almanac page for both commanline and gui versions
  • Things to be done in the coming week
    • Development of more examples for the Karajan Almanac

24th June - 2nd July 2006

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows

Karajan demonstration of successful run of the Karajan GUI desktop

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows
[x] The team must demonstrate that they can develop at least ten self developed "sequential" workflows and make them available as part of the Karajan almanac
[ ] A project review will be conducted on the material that this team has checked into the SVN. Note: This review was supposed to be conducted this week, but we are unable to judge your progress as the team is still working on setting up their enviornment. Hence, we will postpone the review based on what will be delivered next week. At this time we are feaing that the project goals may not be achievable by August.
[x] The team must work on a slightly improved version of http://wiki.cogkit.org/index.php/Karajan_demonstration.

  • Things Achieved
    • Succeeful run of Karajan Examples
    • Creation of Documentation page for setting up work environment (corrections to come by the end of the week)
  • Things not achieved (Tasks for this week)
    • Development of workflows as part of karajan almanac (updates to come in the middle of next week)

17th June - 24th June 2006

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows
[x] The team must demonstrate that they can develop at least ten self developed "sequential" workflows and make them available as part of the Karajan almanac
[ ] A project review will be conducted on the material that this team has checked into the SVN. Note: This review was supposed to be conducted this week, but we are unable to judge your progress as the team is still working on setting up their enviornment. Hence, we will postpone the review based on what will be delivered next week. At this time we are feaing that the project goals may not be achievable by August.
[x] The team must work on a slightly improved version of http://wiki.cogkit.org/index.php/Karajan_demonstration.
[x] below are references made to "instalation steps documentation". I would like that this documentation be checked into the svn and a direct link (GATeam)
[x] The files have been removed by Gregor as the contents is now on the Wiki which is a much better place for it.

Documentation for setting up working environment
  • Things achieved
    • CoG Kit Installation (including installation steps documentation)
    • Karajan Installation (including installation steps documentation)
    • Installation of WinCVS
    • Apache Ant 1.6.5 downloaded
    • Cygwin installed
  • Things not achieved and to be done in the coming week
    • Familiarization with Karajan and study of examples
    • Installation steps documentation of SSHclient and WinCVS

15th June - 17th June 2006

  • Things achieved
    • TortoiseSVN and SSHclient installed
    • 30 % of the SVN book read by all team members. All members have the basic idea about SVN.
    • J2SE 5.0 installed by all members
  • Things not achieved and to be done in the coming week
    • JAVA COG kit and Karajan installation
    • familiarization with karajan and study of examples
    • completion of the SVN book

Update Guidelines

  • After the start date a weekly progress report must be done that answers three simple questions.
    • What have you done this week?
    • What have you not achieved this week?
    • What will you do next week?

Activities and results prior to June 15

Things not achieved:

  • No review of the Graph capabilities in the CoG Kit have been conducted
  • No use of the Karajan engine has been conducted
  • No learning of software engeneering tools that are essential for remote collaboration has been conducted
  • No instalation of JDK 1.5 has been conducted, JDK 1.4 is unsuitable for this project
  • Based on the later the assumption is that the study of generics has not been conducted
  • The pseudocode is not available to the rest of the team, so we can not give proper guidance.
  • The collection of sample programs with links to them on the web page is not completed.

Things achived:

  • (incomplete, you need 1.5) JSDK 1.4 installed by all team members.
  • (incomplete, not available to us) Collected some more sample grpahs.
  • (ok) Some Graph partitioning algorithms in C++ are collected and are currently under study.
  • (incomplete, code not available to us) AddNode, RemoveNode , AddVertex, AddEdge, RemoveEdge functions developed for creation of graphs.
  • (incomplete) CoG Kit still not installed.
  • (incomplete, not available to us) Written pseudocode for a few functions required for graph partitioning algorithm.
  • (incomplete, links and pointers to these codes are missing) Collected some sample programs in Java on graphs in general. Currently developing the functions to suit our requirements.

List of tasks

Due to the dealy of theeffective starting date of the project to June 1st, several tasks that we wanted to assign to the team prior to the main phase were not completed and the assignment was done two weeks ago.

The project has now several phases. Please add new tasks and phases as they become apparent.


Phase 0: Getting Started

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows
[x] The team must demonstrate that they can develop at least ten self developed "sequential" workflows and make them available as part of the Karajan almanac
[ ] A project review will be conducted on the material that this team has checked into the SVN. Note: This review was supposed to be conducted this week, but we are unable to judge your progress as the team is still working on setting up their enviornment. Hence, we will postpone the review based on what will be delivered next week. At this time we are feaing that the project goals may not be achievable by August.
[x] The team must work on a slightly improved version of http://wiki.cogkit.org/index.php/Karajan_demonstration.
[x] below are references made to "instalation steps documentation". I would like that this documentation be checked into the svn and a direct link (GATeam)
[x] The files have been removed by Gregor as the contents is now on the Wiki which is a much better place for it.
[x] Each team member understands how to do wiki tasks. See how i do tasks here (GATeam)
[x] Added postal addresses to this page (GATeam)
[x] Cleaned up the page and made it nice looking. (GATeam)
[x] Inform Gregor about what OS system you work on. (GATeam)
[x] Install cygwin with openssh, svn, make, and cvs checked on (GATeam)
[x] Install JDK 1.5 and ant (GATeam)
[x] Download and install Karajan. This is best done from the commandline and not throug JCreator. (GATeam)
[x] Understand what SVN is.
[x] Find location in SVN where the Karajan Almanac examples should be placed (GATeam)
[x] Prepare a documentation in this page that describes how to set up all the useful tools a developer would want for developing within the CoG Kit project. E.g. svn, cvs, ant, Java 1.5, ..., firefox and google toolbar for editing wiki pages ...

Small updates on documentation to appear soon on the documentation page

Phase I: Learn how to use karajan. Help developing documented examples.

We will be using the karajan workflow engine to coordinate large scale GA algorithms in a Grid. However in orderto achive this the following needs to be conducted:

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows
[x] The team must demonstrate that they can develop at least ten self developed "sequential" workflows and make them available as part of the Karajan almanac
[ ] A project review will be conducted on the material that this team has checked into the SVN. Note: This review was supposed to be conducted this week, but we are unable to judge your progress as the team is still working on setting up their enviornment. Hence, we will postpone the review based on what will be delivered next week. At this time we are feaing that the project goals may not be achievable by August.
[x] The team must work on a slightly improved version of http://wiki.cogkit.org/index.php/Karajan_demonstration.
[x] below are references made to "instalation steps documentation". I would like that this documentation be checked into the svn and a direct link (GATeam)
[x] The files have been removed by Gregor as the contents is now on the Wiki which is a much better place for it.
[x] Each team member understands how to do wiki tasks. See how i do tasks here (GATeam)
[x] Added postal addresses to this page (GATeam)
[x] Cleaned up the page and made it nice looking. (GATeam)
[x] Inform Gregor about what OS system you work on. (GATeam)
[x] Install cygwin with openssh, svn, make, and cvs checked on (GATeam)
[x] Install JDK 1.5 and ant (GATeam)
[x] Download and install Karajan. This is best done from the commandline and not throug JCreator. (GATeam)
[x] Understand what SVN is.
[x] Find location in SVN where the Karajan Almanac examples should be placed (GATeam)
[x] Prepare a documentation in this page that describes how to set up all the useful tools a developer would want for developing within the CoG Kit project. E.g. svn, cvs, ant, Java 1.5, ..., firefox and google toolbar for editing wiki pages ...
[x] Finish Phase 0 only than proceed (GATeam)
[x] Make sure the team members have a working environment in which they can execute karajan (GATeam)
[x] check out CoG Kit from the CVS, compile it and develop simple karajan program to see if it works (GATeam)
[ ] produce a fairly complete set of examples as part of the almanac for Karajan. Make sure you document the examples. (GATeam)

Phase II Algorithms: In this phase you develop Graph algorithms that will be reused by the GA

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows
[x] The team must demonstrate that they can develop at least ten self developed "sequential" workflows and make them available as part of the Karajan almanac
[ ] A project review will be conducted on the material that this team has checked into the SVN. Note: This review was supposed to be conducted this week, but we are unable to judge your progress as the team is still working on setting up their enviornment. Hence, we will postpone the review based on what will be delivered next week. At this time we are feaing that the project goals may not be achievable by August.
[x] The team must work on a slightly improved version of http://wiki.cogkit.org/index.php/Karajan_demonstration.
[x] below are references made to "instalation steps documentation". I would like that this documentation be checked into the svn and a direct link (GATeam)
[x] The files have been removed by Gregor as the contents is now on the Wiki which is a much better place for it.
[x] Each team member understands how to do wiki tasks. See how i do tasks here (GATeam)
[x] Added postal addresses to this page (GATeam)
[x] Cleaned up the page and made it nice looking. (GATeam)
[x] Inform Gregor about what OS system you work on. (GATeam)
[x] Install cygwin with openssh, svn, make, and cvs checked on (GATeam)
[x] Install JDK 1.5 and ant (GATeam)
[x] Download and install Karajan. This is best done from the commandline and not throug JCreator. (GATeam)
[x] Understand what SVN is.
[x] Find location in SVN where the Karajan Almanac examples should be placed (GATeam)
[x] Prepare a documentation in this page that describes how to set up all the useful tools a developer would want for developing within the CoG Kit project. E.g. svn, cvs, ant, Java 1.5, ..., firefox and google toolbar for editing wiki pages ...
[x] Finish Phase 0 only than proceed (GATeam)
[x] Make sure the team members have a working environment in which they can execute karajan (GATeam)
[x] check out CoG Kit from the CVS, compile it and develop simple karajan program to see if it works (GATeam)
[ ] produce a fairly complete set of examples as part of the almanac for Karajan. Make sure you document the examples. (GATeam)
[ ] Understand graph contraction (GATeam)
[ ] Develop a framework for reducing a graph while using a partition vector. Let v be a vector indicating for each node in which partition the node is. Develop a program that delivers a new graph that only contains a graph of size equal to the partitions. Make sure a method is available, to contract and to expand back.
[ ] generate a distance graph (e.g. all nodes have a maximum distance from each other when mapped into a 2d plane) of size 20000 and test the algorithm while producing a) partitions with close by neighbors and reducing the graph)

Phase III: Portions of phase III can be done in paralllel to Phase II.

Clever assignment amongst the team members will allow this.

[ ] This project starts June 16. Project team member will work the first week on building a "Karajan Almanac". After this task is completed members will have sufficient knowledge about Karajan. The beginning of the Almanac can be found at Java CoG Kit Karajan Almanac
[x] The team must demonstrate that they can run succesfully Karajan workflows
[x] The team must demonstrate that they can develop at least ten self developed "sequential" workflows and make them available as part of the Karajan almanac
[ ] A project review will be conducted on the material that this team has checked into the SVN. Note: This review was supposed to be conducted this week, but we are unable to judge your progress as the team is still working on setting up their enviornment. Hence, we will postpone the review based on what will be delivered next week. At this time we are feaing that the project goals may not be achievable by August.
[x] The team must work on a slightly improved version of http://wiki.cogkit.org/index.php/Karajan_demonstration.
[x] below are references made to "instalation steps documentation". I would like that this documentation be checked into the svn and a direct link (GATeam)
[x] The files have been removed by Gregor as the contents is now on the Wiki which is a much better place for it.
[x] Each team member understands how to do wiki tasks. See how i do tasks here (GATeam)
[x] Added postal addresses to this page (GATeam)
[x] Cleaned up the page and made it nice looking. (GATeam)
[x] Inform Gregor about what OS system you work on. (GATeam)
[x] Install cygwin with openssh, svn, make, and cvs checked on (GATeam)
[x] Install JDK 1.5 and ant (GATeam)
[x] Download and install Karajan. This is best done from the commandline and not throug JCreator. (GATeam)
[x] Understand what SVN is.
[x] Find location in SVN where the Karajan Almanac examples should be placed (GATeam)
[x] Prepare a documentation in this page that describes how to set up all the useful tools a developer would want for developing within the CoG Kit project. E.g. svn, cvs, ant, Java 1.5, ..., firefox and google toolbar for editing wiki pages ...
[x] Finish Phase 0 only than proceed (GATeam)
[x] Make sure the team members have a working environment in which they can execute karajan (GATeam)
[x] check out CoG Kit from the CVS, compile it and develop simple karajan program to see if it works (GATeam)
[ ] produce a fairly complete set of examples as part of the almanac for Karajan. Make sure you document the examples. (GATeam)
[ ] Understand graph contraction (GATeam)
[ ] Develop a framework for reducing a graph while using a partition vector. Let v be a vector indicating for each node in which partition the node is. Develop a program that delivers a new graph that only contains a graph of size equal to the partitions. Make sure a method is available, to contract and to expand back.
[ ] generate a distance graph (e.g. all nodes have a maximum distance from each other when mapped into a 2d plane) of size 20000 and test the algorithm while producing a) partitions with close by neighbors and reducing the graph)
[ ] Finish phase II before you start this phase (GATeam)
[ ] Develop java program to do graph partitioning (GATeam)
[ ] Find test graphs for graph partitioning (GATeam)
[ ] Develop program that creates random solution (GATeam)
[ ] Develop kernighan lin algorithm for solution (GATeam)
[ ] Develop graph contraction algorithm (GATeam)
[ ] Combind graph contraction with kernighan lin (GATeam)
[ ] Develop simple GA (GATeam)
[ ] Use karajan workflow to stage multiple parallel solutions (GATeam)
[ ] Fetch results back (GATeam)
[ ] Develop a mechanism to join results (GATeam)

Documentation

This section contains a brief documentation that describes how to set up the various tools a developer would want for developing within the CoG Kit project.

For the Documentation, please refer to Setting Up Working Environment for Java CoG Kit Developers

Random Notes

http://wiki.mcs.anl.gov/gregor/index.php/Gregor_von_Laszewski#las91icga

Phase I: Build a framework based on the Java CoG Kit (or develop an enhancement) that allows to submit a problem to a remote "Calculation server". A calculation is defined by

a) an initial message that defines the calculation f through a signed jar file essentially it gives you a function

(r_1, ..., r_n) = f (i_1, .....,i_m)

lets assume we use vector notation R = f (I)

we would formally define this first.

b) a loop that accepts incoming "sets" of things that need to be calculated. Once calculated the results are stored or send automatically to a datacollector e.g. foreach I=set do R_i = f(I) do

c) integration of new compute resources.

d) a comparison to boinc and how it differs.

e) the implementation of a task to be performed. (this could be as simple as a montecarlo pi calculation, but a GA would be much cooler.

Team Members' System Information

Operating Systems/Platforms :

All team members use either Microsoft® Windows XP® Professional or Home Edition

Contact Addresses

Kamal Agarwal

Residence Address:

Kamal Agarwal
A-1/265 , First Floor,
Safdarjung Enclave, 
New Delhi - 110029,
India
Mobile : +91-9810419518
lotus.ns5@gmail.com

Departmental Address:

Undergraduate 
Electronics And Communication Engineering
Bharati Vidyapeeth's College of Engineering
A-4, Paschim Vihar
New Delhi-110063
India

Ashish Anand

Residence Address:

Ashish Anand
D-5A, Second Floor,
Vijay Nagar, 
New Delhi - 110009,
India
Mobile : +91-9811855873
anand_ashish45@hotmail.com

Departmental Address:

Undergraduate 
Department of Computer Science
Bharati Vidyapeeth's College of Engineering
A-4, Paschim Vihar
New Delhi-110063
India

Amandeep Singh

Residence Address:

Amandeep Singh
C-24 Subhadra Colony,
Sarai Rohilla 
Delhi - 110035,
India
Mobile : +91-9810648229
amandeep_sabharwal@yahoo.com

Departmental Address:

Undergraduate 
Department of Computer Science
Bharati Vidyapeeth's College of Engineering
A-4, Paschim Vihar
New Delhi-110063
India
Personal tools
Collaboration and Jobs