Karajan:Thoughts on automatic parallelization
From Java CoG Kit
Note1: this is somewhat trivial, but may be useful
Note2: automatic parallelization has not been implemented
Note3: this is a draft
Contents |
Types of elements
Elements in Karajan can be separated into 3 distinct categories based on behavior:
- Pure elements (no side-effects) - pureX()
- Elements with local side-effects (
set) - localseX() - Elements with global side-effects (
execute,transfer, etc.) - globalseX()
Elements with global side effects include all elements that do direct I/O (Note:printis not one of them).
User defined elements with at least one global side-effect element inherit the global side-effect property.
element(notpure, [] pure1() globalse1() pure2() )
Here notpure is not pure, regardless of what it is applied to.
The application of a pure element to an element with a global side-effect has a global side-effect.
pure1(pure2())
produces no side-effects, while
pure1(globalse1())
produces a global side effect.
By contrast, local side effects are not inherited, whether in a definition body
element(pure, [] pure1() localse1() pure2() localse2() )
or the arguments
pure( localse1() pure1() localse2() )
In certain cases it may be desirable to force otherwise non-pure elements to be considered pure. It may be the case with, for example, an element that transfers a remote file to the local machine, counts the number of lines in it and returns the result. Such an element is not pure. There can be no guarantee of consistency if two instances of it are executed in parallel with the same argument. However, through the use of programmed safe-guards, such as transfer to a unique temporary file, an acceptable margin of "purity" can be achieved.
Normally, when no explicit parallelism is used (i.e. parallel), elements are executed in lexical sequence. For an arbitrary parse tree, this implies a post-order traversal of the tree. In particular, user defined elements are executed strictly (the body after all the arguments). Let us call this sequential order. Certain exceptions apply:
-
if,choiceand some others are non-strict -
formay execute some sub-elements (nodes) repeatedly -
elementdoes not execute sub-elements at the time of definition
For elements with strict behavior, including user defined elements, safe parallelization can be achieved provided that the following are kept true:
- return values of pure elements are lexically ordered (
paralleldoes it automatically)
- local side-effects are executed in sequential order within their immediate parent scope with respect to: each other AND consecutive sequences of pure elements
- global side-effects are executed in sequential order regardless of any other considerations
Examples:
f( pure1() pure2() ... puren() )
can be safely transformed into:
f(
parallel(
pure1()
pure2()
...
puren()
)
)
f( localse1() pure1() pure2() localse2() pure3() )
can be safely transformed into:
f(
localse1()
parallel(
pure1()
pure2()
)
localse2()
pure3()
)
Parallelization of for
The parallelization of for can be achieved only if no sub-element has a global side-effect, and can be done by replacement with parallelFor
Element parallelization (call-by-future)
If an element is pure then it can safely be defined as a parallelElement. Even if it may be applied to global side-effect producing elements, since no global side-effect elements exist in the body of the definition, the sequential order of execution of the global side-effect elements cannot be affected.
It is however necessary for parallelElement to then return all arguments on channels not handled by the element in sequential order, a feature which is not currently implemented.
Possible improvements
Notes
1. Version 0.33 introduced commutative channels, for which the order of the arguments is inconsequential, and which are not visible at the user level. For examplesum uses a commutative channel. It is not necessary to maintain the lexical argument order of return values on commutative channels. Most parallel elements (such as parallel) are aware of commutative channels and will not attempt to keep the lexical ordering of return values on them.
