Karajan2
From Java CoG Kit
Contents |
Introduction
Karajan2 is a continuation of some of the ideas in Karajan. New features include:
- Karajan2 is compiled to Java bytecode instead of being interpreted
- It has lexical scoping ("variables you see are the variables you get")
- As opposed to Karajan, Karajan2 has a type system with the following characteristics:
- Types are first class entities, but they are dealt with at compile time (so it's a static type system)
- Type inference (type annotations can always be specified in order to ensure correctness, but in many cases it is unnecessary to do so)
- Type safe casting: there is no explicit type casting; when a test for a type is done, such as in an if condition, the compiler can infer the types based on the branch taken: in the positive branch the tested variable will always have the tested type, and in the negative branch it will never have that type
- No nullable types: the "null" value can be used to represent guards; however it has its own type and whenever a null value can be present, it must be declared (or inferred); the compiler will then require explicit testing for null whenever a possibly-null value is accessed.
- Block arguments, which are a way of expressing macro-like features in a type-safe and (arguably) intuitive way
- Tail-recursion optimization
Old features that were kept:
- Lightweight threading
- Concurrency operators, futures, channels
- The task (Globus and such) library
Download
Currently Karajan2 is available in SVN at: https://cogkit.svn.sourceforge.net/svnroot/cogkit/branches/karajan2/
Compiling
If you want to compile Karajan2 with the abstraction library:
cd src/cog/modules/abstraction-karajan ant dist
Otherwise:
cd src/cog/modules/karajan2 ant dist
Status
- There is a working compiler, which probably breaks for certain tricky things
- A few libraries are available (Sys, Task, List, Channel, XML, String)
Features in more detail
Types
Types are expressed in a style similar to Pascal. With type inference they are optional in most cases. The following are equivalent:
n := 1
n: Integer = 1
When defining a function, the inference engine cannot always constrain the type of the arguments based on the function body. In such cases, type annotations are necessary. For example, there are (in the Sys library) 3 different '+' operators, namely:
'+'(Integer, Integer) -> Integer '+'(Real, Real) -> Real '+'(String, String) -> String
where the '->' operator is used to specify the return type of a function.
The following declaration will result in a compile-time error because the inference engine cannot tell which of the three '+' operator the body of the function is referring to:
import(Sys)
f := function(x, y) {
x + y
}
In order to avoid the error, one must specify the types of x and y:
import(Sys)
f := function(x: Integer, y: Integer) {
x + y
}
In general, the return type of a function can be successfully inferred by the inference engine, with the exception of recursive functions:
import(Sys)
fact := function(n) -> Integer {
if (n > 1, n * fact(n - 1), 1)
}
Here, the type of n can be inferred from the comparison operation ('>'(Integer, Integer) -> Integer), but having both the type of n and the return type of fact unspecified leaves the multiplication ambiguous. Arguably the constraint that n be an Integer (due to the comparison) can be extended to the multiplication operation, but there's the question of how much smartness from the system is required to make the system error-prone or hard to understand. At this point such situations require a type declaration.
Parametric Types
Certain things benefit from not being bound to a specific type at compile-time. Such is the case with lists. The type of a list can be specified using the List(Type) constructor:
import(Sys), import(List)
l: List(Integer) = [1, 2, 3]
// or
g := List(Integer)(1, 2, 3)
print("l: ", l, "g: ", g)
Type Disambiguation
Type disambiguation is scheme to avoid type casting when it would be needed. Take a function (arguably a useless one) that can square a number, which can be either an Integer or a Real. One would probably write two functions and use polymorphism to deal with the issue, but for the sake of simplicity of this example, let's believe that this is the right approach:
sqr := function(x: Integer | Real) {
x * x
}
That would result in a compile-time error, because the compiler would not know whether * refers to the Integer multiplication or Real multiplication. The following, however, would work:
sqr := function(x: Integer | Real) {
if (typeOf(x) == Integer) {
x * x //*(Integer, Integer) -> Integer
}
else {
x * x //*(Real, Real) -> Real
}
}
That works because when the type test is provided as part of the conditional expression, the compiler knows that the only way the true branch can be reached is when x is an Integer, and the only way the false branch can be reached is when x is Real (the parameter declaration restricts it to just these two types).
A more realistic situation for type disambiguation is when using guards in variable-sized constructs. Such as a linked list in which the end of the list is marked by a "special" value. The generic special value in Karajan2 is null. Unlike some other languages (C, Pascal, Ada, Python), the null value does not have a universal type (i.e. it can be assigned to objects of (almost) any type), but it's own type named Null. This means that functions whose parameters can be null must explicitly state so in the type declaration of that parameter. Furthermore, this prevents de-reference of null values. Note however that the Null type is not the same as the Nothing type (denoted as ()); the Null type has an arity of 1, whereas Nothing has an arity of 0. So a size function on the linked list may look like this:
size := function(l: LList | Null) -> Integer {
if (l == null) {
0
}
else {
1 + size(l::next)
}
}
Somewhat similar to the sqr example, the compiler knows that the negative branch of the if can only be reached when l is not null and therefore (due to it being able to be only null or a valid LList) the ::next dereference is valid and on a LList value. Without the test, the compiler would throw an error complaining that the ::next dereference does not apply to the (LList | Null) type.
Currying
Your standard currying:
addn := function(n) {
function(m) {
n + m
}
}
addtwo := addn(2)
print(addtwo(3)) // => 5
Block Arguments
This is best done through examples:
ifeq := function(n: Integer, m: Integer, b:Block) {
if (n == m) {
b()
}
}
print(
ifeq(2, 2) {
"2 equals 2"
}
}
The type of the values returned by ifeq depend on the types returned by the block argument. In the above example, the type of the ifeq expression that is an argument to print is String | Nothing is determined by first evaluating the type of the block argument to ifeq, which is String, and then passing that through the body of ifeq where the if can either return the type of the positive branch (String in this case) or the type of the negative branch (Nothing in this case).
