Index of /dev/null

Is LINQ functional?

With it's 3.5 extensions, the .NET framework started to turn into a really cool looking programming concept, last but not least due to the syntactic sugar of LINQ. A reason for that is surely it's functional look. Well, as LINQ is integrated into an imperative context, it won't be ever able to guarantee state-free evaluation as a genuine functional language does. Nevertheless it's worth to discuss and play around with a few aspects of it in terms of a multiple programming paradigm concept.

Delegating definitions in C# 3.0

Firstly, the concept of first-class functions, i.e. the invention of the function type, leads to the notion of closures. So for instance, a constant function such as

Func<int> i = () => 1;

defines something like a readonly variable. You may get it's value now, later or never, but you can always be sure that it's value won't be ever changed anywhere in your code. Hence, you have won a quantum of control over your program by this weird piece of code. That's a basic idea of functional programming.

The concept of function types leads to higher order functions, i.e. functions mapping functions to other functions. Thus, the curry functor, a key concept in the theory of functional programming, is regarded:

curry: (X x Y → Z) → (X → Y → Z)

That is, for any function f(x,y), there is a curryied function curry(f)(x) taking x to a function g(y) = f(x,y). This is now implemented easily in C# using generic types:

static Func<X, Func<Y, Z>> Curry<X, Y, Z>(Func<X, Y, Z> f)
{
  return x => y => f(x, y);
}

(inspired by this C# abuse of the day). Well, that's more or less of academic interest, since one would hardly ever replace x++ by

x = Curry<int, int, int>((a, b) => a + b)(1)(x); // x++ ;)

A slightly more interesting example is the following:

// using System.Text.RegularExpressions;
var grep = Curry<Regex, IEnumerable<string>, IEnumerable<string>>(
  (regex, list) => from s in list
                   where regex.Match(s).Success
                   select s);
var grepFoo = grep(new Regex("foo"));

Thus, grepFoo will grep all words containing "foo" from a wordlist. Attention should be paid to the fact that with the statement

var fooList = grepFoo(new string[]{"foo", "bar", "foobar"});

then there is still no regex applied. Indeed, fooList is of type IEnumerable<string> and not yet enumerated at this point. So the evaluation of the expression is deferred until it's result is needed by another computation - smells like lazy evaluation.

LINQ is not lazy!

One of the most important paradigms of functional programming is the concept of lazy evaluation. For instance, in a functional language, such as the good old Haskell, an expression such as

length [1, 2, 3/0]

evaluates to 3. That is, the control system is too lazy to fail on division by zero, neither at compile time nor on run time, since it doesn't need to know any element inside the array in order to calculate it's length. In C# (where you aren't even able to compile an expression such as 1/0), you may let

var q1 = from i in (IEnumerable<int>)new int[] { 1, 2, 3 }
         select 1/(i - 3);

without getting a run time error. But this has nothing to do with lazy evaluation, since the query expression isn't evaluated at all at this point (in contrast to the array definition inside the query), so the query expression is simply treated as a function definition. However, as soon as an aggregation expression such as

int three = q1.Count();

is reached, a DivideByZeroException will be thrown. Thus, LINQ evaluates eager here, not lazy. On the other hand,

int two = q1.Take(2).Count();

works fine, since the black hole stays unevaluated due to the Take operator. But, having

var q2 = from i in (IEnumerable<int>)new int[] { 1, 2, 3 }
         select 1/(i - 1);
int two2 = q2.Skip(1).Count();

instead, you will - guess what! - catch the exception again. Thus, in contrast to the Take operator, the Skip operator does iterate through skipped elements and hence evaluates them. Ok, that's no surprise, since these operators are using the IEnumerator provided by the corresponding IEnumerable. So, LINQ pretends to be lazy in the way that

var p = q2.Reverse();

won't be evaluated at this point and thus doesn't fail, wheras

int two3 = p.Take(2).Count();

then throws again the exception even though the evil one shuoldn't be taken here.

A functional approach to force lazyness would be to replace value expressions by constant functions, but the compiler won't accept something like this:

// The type of the expression in the select clause is incorrect.
// Type inference failed in the call to 'Select'.
var q1_ = from i in (IEnumerable<int>)new int[] { 1, 2, 3 }
          select () => 1 / (i - 3);

Hence, LINQ isn't lazy, but has a smart way to make function definitions looking like statement expressions.

Diving into recursion

Remember the famous

Fibonacci numbers:

fib0 = 0, fib1 = 1, fibn = fibn-1 + fibn-2.

The sequence starts with

fibs = 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...

where fibs100 is a number consisting of 21 digits then, so it grows quite fast. Although one may calculate Fibonacci numbers in constant time using Binet's formula, the definition leads to interesting comparisons of different recursion strategies.

Well, lets have a

delegate long Fibonacci(int n);

A direct translation of the definition into a lambda recursion looks like this:

Fibonacci fib1 = null; // pre-assigned for use within recursion
fib1 = n => n <= 1 ? n : fib1(n - 1) + fib1(n - 2);

The funny thing with this implementation is, that the Fibonacci function itself determines it's run time: It's O(fibn), i.e. lower values will be recalculated many times again and again in order to get a higher one, due to the lack of an aggregating strategy.

Now, in Haskell you may get around this very elegantly by defining an infinitive list:

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

The list is inititialized with two elements. Then, notional, the tail function shifts the first element from the fibs list, while zipWith (+) creates a new list by adding elements of both fibs and (tail fibs) with each other then. But in practice, Haskell is smart and lazy enough to avoid any needless recalculation of numbers already present in the fibs list. Thus, the algorithm applied here is the same one a human being would apply spontaneously using a pencil and a chit of paper. So, it's O(n).

To define an infinitive list in C#, one should implement the IEnumerable interface in the way that the corresponding IEnumerator expands the list on demand within it's MoveNext() method then. Here, it's enough to have a little inliner, taking a list and an expanding function to a Fibonacci type:

Func<
  IEnumerable<long>,
  Func<IEnumerable<long>, IEnumerable<long>>,
  Fibonacci> infList = null;
infList = (list, exp) => n => n < list.Count() ?
  list.Skip(n).First() : infList(exp(list), exp)(n);

Now, C# also provides a Zip function. So, a simple syntactic translation of the Haskell list would look like this:

Func<IEnumerable<long>, IEnumerable<long>> fibZip = fibs =>
  fibs.Take(2).Concat(fibs.Zip(fibs.Skip(1), (x, y) => x + y));

Hm, but this one is even worse than the naive recursion. Indeed, trying

Fibonacci fib2 = infList(new long[] { 0, 1 }, fibZip);

then, you will see that aggregation doesn't work at all this way, since the concept of enumeration is not functional. We may repair the fibZip as follows:

Func<IEnumerable<long>, IEnumerable<long>> fibZip2 = fibs =>
  fibs.Concat((IEnumerable<long>)(new long[] {
    fibs.Skip(fibs.Count() - 2).Sum() }));

This one looks a bit weird, since it's not that easy to extend an IEnumerable by one element. Anyway,

Fibonacci fib3 = infList(new long[] { 0, 1 }, fibZip2);

indeed does the job in O(n) then, even though the idea of an infinitive list has lost it's magic this way.

Conclusion

As expected, neither C# nor LINQ turns out to implement the paradigms of a functional language. None the less, it's really fancy. 8-)

no comments

post a comment

Leave a comment
You may use HTML tags for style