c#

Don’t blow your stack: C# trampolining

A couple of weeks ago someone posted two links on twitter suggesting the trampolining superiority of Clojure to C#. (Can’t find that tweet anymore, and yes, that tweet was what finally motivated me to migrate my blog so i could post again.)

Well, the comparison wasn’t really apples to apples. The C# article “Jumping the trampoline in C# – Stack-friendly recursion” is trying to do a lot more than the Clojure article “Understanding the Clojure `trampoline’”. But maybe that’s the point as well: that with C# people end up building much more complex, verbose machinery when the simple style of Clojure is all that is needed. That determination wanders into the subjective, where language feuds are forged, and i’ll avoid that diversion this time around.

What struck me more was that you can do trampolining a lot more simply in C# if we mimic the Clojure example given. Let’s start by reproducing the stack overflow handicapped example first:

Func<int, int> funA, funB = null;
funA = n => n == 0 ? 0 : funB(--n);
funB = n => n == 0 ? 0 : funA(--n);

I’d dare say that the Lambda syntax is more compact than the Clojure example :) Ok, ok, the body is artificially small, allowing me to replace it with a single expression, which isn’t a realistic scenario. Suffice it to say, you can get quite compact with expressions in C#.

Now, if we call funA(4), we’d get the same call sequence as in Clojure, i.e.

funA(4) -> funB(3) -> funA(2) -> funB(1) -> funA(0) -> 0

And if you, instead, call funA(100000), you’ll get a StackOverflowException.

So far so good, but here is where we diverge from Clojure. Clojure is dynamically typed, so it can return a number or an anonymous function that produces that number. We can’t do that (lest we return object, ick), but we can come pretty close.

Bring in the Trampoline

The idea behind the Trampoline is that it unrolls the recursive calls into sequential calls, by having the functions involved return either a value or a continuation. The trampoline simply does a loop that keeps executing returned continuations until it gets a value, at which point it exits with that value.

What we need for C# then is a return value that can hold either a value or a continuation and with generics, we can create one class to cover this use case universally.

public class TrampolineValue<T> {
    public readonly T Value;
    public readonly Func<TrampolineValue<T>> Continuation;

    public TrampolineValue(T v) { Value = v; }
    public TrampolineValue(Func<TrampolineValue<T>> fn) { Continuation = fn; }
}

Basically it’s a container for either a value T or a func that produces a new value container. Now we can build our Trampoline:

public static class Trampoline {
    public static T Invoke<T>(TrampolineValue<T> value) {
        while(value.Continuation != null) {
            value = value.Continuation();
        }
        return value.Value;
    }
}

Let’s revisit our original example of two functions calling each other:

Func<int, TrampolineValue<int>> funA, funB = null;
funA = (n) => n == 0
  ? new TrampolineValue<int>(0)
  : new TrampolineValue<int>(() => funB(--n));
funB = (n) => n == 0
  ? new TrampolineValue<int>(0)
  : new TrampolineValue<int>(() => funA(--n));

Instead of returning an int, we simply return a TrampolineResult<int> instead. Now we can invoke funA without worrying about stack overflows like this:

Trampoline.Invoke(funA(100000));

Voila, the stack problem is gone. It may require a bit more plumbing than a dynamic solution, but not a lot more than adding type declarations, which will always be the syntactic differences between statically and dynamically typed. But with lambdas and inference, it doesn’t have to be much more verbose.

But wait, there is more!

Using Trampoline.Invoke with TrampolineValue<T> is a fairly faithful translation of the Clojure example, but it doesn’t feel natural for C# and actually introduces needless verbosity. It’s functional rather than object-oriented, which C# can handle but it’s not its best face.

What TrampolineValue<T> and its invocation really represent are a lazily evaluated value. We really don’t care about the intermediaries, nor the plumbing required to handle it.

What we want is for funA to return a value. Whether that is the final value or lazily executes into the final value on examination is secondary. Whether or not TrampolineValue<T> contains a value or a continuation shouldn’t be our concern, neither should passing it to the plumbing that knows what to do about it.

So let’s internalize all this into a new return type, Lazy<T>:

public class Lazy<T> {
    private readonly Func<Lazy<T>> _continuation;
    private readonly T _value;

    public Lazy(T value) { _value = value; }
    public Lazy(Func<Lazy<T>> continuation) { _continuation = continuation; }

    public T Value {
        get {
            var lazy = this;
            while(lazy._continuation != null) {
                lazy = lazy._continuation();
            }
            return lazy._value;
        }
    }
}

The code for funA and funB is almost identical, simply replacing TrampolineValue with Lazy:

Func<int, Lazy<int>> funA, funB = null;
funA = (n) => n == 0
  ? new Lazy<int>(0)
  : new Lazy<int>(() => funB(--n));
funB = (n) => n == 0
  ? new Lazy<int>(0)
  : new Lazy<int>(() => funA(--n));

And since the stackless chaining of continuations is encapsulated by Lazy, we can simply invoke it with:

var result = funA(100000).Value;

This completely hides the difference between a Lazy<T> that needs to have its continuation triggered and one that already has a value. Now that’s concise and easy to understand.

By arne on | .net, geek | 6 comments
Tags: , ,

Duckpond: Lightweight duck-typing for C#

Edit: Changed As to AsImplementationOf since it’s an extension method on object and too likely to collide.

A while back I was talking about Interface Segregation and proposed using either LinFu’s DynamicObject or delegate injection. While I played a bit more with delegate injection, in practical use delegate injection turned out to be rather ugly and not really improve readability.

So, I’ve come back to wanting to cast an object to an interface regardless of what interfaces that object implemented. I wanted this to be as simple and lightweight as possible, so rather than using a dynamic proxy framework, i simply rolled my own IL and wrote a pure proxy that does nothing but call the identical method on the class it wraps.

Introducing DuckPond

DuckPond is a very simple and focused library. It currently adds only a single extension method: object.AsImplementationOf<Interface>

Given a class Duck that implements a number of methods, including Quack:

public class Duck {
  public void Quack(double decibels) {
    ...
  }

  //... various other methods ...
}

we can easily cast Duck to a more limited interface that the class doesn’t implement such as:

public interface IQuacker {
  void Quack(double decibels);
}

using the object.AsImplementationOf<T> extension method:

using Droog.DuckPond;

...

var quacker = new Duck().AsImplementationOf<IQuacker>();

That’s all there is to it.

But is it fast?

Honestly, i don’t know yet. I have not benchmarked the generated classes against virtual method dispatches or LinFu’s and Castle’s dynamic proxy. I assume it is, since unlike with dyanmic proxy, DuckPond doesn’t use an interceptor. Instead it emits Intermediate Language for each call in the interface, dispatching the call against the wrapped instance’s counterpart.

Try it, fork it, let me know what you think

The code is available now at GitHub: http://github.com/sdether/duckpond

By arne on | .net, geek | 2 comments
Tags: , , ,

Linq2MongoDB: Building a Linq Provider for MongDB

This weekend has been a hack-a-thon, trying to build a simple linq provider for MongoDB. I’m using Sam Corder, et al.’s excellent C# MongoDB Driver as the query pipeline, so my provider really is just a translator from Linq syntax to Mongo Document Query syntax. I call it a hack-a-thon, because it’s my first linq provider attempt and, boy, is that query translator state machine ugly already. However, I am covering every bit of syntax with tests, so that once i understand it all better, i can rewrite the translator in a cleaner fashion.

My goals for this provider is to replace a document storage layer i’ve built for a new notify.me project using NHibernate against mysql. This is in no way a judgment against NHibernate. It just happens that for this project, my schema is a heavily denormalized json document database. While fluent NHibernate made it a breeze to let me map it into mysql, it’s really an abuse of an RDBMS. It was a case of prototyping with what you know, but now it’s time to evaluate whether a document database is the way to go.

Replacing existing NHibernate code does mean that, eventually, i want the provider to work with POCO entities and use a fully strong-typed query syntax. But that layer will be built on top of the string-key based version i’m building right now. The string-key based version will be the primary layer, so that you never loose any of the schema-less flexibility of MongoDB, unless you choose to.

Basic MongoDB queries

So, lacking an entity with named properties to map against, what does the syntax look like right now? First thing we need is an IQueryable<Document> which is created like this:

var mongo = new Mongo();
var queryable = mongo["db"]["collection"].AsQueryable();

Given the queryable, the queries can be built using the Document indexer like this:

var q = from d in queryable where (string)d["foo"] == "bar" select d;

The Document returns an object, which means a cast is unfortunately required on one side of the conditional. Alternatively, Equals, either the static or instance version, also works, alleviating the need for a cast:

var q = from d in queryable where Equals(d["foo"], "bar") select d;
// OR
var q = from d in queryable where d["foo"].Equals("bar") select d;

Better, but it’s not as nice as operator syntax would be, if we could get rid of the casts..

As it turns out there is a number of query operators in MongoDB that don’t have an equivalent syntax in Linq, so a helper class to generate query expression was already needed. The helper is instantiated via the Document extension method .Key(key), giving us the opportunity to overload operators for the various types recognized by MongoDB’s BSON. This allows for the following conditional syntax:

var q = from d in queryable
        where d.Key("type") == "customer" &&
              d.Key("created") >= DateTime.Parse("2009/09/27")
              d.Key("status") != "inactive"
        select d;

IN and NOT IN

In addition to normal conditional operators, the query expression helper class also defines IN and NOT IN syntax:

var in = from d in queryable where d.Key("foo").In("bar", "baz") select d;

var notIn = from d in queryable where d.Key("foo").NotIn("bar", "baz") select d;

The helper will be the point of extension to support more of MongoDB’s syntax, so that most query definitions will use the d.Key(key) syntax.

findOne, limit and skip

Linq has matching counter parts of MongoDB’s findOne(), limit() and skip(), in First or FirstOrDefault, Take and Skip respectively, and the current version of Linq provider already supports them.

What’s missing?

There is a lot in Linq that will likely never be supported, since MongoDB is not a relational DB. That means joins, sub-queries, etc. will not covered by the provider. Anything that does map to MongoDB’s capabilities, though, will be added over time. The low hanging fruit are Count() and order by, with group by following thereafter.

Surprisingly, || (or conditionals) are not going to happen as fast, since aside from or type queries using the .In syntax, it is not directly supported by MongoDB. In order to perform || queries, the query has to be written as a javascript function, which would basically mean that as soon as a single || shows up in the where clause the query translato would have to rewrite all other conditions in javascript as well. So, that’s a bit more on the nice to have end of the spectrum of priorities.

Ready to go!

I will most likely concentrate on the low hanging fruit and then work on the POCO query layer next, since my goal is to be able to try out MongoDB as an alternative to my NHibernate code.

All that said, the code described above works now and is ready for some test driving. It’s currently only in my branch on github, but I hope it will make it into the master soon.

About Concurrent Podcast #3: Coroutines

Posted a new episode of the Concurrent Podcast over on the MindTouch developer blog. This time Steve and I delve into Coroutines, a programming pattern we use extensively in MindTouch 2009 and one that i’m also trying out as an alternative to my actor based Xmpp code in Notify.me.

Since there isn’t a native coroutine framework in C#, we’re using the one provided by MindTouch Dream. It’s built on top of the .NET iterator pattern (i.e. IEnumerable and yield) and makes the assumption that all Coroutines are asynchronous methods using Dream’s Result<T> object for coordinating the producer and consumer of a return values. Steve’s previously blogged about Result. Since those posts there’s also been a lot of performance improvements and capability improvements to Result committed to trunk, primarily providing robust cancellation with resource cleanup callbacks. For background on coroutines, you can also check out previous posts I’vee written.

The cool thing about asynchronous coroutines compared to an actor model is that call/response based actions can be written as a single linear block of code, rather than separate message handlers whose contiguous flow can only be determined by examining the message dispatcher. With a message dispatcher that can correlate message responses with suspended coroutines, sending and waiting for a message in a coroutine can be made to look like a method call without blocking the thread, which, especially with message passing concurrency, is vital, since a response isnn’t in any way guaranteed to happen.

I’m due to write another post on how to use Dream’s coroutine framework, but in the meantime i highly recommend checking out Dream from mindtouch’s svn. Lot’s of cool concurrency stuff in there. trunkis under heavy development, as we work towards Dream profile 2.0, but 1.7.0 is stable and production proven.

Composing remote and local linq queries

One of the cool things with Linq is that queries are composable, i.e. you can add further query constraints by selecting from an existing query. Nothing is executed until you try to read from the query. This allows IQueryable to compose all the added constraints and transform it into the underlying query structure, most commonly SQL.

However it does come with the pitfall that there are a lot of things legal in Linq expressions that will die at run time. This happens because an expression may not have an equivalent syntax in the transformed language, like calling a method as part of a where clause.

This does not mean that you can’t use linq for the portion of the query that is not executable by the provider. As long as you know what expression is affected, you can use query composition to build a query that executes some part remotely and some part against the object model in memory.

Let’s suppose we wish to execute the following query:

var q = from e in session.Queryable<Entry>()
    where e.Created > DateTime.Parse("2009/9/1")
        &&  e.Created < DateTime.Now
        && e.Tags.Contains('foo')
    select e;

But our query provider doesn’t understand the extension method that allows us to check the list of Tags. In order for this query to work, that portion must be executed against the result set of the date range query. We could coerce the first portion to a list or array and then query that portion, but that would just force the date query to be materialized before we could prune the set. Instead we want to feed the stream of matching entries into a second query, composing a new query that contains both portions as a single query and won’t access the database until we iterate over it.

To accomplish this I created an extension method that coerces the query into a sequence that yields each item as it is returned by the database query:

public static class LinqAdapter {
    public static IEnumerable<T> AsSequence<T>(this IEnumerable<T> enumerable) {
        foreach(var item in enumerable) {
            yield return item;
        }
    }
}

UPDATE: As Scott points out in the comments, my AsSequence just re-implements what is already available in Ling as AsEnumerable. So the above really just serves to explain how AsEnumerable defers execution to enumeration rather than query definition.

Anyway, AsSequence or AsEnumerable allows me to compose the query from server and local expressions like this:

var q = from e in session.Queryable<Entry>()
    where e.Created > DateTime.Parse("2009/9/1")
        &&  e.Created < DateTime.Now
    select e;
q = from e in q.AsSequence() where e.Tags.Contains('foo') select e;

When q is enumerated, the first expression is converted to SQL and executes against the database. Each item returned from the database is then fed into the second query, which checks its expression and yields the item to the caller, should the expression match. Since q.AsSequence() is used as part of query composition, it does not force the first expression to execute at the time of query definition as q.ToList() would. The additional benefit is that even when q.AsSequence() is executed, it never builds the entire result set in memory as a list to iterate over, but rather just streams each database query result item through its own expression evaluation.

Of course, this still have the performance implications of sending data across the wire and filtering it locally. However, this is not an uncommon problem when SQL alone cannot provide all the filtering. The benefit of this approach is reduced memory pressure on execution, better control when execution occurs and the ability to use Linq syntax to do the secondary filtering.

By arne on | .net | 2 comments
Tags: ,

Concurrent Podcast and Producer/Consumer approaches

As usual, I’ve been blogging over on the MindTouch Developer blog, and since the topics i post about over there have a pretty strong overlap with what I’d post here, I figured i might as well start cross-posting about it here.

Aside from various technical posts, Steve Bjork and I have started recording a Podcast about concurrent programming. It’s currently 2 episodes strong, with a third one coming soon. Information on past and future posts can always be found here.

Today’s post on the MindTouch dev blog is about the producer/consumer pattern and how i moved from using dedicated workers with a blocking queue to using Dream’s new ElasticThreaPool to dispatch work.

Writing a Tag Algebra compiler with Coco/R

This week, i was digging back into the Coco/R implemented parser of DekiScript, tracking down a bug, which turned out to not be in the parsing bits at all. It did, however, get me familiarized with Coco/R again. So I thought i’d give myself an hour to implement the parser for my Tag related boolean algebra with Coco/R. If i could pull it off, forget about the regex/state-machine approach i was considering.

Took me about 15 minutes to set up the classes to represent the intermediate AST and another 30 minutes for the grammar in Coco/R ATG format. After that I wrote a couple of unit tests to check that the parsing was right, only to realize that while AND and OR are left-to-right associative, NOT is right-to-left associative. Figuring out how to adjust the grammar for that actually took me another 10-15 minutes. But overall, I hit the one hour goal.

The Syntax Tree

Before tackling the grammar to parse, I needed to define data structures to represent the parsed syntax tree, which I’d later convert to executable code. The syntax is fairly simple:

(foo+bar)|(^foo+baz)

This can be represented by just 4 tree node types (with common parent TagExpression:

  • AndExpression
  • OrExpression
  • NotExpression
  • Tag

The parentheses are just tree structure artifacts, so are not represented in the AST.

The Coco/R grammar

I’ve broken the grammar up to discuss the various parts, but the below sections represent a single ATG file, the Coco/R grammar format.

COMPILER TagAlgebra

  public TagExpression Result = TagExpression.Empty;

The COMPILER defines the entrypoint PRODUCTIONfor the parser. The following lines. until the next grammar definition, are inserted into the generated Parser, and can be used to inject class fields, extra methods, etc. into the Parser source. The only thing I inserted was a field to hold the root of the AST and initialize it to empty.

IGNORECASE

This tells Coco/R that our grammar is case insensitive.

CHARACTERS
  tab = '\t'.
  eol = '\n'.
  cr = '\r'.
  nbsp = '\u00a0'. // 0xA0 = 'unbreakable space'
  shy = '\u00ad'.  // 0xAD = 'soft-hyphen'
  letter = 'a'..'z'.
  digit = '0'..'9'.

CHARACTERS defines characters that the parser should recognize for matches.

TOKENS
  tag =
    letter { letter | digit | "_" | ":" }
    .

The only token in the grammar are tag, which is composed from the characters defined above and extra quoted characters.

IGNORE eol + cr + tab + nbsp + shy

IGNORE tells Coco/R what characters have no meaning in parsing the input.

Next come the PRODUCTIONS, i.e. the meat of the grammar. These are the rules for matching input and converting it into code. Coco/R is an LL(1) parser generator, i.e. the grammar must be parsable from Left to Right with Left-canonical derivations and one look-ahead symbol. We also cannot have a loop in our grammar, i.e all possible branches have to lead to a terminal via a unique set of production matches.

PRODUCTIONS

  TagAlgebra                      (. Result = TagExpression.Empty; .)
    =
    [ BinaryExpr<out Result> ]
    .

The first production, is the entry point which, again, sets the result to an empty AST, since the same instance of the parser can parse multiple expressions. It then specifies 0 or 1 BinaryExpr productions.

  BinaryExpr<out TagExpression expr>  (.  expr = null; TagExpression right = null; .)
    =
    NotExpr<out expr> {               (. bool and = false; .)
      (
        "|"
        | "+"                         (. and = true; .)
      )
      NotExpr<out right>              (. expr = and ? (TagExpression)new AndExpression(expr,right) : new OrExpression(expr,right); .)
    }
    .

BinaryExpr is used for both AND and OR expressions, since both take two sub-expressions and combine them with a single operator. The production specifies a left side of a NotExpr followed by an optional operator and another NotExpr. I.e. should our first match happen to not be a BinaryExpr, the parser can fall through to NotExpr return its result instead, without matching the optional production body.

  NotExpr<out TagExpression expr> (. expr = null; int notCount = 0; .)
    =
    {
      "^"                         (. notCount++; .)
    }
    (
      "(" BinaryExpr<out expr> ")"
      | tag                       (. expr = new Tag(t.val); .)
    )                             (. for(var i=0;i<notCount;i++) expr = new NotExpression(expr); .)
    .

END TagAlgebra.

NotExpr, just like BinaryExpr optionally matches on its operator, requiring only that the end of the operation it matches either a BinaryExpr enclosed by parentheses (i.e. not a circular match back into BinaryExpr, since it requires additional surrounding matches), or it matches a tag, the ultimate terminal in the grammar.

There is one tricky bit in this production, i.e. the NOT operator can match multiple times, which means, we need to accumulate the number of operator matches and build the chain of NOT expressions wrapping the current expression once we know how many, if any, matched.

What to do with the AST

The nice thing with Coco/R is that it adds no runtime dependency at all, building a fully self-contained Scanner and Parser. With these built, it is now possible to take Tag Algebra expressions and turn them into an executable tree of Func calls, as described in “Boolean Algebra for Tag queries“.

The grammar could have been written to accumulate the unique tags and construct the Func tree right away, but the two benefits of going to an AST first, is that a)the AST can easily be rendered back into text form (even placing parentheses properly for expressions that previously had not parentheses), and b) the AST can easily be programatically composed with other expressions, or decomposed into sub-expressions, which can be used for caching and other efficiency operations.

I’ll probably play with Irony next, but the “no runtime dependency” and existing familiarity made Coco/R the winner this time.

By arne on | .net, geek | A comment?
Tags: , , , , , ,

Boolean Algebra for Tag queries

Currently working on a tag query engine and couldn’t find anything all that useful in the published approaches. I want to do arbitrary boolean algebras against a set of tags in a database, which seems to be out of scope of SQL approaches. All the various tagging schemas out there reduce to either just AND or just OR queries, but not complex logic. However, I want to be able to do something like:

(foo+bar)|(foo+baz)|(bar+^baz)

If there is a way to do this with SQL, i’d love to know. But the way i look at it, i really have to fetch all tags for each item and then do apply that formula to the list of tags on the item.

But let’s break down the matching problem itself into something i can execute. Let’s assume I’ve got a simple parser that can turn the above into an AST. Really, i can decompose any variation into three operations, AND(a,b), OR(a,b) and NOT(a). And I can represent those with some simple Func<> definitions:

Func<bool, bool, bool> AND = (a, b) => a && b;
Func<bool, bool, bool> OR = (a, b) => a || b;
Func<bool, bool> NOT = (a) => !a;

Assuming that i have boolean tokens for foo, bar and baz, the expressions becomes:

OR(AND(foo, bar), OR(AND(foo, baz), AND(bar,NOT(baz))))

Now, the above expression can be expressed as a function that takes three booleans describing the presence of the mentioned tags, ignoring any other tags that the item has, returning a boolean indicating a successful match. In C# that expression would look like this:

Func<bool[], bool> f = x => OR(AND(x[0], x[1]), OR(AND(x[0], x[2]), AND(x[1],NOT(x[2]))));

Next we need to generate this boolean map from the list of tags on the item. Assuming that the tag list is a list of strings, we can define an extension methods on IEnumerable<string> to generate the boolean map like this:

public static bool[] GetBinaryMap(this IEnumerable<string> tags, string[] mask) {
    var map = new bool[mask.Length];
    foreach(var x in tags) {
        for(var i = 0; i < mask.Length; i++) {
            if(x == mask[i]) {
                map[i] = true;
            }
        }
    }
    return map;
}

And with this we can define a linq query that will return us all matching items:

var mask = new[] { "foo", "bar", "baz"};
Func<bool[], bool> f = x => OR(AND(x[0], x[1]), OR(AND(x[0], x[2]), AND(x[1],NOT(x[2]))));
var match = from item in items
            where f(item.Tags.GetBinaryMap(mask))
            select item;

Clearly this is isn’t the fastest executing query, since we first had to create our items, each item in which has a collection of tags. But there is a lot of optimizations left on the table here, such as using our tag mask to pre-qualify items, breaking down the AST into sub-matches that could be used against a cache to find items, etc.

But at least we have a fairly simple way to take complex boolean algebra on tags and convert them into something that we can evaluate generically

By arne on | .net, geek | A comment?
Tags: , , , ,

Ultimate Interface Segregation: Dependency injection by Delegate

I’ve been on a bit of a tear about declaring dependency contracts and injecting only what is required. While examining the use of Interfaces in IoC and their shortcomings, I decided that taken to the extreme, dependencies come down to call dependencies, which could be modeled with delegates rather than interfaces. Instead of writing a novel, as I’ve been prone to, i thought I’d do a shorter post on my approach to this solution, and expand on the implementation in later posts.

To recap, in the SOLID principles, the Interface Segregation Principle states: Clients should not be forced to depend upon interfaces that they do not use. This means that interfaces should be fine-grained enough to expose no more than one responsibility. Taken to the extreme, this could be taken to mean that each interface only has a single method. There are valid SRP scenarios where a responsibility is modeled by more than one call, but let’s start with the simplest scenario first, then see how well it applies to more complex responsibilities later.

In C# we have delegates, which describe a single method call. A delegate instance is a reference to a method that encapsulates a specific instance of a class, without exposing the underlying class (unless your delegate is a static method). A delegate can even be used to expose internal, protected and private methods.

Instead of declaring a list of interfaces that the IoC container should inject, classes would define their dependencies as delegates. Taking the example from my duck typing post, we would get the following dependency declarations.

First, we have the same service provider, MessageQueue, which still doesn’t need to implement an interface:

public class MessageQueue
{
 public void Enqueue(string recipient, string message) { ... }
 public string TryDequeue(string recipient) { ... }
}

Next, we have the new Producer, now declaring its dependency has a delegate:

public class Producer : IProducer
{
 public delegate void EnqueueDelegate(string recipient, string message);
 public Producer(EnqueueDelegate dispatcher) { ... }
}

And finally, we have the new Consumer, also declaring a delegate for construction time injection:

public class Consumer : IConsumer
{
 public delegate string TryDequeueDelegate(string recipient);
 public Consumer(TryDequeueDelegate inbox) { ... }
}

Think of the delegate as your Method Interface. You could define your dependencies as Func‘s and Action‘s, but that would obfuscate your dependencies beyond recognition in most scenarios. By using an explicit delegate, you get to attach the dependency to the class that has the dependency, in addition to having a descriptive signature.

Now, if we were to wire this up manually we’d get something like this:

var queue = new MessageQueue();
IProducer producer = new Producer(queue.Enqueue);
IConsumer consumer = new Consumer(queue.TryDequeue);

That’s simple enough, but not really very scalable, once you get a lot of dependencies to wire up. What we really need is an IoC container that let’s us register delegates against classes, instead of having to have instances at dependency declaration time. Delegates can’t be cast from one to another and are not, strictly speaking, types, which posts some challenges with creating a type-safe registration interface. There are a number of ways to accomplish this syntax, which I will elaborate on in my next post.

By arne on | geek | A comment?
Tags: , , ,

C# duck-typing to the rescue for Interface Segregation

Interfaces are the method by which we get multiple inheritance in C#, Java, etc. They are contracts without implementation. We don’t get the messy resolution of which code to use from multiple base classes, because there’s only one inheritance chain that includes code.

They’re useful to let us provide one contract and multiple implementations or simply describe a contract for our code and allow someone else to come along and replace our implementation entirely.

In practice, though, I almost always use them purely for decoupling the contract from the code, so that I can replace the implementation with a mock version in unit tests. Hmm.. So, I use interfaces just to get around the yoke of the type system? Wait, why I am so in love with statically typed languages again?

Right… While the above sounds like the interface exists just so that I can mock my implementation, the real purpose of the interface is the ability for the consuming code to express a contract for its dependencies. It’s unfortunate that interface implementation forces them to be attached at the implementation side, which is why I say that Interface attachment is upside down. And it’s deep rooted in the language, after all it’s called Interface Inheritance not Dependency Contract Definition.

Dynamic Proxy

So, it’s not surprising that there is no CLR-level way around this limitation. Fortunately, you can always create a dynamic proxy that wraps your class and implements the Interface. Both Castle‘s DynamicProxy and LinFu‘s DynamicProxy are excellent frameworks for writing your own proxies. I’ve never tested them against each other, but have used both in production and neither showed up as culprits when time for profiling came about.

With a dynamic proxy, you can generate an object that claims to implement an interface but under the hood just has a interceptors that provide the call signature to let you respond correctly or proxy the call on to a class you are wrapping. I’ve previously covered how you can even use them to have a class inherit from more than one base class via a proxy. This is necessary if you want to remote an object, which requires a base of MarshalByRefObject, but you already have a base class.

However, proxies require a fair bit of hand-rolling so they are not the most lightweight way, development time wise, to attach an Interface.

Duck Typing

What would be really useful would be the ability to cast an object to an interface:

IUser user = new User() as IUser;

The above code would even be compile time verifiable, since we can simply see if the User implements the call signatures promised by IUser. This would be provide us strongly typed Duck Typing — an object that can quack ought to be able to be treated as a duck.

This is where LinFu goes a step further than just DynamicProxy and provides duck typing as well:

IUser user = DynamicObject(new User()).CreateDuck<IUser>();

DynamicObject‘s constructor takes an instance of a class to wrap. You can then create a duck from that dynamic object which automatically proxies the given interface and will call the appropriate method on the wrapped class on demand.

Using duck typing to satisfy the Interface Segregation Principle

Saying that you may have a class that has the perfect method signature but doesn’t implement an interface you already have, does sound rather contrived. However, forcing a class to implement an interface of your choosing does have some real benefits, aside from being able to abstract an existing class into a mockable dependency:

Clients should not be forced to depend upon interfaces that they do not use

One problem with interfaces is that they tell you everything an implementation can do. And often a class acts as a service that provides functionality to more than one client class, but provides just a single interface. That single interface may expose capabilities that you don’t care about.

Instead, interfaces should be fine-grained to only include the methods appropriate to the client. But that’s not always feasible. Aside from having a class implement lots of tiny interfaces, the service class does not know about the client’s requirements, so it really doesn’t know what the interfaces should include. The client, on the other hand, does know and can tailor exactly the right interface it wants as a contract with the dependency.

Suppose we have message queue for passing data between decoupled classes:

public class MessageQueue
{
 public void Enqueue(string recipient, string message) { ... }
 public string TryDequeue(string recipient) { ... }
}

Proper interface segregation would have us create a Dispatcher interface for our message Producer

public interface IMessageDispatcher
{
 void Enqueue(string recipient, string message);
}

public class Producer : IProducer
{
 public Producer(IMessageDispatcher dispatcher) { ... }
}

and an inbox interface for our message Consumer

public interface IMessageBox
{
 string TryDequeue(string recipient);
}

public class Consumer : IConsumer
{
 public Consumer(IMessageBox inbox) { ... }
}

Assuming that MessageQueue does not implement our interfaces (yes, in this case it would not have been a problem to have the class implement them both, but this is a simplified example with obvious segregation lines), we can now configure our IoC container (example uses AutoFac) to create the appropriately configured IProducer and IConsumer, each receiving exactly those capabilities they should depend on:

var queue = new MessageQueue();
var builder = new ContainerBuilder();
builder.Register<Producer>().As<IProducer>();
builder.Register<Consumer>().As<IConsumer>();
builder.Register(c => new DynamicObject(queue).CreateDuck<IMessageDispatcher>()).As<IMessageDispatcher>();
builder.Register(c => new DynamicObject(queue).CreateDuck<IMessageBox>()).As<IMessageBox>();

using (var container = builder.Build())
{
 var producer = container.Resolve<IProducer>();
 var consumer = container.Resolve<IConsumer>();
}

But what about C# 4.0 & Dynamic

While I think Dynamic objects in C# 4.0 are very cool, as of right now, they seem to have skipped over duck typing, at least in a strongly typed fashion.

Sure, once you have a dynamic instance, the compiler will let you call whatever signature you wish on it and defers checking until execution time. But that means we have no contract on it, if used as a dependency, nor can we use it to dynamically create objects that provide implementations for existing contracts. So, you’ve have to wrap a dynamic object with a proxy, in which case, LinFu’s existing duck typing already provides a superior solution.

The lack of casting to an interface, imho was already oversight with C# 3.0, which introduced anonymous classes that are so convenient for Linq projections, but can’t be passed out of the scope of the current method, due to a lack of type.

So don’t expect C# 4.0 to do anything to let you more easily attach your contracts at the dependency level. For the foreseeable future, this remains the territory of Dynamic Proxy.

Next time: Delegate injection

However, there is another way to deal with dependency injection that provides a fine-grained contract and imposes no proxies nor other requirement on the classes providing the dependency: Injection of the required capabilities as delegates

I’ve been experimenting with a framework to make this as painless as traditional IoC containers make dependency resolution. It’s still a bit rough around the edges, but I hope to have some examples to write about soon.

By arne on | geek | 1 comment
Tags: , , , ,