Skip to content

net

Moq rocks

Ok, so i'm not proud of it, but i've been a hold-out on mocking frameworks for a while. With the auto-gen of interfaces that resharper gives me, i'd just gotten pretty fast at rolling my own mocks. Once or twice a year, i'd dip my toe into a mocking framework, find its syntax frustrating and rather than get a good used to it, I'd soon find myself replacing NotImplementedException in a stub for yet another custom Mock object.

My reasoning was that if i can roll my mocks just as fast as wiring up a mock, then why bother with the dependency. And I thought I wasn't really causing myself too much extra work.

In the meantime, I even wrote a simple Arrange/Act/Assert mocking harness for Dream, so i could better test REST services. So, it's not like i didn't believe in the benefits of mocking harnesses.

Well, for the last couple of weeks, I've been using Moq and it's pretty much killed off all my desire to roll my own. I'm generally a huge fan of lambdas and have gotten used thinking in expressions. Although even with that, I wasn't able to get comfortable with the latest Rhino.Mocks. Probably just me. But from the very first attempt, Moq worked like i think and I was up and running.

var mock = new Mock();
mock.Setup(x=> x.Bar).Returns("bar").AtMostOnce();
var foo = mock.Object;
Assert.AreEqual("bar",foo.Bar);

I'm a convert!

Using TDD to learn new code

When I pick up a new framework or library, there's usually that learning curve where I get familiar with its API, find what works, what doesn't work, etc. One habit I've gotten into is that I create a TestFixture for everything i think i should be able to do and build a test for that assumption. The purpose of these tests is both to make sure the code does what I expect it to, but also to serve as a record of what I've already learned. If i later on wonder how some call would function, i first check my test signatures, to see if i've already tested that behavior. If there is an appropriate test, I immediately know what the behavior will be, plus i now have working sample code or how to do it.

For example, I was playing around with setting up Moq's through Autofac and wanted to come up with a registration that would give me a container scoped Moq object that i could set up before executing a particular test. The resulting test looked like this:

public interface IMockWithAccessor
{
 IMockAccessorValue Accessor { get; }
}
public interface IMockAccessorValue
{
 string Foo { get; }
}

[Test]
public void Create_nested_mock_so_it_can_be_altered_in_container_scope()
{
 var builder = new ContainerBuilder();
 builder.Register(c => new Mock<IMockAccessorValue>())
  .As<Mock<IMockAccessorValue>>().ContainerScoped();
 builder.Register(c => c.Resolve<Mock<IMockAccessorValue>>().Object)
  .As<IMockAccessorValue>().ContainerScoped();
 builder.Register(c =>
 {
  var mockBuilder = new Mock<IMockWithAccessor>();
  mockBuilder.Setup(x => x.Accessor)
                       .Returns(c.Resolve<IMockAccessorValue>());
  return mockBuilder.Object;
 }).As<IMockWithAccessor>().ContainerScoped();

 using (var container = builder.Build().CreateInnerContainer())
 {
  var mockAccessorBuilder = container
                     .Resolve<Mock<IMockAccessorValue>>();
  mockAccessorBuilder.Setup(x => x.Foo).Returns("bar");
  Assert.AreEqual("bar", container
                     .Resolve<IMockWithAccessor>().Accessor.Foo);
 }
}

Sometimes, of course, my expectations are not met and the code does not allow me to do what i set out to do. These test are even more valuable for future reference, as long as i make sure to rename the test to reflect the failed expectation, and alter the asserts to reflect the actual behavior.

I was trying to figure out parametrized component registrations in Autofac. The example showed it being used with FactoryScope. I wondered whether, in default (Singleton) scope, Autofac would use the parameters to create singletons per parameter set. My original test was named Parametrized_resolve_creates_different_singleton_per_parameter_Value. Well, it turned out that, no, autofac does not vary singletons, and parametrized registrations only make sense in FactoryScope. The final test looks like this:

public class ParametrizedSingleton { }
[Test]
public void Parametrized_resolve_without_factory_scope_is_always_a_singleton()
{
 var builder = new ContainerBuilder();
 builder.Register((c, p) => new ParametrizedSingleton());
 using (var container = builder.Build())
 {
  var foo1 = container.Resolve<ParametrizedSingleton>(
                     new NamedParameter("type", "foo"));
  var foo2 = container.Resolve<ParametrizedSingleton>(
                     new NamedParameter("type", "foo"));
  var bar1 = container.Resolve<ParametrizedSingleton>(
                     new NamedParameter("type", "bar"));
  Assert.AreSame(foo1, foo2);
  Assert.AreSame(foo1, bar1);
 }
}

I usually keep these test fixtures in a separate test project in my solution as permanent reference, as i continue to develop the main code. It's proven useful a number of times when coming back to some old code and having to reacquaint myself with a third party bit of code.

Designing a Delegate Injection Container

In my last post I proposed using delegates instead of interfaces to declare dependencies for injection. While delegates are limited to a single function call, this is often sufficient for service dependencies. In addition, this is not a wholesale replacement for traditional IoC, since at the end of the day, you have to have some class instances that provide the methods bound by the delegates and want to return instances require those delegates, so our container will still need to resolve class instances.

The main benefit of using delegates instead of interfaces is that delegates do not impose any requirements on the providing class, and thereby dependencies can be defined by what the client needs rather than what a service provides.

Mapping Delegate Dependencies

To illustrate this mapping, let's bring back the classes defined in the last post:

public class MessageQueue
{
 public void Enqueue(string recipient, string message) { ... }
 public string TryDequeue(string recipient) { ... }
}

public class Producer : IProducer
{
 public delegate void EnqueueDelegate(string recipient, string message);
 public Producer(EnqueueDelegate dispatcher) { ... }
}

public class Consumer : IConsumer
{
 public delegate string TryDequeueDelegate(string recipient);
 public Consumer(TryDequeueDelegate inbox) { ... }
}

What we need is a way to map a delegate to a method:

EnqueueDelegate => MessageQueue.Enqueue
TryDequeueDelegate => MessageQueue.TryDequeue

Aside from the fact that the above is not a legal C# syntax, the above has the implicit assumption that we can resolve a canonical instance of MessageQueue, since MessageQueue.Enqueue really refers to a method on an instance of MessageQueue. I.e. our container must function like a regular IoC container so we can resolve that instance.

The above solves the scenario of a mapping one delegate to one implementation. In addition, we'd probably want the flexibility to map a particular implementation to a particular client class, such that:

Producer =>
  EnqueueDelegate => MessageQueue.Enqueue
Consumer =>
  TryDequeueDelegate => MessageQueue.TryDequeue

Using Expressions to capture delegate mappings

The usage scenarios described are simple enough to understand. If our container were initialized using strings or some external configuration file (Xml, custom DSL, etc.), the actual reflection required for the injection of mappings isn't too complex either. However, I abhor defining typed mappings without type-safety. This is isn't so much about making mistakes in spelling, etc. that the compiler can't catch. It is mostly about being able to navigate the mappings and the dependencies they describe and the ability to refactor while keeping the mappings in sync.

It would be great if we could just say:

Note: I'm purposely using a syntax that mimics Autofac, since the implementation later on will be done as a hack on top of Autofac

builder.Define<Producer.EnqueueDelegate>.As<MessageQueue.Enqueue>();

That looks fine and could even work in the face of polymorphism, since knowing the signature of Producer.EnqueueDelegate we could reflect the proper overload of MessageQueue.Enqueue. However, C# has no syntax for getting at the MethodInfo of a method via Generics (it is not a type). There isn't even an equivalent to typeof(T) for members, the reason for which was well explained by Eric Lippert in In Foof We Trust: A Dialogue. The only way to get MethodInfo relies on string based reflection.

Fortunately, C#3.0 introduced a syntax that allows us to capture method calls as expression trees that we can decompose programatically. This lets us to express our method call like this:

MessageQueue messageQueue = new MessageQueue();
Expression<Producer.EnqueueDelegate> expr = (a,b) => messageQueue.Enqueue(a,b);

This expression conveniently infers the type of a and b. As a sidenote, Producer.EnqueueDelegate does not mean that EnqueueDelegate is a member of Producer. It's just a syntax artifact of nested declarations in C#, which in this case conveniently makes the delegate look attached to the class.

Unfortunately, there we can't just include MessageQueue in the parameter list of the lambda. If we were to include it in the argument list, it could not be inferred, and if we were to define MessageQueue explicitly as a lambda parameter, then we'd be forced to declare all arguments. We want to express the above and only explicitly define MessageQueue. To accomplish this we need to create a composite expression that previously was told about MessageQueue:

Expression<Func<MessageQueue, Expression<Producer.EnqueueDelegate>>> expr
  = x => (a,b)=> messageQueue.Enqueue(a,b);

Now we have enough syntactic sugar to describe our two registration scenarios in terms of the container builder. First, the global registration of the delegate against an implementation:

builder.Define<Consumer.TryDequeueDelegate>()
  .As<MessageQueue>(x => a => x.TryDequeue(a));
builder.Define<Producer.EnqueueDelegate>()
  .As<MessageQueue>(x => (a, b) => x.Enqueue(a, b));

And alternately, the registration of delegates and their implementation in the context of a particular class:

builder.Register<Consumer>().As<IConsumer>()
  .With<Consumer.TryDequeueDelegate>()
  .As<MessageQueue>(x => a => x.TryDequeue(a));

builder.Register<Producer>().As<IProducer>()
  .With<Producer.EnqueueDelegate>()
  .As<MessageQueue>(x => (a, b) => x.Enqueue(a, b));

Next time, I'll go over the implementation of the above to get it working as an injection framework.

Searching a Tree of Objects with Linq, Revisited

A while back, I wrote about searching through a tree using linq to objects. That post was mostly snippets of code about delegates, lambda's, yield and how it applies to linq -- more a technical exploration than an example. So I thought I'd follow it up with concrete extension methods to make virtually any tree searchable by Linq.

Linq, IEnumerable, yield

All that is required to search a tree with Linq is creating a list of all nodes in the tree. Linq to Objects can operate on IEnumerable. Really, Linq to objects is a way of expressing operations we've been doing forever in loops with if/else blocks. That means there isn't any search magic going on, it is a linear traversal of all elements in a set and examining each to determine whether it matches our search criteria.

To turn a tree into a list of node we need to walk and collect all children of every node. A simple task for a recursive list that carries along a list object to stuff every found node into. But there is a better way, using yield to return each item as it is encountered. Now we don't have to carry along a collection. Iterators using yield implement a pattern in which a method can return more than once. For this reason, a method using yield in C# must return an IEnumerable, so that the caller gets a handle to an object it can traverse the result of the multiple return values.

IEnumerable is basically an unbounded set. This is also the reason why unlike collections, it does not have a Count Property. It is entirely possible for an enumerator to return an infinite series of items.

Together IEnumerable and yield are a perfect match for our problem, i.e. recursively walking a tree of nodes and return an unknown number of nodes.

Two types of Tree Traversal

Depth First

In depth-first traversal, the algorithm will dig continue to dig down a nodes children until it reaches a leaf node (a node without children), before considering the next child of the current parent node.

Breadth First

In breadth-first traversal, the algorithm will return all nodes at a particular depth first before considering the children at the next level. I.e. First return all the nodes from level 1, then all nodes from level 2, etc.

Tree to IEnumerable Extension methods

public static class TreeToEnumerableEx
{
 public static IEnumerable<T> AsDepthFirstEnumerable<T>(this T head, Func<T, IEnumerable<T>> childrenFunc)
 {
  yield return head;
  foreach (var node in childrenFunc(head))
  {
   foreach (var child in AsDepthFirstEnumerable(node, childrenFunc))
   {
    yield return child;
   }
  }
 }

 public static IEnumerable<T> AsBreadthFirstEnumerable<T>(this T head, Func<T, IEnumerable<T>> childrenFunc)
 {
  yield return head;
  var last = head;
  foreach(var node in AsBreadthFirstEnumerable(head,childrenFunc))
  {
   foreach(var child in childrenFunc(node))
   {
    yield return child;
    last = child;
   }
   if(last.Equals(node)) yield break;
  }
 }
}

This static class provides two extension methods that can be used on any object, as long as it's possible to express a function that returns all children of that object, i.e. the object is a node in some type of tree and has a method or property for accessing a list of its children.

An Example

Let's use a hypothetical Tree model defined by this Node class:

public class Node
{
 private readonly List<Node> children = new List<Node>();

 public Node(int id)
 {
  Id = id;
 }

 public IEnumerable<Node> Children { get { return children; } }

 public Node AddChild(int id)
 {
  var child = new Node(id);
  children.Add(child);
  return child;
 }

 public int Id { get; private set; }
}

Each node simply contains a list of children and has an Id, so that we know what node we're looking at. The AddChild() method is a convenience method so we don't expose the child collection and no node can ever be added as a child twice.

The calling convention for a depth-first collection is:

IEnumerable<Node> = node.AsDepthFirstEnumerable(n => n.Children);

The lambda expression n => n.Children is the function that will return the children of a node. It simply states given n, return the value of the Children property of n. A simple test to verify that our extension works and to show us using the extension in linq looks like this:

[Test]
public void DepthFirst()
{
 // build the tree in depth-first order
 int id = 1;
 var depthFirst = new Node(id);
 var df2 = depthFirst.AddChild(++id);
 var df3 = df2.AddChild(++id);
 var df4 = df2.AddChild(++id);
 var df5 = depthFirst.AddChild(++id);
 var df6 = df5.AddChild(++id);
 var df7 = df5.AddChild(++id);

 // find all nodes in depth-first order and select just the Id of each node
 var IDs = from node in depthFirst.AsDepthFirstEnumerable(x => x.Children)
        select node.Id;

 // confirm that this list of IDs is in depth-first order
 Assert.AreEqual(new int[] { 1, 2, 3, 4, 5, 6, 7 }, IDs.ToArray());
}

For breadth-first collections, the calling convention is:

IEnumerable<Node> = node.AsBreadthFirstEnumerable(n => n.Children);

Again, we can test that the extension works like this:

[Test]
public void BreadthFirst()
{
 // build the tree in breadth-first order
 var id = 1;
 var breadthFirst = new Node(id);
 var bf2 = breadthFirst.AddChild(++id);
 var bf3 = breadthFirst.AddChild(++id);
 var bf4 = bf2.AddChild(++id);
 var bf5 = bf2.AddChild(++id);
 var bf6 = bf3.AddChild(++id);
 var bf7 = bf3.AddChild(++id);

 // find all nodes in breadth-first order and select just the Id of each node
 var IDs = from node in breadthFirst.AsBreadthFirstEnumerable(x => x.Children)
       select node.Id;

 // confirm that this list of IDs is in depth-first order
 Assert.AreEqual(new int[] { 1, 2, 3, 4, 5, 6, 7 }, IDs.ToArray());
}

Searching Trees

The tree used in the example is of course extremely simple, i.e. it doesn't even have any worthwhile data to query attached to a node. But these extension methods could be used on a node of any kind of tree, allowing the full power of Linq, grouping, aggregation, sorting, projection, etc. to be used on the tree.

As a final note, you may wonder, why bother with depth-first vs. breadth first? After all, in the end we do examine every node! There is however one particular case where the choice of algorithm can be very important: You are looking for one match or a particular number of matches. Since we are using yield, we can terminate the traversal at any time. Using the FirstOrDefault() extension on our Linq expression, the traversal would stop as soon as one match is found. And if have any knowledge where that node might be in the tree, the choice of search algorithm can be a significant performance factor.

db4o 7.4 binaries for mono

As I talked about recently, the standard binaries for db4o have some issues with mono, so I recompiled the unmodified source with the MONO configuration flag. I've packed up both the debug and release binaries and you can get them here. These are just the binaries (plus license). It's not the full db4o package. If you want the full package, just get it directly from the db4o site, since the MONO config flag and have Visual Studio rebuild the package.

This package should show up on the official db4o mono page shortly as well.

db40 indexing and query performance

Indexing on db4o is a bit non-transparent, imho. There's barely a blurp in their Documentation app and it just tells you how to create an index and how to remove it. But you can't easily inspect that one exists, or whether it's being used. So i spent a good bit of time today trying to figure out why my queries were so slow, was an index created and if so, was it being used? The final answer is, if querying is slow in db4o, you're not using an index, because, OMG, you'll know when you do an indexed query.

Index basics

Given an object such as

public class Foo
{
  public string Bar;
}

you create an index, globally (meh) for that object on all databases you create thereafter, with this call:

Db4oFactory.Configure().ObjectClass(typeof(Foo)).ObjectField("Bar").Indexed(true);

So far, straight forward enough. But let's say you're using a property? Well, db4o does its magic by inspecting your underlying storage fields, so you have to index them, not the properties that expose them. That means if our object was supposed to have a readonly property Bar, like this:

public class Foo
{
  private string bar;
  public Foo(string bar)
  {
    this.bar = bar;
  }
  public string Bar { get { return bar; } }
}

then the field you need to index is actually the private member bar:

Db4oFactory.Configure().ObjectClass(typeof(Foo)).ObjectField("bar").Indexed(true);

Given this idiosyncrasy, the obvious question is "what about automatic properties?". Well, as of right now the answer is, no such luck, because you'd have to reflect the underlying storage field that is created and index it, and you don't get any guarantees that field is named the same from compiler to compiler or version to version. That probably also means, that automatic properties are dangerous all around, because you may never get your data back if the storage changes, although on that conclusion i'm just speculating wildly.

Query performance

Index in hand, I decided to populate a DB, always checking if the existing item already existed, using a db4o native query. That started at 1 ms query time and then linearly increased with every item added. That sure didn't seem like an indexed search to me. I finally discovered a useful resource on the db4o site, but unfortunately it's behind a login, so google didn't help me find it and my link to it will only take you to the login. That's a shame because this bit of information ought to be somewhere in big bold letters!

You must have the following DLLs available for Native Queries to be optimized into SODA queries, which apparently is the format that hits the index:

  • Db4obects.Db4o.Instrumentation.dll
  • Db4objects.Db4o.NativeQueries.dll
  • Mono.Cecil.dll
  • Cecil.FlowAnalysis.dll

The query will execute fine, regardless of their presence, but the performance difference between the optimized, index using query and the unoptimized native query is orders of magnitude. My queries went from 100-500ms to 0.01ms, just by dropping those DLLs into my executable directory. Yeah, that's a useful change.

Interestingly enough, the same is not required for linq queries. They seem to hit the index without the extra help (although just to even run, Mono.Cecil and Cecil.FlowAnalysis need to be present, so here you at least get an error). There currently appears to be about 1ms overhead for parsing linq into SODA, but i'll take that hit for the syntactic sugar.

Conclusions

I'm pretty happy with simplicity and performance of db4o so far. It seems like an ideal local, queryable persistence layer. The way it works does want to make me abstract my data model into simple data objects that are then converted into business entities. I'd rather have the attribute based markup of ActiveRecord, but that's not a deal breaker.

Db4o on .NET and Mono

After failing to get a cross-platform sample of NHibernate/Sqlite going, I decided to try out Db4o. This is for a simple, local object persistence layer anyhow, nothing more than a local cache, so db4o sounded perfect.

The initial DLLs for 7.4 worked beautifully on .NET but ran into problems on Mono. Apparently db4o imports FlushFileBuffers from kernel32.dll if your build target is not CF or mono. And in its call to FlushFileBuffers it uses FileStream.SafeFileHandle.DangerousGetHandle() which it not yet implemented under Mono, resulting in this exception:

Unhandled Exception: System.NotImplementedException: The requested feature is not implemented.
  at System.IO.FileStream.get_SafeFileHandle () [0x00000]
  at Sharpen.IO.RandomAccessFile.Sync () [0x00000]
  at Db4objects.Db4o.IO.RandomAccessFileAdapter.Sync () [0x00000]
  ...

I found this page on the Db4o site, which suggested just falling back to FileStream.Handle. However, that for me just resulted in this:

Unhandled Exception: System.EntryPointNotFoundException: FlushFileBuffers
  at (wrapper managed-to-native) Sharpen.IO.RandomAccessFile:FlushFileBuffers (intptr)
  at Sharpen.IO.RandomAccessFile.Sync () [0x00000]
  at Db4objects.Db4o.IO.RandomAccessFileAdapter.Sync () [0x00000]
  ...

So, i simply defined MONO as a compilation symbol in visual studio and rebuilt it. I figure the only time this code will run on Windows is during testing, so treating it as mono is fine. And that did solve my issues and i now have a DLL for db40 7.4 that works beautifully across .NET and mono from a single build.

Being a Linq nut, I immediately decided to skip the Native Query syntax and dive into using the Linq syntax instead. Which worked great on mono 2.0.1, but unfortunately on the current Redhat rpm (stuck back in 1.9.1 lang), the Linq implementation isnt' quite complete and you get this:

Unhandled Exception: System.NotImplementedException: The requested feature is not implemented.
  at System.Linq.Expressions.MethodCallExpression.Emit (System.Linq.Expressions.EmitContext ec) [0x00000]
  at System.Linq.Expressions.LambdaExpression.Emit (System.Linq.Expressions.EmitContext ec) [0x00000]
  at System.Linq.Expressions.LambdaExpression.Compile () [0x00000]
  at Db4objects.Db4o.Linq.Expressions.SubtreeEvaluator.EvaluateCandidate (System.Linq.Expressions.Expression expression) [0x00000]
  ...

But falling back from this syntax:

var items = from RosterItem r in db where r.CreatedAt > DateTime.Now.Subtract(TimeSpan.FromMinutes(10)) select r;

to the NativeQuery syntax (with delegates replaced by lambda's): `db.Query`
var items = db.Query<RosterItem>(r => r.CreatedAt > DateTime.Now.Subtract(TimeSpan.FromMinutes(10)));

It's still a fairly compact and straight forward syntax, so until i finish setting up my own Centos mono RPMs, i'll stick to this syntax.

I need to run db4o through some more serious paces, but I like what I see so far.

Dream for REST and async

I've been doing a lot of work inside of MindTouch Dream as of late over at MindTouch and i'm really digging it. Steve's put together an awesome framework for doing asynchronous programming on .NET and for being able to treat all access as RESTful resources in your server side code.

Now, coming from a very Dependency Injection heavy design philosophy, Dream has been a bit of an adjustment for me, but the capabilities of Dream, especially the coroutine approach for dealing with requests, is very powerful and fairly intuitive, once you get your head around it.

In an effort to ease the Dream learning curve and cement my understanding of the code base, I'll be blogging articles about it as I go along, and cross-posting them to the MindTouch developer wiki as well. My first article was a continuation of Steve's Asynchronicity library series, this one about coroutines (read: yield) in Dream.

I've been using the C# Web Server project for my REST work up until recently, but I'm currently in the process of migrating it over to Dream. It just removes all the legwork and fits much better into the async workflow of the rest of notify.me.

Clearly I am biased, but seriously, if you need to build REST interfaces in .NET, Dream beats anything you can roll on your own in a reasonable amount of time, and definitely is about 1000% more powerful than trying to force WCF down the REST or even POX path.

IEnumerable.ForEach()

I keep wanting to do ForEach() on a collection, and noticed that it was inconsistent whether that extension method was available or not. I always figured that i wasn't importing the right namespace, and blamed ReSharper for not being smart enough to figure out what namespace I was missing. So I finally googled the problem only to find this. Turns out ForEach() isn't on IEnumerable, only on Array and List. Meh. But thanks to Dave, i now have it in my core assembly.

A case for TDD

I know, it's been rather quiet here lately. I've just been slammed with coding, so writing things up is falling behind. In addition, my blogging time's going to be split between techblog.notify.me and www.mindtouch.com/blog. And I'm behind on both of those as well. I should have some fun Dream and DekiScript stuff for the mindtouch blog and some asynchronous programming for the notify.me blog soon. As soon as i can get myself to stop coding again.

So what makes me stop coding for a minute to babble on? It's just a quick case studio of why TDD is important.

I've been an Inversion of Control/Dependency Injection for about a year and a half, and while I've eased my way into it, I'm pretty much at the "an interface for every class" stage of having everything abstracted so i can easily mock things. But here and there, I take in third party assemblies for my projects. And most of the time, they are not well interfaced. And generally I try to create a facade that is interfaced, so i can test my interaction in isolation. But depending on how many secondary classes their code uses, sometimes my facade gets lazy leaving places i can't mock.

Now, i'm pretty religious about test coverage, but i do have holes where my facade leaves untestable bits. And this is where TDD shows it's worth. Because when a feature is added or a refactor happens, almost with 100% certainty, the bugs that manage to get into production are in the code that doesn't have test coverage.

The lesson here is that the time saved in not building a properly mockable facade, thereby torpedoing my testability, is repaid manyfold in debugging later as bugs make it into production. meh.