.NET 3.5

IEnumerable.ForEach()

I keep wanting to do ForEach() on a collection, and noticed that it was inconsistent whether that extension method was available or not. I always figured that i wasn’t importing the right namespace, and blamed ReSharper for not being smart enough to figure out what namespace I was missing. So I finally googled the problem only to find this. Turns out ForEach() isn’t on IEnumerable, only on Array and List. Meh. But thanks to Dave, i now have it in my core assembly.

By arne on | .net | A comment?
Tags: , ,

Addicted to .Net 3.5

Outside of LINQ, i thought that 3.5 was a lot of cool but not vital syntactic sugar. This weekend marks the first time since November that I fired up VS.NET 2k5 to build an app targeting the 2.0 framework and I was amazed how much I’d already come to rely on that sugar. Now this might be seen as an invalidation of my preference of explicit, verbose syntax versus the terse syntax of many scripting languages. I’d like to point out that terseness and expressiveness in sytnax are two separate things. Syntactic sugar that let’s me express my action in more concise code that easily conveys the meaning is not the same thing as using a terse vocabulary to keep the typing down and requiring memorization of abbreviated keywords to understand the code. Anyway, here are the parts of 3.5 I’ve missed more than once this weekend.

Extension Methods on IEnumerable

The plethora of extension methods on all things IEnumerable is largely due to LINQ, but the To* methods have become just a basic part of my vocubulary. Taking the Values of a Dictionary into an Array of the same type now is seems just painfully verbose without the ToArray() method. Compare

FileInfo[] f = new FileInfo[files.Count];
files.Values.CopyTo(f,0);
return f;

with

return files.Values.ToArray();

Object/List initializer syntax

Now this I really thought of as frivolous. However, using objects that use DTOs as their initializer/storage, initialization does become rather awkward without loops or long constructors:

      FileInfoData[] remoteFileData = new FileInfoData[]
      {
        new FileInfoData(),
        new FileInfoData(),
        new FileInfoData()
      };
      remoteFileData[0].name = "test1.mpg";
      remoteFileData[1].name = "test2.mpg";
      remoteFileData[2].name = "test10.mpg";
      FileInfo[] remoteFiles = new FileInfo[]
      {
        new FileInfo(remoteFileData[0]),
        new FileInfo(remoteFileData[1]),
        new FileInfo(remoteFileData[2]),
      };

versus

      FileInfo[] remoteFiles = new FileInfo[]
      {
        new FileInfo(new FileInfoData()
        {
          Name = "test1.mpg"
        }),
        new FileInfo(new FileInfoData()
        {
          Name = "test2.mpg"
        }),
        new FileInfo(new FileInfoData()
        {
          Name = "test10.mpg"
        }),
      };

Anonymous delegates are a pain and hard to read

I needed to pass in a delegate to a function as a factory callback. Perfect scenario for a nice concise lambda. But I was in 2.0, so that meant defining a delegate and anonymous delegate syntax resulting ing

  public delegate ILocalFileSystemManager LocalCreateDelegate(string localPath, string extension);

  public class FileSystemManagerFactory
  {
    public FileSystemManagerFactory( LocalCreateDelegate localFactory )
    {
      this.localFactory = localFactory;
    }
  }

FileSystemManagerFactory factory = new FileSystemManagerFactory(
  delegate(string localPath, string extension)
  {
    return new MockLocalFileSystemManager(localPath, extension);
  });

instead of

  public FileSystemManagerFactory( Func<string,string,ILocalFileSystemManager> localFactory )
  {
    this.localFactory = localFactory;
  }

FileSystemManagerFactory factory = new FileSystemManagerFactory(
  (localPath, extension)
  =>
  return new MockLocalFileSystemManager(localPath, extension);
  );

Automatic Properties

If there is one feature of C# (well and java as well) that is the most code generated, it’s getters and setters. I’ve never liked how code generation tools created those for me, since i liked having my private members in one place and Properties in another. So i’ve been typing them out for years. But with C# 3.0, we got automatic properties. The two patterns, read/write properties and read-only properties are oft repeated like this

string readwriteMember;
string readonlyMember;

public string ReadWrite
{
  get { return readwriteMember; }
  set { readwriteMember = value; }
}

public string ReadOnly { get { return readonlyMember; }

Not a ton of code, but certainly takes more time to write than

public string ReadWrite { get; set; }

public string ReadOnly { get; private set; }

Ho hum.. I’m in 2.0, so i’ll have to deal, but I certainly hope that 3.5 has a fast pick-up rate on the client (on the server, I can still control my environment).

By arne on | .net | A comment?
Tags: ,

Searching a Tree of Objects with Linq

UPDATE: Posted a follow-up here.
I’ve finally had legitimate use for LINQ to Objects, not just to make the syntax cleaner, but also to simplify the underlying code and provide me a lot of flexibility without significant work.

The scenario

I have a tree of objects that have both a type and a name. The name is unique, the Type is not. The interface is this:

public interface INode
{
  int Id { get; }
  string Name { get; set; }
  string Type { get; set; }
  List<INode> Children { get; }
}

I want to be able to find a single named Node in the tree and I want to be able to retrieve a collection of all nodes for a particular type. The searchable interface could be expressed as this:

public interface ISearchableNode : INode
{
  INode FindByName(string name);
  IEnumerable<INode> FindByType(string name);
}

Both require me to walk the tree and examine each node, so clearly I just want to have one walk routine and generically evaluate the node in question. In C# 2.0 parlance, that means I could pass an anonymous delegate into my generic find routine and have it recursively iterate through all the children. I also pass along a resultset to be populated.

The signature for the evaluation delegate looks like this:

delegate bool FindDelegate(INode node);

but since I’m using C# 3.0 (i.e. .NET 3.5) I can use lambda expressions to avoid creating a delegate and simplify my syntax. Instead of FindDelegate, I can simply use Func<INode,bool>:

// Instead of this:
private void Find(INode node, List<INode> resultSet, FindDelegate findDelegate);
// called like this for a Name search:
Find(this, resultSet, delegate(INode node) { return node.Name == name; });

// I can use this:
private void Find(INode node, List<INode> resultSet, Func<INode, bool> f)
// called like this:
Find(this, resultSet, node => node.Name == name);

Thus giving me the following implementation for ISearchableNode:

public INode FindByName(string name)
{
  List<INode> resultSet = new List<INode>();
  Find(this, resultSet, x => x.Name == name);
  return resultSet.FirstOrDefault();
}

public IEnumerable<INode> FindByType(string type)
{
  List<INode> resultSet = new List<INode>();
  Find(this, resultSet, x => x.Type == type);
  return (IEnumerable<INode>)resultSet;
}

private void Find(INode node, List<INode> resultSet, Func<INode, bool> f)
{
  if (f(node))
  {
    resultSet.Add(node);
  }
  foreach (INode child in node.Children)
  {
    Find(child, resultSet, f);
  }
}

Problem solved, move on… Well, except there is significant room for improvement. Here are the two main issues that ought to be resolved:

  1. Syntax is limited to two types of searches and exposing the generic find makes for an ugly syntax. It would be much nicer if queries to the tree could be expressed in LINQ syntax.
  2. It’s also inefficient for the Name search, since I’m walking the entire tree, even if the first node matched the criteria.

LINQ to Hierarchical Data

In order to use LINQ to objects, I need to either create a custom query provider or implement IEnumerable. The latter is significantly simpler and could be expressed using the following interface:

public interface IQueryableNode : IEnumerable<INode> { }

Ok, ok, I don’t even need an interface, I could just implement IEnumerable<INode>… But what does that actually mean? In the simplest sense, I’m iterating over the node’s children, however, I also with do descend into the children’s children and so on. So a simple foreach won’t do. I could just do the same tree walking with a resultset as I did above and return the Enumerator of the resulting list to implement the interface, but C# 2.0 introduced a much more useful way to implement non-linear Iterators, i.e. the yield keyword. Instead of building a list to be interated over, yield let’s the iterating code return values as they are found, which means it can be used for recursive iteration. Thus the GetEnumerator is implemented simply as follows

#region IEnumerable<Node> Members
public IEnumerator<INode> GetEnumerator()
{
  yield return this;
  foreach (Node child in Children)
  {
    foreach (Node subchild in child)
    {
      yield return subchild;
    }
  }
}
#endregion

#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
  return this.GetEnumerator();
}
#endregion

Nice and simple and ready for LINQ.

Searching for all Nodes of a type becomes

var allBar = from n in x
                where n.Type == "bar"
                select n;
foreach (Node n in allBar)
{
  // do something with that node
}

and the search for a specifically named node becomes

INode node = (from n in x
              where n.Name == "i"
              select n).FirstOrDefault();

But the real benefit of this approach is that I don’t have hard-coded search methods, but can express much more complex queries in a very natural syntax without any extra code on the Node.

Deferred execution

As it turns out, using yield for the recursive iteration also solved the second issue. As yield returns values as it encounters them during iteration, the search doesn’t happen until the query is executed. And one of the side effects of LINQ syntax is that creating a query does not execute it until the result set is iterated over. Therefore, FirstOrDefault() actually short-circuits the query as soon as the first match (and in case of Name, it’s going to be the only match) is hit.

By arne on | .net | 4 comments
Tags: , , , , ,