ILoggable

A place to keep my thoughts on programming

 Subscribe

geekblog
[at]
claassen [dot] net

Powered by Blogger

Tuesday, February 26, 2008

Use .NET/Mono, it's the environmentally friendly choice

I just read what's got to be my favorite Miguel De Icaza quote to date:
In a world that is increasingly green, it is a waste of perfectly healthy computer cycles to interpret your code when you can use an optimizing JIT compiler to run your code.
I'm going to presume that's with tongue firmly implanted in cheek.

Labels: ,

Monday, February 18, 2008

Addicted to .Net 3.5

Outside of LINQ, i thought that 3.5 was a lot of cool but not vital syntactic sugar. This weekend marks the first time since November that I fired up VS.NET 2k5 to build an app targeting the 2.0 framework and I was amazed how much I'd already come to rely on that sugar. Now this might be seen as an invalidation of my preference of explicit, verbose syntax versus the terse syntax of many scripting languages. I'd like to point out that terseness and expressiveness in sytnax are two separate things. Syntactic sugar that let's me express my action in more concise code that easily conveys the meaning is not the same thing as using a terse vocabulary to keep the typing down and requiring memorization of abbreviated keywords to understand the code. Anyway, here are the parts of 3.5 I've missed more than once this weekend.

Extension Methods on IEnumerable

The plethora of extension methods on all things IEnumerable is largely due to LINQ, but the To* methods have become just a basic part of my vocubulary. Taking the Values of a Dictionary into an Array of the same type now is seems just painfully verbose without the ToArray() method. Compare
FileInfo[] f = new FileInfo[files.Count];
files.Values.CopyTo(f,0);
return f;
with
return files.Values.ToArray();

Object/List initializer syntax

Now this I really thought of as frivolous. However, using objects that use DTOs as their initializer/storage, initialization does become rather awkward without loops or long constructors:
      FileInfoData[] remoteFileData = new FileInfoData[]
      {
        new FileInfoData(),
        new FileInfoData(),
        new FileInfoData()
      };
      remoteFileData[0].name = "test1.mpg";
      remoteFileData[1].name = "test2.mpg";
      remoteFileData[2].name = "test10.mpg";
      FileInfo[] remoteFiles = new FileInfo[]
      {
        new FileInfo(remoteFileData[0]),
        new FileInfo(remoteFileData[1]),
        new FileInfo(remoteFileData[2]),
      };
versus
      FileInfo[] remoteFiles = new FileInfo[]
      {
        new FileInfo(new FileInfoData()
        {
          Name = "test1.mpg"
        }),
        new FileInfo(new FileInfoData()
        {
          Name = "test2.mpg"
        }),
        new FileInfo(new FileInfoData()
        {
          Name = "test10.mpg"
        }),
      };

Anonymous delegates are a pain and hard to read

I needed to pass in a delegate to a function as a factory callback. Perfect scenario for a nice concise lambda. But I was in 2.0, so that meant defining a delegate and anonymous delegate syntax resulting ing
  public delegate ILocalFileSystemManager LocalCreateDelegate(string localPath, string extension);

  public class FileSystemManagerFactory
  {
    public FileSystemManagerFactory( LocalCreateDelegate localFactory )
    {
      this.localFactory = localFactory;
    }
  }

FileSystemManagerFactory factory = new FileSystemManagerFactory(
  delegate(string localPath, string extension)
  {
    return new MockLocalFileSystemManager(localPath, extension);
  });
instead of
  public FileSystemManagerFactory( Func<string,string,ILocalFileSystemManager> localFactory )
  {
    this.localFactory = localFactory;
  }

FileSystemManagerFactory factory = new FileSystemManagerFactory(
  (localPath, extension)
  =>
  return new MockLocalFileSystemManager(localPath, extension);
  );

Automatic Properties

If there is one feature of C# (well and java as well) that is the most code generated, it's getters and setters. I've never liked how code generation tools created those for me, since i liked having my private members in one place and Properties in another. So i've been typing them out for years. But with C# 3.0, we got automatic properties. The two patterns, read/write properties and read-only properties are oft repeated like this
string readwriteMember;
string readonlyMember;

public string ReadWrite
{
  get { return readwriteMember; }
  set { readwriteMember = value; }
}

public string ReadOnly { get { return readonlyMember; }

Not a ton of code, but certainly takes more time to write than
public string ReadWrite { get; set; }

public string ReadOnly { get; private set; }
Ho hum.. I'm in 2.0, so i'll have to deal, but I certainly hope that 3.5 has a fast pick-up rate on the client (on the server, I can still control my environment).

Labels: ,

Friday, February 01, 2008

LINQ: Immutability vs. Deferred execution

The last couple of nights I've been playing with some Linq to Sql and a whole lot of Linq to Objects and I have to say where coming up with complex Regular Expressions used to be one of my favorite puzzles, coming up with complex projections and transformations through Linq is quickly taking its place. Simple Linq is well documented, but when it comes to aggregation, it's a lot sparser. I expect to write more of that up once I feel more comfortable with the syntax.

In the meantime, I wanted to write up some non-obvious observation about deferred execution with Linq. Considering the gotchas with lambdas, it's easy to extend the lessons learned to linq, since it is after all deferred execution. But what's different with Linq is that, while execution is deferred, the expression tree built via a query is also immutable. I came across this trying to do some simple query re-use.

Let's start with a simple DTO:

public class Order
{
  public Order(int id, int val, bool buyOrder)
  {
    Id = id;
    Value = val;
    IsBuyOrder = buyOrder;
  }
  public int Id { get; set; }
  public int Value { get; set; }
  public bool IsBuyOrder { get; set; }
}
And a set of this data:
Order[] orders = new Order[]
{
  new Order(1,2,true),
  new Order(2,2,false),
  new Order(3,4,true),
  new Order(4,4,false),
  new Order(5,6,true),
  new Order(6,6,false),
};

Let's split those into buy and sell orders:

var buyOrders = from order in orders
          where order.IsBuyOrder
          select order;

var sellOrders = from order in orders
                 where !order.IsBuyOrder
                 select order;

If we want to find the buy and the sell order with a value of 2, you'd think we could write one query and re-use it for both of those queries. Since both queries results in IEnumerable<Order>, how about we define a query source and assign the value of either above query.

IEnumerable<Order> orders2 = null;

var orderAtTwo = from order in orders2
                 where order.Value == 2
                 select order;

orders2 = buyOrders;
int buyOrderId = orderAtTwo.First().Id;

orders2 = sellOrders;
int sellOrderId = orderAtTwo.First().Id;

Console.WriteLine("buy Id: {0}, sell Id: {1}", buyOrderId, sellOrderId);
Since the query is deferred until we call .First() on it, that seems like a reasonable syntax. Except this will result in an System.ArgumentNullException because our query grabbed a reference to orders2 at query definition, even though the query won't be executed until later. Giving orders2 a new value does not change the original reference in the immutable expression tree.

A way around this is to replace the actual contents of orders2. However, for us to do that, we have to turn it into the query source into a collection first.

orders2.Clear();
orders2.AddRange(buyOrders);
int buyOrderId = orderAtTwo.First().Id;

orders2.Clear();
orders2.AddRange(sellOrders);
int sellOrderId = orderAtTwo.First().Id;

Console.WriteLine("buy Id: {0}, sell Id: {1}", buyOrderId, sellOrderId);
This gives us the expected
buy Id: 1, sell Id: 2
Let's put aside the awkwardness of clearing out a list and stuffing data back in, this code has another unfortunate sideeffect. .AddRange() actually executes the query passed to it, so we execute our buy and sell queries to populate orders2 and then execute orderAtTwo twice against those collections. The beauty of linq is that if you create a query from a query, your not running multiple queries, but building a more complex query to be executed. So, what we really want is query "re-use" that results in single expression trees at execution time.

To achieve this, we need to move the shared query into a separate method such as:

private IEnumerable<Order> GetTwo(IEnumerable<Order> source)
{
  return from order in source
         where order.Value == 2
         select order;
}
and the code becomes:
int buyOrderId = GetTwo(buyOrders).First().Id;
int sellOrderId = GetTwo(sellOrders).First().Id;

Console.WriteLine("buy Id: {0}, sell Id: {1}", buyOrderId, sellOrderId);
This gives the same output as above, and we're only running two queries, each against the original collection. The method call means that we don't get to re-use an expression tree, since it builds a new one, combining the expression tree passed to it with the one it builds itself.

Labels: , ,