A while back I wrote that you really never have to write another delegate again, since any delegate can easily be expressed as an Action or Func. After all what's preferable? This:
var work = worker.ProcessTaskWithUser(delegate(Task t, User u) {
// define the work callback
});
or this:
var work = worker.ProcessTaskWithUser((t, u) => {
// define the work callback
});
I know I prefer lambda's over delegates. But this is just on the consuming end. The signature for the above could be either:
delegate Task TaskUserDelegate(Task inputTask, User contextUser); IEnumerable<Task> ProcessTaskWithUser( TaskUserDelegate processCallback );
or:
IEnumerable<Task> ProcessTaskWithUser( Func<Task,User,Task> processCallback );
Either one can be used with the same lambda, so using the delegate doesn't inconvenience us in consumption. But writing the Func version is certainly more concise so it seems like the winner once again. But In terms of consumption of that API, we've lost the signature of the method which would explain what each parameter is used for. Sure, .Where(Func<T,bool> filter) is pretty self-explanatory, but .WhenDone(Func<T,V,string,T> callback) really doesn't tell us much of anything.
So there seems to be straight forward usability rule of thumb: Use a delegate if the parameter's meaning isn't obvious from the usage of the lambda. But if the goal here is to make it easier for the consumer of the API, unfortunately it's not that simple, since the primary tool for communicating the API's documentation, intellisense, actually makes things worse.
For maximum usability, let's document the the API so it's meaning is discoverable:
/// <summary>
/// The task user delegate is meant to transform a given task into a new one in the context of a user.
/// </summary>
/// <param name="inputTask">The task to transform.</param>
/// <param name="activeUser">The user context to use for the transform.</param>
/// <returns>A new task in the user's context.</returns>
delegate Task TaskUserDelegate(Task inputTask, User activeUser);
/// <summary>
/// Transform all tasks for a set of users.
/// </summary>
/// <param name="processCallback">Callback for transforming each task for a specific user</param>
/// <returns>Sequence of transformed tasks</returns>
IEnumerable<Task> ProcessTaskWithUser(TaskUserDelegate processCallback) {
//...
}
And this is what it looks like on code completion:

While TaskUserDelegate is well documented, this does not get exposed via intellisense. Worse, this signature tells us nothing about the arguments for our lambda. So, yeah, we created a better documented API, but made it's discovery worse.
Now, let's do the same for the func signature:
/// <summary>
/// Transform all tasks for a set of users.
/// </summary>
/// <param name="processCallback">Callback for transforming each task for a specific user</param>
/// <returns>Sequence of transformed tasks</returns>
IEnumerable<Task> ProcessTaskWithUserx(Func<Task, User, Task> processCallback) {
//...
}
which gives us this completion:

Now we at least know the exact signature of the lambda we're creating, even if we don't know what the purpose of the arguments is.
In both cases, the best discoverability ends up being plain old textual documentation of the parameter and even though the delegate provides extra documentation possibilities, their access is not convenient that for expediency i'd still have to vote for the Func signature.
The one exception to the rule would be a lambda that is meant as a dependency. I.e. a class or method that has a callback that you attach for later use, rather than immediate dispatch. In that case the lambda really functions as a single method interface and should be treated like any other dependency and be as explicit as possible.
Before getting into method slots, here's a quick review of the Promise lambda grammar:
lambda: [<signature>] <expression>;
signature: (<arg1>, ... <argN>[|<return-type>])
arg: [<type>] <argName>[=<init-expression>]
expression: <statement> | { <statement1>; ... <statementN>; }
A lambda can be called with positional arguments either with the parentheses-comma convention ( foo(x,y) ) or the space-separated convention ( foo x y ), or with a JSON object as argument ( foo{ bar: x, baz: y} ).
When i decided to use slots that you assign lambdas as methods, I thought I'd be clever and make those slots polymorphic to get around shortcomings i perceived in the javascript model of just attaching functions to named fields. After listening to Rob Pike talk about Go at OSCON, I decided this bit of cleverness did not serve a useful purpose. In Go there are no overloads, because a different signature denotes different behavior and the method name should reflect that difference. Besides, even if you want overload type behavior in Promise, you can get it via the JSON calling convention:
class Index {
Search:(|SearchResult) {
foreach(var keyvaluepair in $_) {
// handle undeclared named parameters
}
...
};
}
Basically the lambda signature is used to declare an explicit call contract, but using a JSON object argument, undeclared parameters can just as easily be passed in.
If a method is called with positional arguments instead of a JSON object, the default JSON object will contain a field called args with an array value
class Index {
Search: {
...
};
}
Index.Search('foo','documents',10);
// $_ => { args: ['foo','documents',10] }
The above signature shows a method assigned a lambda without any signature, i.e. it accepts any input and returns an untyped object. Receiving $_.args is not contingent on that signature, it will always be populated, regardless of the lambda signature.
A class can also contain a wildcard method to catch all method calls that don't have an assigned slot.
class Index
*: {
var (searchType) = $_._methodname./^Find_(.*)$/;
if(searchType.IsNil) {
throw new MethodMissingException();
}
...
};
}
The wild card method is a slot named *. Retrieving the call arguments is the same as with any other method without declared signature, i.e. $_ is used. In addition, the methodname used in the call is stuffed into $_ as the field _methodname.
The above example shows a method that accepts and call that starts with Find_ and takes the remainder of the name as the document type to find, such as Find_Images, Find_Pages, etc. This is done by using the built in regex syntax, i.e. you can use ./<regex>/ and ./<regex>/<substitution>/ on any string (or the string an object converts to), similar to perl's m// and s///. Like perl, the call returns a list of captures, so using var with a list of fields, in this case one field called searchType, receives the captures, if there is a match.
When a method is called that cannot be found on the Type, it throws a MethodMissingException. A wildcard method is simply a hook that catches that exception. By throwing it ourselves, our wildcard reverts to the default behavior for any method that doesn't match the desired pattern. This also gives parent classes or mix-ins the opportunity to fire their own wildcard methods.
Wildcard methods can only declared in classes and mix-ins, not on Types. Types are supposed to be concrete contracts. The existence of a wildcard does mean that the class can satisfy any Type contract and can be used to dynamically implement type contracts without having to declare each method (think mocks).
Operators are really just methods called with the whitespace list syntax
var x = 3; var y = x + 5; // => 8 var z = x.+(5); // => 8 // most operators implements polish notation as appropriate var v = x.+(5,6); // => 14
Operators are just methods, which means you can assign them yourselves as well
class Query {
List<Query> _compound;
op+: {
var q = Query();
q._compound.AddRange(_compound);
q._compound.AddRange($_args);
q;
};
}
The only difference between a normal method slot and an operator slot is that the operator slot has the op prefix for disambiguation.
That concludes the overview of the things I think make Promise unique. There's certainly tons more to define for a functioning language, but most of that is going to be very much common syntax. So now it's time to buckle down dig into antlr and the DLR to see what it will take to get some semblance of Promise functioning.
This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.
Lambdas in Promise, like other languages, are anonymous functions and first-class values. They act as closures over the scope they are defined in and can capture the free variables in their current lexical scope. Promise uses lambdas for all function definitions, going further even than javascript which has both named and anonymous functions. In Promise there are no named functions. Just slots that lambdas get assigned to for convenient access.
Straddling the statically/dynamically typed divide by allowing arguments and return values to optionally declare a Type, Promise mimicks C# lambda syntax more than say, LISP, javascript, etc. A simple lambda example looks like this:
var i = 0;
var incrementor = { ++i; };
print incrementor(); // => 1
This declaration doesn't have any input arguments to declare, so it uses a shortform of simply assigning a block to a variable. The standard form uses the lambda operator =>, so the above lambda could just as well be written as:
var incrementor = () => { ++i; };
I'm currently debating whether I really need the =>. It's mostly that i'm familiar with the form from C#. But given that there are no named functions, parentheses followed by a block can't occur otherwise, so there is no ambiguity. So, i'm still deciding whether or not to drop it:
var x = (x,y) => { x + y };
// vs.
var x = (x,y) { x + y };
The signature definition of lambdas borrows heavily from C#, using a left-hand side in parantheses for the signature, followed by the lambda operator. Input arguments can be typed, untyped or mixed:
var untypedAdd = (x,y) => { x + y; };
var typedAdd = (Num x, Num y) => { x + y; };
var mixedtypeAdd = (Num x, y) => { x + y; };
In dynamic languages, lambda definitions do not need a way to express whether they return a value–there is no type declaration so whether or not to expect a value is convention and documentatio driven. In C# on the other hand, a lambda can either not return a value, a void method, which uses one of the Action delegates, or return a value and declare it as the last type in the declaration using the Func delegates. In Promise all lambdas return a value, even if that value is nil (more about the special singleton instance nil later). Values can be returned by explicitly using the return keyword, otherwise it defaults simply to the value of the last statement executed before exiting the closure. Since return values can be typed, we need a way to declare that Type. Unlike C#, our lambdas aren't a delegate signature, so instead of reserving the last argument Type as the return Type, which would be ambiguous, Promise uses the pipe '|' character to optionally declare a return type:
var returnsUntyped = (x,y) => { x + y; };
var returnsTyped = (x,y|Num) => { x + y; };
var explicitReturn = (|Num) => { returnsTyped(1,2); };
Lambdas can also declare default values for arguments, which can be simple values or expressions:
var simple = (x=2,y=3) => { x + y; };
var complex= (x=simple()) => { x; };
Promise supports three different method calling styles. The first is the standard parentheses style as shown above. In this style, optional values can only be used by leaving out trailing arguments like this:
var f = (x=1,y=2,z=3) => { x + y +z; };
print f(2,2,2); // => 6
print f(2,2); // => 7
print f(4); // => 9
print f(); // => 6
If you want to omit a leading argument, you have to use the named calling style, using curly brackets, which was inspired by DekiScript. The curly bracket style uses json formatting, and since json is a first-class construct in Promise, calling the function by hand with {} or providing a json value behaves the same, allowing for simple dynamic argument construction:
print f{y: 1}; // => 5
print f{z: 1, y: 1}; // => 3
var args = {z: 5};
print f args; // => 8
Finally there is the whitespace style, which replaces parentheses and commas with whitespace. This style exists primarily to make DSL creation more flexible:
print f 2 2 2; // => 6 print f 2 2; // => 7 print f 4; // => 9 print f; // => 6
Note the final call simply uses the bare variable f. This is possible because in Promise a lambda requiring no arguments can take the place of a value and accessing the variable executes the lambda. Sometimes it's desirable to access or pass a reference to a lambda, not execute it, in which case the reference notation '&' is needed. Using reference notation on a value is harmless (at least that's my current thinking), since Promise has no value types, so the reference of a value is the value:
var x = 2;
var y = () => { x+10; };
var x2 = &x;
var y2 = &y;
var y3 = y;
x++;
print x2; // => 3;
print y2; // => 13;
print y3; // => 12;
The output of y3 is only 12, because assignment of y3 evaluated y, capturing x before it was incremented.
As mentioned above, Lambdas can capture variables from their current scope. Scopes are nested, so a lambda can capture variables from any of the parent scopes
var x = 1;
var l1 = (y) => {
return () => { x++; y++; x + y; };
};
print l1(2); // => 5
print l1(2); // => 6
print x; // => 3
Similar to javascript, a block is not a new scope. This is done primarily for scoping simplicity, even if it introduces some side-effects:
() => {
var x = 5;
if( x ) {
var x = 5; // illegal
var y = 10;
}
return y; // legal since the variable declaration was in the same scope
};
As I've said that lambdas are the basic building block, meaning there is no other type of function definition. You can use them as lazily evaluated values, you can pass them as blocks to be invoked by other blocks and as I will discuss next time, Methods are basically polymorphic, named slots defined inside the closure of the class (i.e. capturing the class definition's scope), which is why there is no need for explicitly named functions.
This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.
A couple of weeks ago someone posted two links on twitter suggesting the trampolining superiority of Clojure to C#. (Can’t find that tweet anymore, and yes, that tweet was what finally motivated me to migrate my blog so i could post again.)
Well, the comparison wasn’t really apples to apples. The C# article “Jumping the trampoline in C# – Stack-friendly recursion” is trying to do a lot more than the Clojure article “Understanding the Clojure `trampoline’”. But maybe that’s the point as well: that with C# people end up building much more complex, verbose machinery when the simple style of Clojure is all that is needed. That determination wanders into the subjective, where language feuds are forged, and i’ll avoid that diversion this time around.
What struck me more was that you can do trampolining a lot more simply in C# if we mimic the Clojure example given. Let’s start by reproducing the stack overflow handicapped example first:
Func<int, int> funA, funB = null; funA = n => n == 0 ? 0 : funB(--n); funB = n => n == 0 ? 0 : funA(--n);
I’d dare say that the Lambda syntax is more compact than the Clojure example
Ok, ok, the body is artificially small, allowing me to replace it with a single expression, which isn’t a realistic scenario. Suffice it to say, you can get quite compact with expressions in C#.
Now, if we call funA(4), we’d get the same call sequence as in Clojure, i.e.
funA(4) -> funB(3) -> funA(2) -> funB(1) -> funA(0) -> 0
And if you, instead, call funA(100000), you’ll get a StackOverflowException.
So far so good, but here is where we diverge from Clojure. Clojure is dynamically typed, so it can return a number or an anonymous function that produces that number. We can’t do that (lest we return object, ick), but we can come pretty close.
The idea behind the Trampoline is that it unrolls the recursive calls into sequential calls, by having the functions involved return either a value or a continuation. The trampoline simply does a loop that keeps executing returned continuations until it gets a value, at which point it exits with that value.
What we need for C# then is a return value that can hold either a value or a continuation and with generics, we can create one class to cover this use case universally.
public class TrampolineValue<T> {
public readonly T Value;
public readonly Func<TrampolineValue<T>> Continuation;
public TrampolineValue(T v) { Value = v; }
public TrampolineValue(Func<TrampolineValue<T>> fn) { Continuation = fn; }
}
Basically it’s a container for either a value T or a func that produces a new value container. Now we can build our Trampoline:
public static class Trampoline {
public static T Invoke<T>(TrampolineValue<T> value) {
while(value.Continuation != null) {
value = value.Continuation();
}
return value.Value;
}
}
Let’s revisit our original example of two functions calling each other:
Func<int, TrampolineValue<int>> funA, funB = null; funA = (n) => n == 0 ? new TrampolineValue<int>(0) : new TrampolineValue<int>(() => funB(--n)); funB = (n) => n == 0 ? new TrampolineValue<int>(0) : new TrampolineValue<int>(() => funA(--n));
Instead of returning an int, we simply return a TrampolineResult<int> instead. Now we can invoke funA without worrying about stack overflows like this:
Trampoline.Invoke(funA(100000));
Voila, the stack problem is gone. It may require a bit more plumbing than a dynamic solution, but not a lot more than adding type declarations, which will always be the syntactic differences between statically and dynamically typed. But with lambdas and inference, it doesn’t have to be much more verbose.
Using Trampoline.Invoke with TrampolineValue<T> is a fairly faithful translation of the Clojure example, but it doesn’t feel natural for C# and actually introduces needless verbosity. It’s functional rather than object-oriented, which C# can handle but it’s not its best face.
What TrampolineValue<T> and its invocation really represent are a lazily evaluated value. We really don’t care about the intermediaries, nor the plumbing required to handle it.
What we want is for funA to return a value. Whether that is the final value or lazily executes into the final value on examination is secondary. Whether or not TrampolineValue<T> contains a value or a continuation shouldn’t be our concern, neither should passing it to the plumbing that knows what to do about it.
So let’s internalize all this into a new return type, Lazy<T>:
public class Lazy<T> {
private readonly Func<Lazy<T>> _continuation;
private readonly T _value;
public Lazy(T value) { _value = value; }
public Lazy(Func<Lazy<T>> continuation) { _continuation = continuation; }
public T Value {
get {
var lazy = this;
while(lazy._continuation != null) {
lazy = lazy._continuation();
}
return lazy._value;
}
}
}
The code for funA and funB is almost identical, simply replacing TrampolineValue with Lazy:
Func<int, Lazy<int>> funA, funB = null; funA = (n) => n == 0 ? new Lazy<int>(0) : new Lazy<int>(() => funB(--n)); funB = (n) => n == 0 ? new Lazy<int>(0) : new Lazy<int>(() => funA(--n));
And since the stackless chaining of continuations is encapsulated by Lazy, we can simply invoke it with:
var result = funA(100000).Value;
This completely hides the difference between a Lazy<T> that needs to have its continuation triggered and one that already has a value. Now that’s concise and easy to understand.
Currently working on a tag query engine and couldn’t find anything all that useful in the published approaches. I want to do arbitrary boolean algebras against a set of tags in a database, which seems to be out of scope of SQL approaches. All the various tagging schemas out there reduce to either just AND or just OR queries, but not complex logic. However, I want to be able to do something like:
(foo+bar)|(foo+baz)|(bar+^baz)
If there is a way to do this with SQL, i’d love to know. But the way i look at it, i really have to fetch all tags for each item and then do apply that formula to the list of tags on the item.
But let’s break down the matching problem itself into something i can execute. Let’s assume I’ve got a simple parser that can turn the above into an AST. Really, i can decompose any variation into three operations, AND(a,b), OR(a,b) and NOT(a). And I can represent those with some simple Func<> definitions:
Func<bool, bool, bool> AND = (a, b) => a && b; Func<bool, bool, bool> OR = (a, b) => a || b; Func<bool, bool> NOT = (a) => !a;
Assuming that i have boolean tokens for foo, bar and baz, the expressions becomes:
OR(AND(foo, bar), OR(AND(foo, baz), AND(bar,NOT(baz))))
Now, the above expression can be expressed as a function that takes three booleans describing the presence of the mentioned tags, ignoring any other tags that the item has, returning a boolean indicating a successful match. In C# that expression would look like this:
Func<bool[], bool> f = x => OR(AND(x[0], x[1]), OR(AND(x[0], x[2]), AND(x[1],NOT(x[2]))));
Next we need to generate this boolean map from the list of tags on the item. Assuming that the tag list is a list of strings, we can define an extension methods on IEnumerable<string> to generate the boolean map like this:
public static bool[] GetBinaryMap(this IEnumerable<string> tags, string[] mask) {
var map = new bool[mask.Length];
foreach(var x in tags) {
for(var i = 0; i < mask.Length; i++) {
if(x == mask[i]) {
map[i] = true;
}
}
}
return map;
}
And with this we can define a linq query that will return us all matching items:
var mask = new[] { "foo", "bar", "baz"};
Func<bool[], bool> f = x => OR(AND(x[0], x[1]), OR(AND(x[0], x[2]), AND(x[1],NOT(x[2]))));
var match = from item in items
where f(item.Tags.GetBinaryMap(mask))
select item;
Clearly this is isn’t the fastest executing query, since we first had to create our items, each item in which has a collection of tags. But there is a lot of optimizations left on the table here, such as using our tag mask to pre-qualify items, breaking down the AST into sub-matches that could be used against a cache to find items, etc.
But at least we have a fairly simple way to take complex boolean algebra on tags and convert them into something that we can evaluate generically
This is definitely an edge case testing scenario, so i don’t know how useful this utility class is in general, but i thought it was kinda fun deferred execution stuff, so why not post it?
Here’s my scenario. I’ve built some Dream REST services that i’ve set up to create an inner Autofac container per request to create per request instances of objects — things like NHibernate ISession and other disposable resources that only make sense in a per request context.
Now i’m writing my unit tests around these services, and need to provide mocks for these inner container created objects. I should also mention that to test the services, i am firing up a DreamHost, since the services can only be tested in the context of the full pipeline. Yeah, i know, smells a bit functional, but that’s what I have to work with right now. And i need these objects to be ContainerScoped (i.e. per inner container singletons), so that multiple Resolve‘s return the same instance, but still return different instances on multiple requests. Ok, ok, i know the tests are doing too much… Like i said, this is an edge case. It’s not strictly a unit test, but i still want coverage on this code. Getting around this would require refactoring of code that’s not part of my project, so there you go.
What I want to do is set up the mock for the inner container instance on creation, which doesn’t happen until i’ve handed over execution control to the Act part of the test. This lead me to create a factory that provides a hook for setting up the mock on creation of the mock:
public class DelegatedMoqFactory<T> where T : class
{
private Action<Mock<T>, IContext> setupCallback;
private Mock<T> mock;
public Mock<T> CurrentMock { get { return mock; } }
public T CreateInstance(IContext container)
{
mock = new Mock<T>();
if (setupCallback != null)
{
setupCallback(mock, container);
}
return mock.Object;
}
public void OnResolve(Action<Mock<T>, IContext> setupCallback)
{
this.setupCallback = setupCallback;
}
}
A sample autofac wire-up looks like this:
builder.RegisterGeneric(typeof(DelegatedMoqFactory<>));
builder.Register(c => c.Resolve<DelegatedMoqFactory<IAuthService>>().CreateInstance(c))
.ContainerScoped();
With a test setup of the IAuthService being done like this:
container.Resolve<DelegatedMoqFactory<IAuthService>>()
.OnResolve(m => m.Setup(x =>
x.AuthenticateAndGetAccountId("authtoken")).Returns(1234);
The open generic of DelegateMoqFactory is registered with default scope, since i want it to exist outside the inner scope, so that i can resolve it to wire up my expectations for the mock. Then on the first access for IAuthService inside the inner scope, the DelegateMoqFactory creates the mock and calls my OnResolve callback to set up the mock.
The reason there is also a CurrentMock accessor is so that I can do verification on the mock after the inner container has gone out of scope, like this:
container.Resolve<DelegatedMoqFactory<IAuthService>>()
.CurrentMock.Verify(x =>
x.CreateAuthToken(It.IsAny<IAccount>()), Times.Never());
This class should be useful whenever you are testing some code that internally creates an inner container and scoping the objects usually created under ContainerScope as default scope doesn’t work (likely because there’s multiple inner containers). We still get per inner container instances, but get to wire them up with deferred setups that don’t come into play until the mocks are actually pulled from the inner container.
The last couple of nights I’ve been playing with some Linq to Sql and a whole lot of Linq to Objects and I have to say where coming up with complex Regular Expressions used to be one of my favorite puzzles, coming up with complex projections and transformations through Linq is quickly taking its place. Simple Linq is well documented, but when it comes to aggregation, it’s a lot sparser. I expect to write more of that up once I feel more comfortable with the syntax.
In the meantime, I wanted to write up some non-obvious observation about deferred execution with Linq. Considering the gotchas with lambdas, it’s easy to extend the lessons learned to linq, since it is after all deferred execution. But what’s different with Linq is that, while execution is deferred, the expression tree built via a query is also immutable. I came across this trying to do some simple query re-use.
Let’s start with a simple DTO:
public class Order { public Order(int id, int val, bool buyOrder) { Id = id; Value = val; IsBuyOrder = buyOrder; } public int Id { get; set; } public int Value { get; set; } public bool IsBuyOrder { get; set; } }
And a set of this data:
Order[] orders = new Order[] { new Order(1,2,true), new Order(2,2,false), new Order(3,4,true), new Order(4,4,false), new Order(5,6,true), new Order(6,6,false), };
Let’s split those into buy and sell orders:
var buyOrders = from order in orders where order.IsBuyOrder select order; var sellOrders = from order in orders where !order.IsBuyOrder select order;
If we want to find the buy and the sell order with a value of 2, you’d think we could write one query and re-use it for both of those queries. Since both queries results in IEnumerable<Order>, how about we define a query source and assign the value of either above query.
IEnumerable<Order> orders2 = null; var orderAtTwo = from order in orders2 where order.Value == 2 select order; orders2 = buyOrders; int buyOrderId = orderAtTwo.First().Id; orders2 = sellOrders; int sellOrderId = orderAtTwo.First().Id; Console.WriteLine("buy Id: {0}, sell Id: {1}", buyOrderId, sellOrderId);
Since the query is deferred until we call .First() on it, that seems like a reasonable syntax. Except this will result in an System.ArgumentNullException because our query grabbed a reference to orders2 at query definition, even though the query won’t be executed until later. Giving orders2 a new value does not change the original reference in the immutable expression tree.
A way around this is to replace the actual contents of orders2. However, for us to do that, we have to turn it into the query source into a collection first.
orders2.Clear(); orders2.AddRange(buyOrders); int buyOrderId = orderAtTwo.First().Id; orders2.Clear(); orders2.AddRange(sellOrders); int sellOrderId = orderAtTwo.First().Id; Console.WriteLine("buy Id: {0}, sell Id: {1}", buyOrderId, sellOrderId);
This gives us the expected
buy Id: 1, sell Id: 2
Let’s put aside the awkwardness of clearing out a list and stuffing data back in, this code has another unfortunate sideeffect. .AddRange() actually executes the query passed to it, so we execute our buy and sell queries to populate orders2 and then execute orderAtTwo twice against those collections.
The beauty of linq is that if you create a query from a query, your not running multiple queries, but building a more complex query to be executed. So, what we really want is query “re-use” that results in single expression trees at execution time.
To achieve this, we need to move the shared query into a separate method such as:
private IEnumerable<Order> GetTwo(IEnumerable<Order> source) { return from order in source where order.Value == 2 select order; }
and the code becomes:
int buyOrderId = GetTwo(buyOrders).First().Id; int sellOrderId = GetTwo(sellOrders).First().Id; Console.WriteLine("buy Id: {0}, sell Id: {1}", buyOrderId, sellOrderId);
This gives the same output as above, and we’re only running two queries, each against the original collection. The method call means that we don’t get to re-use an expression tree, since it builds a new one, combining the expression tree passed to it with the one it builds itself.
This is closely related to my last post on deferred execution gotchas and its basically more “if you inline delegated code, you may easily overlook scope side-effects”. This time it’s about dealing with foreach and using the local each item for deferred execution.
public void SpawnActions() { foreach (ActionContext context in contexts) { int id = context.Id; Action<int> callback = (workerNumber) => { Console.WriteLine("{0} Id: {1}/{2}", workerNumber, id, context.Id); }; ThreadPool.QueueUserWorkItem(new WaitCallback(FutureExecute), callback); } } public void FutureExecute(object state) { int id = worker++; Action<int> callback = state as Action<int>; Thread.Sleep(500); callback(id); }
The output looks like this:
0 Id: 0/9 1 Id: 1/9 2 Id: 2/9 3 Id: 3/9 4 Id: 4/9 5 Id: 5/9 6 Id: 6/9 7 Id: 7/9 8 Id: 8/9 9 Id: 9/9
So while the foreach scope variable context is kept alive for the deferred execution, it turns out that foreach re-uses the variable on each pass through the loop and therefore when the Action
foreach (ActionContext context in contexts) { int id = context.Id; // locally scoped variable ActionContext c2 = context; Action<int> callback = (workerNumber) => { Console.WriteLine("{0} Id: {1}/{2}", workerNumber, id, c2.Id); }; ThreadPool.QueueUserWorkItem(new WaitCallback(FutureExecute), callback); }
And now our results are a bit more what we expected:
0 Id: 0/0 1 Id: 1/1 2 Id: 2/2 3 Id: 3/3 4 Id: 4/4 5 Id: 5/5 6 Id: 6/6 7 Id: 7/7 8 Id: 8/8 9 Id: 9/9
I recently wrote about Action & Func, which along with Lambda expression let you do easy inline callbacks like this:
Utility.ActionDownloader.Download(
Configuration.GetAssetUri(dto.Url),
(Downloader d) =>
{
FloatContainer c = (FloatContainer)XamlReader.Load(d.ResponseText);
c.Initialize(dto);
});
i.e. I can call a downloader and inline pass it a bit of code to execute once the download completes. But the catch of course is that looking at the code, and following the usual visual tracing of flow hides the fact that c.Initialize(dto) doesn’t get called until some asynchronous time in the future. Now, that’s always been a side-effect of delegates, but until they became anonymous and inline, the visual deception of code that looks like it’s in the current flow scope but isn’t wasn’t there.
What happened was that I needed my main routine to execute some code after FloatContainer was initialized, and by habit i created an Initialized event on FloatContainer. Of course this was superfluous, since my lambda expression called the synchronous Initialize, i.e my action could be placed inline after that call to c.Initialize(dto) and be guaranteed to be called after initialization had completed.
This scenario just meant I created some superfluous code. However, I’m sure as I use lambda expression more, there will be more pitfalls of writing code that doesn’t consider that its execution time is unknown, as is the state of the objects tied to the scope of the expression.
This last bit about objects tied to the expression scope is especially tricky and I think we will see some help in terms of Immutable concepts weaving their way into C# 3.x or 4.0, as the whole functional aspect of lambda expressions really work best when dealing with objects that cannot change state. Eric Lippert’s been laying the groundwork in a number of posts on the subject and while he constantly disclaims that his ponderings are not a roadmap for C#, I am still going to assume that his interest and recognition of the subject of Immutables will have some impact in a future revision of the language. Well, I at least hope it does.
With lambda expressions in C#, the Func
generic delegate and it's variations have been getting a lot of attention. So naturally, you might think that the lambda syntax is just a shortcut for creating anonymous delegates, whether they return values or not.
First let's look at the evolution of delegates from 1.1 to now. Delegates, simply are the method equivalent of function pointers. They let you pass a method call as an argument for later execution. The cool thing (and a garbage collection pitfall) is that a delegate creates a lexical closure, i.e. the delegate carries with it the object that the method gets called on. For garbage collection this means that a delegate prevents an object from being collection. That's why it's important to unsubscribe from those events you subscribed to.
But I digress. Let's define a delegate that returns an Integer and a method that matches that delegate:
delegate int IntProducerDelegate();
public int x = 0;
public int IntProducer()
{
return x++;
}
With the original .NET 1.0 syntax we'd create the delegate like this:
IntProducerDelegate p1 = new IntProducerDelegate(IntProducer);
p1() and get an integer back, and since it's closure, each time we call p1() the originating objects x increases as does our return value.Then, in .Net 2.0 we got anonymous delegates.
IntProducerDelegate p2 = delegate { return IntProducer(); };
// or with IntProducer's action inlined...
IntProducerDelegate p3 = delegate { return x++; };
This got rid of the need to create a method just to pass along a closure that manipulated our object at a later time. The other thing that anonymous delegates re-inforce is that delegates just care about signature. IntProducerDelegate can get assigned any delegate that takes no argument and returns an int. That sounds like a perfect scenario for generics and in .NET 3.5, we got just that, a set of generic delegates called Func. Using Func, we quickly get to our lambda expression replacing the original delegate syntax like this:
// create a new Func delegate just like the IntProducerDelegate
IntProducerDelegate p3 = new Func<int>(IntProducer);
// which means that we don't need IntProducerDelegate at all anymore
Func<int> p4 = delegate { return x++; };
// and the anonymous delegate can also be shorthanded with a lambda expression
Func<int> p5 = () => { return x++; };
// which says, given that we take no argument "()", execute and return the following "return x++;"
However, before there ever was Func, .Net 2.0 introduced the generic delegate Action, which is a natural counterpart to Func, encapsulating a method that does not return anything. Following through the example of the producer, we'll create a consumer like:
delegate void IntConsumerDelegate(int i);
public void IntConsumer(int i)
{
Console.WriteLine("The number is {0}", i);
}
Now following the same evolution of syntax we get this:
IntConsumerDelegate c1 = new IntConsumerDelegate(IntConsumer);
IntConsumerDelegate c2 = new Action<int>(IntConsumer);
Action<int> c3 = delegate(int i) { Console.WriteLine("The number is {0}", i); };
Action<int> c4 = (i) => { Console.WriteLine("The number is {0}", i); };
So lambda syntax can be used to create either a Func or an Action. And that also means that we never have to explicitly need to create another delegate, being able to use a variation of these two generic delegates as our arsenal for storing lambda expressions of all kinds.