Saying LINQ is about databases is missing its true benefits

Just came across a long discussion about LINQ in Java on the ODBMS blog (thanks to Miguel's tweet). There is some excellent discussion in there, but aside from a couple of people the discussion seemed to largely center on

  1. It's from MS so it's a bad idea to copy, and
  2. I don't think it adds anything over normal SQL syntax

The first is an unfortunate dismissal of a very powerful functional language construct because of its origin. And the second illustrates that the commenter does not truly understand what LINQ brings to the language in the first place.

Of course, I'd bet that 90% of .NET developers, if polled, would also equate LINQ with "type-safe SQL in the language", so this isn't a dig against Java people. Hopefully as Parallel LINQ gains some traction, this simplification will loose it's hold on people.

LINQ or Language INtegrated Query is really a functional way of expression operations on collections. And if you decompose a lot of code, anywhere where you are using loops to manipulating collections, LINQ is likely to create a more concise and powerful expression for the same operation. And being functional, the implementation of how LINQ does this is opaque to the caller. The caller simply describes what operations should be done on the data sources, allowing for optimization of the operations based on the data sources. That means they could be in a database, they could come from XML, or REST calls or simply exist as an in-memory object graph. But none of these things change the transformations desired.

I've only used LINQ once for SQL, although I use LINQ to objects, i.e. against IEnumerable<T>, almost daily and it's done away with a lot of foreach with temporary variables, temporary lists, etc. But even that scenario against the apparently now deprecated DLINQ or Linq2Sql illustrated how it wasn't just about replacing SQL with type-safe syntax, it allowed me to use one syntax for both database and local operations.

This project included doing a bunch of analytical processing against the data, including projection combining a number data sources. Not all of this was expressible in SQL and some of it wasn't a wise use of live DB queries, and performing the additional work in memory was a lot more efficient. Traditionally this type of work would exhibit a fairly obvious syntactic break between the local and the DB operations. And moving some part (say a sort of a sub-set) from SQL to local or vice versa would be a significant re-write. But using LINQ, the syntax was identical, it was merely a matter of deciding at what point the query should be turned into concrete data vs. a cursor against the DB. This is simply done by turning any IEnumerable to a List, forcing immediate execution of the query represented by the IEnumerable source. Either way, local or remote, the syntax stayed the same the power of where processing should happen was in my hands.

I do support the goal of getting something akin to LINQ in java, but I sure hope they don't attack the problem by creating some DB-centric query DSL. The greatest benefit of LINQ in .NET, imho, is that instead of hacking its syntax into the language, the building blocks of anonymous delegates, lambda expressions, anonymous types, var syntax and object initializers. Each one of these pieces is a fundamental part of C# 3.0 and can be used independently, but together they allow LINQ to exist. Discussion of common fluent APIs for databases or whole new languages like SBQL miss the benefit of "Language Integrated" in LINQ.

Going to be interesting to follow how this evolves, since I personally think that LINQ is one of the key differentiators between Java and C# that isn't just syntactic sugar (even though many think it is just that).