ILoggable

A place to keep my thoughts on programming

 Subscribe

geekblog
[at]
claassen [dot] net

Powered by Blogger

Sunday, November 02, 2008

Saying LINQ is about databases is missing its true benefits

Just came across a long discussion about LINQ in Java on the ODBMS blog (thanks to Miguel's tweet). There is some excellent discussion in there, but aside from a couple of people the discussion seemed to largely center on
  1. It's from MS so it's a bad idea to copy, and
  2. I don't think it adds anything over normal SQL syntax

The first is an unfortunate dismissal of a very powerful functional language construct because of its origin. And the second illustrates that the commenter does not truly understand what LINQ brings to the language in the first place.

Of course, I'd bet that 90% of .NET developers, if polled, would also equate LINQ with "type-safe SQL in the language", so this isn't a dig against Java people. Hopefully as Parallel LINQ gains some traction, this simplification will loose it's hold on people.

LINQ or Language INtegrated Query is really a functional way of expression operations on collections. And if you decompose a lot of code, anywhere where you are using loops to manipulating collections, LINQ is likely to create a more concise and powerful expression for the same operation. And being functional, the implementation of how LINQ does this is opaque to the caller. The caller simply describes what operations should be done on the data sources, allowing for optimization of the operations based on the data sources. That means they could be in a database, they could come from XML, or REST calls or simply exist as an in-memory object graph. But none of these things change the transformations desired.

I've only used LINQ once for SQL, although I use LINQ to objects, i.e. against IEnumerable<T>, almost daily and it's done away with a lot of foreach with temporary variables, temporary lists, etc. But even that scenario against the apparently now deprecated DLINQ or Linq2Sql illustrated how it wasn't just about replacing SQL with type-safe syntax, it allowed me to use one syntax for both database and local operations.

This project included doing a bunch of analytical processing against the data, including projection combining a number data sources. Not all of this was expressible in SQL and some of it wasn't a wise use of live DB queries, and performing the additional work in memory was a lot more efficient. Traditionally this type of work would exhibit a fairly obvious syntactic break between the local and the DB operations. And moving some part (say a sort of a sub-set) from SQL to local or vice versa would be a significant re-write. But using LINQ, the syntax was identical, it was merely a matter of deciding at what point the query should be turned into concrete data vs. a cursor against the DB. This is simply done by turning any IEnumerable to a List, forcing immediate execution of the query represented by the IEnumerable source. Either way, local or remote, the syntax stayed the same the power of where processing should happen was in my hands.

I do support the goal of getting something akin to LINQ in java, but I sure hope they don't attack the problem by creating some DB-centric query DSL. The greatest benefit of LINQ in .NET, imho, is that instead of hacking its syntax into the language, the building blocks of anonymous delegates, lambda expressions, anonymous types, var syntax and object initializers. Each one of these pieces is a fundamental part of C# 3.0 and can be used independently, but together they allow LINQ to exist. Discussion of common fluent APIs for databases or whole new languages like SBQL miss the benefit of "Language Integrated" in LINQ.

Going to be interesting to follow how this evolves, since I personally think that LINQ is one of the key differentiators between Java and C# that isn't just syntactic sugar (even though many think it is just that).

Labels: , ,

Monday, October 06, 2008

C# 3.0 language features vs. Java's OS community

I've been in a constant struggle for a number of years and it's a battle between wanting to program in C# but envying the variety of open source projects available on the JVM.

And this isn't so much a windows/linux thing, because mono vs. native .NET is really the least of my problems. The mono team has done an amazing job letting me write code without worry in Visual Studio and deploy on linux. With mono 2.0 just out, i have no code left that doesn't just work when i copy the dll's over. The fact that i do a lot of .NET on linux is not a pain point for me.

But when it comes to open source libraries, the eco-system on java is just so much richer. Think of an API or protocol and google it for java and you'll find it. Do the same for .NET and your chances are much smaller. And if you find it, chances are it's a port of Jfoo to Nfoo. So you could say, I've got library envy.

So why not just go java? Well, there are a number of C# language features that many might accuse of being "sugar", but i find them incredibly useful to express my intent in code. Here are the things that keep me in C# 3.0:

So there you go, I'm a C# fan boy. I admit it. But still, i got this library envy. What's a guy to do? I see two possible paths, C# on the JVM or java libs on mono. Both are mostly possible, but I really need to do some spikes to know where the pain starts with the two options.

C# on the JVM: Mainsoft Grasshopper

Author in C#, cross-compile IL into java byte code. Mainsoft is apparently collaborating with the mono guys, so generally what makes it into mono in terms of language features will make it into Grasshopper. Need to take some of my projects and see what happens when i try this.

Java on C#: IKVM

Compile java code into .NET IL using IKVM. This is dependent on the state and compatibility of IKVM and Java. Need to take some popular java open source projects and see what happens when i try to build them using IKVM.

Either sounds like inviting a life of debugging edge cases that no one else cares about. But I don't see a better option. Am I overlooking some other path? Should i stop whining and just go java? Is the OS situation not nearly as bad on .NET as i make it out to be and I can just continue as is? Things that keep me up at night.

Labels: , , ,

Tuesday, April 17, 2007

Reading a resource in eclipse or jar

Disclaimer: I've been trying to store some text resources in my jar so that i can easily package uncompiled content with my executable. This course of action seemed like the natural thing to do. However what I had to do to actually access this file and allow for different environments like running from eclipse vs. executing the jar makes me wonder if I'm missing some essential facilities. I.e. can eclipse build the jar and execute it instead of running it's class files, so that you can test jar stored executions as part of eclipses normal debug and excution? And is there some facility that let's you just say InputStream res = App.GetResourceStream()?. Well, in the meantime here's what i cobbled together to get this working, since my googling didn't come up with a single source that put it all together.

First thing, we need to determine where we are executing, both because its the base for finding our resource and because it tells us whether we're dealing with a directory hierarchy or a jar file:

ProtectionDomain protectionDomain = this.getClass().getProtectionDomain();
CodeSource codeSource = protectionDomain.getCodeSource();
URL location = codeSource.getLocation();
File f = new File(location.getFile());

Now we get an InputStream, depending on what environment we're in:

String resourcePath = "resources/xsl/register.xsl";
InputStream resourceStream = null;
if (f.isFile()) {
    // it's a file, so we assume it's a jar and look for our
    // resource as a JarEntry
    JarFile jar = new JarFile(f);
    JarEntry xslEntry = jar.getJarEntry(resourcePath);
    resourceStream = jar.getInputStream(xslEntry);
} else {
    // it's a directory, so we just append our relative resource
    // path
    resourceStream = new FileInputStream(f.getAbsolutePath() + "/"
            + resourcePath);
}

Now we have an InputStream and can ingest the file anyway we like.

Labels: , ,

Sunday, April 08, 2007

Are delegates the reason we have C#?

This may be ancient news, but I just came across an article that strongly implied that the reason .NET came about was because Sun didn't like Microsoft's addition of delegates to J++. That surely is a condensation of events, but it's certainly an interesting yarn.

I got into this whole subject because I needed an object to subscribe to the state change of another object in my java project. And I didn't want to create a tight coupling between otherwise unrelated objects. In C#, I would have created an event for the state change and have the second object subscribe to it. Done.

Alas, I'm in java and don't have delegates, e.g. no events in the way i'm used to. I remembered dealing with creating lots of anonymous inner classes during some long ago experiences with swing programming, which I always found to be less than transparent in presentation. So I figured I'd do some digging to see how delegation and events should be handled in java. I don't like seeing getFoo()/setFoo() from someone not used to C# properties, and I don't want to subject someone else to my C#-ing of java code in return.

I started googling java, delegates and inner classes and came across a plethora of interesting articles, including the obligatory C# is better better than java because it has delegates and java kicks C#'s ass because isn't littered with atrocious syntactic sugar like delegates when inner classes will do variety. Of these, the most entertaining was a rather testy condemnation of delegates in J++ in the form of a white paper on Sun's site. Considering the stance Sun has taken over the years on java, i.e. binding the language, runtime and philosophy into a single indivisible unit, having MS try subvert the language with what to many looks like a procedural programming throwback certainly could have been a significant motivator for the lawsuit that revoked Microsoft's java license. Looking at C# and the CLR, Microsoft obviously saw a lot of things it liked in the java language and the jvm. So the result that .NET came about because Sun rejected delegates doesn't seem too far fetched and my favorite version of the story so far.

Now, back to the problem at hand, how does one do delegation in java and the answer does appear to be with inner classes functioning as callbacks. This certainly does the trick. But that's a bunch of code and interfaces to create which in the end doesn't improve the readability of the code. As an illustration, here is the C# code and the java code I created to get the same effect. Note: I didn't need the extra information that C# events provide, i.e. the event source and event arguments, so I left them out of the java version to keep the code more concise. I also didn't do any checking if there are subscribers, etc -- read: this is an illustration not production code :)

Publisher

// C# Publisher
public class Publisher
{
  // create the event, which implicitly gives us add/delete subscribers
  public event EventHandler someAction;

  public void DoAction()
  {
    Console.WriteLine("Start action");
    //implictly call all subscribers
    someAction(this, EventArgs.Empty);
    Console.WriteLine("End action");
  }
}
// java Publisher
public class Publisher {

    private List<EventHandler> subscribers = new ArrayList<EventHandler>();
    
    public void subscribeToAction(EventHandler notifier)
    {
        subscribers.add(notifier);
    }
    
    public void doAction()
    {
        System.out.println("Start action");
        for( EventHandler subscriber : subscribers ) {
            subscriber.handle(this);
        }
        System.out.println("End action");
    }
}

Subscriber

//C# Subscriber
public class Subscriber
{
  private string name;

  public Subscriber(string name)
  {
    this.name = name;
  }

  public void AttachToPublisher(Publisher publisher)
  {
    // subscribe to the event. This creates a closure for this particular
    // instance of Subscriber.
    publisher.someAction += new EventHandler(RespondToAction);
  }

  void RespondToAction(object sender, EventArgs e)
  {
    Console.WriteLine("Responding to action for '" + name + "'");
  }
}
// java Subscriber
public class Subscriber {

    private String name;
    
    public Subscriber(String name) {
        this.name = name;
    }
    
    public void attachToPublisher(Publisher publisher) {
        // create a new anonymouse instance of the EventHandler
        // as a closure for this instance of Subscriber
        publisher.subscribeToAction(new EventHandler() {
            public void handle(Publisher publisher) {
                respondToAction();
            }
        }
        );
        
    }
    
    private void respondToAction() {
        System.out.println("Responding to action for '" + name + "'");
    }
}

EventHandler

In C# this is just built in plumbing. In java we create a simple interface that our anonymous inner class will implement:
public interface EventHandler {
    void handle(Publisher publisher);
}

Now we exercise the code:

      Publisher p = new Publisher();
      Subscriber s1 = new Subscriber("abc");
      Subscriber s2 = new Subscriber("xyz");
      s1.AttachToPublisher(p);
      s2.AttachToPublisher(p);
      p.DoAction();

The java code is virtually identical just with different casing for code style and both produce this output:

Start action
Responding to action for 'abc'
Responding to action for 'xyz'
End action
Now add lost of different events and unsubscribing of events, plus more complex EventHandlers and the amount of code you end up writing quickly becomes significant. If there is one thing object oriented programming encourages us to do is to take repetitive code patterns and formulate reusable objects. Plenty of people in the java community have created delegate-like helpers that make delegation easier to read and maintain than simple inner classes, my favorite so far being Alex Winston's strongly typed approach.

So are delegates just syntactic sugar or a throw-back to procedural coding? For those with only a cursory understanding of delegates, they do just look like function pointers, like C, or at best, type-safe function pointers. But just like inner classes they create instance specific closures, plus they throw in functionality for handling multi-casting and handling synchronous and asynchronous invocation of the closure. I, at least, think delegates as a first-class citizen of the runtime make life easier, improve readability and do not detract from the object oriented nature of the surrounding code.

Labels: , , , , ,

Thursday, March 29, 2007

Eclipse & referencing JARs

Being an eclipse newbie, setting up third party JARs was a bit painful to figure out. Well, not painful as such, just painful once i tried to move the project across platforms.

See, I was using the External JARs... option in the Build Path dialog. But that set up absolute paths, which is a bad idea even if you stay on the same platform. Going across platforms C:\.. just wasn't an option. So i tried editing the .classpath by hand. That just created tied the path to the root. Finally I figure out that if i added a lib directory to the project, the JARs I put inside of it, were now browsable by the Add JARs option. Now everything build happily across Mac and Windows. Joy.

Labels: , , ,

Wednesday, March 28, 2007

Simple Java XML Serialization

Last time I used Java, JAXB was just being released. At least at the time, it was all about Schema definition and code generation. I loved the concept of serializing straight into XML and back, but code generation always is a bit of a nasty beast, imho. Plus, adding custom code to your class was also annoying. You basically had to extend your generated class to add anything custom to it.

When I started working with .NET, I was pleasantly surprised by their Attribute driven approach. Now the code and the XML definition was in one place and no post processing was required.

Back in Java land, I thought that with Java 5.0's addition of Annotations, maybe the meta-data code-markup method had been picked up as well. Sure enough there is JAXB 2.0 Reflection. However, if all I want to do is some config files and maybe some data serialization without the full Web Services juggernaut, that seems like a heavy handed way of doing it.

A little further digging brought me to Simple. One tiny jar and you are ready to define, serialize and deserialize your classes. Woot. Sure, my .NET bias is showing, but I really like this approach so much better than starting at the XSD/DTD and generating code. And creating code that let's you move objects from one side to the other is dead simple:

java

@Root(name="example")
public class Example {

   @Element(name="message")
   private String text;

   @Attribute(name="id")
   private int index;

   public Example() {
      super();
   }  

   public Example(String text, int index) {
      this.text = text;
      this.index = index;
   }

   public String getMessage() {
      return text;
   }

   public int getId() {
      return index;
   }
}

C#

[XmlRoot("example")]
public class Example
{

  private string text;
  private int index;

  public Example()
  {
  }

  public Example(string text, int index)
  {
    this.text = text;
    this.index = index;
  }

  [XmlElement("message")]
  public String Message
  {
    get { return text; }
    set { text = value; }
  }

  [XmlAttribute("id")]
  public int Id
  {
    get { return index; }
    set { index = value; }
  }
}

XML

<?xml version="1.0" encoding="UTF-8"?>
<example id="123">
  <message>Example message</message>
</example>

The main difference is that the Simple approach doesn't require there to be a public setter on the data, which is actually kind of strange. How does deserialization set the field? Does it use reflection Regardless, two pieces of code, looking quite similar, allowing you to serialize a class from C# to Java and back without a lot of work.

Labels: , , , , ,

Tuesday, March 27, 2007

throws NobodyCaresException

I've been doing java again for a project for the last few weeks and it's been fun. You do get spoiled from the clean syntax C# Properties give you over the getter/setter pattern. And talk about notation differences. I'm so used to Pascal Case on everyting except for fields and parameters that writing Camel Case on methods is taking some getting used to again. And don't even get me started on switching back and forth between Eclipse and VS.NET several times a day. My muscle memory is in shock. But once you get your mind in the right frame for the idiosyncracies, everything else is so similar is scary, especially with the Java 5.0 additions.

The difference, however, between the two platforms that I was always on the fence about is Checked Exceptions:

In Java, I love that I can declare what exceptions a method may throw and the IDE can pick this up and let you make intelligent decisions. In C#, you have to hope there are some mentions in the docs and otherwise you just find out at runtime and either have a catch all or add individual catches as you find exceptions.

But then, checked exceptions are in my mind the single most responsible party for the good old catch {} gem. And it's unavoidable. There are a number of cases where a non-type-safe argument is required, which, if it was wrong, would throw an exception. However, most of the time that argument isn't dynamic, but some constant you define. Really you know that the exception you are forced to handle will never be thrown and you put a catch {} in there, feeling justified. Soon enough the guilt is gone and you start using it when you just don't feel like dealing with that exception right now. Suddenly you're swallowing exceptions that someone up the chain might have cared about or that should have legitimately bubbled to the top. Bah.

Being a rabid fan of anything that makes code requirements machine discoverable, not being able to declare my exceptions feels dirty. And even for human discovery, documentation isn't sufficient, since it only covers the surface level, i.e. what I throw. Now, if i use someone else's code in there, i need to hope they were also diligent and I have to add any exceptions they may throw and I don't handle to the ones i may throw. Yeah, thatdocumentation is highly likely to fall behind the reality of the code.

Wouldn't it be nice if we could declare our exceptions as informational. Like throws but you don't have to deal with it. Huh, wouldn't that be meta-data about your code, i.e. a perfect candidate for Attributes ne Annotations? I've seen rumblings here and there about this, but nobody's ever picked it up seriously, that I could find. I for one, would love it if I could say

[Throws("InvalidOperationException,FooBarException")]
public void SomeMethod() { ... }
and have the IDE pick up on it to let me know and optionally generate warnings about things I'm not catching. I think that would be the best of both worlds.

Labels: , , ,