A case for XML
XML gets maligned a lot. It's
enterprisey, bloated, overly complex, etc. And the abuses visited upon it, like trying to express flow control or whole DSLs in it or being proposed as some sort of panacea for all interop problems only compound this perception. But as long as you treat it as what it is,
data storage, I generally can find little justification to use something else. Not because it's the best, but because it's everywhere.
If you are your own consumer and you want a more efficient data storage, just go binary already. If you're not, then I bet your data consumers are just tickled that they have to add another parser to their repository of data ingestors. Jim Clark probably put it best when he said:
"For the payload format, XML has to be the mainstay, not because it's technically wonderful, but because of the extraordinary breadth of adoption that it has succeeded in achieving. This is where the JSON (or YAML) folks are really missing the point by proudly pointing to the technical advantages of their format: any damn fool could produce a better data format than XML."
Ok, I won't get religious on the subject, but mostly wanted to give a couple of examples, where the abilities and the adoption of XML have been a godsend for me. All this does assume you have a mature XML infrastructure. If you're dealing with XML via SAX or even are doing the parsing and writing by hand, then you are in a world of hurt, I admit. But unless it's a memory constraint there really is no reason to do that. Virtually every language has an XML DOM lib at this point.
I love namespaces
One feature a lot of people usually point to when they decry XML to me is namespaces. They can be tricky, i admit, and a lot of consumers of XML don't handle them right, causing problems. Like Blend puking on namespaces that weren't apparently hardcoded into its parser. But very simply, namespaces let you annotate an existing data format without messing with it.
<somedata droog:meta="some info about somedata">
<droog:metablock>And a whole block of extra data</droog:metablock>
</somedata>
Here's the scenario. I get data in XML and need to reference metadata for processing further down the pipeline. I could have ingested the XML and then written out my own data format. But that would mean I'd have to also do the reverse if I wanted to pass the data along or return it after some modifications and I have to define yet another data format. By creating my own namespace, I am able to annotate the existing data without affecting the source schema and I can simply strip out my namespace when passing the processed data along to someone else. Every data format should be so versatile.
Transformation, Part 1: Templating
When writing webapps, there are literally dozens of templating engines and there's constantly new ones emerging. I chose to learn XSLT some years back because I liked how
Cocoon and
AxKit handled web pages. Just create your data in XML and then transform it using XSLT according to the delivery needs. So far, nothing especially unique compared to other templating engines. Except unlike most engines, it didn't rely on some program creating the data and then invoking the templating code. XSLT works with dynamic Apps as easily as with static XML or third party XML without having.
Since those web site roots, I've had need for email templating and data transformation in .NET projects and was able to leverage the same XSLT knowledge. That means I don't have to pick up yet another tool to do a familiar task just a little differently.
What's the file format?
When I first started playing with Xaml, I was taking
Live For Speed geometry data and wanted to render it in WPF and Silverlight. Sure, I had to learn the syntax of the geometry constructs, but I didn't have to worry about figuring out the data format. I just used the more than familiar XmlDocument and was able to concentrate on geometry, not file formats.
Transformation, Part 2: Rewriting
Currently I'm working with Xaml again for a Silverlight project. My problem was that I had data visualization in Xaml format (coming out of Illustrator), as well as associated metadata (a database of context data) and I needed to attach the metadata to the geometry, along with behavior. Since the first two are output from other tools I needed a process that could be automated. One way would be to walk the Visual tree once loaded, create a parallel hierarchy of objects containing the metadata and behavior and attach their behavior to the visual tree. But i'd rather have the data do this for itself.
<Canvas x:Name="rolloverContainer_1" Width="100" Height="100">
<!-- Some geometry data -->
</Canvas>
<!-- becomes -->
<droog:RolloverContainer x:Name="rolloverContainer_1" Width="100" Height="100">
<!-- Some geometry data -->
</droog:RolloverContainer>
So I created custom controls that subclassed the geometry content containers. I then created a post-processing script that simply loaded the Xaml into the DOM and rewrote the geometry containers as the appropriate custom controls using object naming as an identifying convention. Now the wiring happens automatically at load, courtesy of Silverlight. Again, no special parser required, just using the same XmlDocument class I've used for years.
And finally, Serialization
I use XML serialization for over the wire transfers as well as data and configuration storage. In all cases, it lets me simply define my DTOs and use them as part of my object hierarchy without ever having to worry about persistence. I just save my object graph by serializing it to XML and rebuild the graph by deserializing the stream again.
I admit that this last bit does depend on some language dependent plumbing that's not all that standard. In .NET, it's built in and let's me mark in my objects with attributes. In Java, I use Simple for the same effect. Without this attribute driven mark up, I'd have to walk the DOM and build m objects by hand, which would be painful.
Sure, for data, binary serialization would be cheaper and more compact, but that misses the other benefits I get for free. The data can be ingested and produced by a wide variety of other platforms, I can manually edit it, or easily build tools for editing and generation, without any specialized coding.
For my Silverlight project, I'm currently using JSON as my serialization layer between client and server, since there currently is no XmlSerializer or even XmlDocument in Silverlight 1.1. It, too, was painless to generate and ingest and, admittedly, much more compact. But I then I added this bit to my DTO:
List<IContentContainer> Containers = new List<IContentContainer>();
It serialized just fine, but then on the other end it complained about there not being a no-argument constructor for IContentContainer. Ho Hum. Easily enough worked around for now, but I will be switching back to XML for this once Silverlight 2.0 fleshes out the framework. Worst case, I'll have to build XmlSerializerLitem, or something like that, myself.
All in all, XML has allowed me to do a lot of data related work without having to constantly worry about yet another file format, or parser. It's really not about being the best format, but about it virtually being everywhere and being supported with a mature toolchain across the vast majority of programming environment and that pays a lot of dividents, imho.
Labels: json, xaml, xml, xslt
Playing with Xaml and LFS
I thought it was about time to see how hard it would be to get
Live For Speed Track Maps into Xaml for use on the desktop or
Silverlight. I initially started with the LFS's SMX (simple mesh format) files, but taking triangle data and turning it into a manageable number of polygons for vector representation is proving to be an interesting exercise. I might get back to it later, but for now, have found a more productive approach.
LFS provides a file format called PTH which basically contains the drive line in world coordinates as a set of nodes for each track configuration. But here's cool part: Every node also includes orientation and the left and right distances from the node to the track edges and drivable area edges. Given the left and right edges from two nodes describes a rectangle. I decided to render this out as a series of polygons in Xaml and the results was a very usable representation of the track.
But this still generates a lot of polygons for such a relatively simple shape. I initially went down the path of just skipping over nodes and doing a reduction of detail that way. This approach, however, quickly deteriorated the detail significantly. My next approach was to take the edges of multiple node polygons and simply draw a single polygon encompassing them all. The file size didn't drop as significantly, but the detail didn't drop at all. Below is the comparison of the two polygon reduction techniques:
|
Skipped Node Polygon Reduction |
Edge following Polygon Reduction |
2 nodes/polygon
At this stage the two outputs are identical. And xaml files sizes are identical at about 90KB.
|
|
|
4 nodes/polygon
You can see a little detail reduction in the skipped node version. But file size reductions are drastic. 31KB for the skipped version, 40KB for edge-following.
|
|
|
8 nodes/polygon
Now, the level of detail reduction in the skipped version is becoming noticable. Since this is a vector format it lends itself to panning and zooming, but at this point the loss of detail would become annoying when zoomed. However, file size reductions continue to out pace the edge following with 14KB and 27KB respectively.
|
|
|
16 nodes/polygon
At this stage the skipped node format is basically unusable as a representation, while the edge following still looks as crisp as the original. File size for skipped has again halved, but really is not useful. Edge following is at 21KB, clearly slowing down in file size reduction
|
|
|
I continued with the edge following polygon reduction technique up to 128 nodes per polygon. However by 64, the file wasn't getting any smaller anymore (the verbosity of Xaml was negligible compared to the number of polygon points. The best final file size was about 17KB, which is quite acceptable for a Silverlight application, if you ask me.
Next I need to hook this Xaml up to my InSim code and then I'll be able to render race progression. This'll have to be inside of a WPF app for now, until Silverlight's CLR gets Sockets and I can port LFSLib.NET to it. Right now, i'd have to do constant server callbacks which wouldn't scale or perform well.
Labels: LFS, LFSLib.NET, live for speed, silverlight, xaml
Rich Internet App Development
If you've had the misfortune of mentioning AJAX in my presence, then you've heard me rant about the crappy user experience we are all willing to accept in the name of net connectness. This really is a lamentation about the state of Rich Internet Application Frameworks and my dislike for coding in Javascript. Well, it looks like there are more choices than I'd been aware of (the choice of google search terms makes all the difference). Still not what I'd hope, but at least its getting more digestible.
Running in the Browser
Programming based on the AJAX technique has certainly done much to elevate the quality of web apps, but I still feel they are always just a pale facsimile of good old desktop apps. Even the best webapp UI is generally great "for a webapp". However lots of libraries are emerging, as are s number of widget sets, so that's definitely improving. While most toolkits let you extend them, you're always doing your server in one language and your custom UI in javascript.
What I personally have hoped for was a VM in the browser that was addressable by a number of compilers creating bytecode. Everytime I see a platform that runs a VM underneath and doesn't let you address that VM directly, I feel like a great opportunity has been missed. Oddly, MS' CLR is the best example of a VM that lets you address it in virtually any language. They certainly didn't invent the concept but they've promoted it. I think Sun did a major disservice to itself and VMs in general when they married Java the language, the Virtual Machine and the Religion into a single marketing entity. I mean who even knows that there are lots of languages that can be used to target the JVM?
Compile to Javascript
A while ago I found
a post by Brendan Eich talking about the future of the Mozilla VM and mentioned
mono and the jvm as options. Yesterday, he
posted about open web standards and I seized the opporunity to ask about bytecode addressability of JS2's VM. His answer about legal issues is likely a big reason why mono was abandoned as an option:
"there won't be a standard bytecode, on account of at least (a) too much patent encrustation and (b) overt differences in VM architectures. We might standardize binary AST syntax for ES4 in a later, smaller ECMA spec -- I'm in favor."
But as he also pointed out there is always compiling to Javascript instead of bytecode. The options he and another poster mentioned were:
Of the three I initially liked the Morfik approach the best, but doing a bit more research, they seem to be well on the path of propagating the same patent issues that Brendan Eich attributes the lack of standard VMs to. Pity.
Looking around for Javascript compiler's I noticed that this approach is also under development at MS as Script# although it hasn't yet moved up to an official MS project. Interestingly this does pit MS vs. Google once again, framed in a C# vs. Java context with ASP.NET AJAX w/ Script# and GWT. And if there's anything that's just as great for innovation as open standards, in my opinion, it's competition between giants. I look forward to seeing them try to outdo each other.
Looking forward to Rich Apps
So far, we're stuck in the browser, and even with tabs, I sure hope this isn't the future of applications. If we are to move beyond browser, what are our options?
Clearly, Adobe is leading the RIA platform wars with Flash and with Flex, SWF certainly looks more like a platform than an animation tool forced to render User Interfaces.
And
Apollo certainly looks to push Flash as a platform as well as making it a stand-alone app platform. I certainly think this is going to be the juggernaut to beat. Given my dislike for the syntax of (Java|Ecma|Action)Script, it's unlikely to be my platform of choice. And i don't see a Adobe supporting cross-language compilation and support for Eclipse or Visual Studio at the expense of their Dev suite.
I really like the concept of WPF. It's philosophy is what I want the future to look like. Mark-up and code separate, common runtime addressable in many languages on client and server, well developed communications framework. Ah, it warms my heart.
But, a) it's closed, b) it's only Windows (i'll get to WPF/E in a sec) and c) boy, is it over-architected. Now, it's at 1.0 release and if there's anything about MS releases, they seldomly get it right in 1.0. We'll see what 2.0 looks like.
WPF/E looks like a combination of simplifying WPF (is this 2.0?) and going after Adobe. And with Script# and recent admissions of some type of CLR on Mac for WPF/E, we're looking at a trojan horse to get Rich Internnet Application Development in .NET established in both the browser and the desktop across platforms. Unfortunately, "across platforms" for MS still means Windows and Mac, while Adobe's Flash 9 has demonstrated their dedication to encompass the linux sphere as well. I don't think that's going to change... I just don't see MS going to linux with the CLR and I find the likelyhood of them leveraging mono's efforts just as unlikely. I wouldn't mind being wrong.
This isn't a platform
per se, but I've seen a lot of cool tech demo's using XUL and/or the
canvas tag. Also
looking at the work that Micheal Robertson's
AJAX13 has done, I think there are the makings of a stand-alone app platform here. If your runtime requirements are "install firefox, but you don't even have to use it as your browser if you don't want to", that's a pretty small barrier for a platform that runs everywhere. Personally, I hope someone straps mono or the
recently liberated jvm to XUL and builds a platform out of it (you'd get mature WS communication for free with either), because of all the options that looks the most appealing to me personally.
There's got to be more
Considering that GWT and Script# had eluded my radar up until today, I'm sure there's even more options for RIA's out there. I just hope that people developing platforms take the multi language lessons of the legacy platforms to heart. All the successful OS's of the past offered developers many ways of getting things done. You looked at your task, picked the language that was the best fit to the task and your style of programming and you delivered your solution. VMs have shown that supporting many languages in an OS independent way is viable, so if you're building a platform now, why would you choose to mandate a language and programming model. I sure hope that the reason for not going this route isn't going to be
"because the patent system is stopping me" -- that would be the ultimate crime of a system that was supposed to foster innovation.
Labels: ajax, gwt, javascript, morfik, ria, script#, wpf, xaml, xul