I thought I had only one syntax post left before diving into posts about attempting to implement the language. But starting on a post about method slots and operators, I decided that there was something else i needed to cover in more detail first: The illustrious JSON object.
I've alluded to JSON objects more than a couple of times in previous posts, generally as an argument for lambda calls. Since everything in Promise is a class, JSON objects are bit of an anomaly. Simply, they are the serialization format of Promise, i.e. any object can be reduced to a JSON graph. As such it exists outside the normal class regime. It is also closer to BSON, as it will retain type information unless serialized to text, and can be serialized on the wire either as JSON or BSON. So really it looks like javascript object notation (JSON) but it's really Promise object notation. For simplicity, i'm going to keep calling it JSON tho.
Creating a JSON object is the same as in javascript:
var a = {};
var b = [];
var c = { foo: ["bar","baz"] };
var d = { song: Song{name: "ShopVac"} };
The notation accepts hash and array initializers and their nesting, as well as object instances as values. Fields are always strings.
The last example shows that you can put Promise objects into a JSON graph, and the object initializer itself takes another JSON object. I explained in "Promise: IoC Type/Class mapping" that passing a JSON object to the Type allows the mapped class constructor to intercept it, but in the default case, it's simply a mapping of fields:
class Song {
_name;
Artist _artist;
Artist:(){ _artist; }
...
}
class Artist {
_name;
...
}
var song = Song{ name: "The Future Soon", artist: { name: "Johnathan Coulton" } };
// get the Artist object
var artist = song.Artist;
//serialize the object graph back to JSON
print song.Serialize();
// => { name: "The Future Soon", artist: { name: "Johnathan Coulton" } };
Lacking any intercepts and maps, the initializer will assign the value name to _name, and when it maps artist to _artist, the typed nature of _artist invokes its initializer with the JSON object from the artist field. Once .Serialize() is called, the graph is reduced to the most basic types possible, i.e. the Artist object is serialized as well. Since the serialization format is meant for passing DTOs, not Types, the type information (beyond fundamental types like String, Num, etc.) is lost at this stage. Circular references in the graph would be dropped–any object already encountered in serialization causes the field to be omitted. It is omitted rather than set to nil so that its use as an initializer does not set the slot to nil, but allows the default initializer to execute.
Above I mentioned that JSON field values are typed and showed the variable d set to have an object as the value of field song. This setting does not cause Song to be serialized. When assigning values into a JSON object, they retain their type until they are used as arguments for something that requires serialization or are manually serialized.
var d = { song: Song{name: "ShopVac"} };
// this works, since song is a Song object
d.song.Play();
var e = d.Serialize(); // { song: { name: "ShopVac" } }
// this will throw an exception
e.song.Play();
// this clones e
var f = e.Serialize();
Serialization can be called as many times as you want and acts as a clone operation for graphs lacking anything further to serialize. The clone is a lazy operation, making it very cheap. Basically a pointer to the original json is returned and it is only fully cloned if either the original or the clone are modified. This means, the penalty for calling .Serialize() on a fully serialized object is minimal and is an ideal way to propagate data that is considered immutable.
JSON objects are fully dynamic and can be access and modified at will.
var x = {foo: "bar"};
// access by dot notation
print x.foo; // => "bar"
// access by name (for programatic access or access of non-symbolic names)
print x["foo"]; // => "bar"
x.foo = ["bar","baz"]; // {foo: ["bar","baz"]}
x.bar = "baz"; // {bar: "baz", foo: ["bar", "baz"]};
// delete a field via self-reference
x.foo.Delete();
// or by name
x["foo"].Delete();
The reason JSON objects exist as entities distinct from class defined objects is to provide a clear separation between objects with behavior and data only objects. Attaching functionality to data should be an explicit conversion from a data object to a classed object, rather mixing the two, javascript style.
Of course, this dichotomy could theoretically be abused with something like this:
var x = {};
var x.foo = (x) { x*x; };
print x.foo(3); // => 9
I am considering disallowing the assignment of lambdas as field values, since they cannot be serialized, thus voiding this approach. I'll punt on the decision until implementation. If lambdas end up as first class objects, the above would have to be explictly prohibited, which may lead me to leave it in. If however, I'd have to manually support this use case, i'm going to leave it out for sure.
JSON objects exist as a convenient data format internally and for getting data in and out of Promise. The ubiquity of JSON-like syntax in most dynamic languages and it's easy mapping to object graphs makes it the ideal choice for Promise to foster simplicity and interop.
This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.
XML gets maligned a lot. It’s enterprisey, bloated, overly complex, etc. And the abuses visited upon it, like trying to express flow control or whole DSLs in it or being proposed as some sort of panacea for all interop problems only compound this perception. But as long as you treat it as what it is, data storage, I generally can find little justification to use something else. Not because it’s the best, but because it’s everywhere.
If you are your own consumer and you want a more efficient data storage, just go binary already. If you’re not, then I bet your data consumers are just tickled that they have to add another parser to their repository of data ingestors. Jim Clark probably put it best when he said:
“For the payload format, XML has to be the mainstay, not because it’s technically wonderful, but because of the extraordinary breadth of adoption that it has succeeded in achieving. This is where the JSON (or YAML) folks are really missing the point by proudly pointing to the technical advantages of their format: any damn fool could produce a better data format than XML.”
Ok, I won’t get religious on the subject, but mostly wanted to give a couple of examples, where the abilities and the adoption of XML have been a godsend for me. All this does assume you have a mature XML infrastructure. If you’re dealing with XML via SAX or even are doing the parsing and writing by hand, then you are in a world of hurt, I admit. But unless it’s a memory constraint there really is no reason to do that. Virtually every language has an XML DOM lib at this point.
One feature a lot of people usually point to when they decry XML to me is namespaces. They can be tricky, i admit, and a lot of consumers of XML don’t handle them right, causing problems. Like Blend puking on namespaces that weren’t apparently hardcoded into its parser. But very simply, namespaces let you annotate an existing data format without messing with it.
<somedata droog:meta="some info about somedata"> <droog:metablock>And a whole block of extra data</droog:metablock> </somedata>
Here’s the scenario. I get data in XML and need to reference metadata for processing further down the pipeline. I could have ingested the XML and then written out my own data format. But that would mean I’d have to also do the reverse if I wanted to pass the data along or return it after some modifications and I have to define yet another data format. By creating my own namespace, I am able to annotate the existing data without affecting the source schema and I can simply strip out my namespace when passing the processed data along to someone else. Every data format should be so versatile.
When writing webapps, there are literally dozens of templating engines and there’s constantly new ones emerging. I chose to learn XSLT some years back because I liked how Cocoon and AxKit handled web pages. Just create your data in XML and then transform it using XSLT according to the delivery needs. So far, nothing especially unique compared to other templating engines. Except unlike most engines, it didn’t rely on some program creating the data and then invoking the templating code. XSLT works with dynamic Apps as easily as with static XML or third party XML without having.
Since those web site roots, I’ve had need for email templating and data transformation in .NET projects and was able to leverage the same XSLT knowledge. That means I don’t have to pick up yet another tool to do a familiar task just a little differently.
When I first started playing with Xaml, I was taking Live For Speed geometry data and wanted to render it in WPF and Silverlight. Sure, I had to learn the syntax of the geometry constructs, but I didn’t have to worry about figuring out the data format. I just used the more than familiar XmlDocument and was able to concentrate on geometry, not file formats.
Currently I’m working with Xaml again for a Silverlight project. My problem was that I had data visualization in Xaml format (coming out of Illustrator), as well as associated metadata (a database of context data) and I needed to attach the metadata to the geometry, along with behavior. Since the first two are output from other tools I needed a process that could be automated. One way would be to walk the Visual tree once loaded, create a parallel hierarchy of objects containing the metadata and behavior and attach their behavior to the visual tree. But i’d rather have the data do this for itself.
<Canvas x:Name="rolloverContainer_1" Width="100" Height="100"> <!-- Some geometry data --> </Canvas> <!-- becomes --> <droog:RolloverContainer x:Name="rolloverContainer_1" Width="100" Height="100"> <!-- Some geometry data --> </droog:RolloverContainer>
So I created custom controls that subclassed the geometry content containers. I then created a post-processing script that simply loaded the Xaml into the DOM and rewrote the geometry containers as the appropriate custom controls using object naming as an identifying convention. Now the wiring happens automatically at load, courtesy of Silverlight. Again, no special parser required, just using the same XmlDocument class I’ve used for years.
I use XML serialization for over the wire transfers as well as data and configuration storage. In all cases, it lets me simply define my DTOs and use them as part of my object hierarchy without ever having to worry about persistence. I just save my object graph by serializing it to XML and rebuild the graph by deserializing the stream again.
I admit that this last bit does depend on some language dependent plumbing that’s not all that standard. In .NET, it’s built in and let’s me mark in my objects with attributes. In Java, I use Simple for the same effect. Without this attribute driven mark up, I’d have to walk the DOM and build m objects by hand, which would be painful.
Sure, for data, binary serialization would be cheaper and more compact, but that misses the other benefits I get for free. The data can be ingested and produced by a wide variety of other platforms, I can manually edit it, or easily build tools for editing and generation, without any specialized coding.
For my Silverlight project, I’m currently using JSON as my serialization layer between client and server, since there currently is no XmlSerializer or even XmlDocument in Silverlight 1.1. It, too, was painless to generate and ingest and, admittedly, much more compact. But I then I added this bit to my DTO:
List<IContentContainer> Containers = new List<IContentContainer>();
It serialized just fine, but then on the other end it complained about there not being a no-argument constructor for IContentContainer. Ho Hum. Easily enough worked around for now, but I will be switching back to XML for this once Silverlight 2.0 fleshes out the framework. Worst case, I’ll have to build XmlSerializerLitem, or something like that, myself.
All in all, XML has allowed me to do a lot of data related work without having to constantly worry about yet another file format, or parser. It’s really not about being the best format, but about it virtually being everywhere and being supported with a mature toolchain across the vast majority of programming environment and that pays a lot of dividents, imho.