promise

May 25, 2011
in geek, promise
3 min read

Implementing "exports" in C

The other day i was musing about improving readability by liberating verbs from the containing objects that own them. Similar things are accomplished with mix-ins or traits in other languages, but I wanted to go another step further and allow for very context specific mapping, rather than just mixing ambigiously named methods from other objects. Since the exports construct looked a lot like a map of expressions, I decided to see what replicating the behavior with existing C# would look like.

To review, this is what I want (in C#-like pseudo code + exports):

class PageWorkflow {
  UserService _userService exports { FindById => FindUserById };
  PageService _pageService exports {
    FindById        => FindPageById,
    UpdatePage(p,c) => Update(this p,c)
  };
  AuthService _authService exports {
    AuthorizeUserForPage(u,p,Permissions.Write) => UserCanUpdatePage(u,p)
  };

  UpdatePage(userid, pageid, content) {
    var user = FindUserById(userid);
    var page = FindPageById(pageid);
    if(UserCanUpdatePage(user,page)) {
      page.Update(content);
    } else {
      throw;
    }
  }
}

And this is C# implementation of the above:

public class PageWorkflow {

    private readonly Func<int, User> FindUserById;
    private readonly Func<int, Page> FindPageById;
    private readonly Action<Page, string> Update;
    private readonly Func<User, Page, bool> UserCanUpdatePage;

    public PageWorkflow(IUserService userService, IPageService pageService, IAuthService authService) {
        FindUserById = (id) => userService.FindById(id);
        FindPageById = (id) => pageService.FindById(id);
        Update = (page, content) => pageService.UpdatePage(page, content);
        UserCanUpdatePage = (user, page) => authService.AuthorizeUserForPage(user, page, Permissions.Write);
    }

    public void UpdatePage(int userid, int pageid, string content) {
        var user = FindUserById(userid);
        var page = FindPageById(pageid);
        if(UserCanUpdatePage(user, page)) {
            Update(page, content);
        } else {
            throw new Exception();
        }
    }
}

As I mentioned, it's all possible, short of the context sensitive extension method on Page. Lacking extension methods, I was going to name the imported method UpdatePage, but since it would be a field, it conflicts with the UpdatePage workflow method despite functionally having different signatures.

All in all, the public workflow UpdatePage is pretty close to what I had wanted, but the explicit type declaration of the each exports makes it boilerplate that is likely not worth the trouble of writing, and, no, i won't even consider code generation.

This exercise along with every other language feature I've dreamt up for Promise does illustrate one thing quite clearly to me: My ideal language should provide programatic access to its parser so that the syntax of the language can be extended by the libraries. Internal DSLs are a nice start with most languages, but often they fall short of being able to reduce default boilerplate and just create new boilerplate. Sure, designing the language to be more succinct is desirable, but if anything, that only covers what the language creators could imagine. Being able to tweak a language into a DSL for the task at hand, a la MPS, seems a lot more flexible.

Is this added flexibility worth the loss of a common set of constructs that is shared by all programmers knowing language X? It certainly could be abused to become incomprehensible, but I would suggest that even knowing language X joining any Team working on a project of sufficient complexity adds its own wealth of implicit patterns and constructs and worse than a language whose compiler is extended, these constructs are communicated via comments and documentation that are not part of language X, i.e. the compiler and IDE lack the ability to aid someone learning these constructs.

Considering this, I really need to do a survey of languages that already offer this capability as well as take a closer look at MPS and Nemerle to see if the language I want is just a few parser rules a way from an existing meta programming language.

May 23, 2011
in geek, promise
5 min read

Of Workflows, Data and Services

This is yet another in my series of posts musing about what my ideal language would look like. This one is about readability.

Most code I write these days seems to utilize three types of classes: Data, Services and Workflow.

Data Classes

These are generally POCO object hierarchies with fields/accessors but minimal logic for manipulating that data. They should not have any dependencies. If an operation on a data object has a dependency, it's really a service for that data. Data objects don't get mocked/stubbed/faked, since we can just create and populate them.

Service Classes

These are really containers for verbs. The verbs could have been methods on the calling object, but by pulling them into these containers we enable a number of desirable capabilities:

re-use -- different workflows can use the same logic without creating their own copy
testing -- by faking the service we get greater control over testing different responses from the service
dependency abstraction -- there might be a number of other bits of logic that have to be invoked in order to provide the work the verb does but isn't a concern of the workflow
organization -- related verbs

Workflow Classes

These end up being classes, primarily because in most OO languages everything's a class, but really workflow classes are organizational constructs used to collected related workflows as procedural execution environments. They can be set up with pre-requisites and promote code re-use via shared private members for common sub-tasks of the workflow. They are also responsible for condition and branching logic to do the actual work.

Actions (requests from users, triggered tasks, etc.) start at some entry point method on a workflow object, such as a REST endpoint, manipulate data via Data objects using services, the results of which trigger paths defined by the workflow.

Same construct, radically different purposes

Let's look how this works out for a fictional content management scenario. I'm using a C#-like pseudo syntax to avoid unecessary noise (ironic, since this post is all about readibility):

class PageWorkflow {
  ...
  UpdatePage(userid, pageid, content) {
    var user = _userService.FindById(userid);
    var page = _pageService.FindById(pageid);
    if(_authService.AuthorizeUserForPage(user,page,Permissions.Write)) {
      _pageService.UpdatePage(page,content);
    } else {
      throw;
    }
  }
}

UpdatePage is part of PageWorkflow, i.e a workflow in our workflow class. It is configured with _userService, _pageService and _authService as our service classes. Finally user and page are instances of our data classes. Nice for maintainability and separation of concerns, but awkward from a readibility perspective. It would be much more readable with syntax like this:

class PageWorkflow {
  ...
  UpdatePage(userid, pageid, content) {
    var user = FindUserById(userid);
    var page = FindPageById(pageid);
    if(UserCanUpdatePage(user,page)) {
      page.Update(content);
    } else {
      throw;
    }
  }
}

Much more like we think of the flow. Of course this could easily be done by creating those methods on PageWorkflow, but that's the beginning of the end of building a god object, and don't even get me started on putting Update on the Page data object.

Importing verbs

So let's assume that this separation of purposes is desirable -- i'm sure there'll be plenty of people who will disagree with that premise, but the premise isn't the topic here. What we really want to do here is alias or import the functionality into our execution context. Something like this:

class PageWorkflow {
  UserService _userService exports { FindById => FindUserById };
  PageService _pageService exports {
    FindById        => FindPageById,
    UpdatePage(p,c) => Update(this p,c)
  };
  AuthService _authService exports {
    AuthorizeUserForPage(u,p,Permissions.Write) => UserCanUpdatePage(u,p)
  };
  ...
}

Do not confuse this with VB's or javascript's with keywords. Both import the entirety of the referenced object into the current scope. The much maligned javascript version does this by importing it into the global namespace, which, given the dynamic nature of those objects, makes the variable use completely ambiguous. While VB kept scope ambiguity in check by forcing a . (dot) preceeding the imported object's members, it is a shorthand that is only questionably more readable.

The above construct is closer to the @EXPORT syntax of the perl Exporter module. Except instead of exporting functions, exports exports methods on an instance as methods on the current context. It also extends the export concept in three ways:

Aliasing

Instead of just blindly importing a method from a service class, the exports syntax allows for aliasing. This is useful because imported method likely defered some of its functionality context to the owning class and could collide with other imported methods, e.g. FindById on PageService and UserService.

Argument rewriting

As the methodname is rewritten, the argument order may no longer be appropriate, or we may want to change the argument modifiers, such as turn a method into an extension method.

UpdatePage(p,c) => Update(this p,c)

The above syntax captures arguments into p and c and then aliases the method into the current class' context and turns it into a method attached to p, i.e. the page, so that we can call page.Update(content)

Currying

But why stop at just changing the argument order and modifiers. We're basically defining expressions that translate the calls from one to the other, so why shouldn't we be able to make every argument an expression itself?

AuthorizeUserForPage(u,p,Permissions.Write) => UserCanUpdatePage(u,p)

This syntax curries the Permissions.Write argument so that we can define our aliases entrypoint without the last argument and instead name it to convey the write permissions implcitly.

Writing workflows more like we think

Great, some new syntactic sugar. Why bother? Well, most language constructs are some level of syntactic sugar over the raw capabilities of the machine to let us express our intend more clearly. Generally syntactic sugar ought to meet two tests: make code easier to read and more compact to write.

The whole of the import mechanism could easily be accomplished (except maybe for the extension method rewrite) by creating those methods on PageWorkflow and calling the appropriate service members from there. The downside to this approach is that the methods are not differentiated from other methods in the body of PageWorkflow therefore not easily recognizable as aliasing constructs. In addition the setup as wrapper methods is syntactically a lot heavier.

The exports mechanism allows for code to be crafted more closely to how we would talk about accomplishing the task without compromising on the design of the individual pieces or tying their naming and syntax to one particular workflow. It is localized to the definition of the service classes and provides a more concise syntax. In this way it aids the readibility as well as theauthoring of a common task.

October 4, 2010
in geek, promise
1 min read

But what about your Promise?

See what i did there? Err... yeah, i know it's horrible. I apologize.

I did want to post an update about Promise since I've gone radio-silent since I finished up my series about features and syntax. I've started a deep dive into the DLR, but mostly got sidetracked with learning antlr since I don't want to build the AST by hand for testing. However, coming up with a grammar for Promise is the real stumbling block. So that's where i'm currently at, spending a couple of hours here and there playing with antlr and the grammar.

In the meantime, I've been doing more ruby coding over the last couple of weeks and even dove back into perl for some stuff and the one thing I am more sure is that I find dynamically code incredibly tedious to work with. The lack of static analysis and even simple contracts turns dynamic programming into a task of memorization, terse syntax and text search/replace. I'm not saying that the static type systems of yore are not equally painful with the hoops you have to jump through to explain contracts with any flexibility to the compiler, but given the choice I tend towads that end. This experience just solidifies my belief that a type system like Promise, i.e. types describe contracts, but classes aren't typed other than by their ability to quack appropriately, would be so much more pleasant.

August 13, 2010
in geek, promise
4 min read

Promise: Method slots and operators

Before getting into method slots, here's a quick review of the Promise lambda grammar:

lambda: [<aignature>] <expression>;

signature: (arg1>, ... <argN>[|<return-type>])

arg: [<type>] <argName>[=<init-expression>]

expression: <statement> | { <statement1>; ... <statementN>; }

A lambda can be called with positional arguments either with the parentheses-comma convention ( foo(x,y) ) or the space-separated convention ( foo x y ), or with a JSON object as argument ( foo{ bar: x, baz: y} ).

Method Overload (revised)

When i decided to use slots that you assign lambdas as methods, I thought I'd be clever and make those slots polymorphic to get around shortcomings i perceived in the javascript model of just attaching functions to named fields. After listening to Rob Pike talk about Go at OSCON, I decided this bit of cleverness did not serve a useful purpose. In Go there are no overloads, because a different signature denotes different behavior and the method name should reflect that difference. Besides, even if you want overload type behavior in Promise, you can get it via the JSON calling convention:

class Index {
  Search:(|SearchResult) {
     foreach(var keyvaluepair in $_) {
       // handle undeclared named parameters
     }
     ...
  };
}

Basically the lambda signature is used to declare an explicit call contract, but using a JSON object argument, undeclared parameters can just as easily be passed in.

If a method is called with positional arguments instead of a JSON object, the default JSON object will contain a field called args with an array value

class Index {
  Search: {
    ...
  };
}

Index.Search('foo','documents',10);

// $_ => { args: ['foo','documents',10] }

The above signature shows a method assigned a lambda without any signature, i.e. it accepts any input and returns an untyped object. Receiving $_.args is not contingent on that signature, it will always be populated, regardless of the lambda signature.

Wildcard Method

A class can also contain a wildcard method to catch all method calls that don't have an assigned slot.

class Index

  *: {
    var (searchType) = $_._methodname./^Find_(.\*)$/;
    if(searchType.IsNil) {
       throw new MethodMissingException();
    }
    ...
  };
}

The wild card method is a slot named *. Retrieving the call arguments is the same as with any other method without declared signature, i.e. $_ is used. In addition, the methodname used in the call is stuffed into $_ as the field _methodname.

The above example shows a method that accepts and call that starts with Find_ and takes the remainder of the name as the document type to find, such as Find_Images, Find_Pages, etc. This is done by using the built in regex syntax, i.e. you can use ./<regex>/ and ./<regex>/<substitution>/ on any string (or the string an object converts to), similar to perl's m// and s///. Like perl, the call returns a list of captures, so using var with a list of fields, in this case one field called searchType, receives the captures, if there is a match.

When a method is called that cannot be found on the Type, it throws a MethodMissingException. A wildcard method is simply a hook that catches that exception. By throwing it ourselves, our wildcard reverts to the default behavior for any method that doesn't match the desired pattern. This also gives parent classes or mix-ins the opportunity to fire their own wildcard methods.

Wildcard methods can only declared in classes and mix-ins, not on Types. Types are supposed to be concrete contracts. The existence of a wildcard does mean that the class can satisfy any Type contract and can be used to dynamically implement type contracts without having to declare each method (think mocks).

Operators

Operators are really just methods called with the whitespace list syntax

var x = 3;
var y = x + 5;  // => 8
var z = x.+(5); // => 8

// most operators implements polish notation as appropriate
var v = x.+(5,6); // => 14

Operators are just methods, which means you can assign them yourselves as well

class Query {
 List<Query> _compound;
 op+: {
    var q = Query();
    q._compound.AddRange(_compound);
    q._compound.AddRange($_args);
    q;
  };
}

The only difference between a normal method slot and an operator slot is that the operator slot has the op prefix for disambiguation.

And now for the hard part

That concludes the overview of the things I think make Promise unique. There's certainly tons more to define for a functioning language, but most of that is going to be very much common syntax. So now it's time to buckle down dig into antlr and the DLR to see what it will take to get some semblance of Promise functioning.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

August 9, 2010
in geek, promise
4 min read

Promise: Object notation and serialization

I thought I had only one syntax post left before diving into posts about attempting to implement the language. But starting on a post about method slots and operators, I decided that there was something else i needed to cover in more detail first: The illustrious JSON object.

I've alluded to JSON objects more than a couple of times in previous posts, generally as an argument for lambda calls. Since everything in Promise is a class, JSON objects are bit of an anomaly. Simply, they are the serialization format of Promise, i.e. any object can be reduced to a JSON graph. As such it exists outside the normal class regime. It is also closer to BSON, as it will retain type information unless serialized to text, and can be serialized on the wire either as JSON or BSON. So really it looks like javascript object notation (JSON) but it's really Promise object notation. For simplicity, i'm going to keep calling it JSON tho.

Initialization

Creating a JSON object is the same as in javascript:

var a = {};
var b = [];
var c = { foo: ["bar","baz"] };
var d = { song: Song{name: "ShopVac"} };

The notation accepts hash and array initializers and their nesting, as well as object instances as values. Fields are always strings.

Serialization

The last example shows that you can put Promise objects into a JSON graph, and the object initializer itself takes another JSON object. I explained in "Promise: IoC Type/Class mapping" that passing a JSON object to the Type allows the mapped class constructor to intercept it, but in the default case, it's simply a mapping of fields:

class Song {
  _name;
  Artist _artist;

  Artist:(){ _artist; }
  ...
}

class Artist {
  _name;
  ...
}

var song = Song{ name: "The Future Soon", artist: { name: "Johnathan Coulton" } };

// get the Artist object
var artist = song.Artist;

//serialize the object graph back to JSON
print song.Serialize();
// => { name: "The Future Soon", artist: { name: "Johnathan Coulton" } };

Lacking any intercepts and maps, the initializer will assign the value name to _name, and when it maps artist to _artist, the typed nature of _artist invokes its initializer with the JSON object from the artist field. Once .Serialize() is called, the graph is reduced to the most basic types possible, i.e. the Artist object is serialized as well. Since the serialization format is meant for passing DTOs, not Types, the type information (beyond fundamental types like String, Num, etc.) is lost at this stage. Circular references in the graph would be dropped--any object already encountered in serialization causes the field to be omitted. It is omitted rather than set to nil so that its use as an initializer does not set the slot to nil, but allows the default initializer to execute.

Above I mentioned that JSON field values are typed and showed the variable d set to have an object as the value of field song. This setting does not cause Song to be serialized. When assigning values into a JSON object, they retain their type until they are used as arguments for something that requires serialization or are manually serialized.

var d = { song: Song{name: "ShopVac"} };

// this works, since song is a Song object
d.song.Play(); 

var e = d.Serialize(); // { song: { name: "ShopVac" } }

// this will throw an exception
e.song.Play();

// this clones e
var f = e.Serialize();

Serialization can be called as many times as you want and acts as a clone operation for graphs lacking anything further to serialize. The clone is a lazy operation, making it very cheap. Basically a pointer to the original json is returned and it is only fully cloned if either the original or the clone are modified. This means, the penalty for calling .Serialize() on a fully serialized object is minimal and is an ideal way to propagate data that is considered immutable.

Access and modification

JSON objects are fully dynamic and can be access and modified at will.

var x = {foo: "bar"};

// access by dot notation
print x.foo; // => "bar"

// access by name (for programatic access or access of non-symbolic names)
print x["foo"]; // => "bar"

x.foo = ["bar","baz"]; // {foo: ["bar","baz"]}
x.bar = "baz"; // {bar: "baz", foo: ["bar", "baz"]};

// delete a field via self-reference
x.foo.Delete();
// or by name
x["foo"].Delete();

The reason JSON objects exist as entities distinct from class defined objects is to provide a clear separation between objects with behavior and data only objects. Attaching functionality to data should be an explicit conversion from a data object to a classed object, rather mixing the two, javascript style.

Of course, this dichotomy could theoretically be abused with something like this:

var x = {};
var x.foo = (x) { x\*x; };
print x.foo(3); // => 9

I am considering disallowing the assignment of lambdas as field values, since they cannot be serialized, thus voiding this approach. I'll punt on the decision until implementation. If lambdas end up as first class objects, the above would have to be explictly prohibited, which may lead me to leave it in. If however, I'd have to manually support this use case, i'm going to leave it out for sure.

JSON objects exist as a convenient data format internally and for getting data in and out of Promise. The ubiquity of JSON-like syntax in most dynamic languages and it's easy mapping to object graphs makes it the ideal choice for Promise to foster simplicity and interop.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

July 26, 2010
in net, geek, promise
5 min read

Promise: Building the repository pattern on the language IoC

Before I get into the code samples, I should point out one more "construction" caveat and change from my previous writing: Constructors don't have to be part of the Type. What does that mean? If you were to explictly declare the Song Type and excluded the Song:(name) signature from the Type, it would still get invoked if someone were to call Song{name: "foo"}, i.e. given a JSON resolution call, the existence of fields is used to try to resolve to a constructor, resulting in a call to Song:(name). Of course that's assuming that instance resolution actually hits construction and isn't using a Use lambda or returning an existing ContextScoped instance.

A simple Repository

Let's assume we have some persistence layer session and that it can already fetch DTO entities, a la ActiveRecord. Now we want to add a repository for entities fetched so that unique entities from the DB always resolve to the same instance. A simple solution to this is just a lookup of entities at resolution time:

$#[Session].In(:session).ContextScoped;
$#[Dictionary].In(:session).ContextScoped;
$#[User].Use {
  var rep = Dictionary<string,User>();
  var name = $_.name;
  rep.Get(name) ?? rep.Set(name,Session.GetByName<User>(name));
};

In the above the $_ implicit JSON initializer argument is used to determine the lookup value. I.e. given a JSON object, we can use dot notation to get to its fields, such as $_.name. This name is then used to do a lookup against a dictionary. Promise adopts the C# ?? operator to mean "if nil, use this value instead", allowing us to call .Set on the dictionary with the result from the Session. There is no return since the last value of a lambda is returned implicitly and Set returns the value set into it.

One other thing to note is the registration of Dictionary as ContextScoped. Since Dictionary is a generic type, each variation of type arguments will create a new context instance of Dictionary. For our example this means that the lambda executed for User resolution always gets the same instance of the dictionary back here.

context(:session) {
  var from = User{ name: request.from };
  var to = User{ name: request.to };
  var msg = Message{ from: from, to: to, body: request.body };
  msg.Send();
}

The usage of our setup stays nice and declarative. Gettting User instances has no knowledge how the instance is created and just passes what instance it wants, i.e. one named :name. Swapping out the resolution behavior for a service layer to get users, a mock layer to test the code, a different DB layer, all can be done without changing the business logic operating on the User instances.

A better Repository

Of course the above repository is just a dictionary and only supports getting. It assumes that Session<User>.GetByName will succeed and even then only acts as a session cache. So let's create a simple Respository class that also creates new entities and let's them be saved.

class Repository<TEntity> {
  Session _session = Session();             // manual resolve/init
  +Dictionary<String,Enumerable> _entities; // automatic resolve/init

  Get:(name|TEntity) {
    var e = entities[name] ?? _entities.Set(name,_session.GetByName<TEntity>(name) ?? TEntity{name});
    e.Save:() { _session.Save(e); };
    return e;
  }
}

Since the Repository class has dependencies of its own, this class introduces dependency injection as well. The simplest way is to just initialize the field using the empty resolver. In other languages this would be hardcoding construction, but with Promise this is of course implicit resolution against the IoC. Still, that's the same extraneous noise as C# and Java that I want to stay away from, even if the behavior is nicer. Instead of explicitly calling the resolver, Promise provides the plus (+) prefix to indicate that a field should be initialized at construction time.

The work of the repository is done in Get, which takes the name and returns the entity. As before, it does a lookup against the dictionary and otherwise set an instance into the dicitionary. However, now if the session returns nil, we call the entity's resolver with an initializer. But if we set up the resolver to call the repository, doesn't that just result in an infinite loop? To avoid this, Promise will never call the same resolver registration twice for one instance. Instead, resolution bubbles to next higher context and its registration. That means, lacking any other registration, this call will just create a new instance.

Finally, we attach a Save() method to the entity instance, which captures the session and saves the entity back to the DB. This last bit is really just there to show how entities can be changed at runtime. As repositories goes, it's actually a bad pattern and we'll fix it in the next iteration.

$#[Repository].In(:session).ContextScoped;
$#[User].Use { Repository().Get($_.name); };

The registration to go along with the Repository has gotten a lot simpler as well. Since the repository is context scoped and gets a dictionary and session injected, these two Types do not need to be registered as context scoped themselves. And User resolution now just calls the Repository getter.

context(:session) {
  var user= User{ name: request.name };
  user.email = request.email
  user.Save();
}

The access to the instance remains unchanged, but now we can change its data and persist it back using the Save() method.

Now with auto-commit

As I mentioned, the attaching of Save() was mostly to show off monkey-patching and in itself is a bad pattern. A true repository should just commit for us. So let's change the repository to reflect this:

class Repository<TEntity> {
  +Session _session;
  +Dictionary<String,Enumerable> _entities;
  _rollback = false;

  Get:(name|TEntity) {
    var e = entities[name] ?? _entities.Set(name,_session.GetByName<TEntity>(name) ?? TEntity{name});
    return e;
  };

  Rollback:() { _rollback = true; };

  ~ {
    _entities.Each( (k,v) { _session.Save(v) } ) unless _rollback;
  }
}

By attaching a Disposer to the class, we get the opportunity to save all instances at context exit. But having automatic save at the end of the :session context, begs for the ability to prevent commiting data. For this the Rollback() method simply sets a _rollback flag that governs whether we call save on the entities in the dictionary.

context(:session) {
  var user= User{ name: request.name };
  user.email = request.email
}

We've iterated over our repository a couple of times, each time changing it quite a bit. The important thing to note, however, is that the repository itself, as well as the session, have stayed invisible from the business logic. Both are an implementation detail, while the business logic itself just cared about retrieving and manipulating users.

I hope that these past posts give a good overview of how language level IoC is a simple, yet powerful way to control instance lifespan and mapping without cluttering up code. Next time, i'll return to what can be done with methods, since fundamentally Promise tries to keep keywords to a minimum and treat everything as a method/lambda call.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

July 22, 2010
in net, geek, promise
6 min read

Promise: IoC Lifespan management

In my last post about Promise i explained how a Type can be mapped to a particular Class to override the implicit Type/Class mapping like this:

$#[User].Use<MockUser>;

This registration is global and always returns a new instance, i.e. it acts like a factory for the User Type, producing MockUser instances. In this post I will talk about the creation and disposal of instances and how to control that behavior via IoC and nested execution contexts.

Dependency Injection

So far all we have done is Type/Class mapping. Before I talk about Lifespan's I want to cover Dependency Injection, both because it's one of the first motivators for people to use an IoC container and because Lifespan is affected by your dependencies as well. Unlike traditional dependency injection via constructors or setters, Promise can inject dependencies in a way that looks a lot more like Service Location without its drawbacks. We don't have constructors, just a resolution mechanism. We do not inject dependencies through the initialization call of the class, we simply declare fields and either execute resolution manually or have the language take care of it for us:

class TwitterAPI {
  _stream = NetworkStream{host: "api.twitter.com"};   // manual resolution
  +AuthProvider _authProvider;                        // automatic resolution
  ...
}

stream simply calls the Stream resolver as its initializer, which uses the IoC to resolve the instance, while authProvider uses the plus (+) prefix on the field type to tell the IoC to initialize the field. The only difference in behavior is that the first allows the passing of an initialzer JSON block, but using the resolver with just (); is identical to the + notiation.

Instance Disposal and Garbage Collection

Promise eschews destructors and provides Disposers in their stead. What, you may ask, is the difference? Instance destruction does not happen until garbage collection which happens at the discretion of the garbage collector. But disposal happens at context exit which is deterministic behavior.

class ResourceHog {
   +Stream _stream; // likely automatic disposal promotion because of disposable field

  ~:{
      // explicit disposer
   };
}

Instances go through disposal if they either have a Disposer or have a field value that has a Disposer. The Disposer is a method slot named by a tilda (~). Of course the above example would only need a disposer if Stream was mapped to a non-disposing implementation. Accessing a disposed instance will throw an exception. Disposers are not part of the Type contract which means that deciding whether or not to dispose an instance at context exit is a runtime decision made by the context.

Having deterministic clean-up behavior is very useful, but does mean that if you capture an instance from an inner context in an outer context, it may suddenly be unusable. Not definining a Disposer may not be enough, since an instance with fields does not know until runtime if one of the fields is disposable and the instance may be promoted to disposable. The safest path for instances that need to be created in one context and used in another is to have them attached to either a common parent or the root context, both options covered below.

Defining instance scope

FactoryScoped

This default scope for creating a new instance per resolver invocation is called FactoryScoped and can also be manually set (or reset on an existing registration) like this:

// Setup (or reset) the default lifespan to factory
$#[User].FactoryScoped;

// two different instances
var bob = User{name: "bob"};
var mary = User{name: "mary"};

A .FactoryScoped instance may be garbage collected when no one is holding a reference to it anymore. Disposal will happen either at garbage collection or when its execution context is exited, whichever comes first.

ContextScoped

The other type of lifespan scoping is .ContextScoped:

// Setup lifespan as singleton in the current context
$#[Catalog].ContextScoped;

// both contain the same instance
var catalogA = Catalog();
var catalogB = Catalog();

This registration produces a singleton for the current execution context, giving everyone in that context the same instance at resolution time. This singleton is guaranteed to stay alive throughout the context's life and disposed at exit.

Definining execution contexts

All code in Promise runs in an execution context, i.e. at the very least there is always he default root context. If you never define another context, a context scoped instance will be a process singleton.

You can start a new execution scope at any time with a context block:

context {
  ...
}

Context scoped instances are singletons in the current scope. You can define nested contexts, each of which will get their own context scoped instances, providing the following behavior:

$#[Foo].ContextScoped;
context {
  var fooA = Foo();

  context {
    var fooB = Foo(); // a different instance from fooA
  }
}

Since the context scope is tied to context the instance was resolved in, each nested context will get it's own singleton.

Context names

But what if i'm in a nested context, and want the instance to be a singleton attached to one of the parent contexts, or want a factory scoped instance to survive the current context? For finer control, you can target a specific context by name. The root context is always named :root, while any child context can be manually named at creation time. If not named, a new context is assigned a unique, random symbol.

println context.name; // => :root

context(:inner) {
  $#[Catalog].In(:inner).ContextScoped;
  $#[Cart].ContextScoped;

  var catalogA = Catalog();
  var cartA = Cart();

  context {
    var catalogB = Catalog(); // same instance as catalogA
    var catalogC = Catalog(); // same instance as A and B
    var cartB = Cart(); // different instance from cartA
    var cartC = Cart(); // same instance as cartB
  }
}

While, .Use and .(Factory|Context)Scoped can be used in any order, the .In method on the registration should generally be the first method called in the chain. When omitted, the global version of the Type registration is modified, but when invoked with .In, a shadow registration is created for that Type in the specified context. The reason for the deterministic ordering is that registration is just chaining method calls, each modifying a registration instance and returning the modified instance. But .In is special in that it accesses one registration instance and returns a different one. Consider these three registrations:

$#[Catalog].In(:foo).Use<DbCatalog>.ContextScoped;
// vs.
$#[Catalog].ContextScoped.In(:foo).Use<DbCatalog>;
// vs.
$#[Catalog].ContextScoped.Use<DbCatalog>;

These registrations mean, in order:

"for the type Catalog in context :foo, make it context scoped and use the class DbCatalog,"
"for the type Catalog__, make it context scoped, and in context :foo__, use class DbCatalog," and
"for the type Catalog, make it context scoped, use the class DBCatalog and in context :foo ..."

The first is what is intended 99% of the time. The second one might have some usefulness, where a global setting is attached and then additional qualifications are added for context :foo. The last, however, is just accidental, since we set up the global case and then access the context specific one based on the global, only to not do anything with it.

This ambiguity of chained method could be avoided by making the chain a set of modifications that are pending until some final command like:

$#[Catalog].ContextScoped.In(:foo).Use<DbCatalog>.Build;

Now it's a set of instructions that are order independent and not applied to the registry until the command to build the registration. I may revisit this later, but for right now, I prefer the possible ambiguity to the extraneous syntax noise and the possibility of unapplied registrations because .Build was left off.

What about thread isolation?

One other common scope in IoC is one that has thread affinity. I'm omitting it because as of right now I plan to avoid exposing threads at all. My plan is to use task based concurrency with messaging between task workers and the ability to suspend and resume execution of methods a la coroutines instead. So the closest analog to thread affinity i can think of is that each task will be fired off with its own context. I haven't fully developed the concurrency story for Promise but the usual thread spawn mechanism is just too imperative where I'd like to stay declarative.

Putting it all together

With scoping and context specific registration, it is fairly simple to produce very custom behavior on instance access without leaking the rules about mapping and lifespan into the code itself. Next time I will show how all these pieces can be put together, to easily build the Repository Pattern on top of the language level IoC.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

July 19, 2010
in net, geek, promise
5 min read

Promise: IoC Type/Class mapping

Before I can get into mapping, I need to changed the way I defined getting an instance in Type and Class definition:

Getting an instance in Promise, revisited

When I talked about Object.new, I eluded to it being a call on the Type, not the Class and the IoC layer taking over, but I was still trapped in the constructor metaphor so ubiquitous in Object Oriented programming. .new is really not appropriate, since we don't know if what we are accessing is truly new. You never call a constructor, there is no access to such a beast, instead it can best be thought of an instance accessor or instance resolution. To avoid confusing things further with a loaded term like new, I've modified the syntax to this:

// get an instance
var instance = Object();

We just use the Type name followed by empty parentheses, or in the case that we want to pass a JSON initializer to the resolution process we can use:

// get an instance w/ initializer
var instance = Object{ foo: "bar" };

As before, this is a call against the implicit Type Object, not the Class Object. And, also as before, creating your own constructor intercept is still a Class Method, but now one without a named slot. The syntax looks like this (using the previous post's example):

Song:(name) {
  var this = super;
  this._name = name;
  return this;
}

The important thing to remember is that the call is against the Type, but the override is against the Class. As such we have access to the constructor super, really the only place in the language where this is possible. Being a constructor overload does mean, that a call to Song{ ... } will not necessarily result in a call to the Song class constructor intercept, either because of type mapping or lifespan managment, but i'm getting ahead of myself.

How an instance is resolved

Confused yet? The Type/Class overlapping namespace approach does seem needlessly confusing when you start to dig into the details, but I feel it's a worthwhile compromise, since for the 99% use case it's an invisible distinction. Hopefully, once I work through everything, you shouldn't even worry about there being a difference between Type and Class -- things should just work, or my design is flawed.

In the spirit of poking into the guts of the design and explaining how this all should work, I'll stop hinting at the resolution process and instead dig into the actual usage of the context system.

The Default Case

// creates new instance of class User by default
var song = User{name: "bob"};

This call uses the implicit mapping of the User type to class and creates a new User class instance. If there is no intercept for the User() Class Method, the deserializer construction path is used and if there exists a field called _name, it would be initialized with "bob".

Type to Class mapping

// this happens implicitly
$#[User].Use<User>; // first one is the Type, the second is the Class

// Injecting a MockUser instance when someone asks for a User type
$#[User].Use<MockUser>;

Promise uses the $ followed by a symbol convention for environment variables popularized by perl and adopted by php and Ruby. In perl, $ actually is the general scalar variable prefix and there just exist certain system populated globals. In Promise, like Ruby, $ is used for special variables only, such as the current environment, regex captures, etc. $# is the IoC registry. Using the array accessor with the Type name accesses the registry value for that Type, which we call the method Use<> on it.

The Use<> call betrays that Promise support a Generics system, which is pretty much a requirement the moment you introduce a Type system. Otherwise you can't create classes that can operate on a variety of other typed instances without the caller having to cast instances coming out to what they expect. Fortunately Generics only come into play when you have chosen typed instances, otherwise you just treat them as dynamic duck-typed instances that you can call whatever you want on.

Type to lambda mapping

The above mapping is a straight resolution from a Type of a class. But sometimes, you don't want a one-to-one mapping, but rather want a way to dynamically execute some code to make runtime decisions about construction. For this, you can use the lambda signature of .Use:

$#[User].Use {
  var this = Object $_;
  this:Name() { return _name; };
  return this;
};

The above is a simple example of how a dynamic type can be built at runtime to take the place of a typed instance. Of course any methods promised by User not implemented on that instance will result in a MethodMissing runtime exception on access.

The $_ environment variable is the implict capture of the lambda's signature as a JSON construct. This allows our mapping to access whatever initializer was passed in at resolution access.

$#[User].Use { return MockUser $_; };

The above example looks like it's the same as the $#[User].Use<MockUser> example, but it has the subtle difference that MockUser in this scenario is the Type, not the Class. If MockUser were mapped as well, the resolved instance would be of another class.

Doing more than static mapping

But you don't have to create a new instance in the .Use lambda, you could do something like this:

// Don't do this!
var addressbook = AddressBook();
$#[AddressBook].Use { return addressbook; };

This will create a singleton for AddressBook, but it's a bad pattern, being a process-wide global. The example only serves to illustrate that .Use can take any lambda.

So far, mapping just looks like a look-up table from Type to Class, and worse, one that is statically defined across the executing process. Next time I will show how the IoC container isn't just a globally defined Class mapper. Using nested execution context, context specific mappings and lifespan mappings, you can easily created factories, singletons and shared services, including repositories, and have those definitions change depending on where in your code they are accessed.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

July 17, 2010
in net, geek, promise
5 min read

Promise: Inversion of Control is the new garbage collection

Before continuing with additional forms of method defintions, I want to take a detour through the Inversion of Control facilities, since certain method resolution behavior relies on those facilities. IoC is one feature of Promise that is meant to not be seen or even thought about 99% of the time, but when you need to manipulate its default behavior it is a fairly broad topic, which I will cover in the next 3 or 4 posts. If you want to see code, you probably want to just go to the next post, since this one is mostly about the reasoning for inclusion of IoC in the language itself.

The evolution of managing instances

Not too long Garbage Collection was considered the domain of academic programming. My first experience with it was writing LISP on Symbolics LISP machines. And while it was a wonderful development experience, you got used to the Listener (think REPL on LISP machines) to pause and the status Genera status bar blinking with (garbage-collect). Ok, but that's back on hardware significantly less powerful than my obsolete Razor flip-phone.

These days garbage collection is pretty much a given. The kind of people that say you have to use C to get things done are the same kind of people that used to say that you have to use assembly to get things done, i.e. they really are talking about edge cases. Even games are using scripting languages for much of their game logic these days. Whether it's complex generational garbage collection or simple reference counting, most languages are memory managed at this point.

The lifespan phases of an instance

But still we have the legacy of malloc and free with us. We still new-up instances and while there's fairly little use of destructors, we still run into scenarios that require decomissioning of objects before garbage collection gets rid of them. And while on the subject of construction and destruction, we're still manually managing the lifespan from creation to when we purposely let them drop out of scope so GC can do its magic.

Somehow while moving to garbage collection so that we don't have to worry about that plumbing, we kept the plumbing of manually handling construction, initialization and disposal. That doesn't seem like work related to solving the task at hand, but rather more like ceremony we've grown used to. We now have three phases in an instance lifespan, only one of which is actually useful to problem solving:

Construction/Initialization

Depending on which language you are using, this might be a single constructor stage (Java, C#, Ruby, et al) or an allocation and initialization stage (Smalltalk, Objective-C, et al). Either way, you do not want your code to start interacting with the instance until these stages are completed

Operation

This is the useful stage of the instance, when it actually can fullfill its purpose in our program. This should really be the only stage we ever need to see.

Disposal/Destruction

We're done with the instance, so we need to clean up any references it has and resources it has a hold of and let the garbage collector do the rest. By definition, it has passed its useful life and we just want to make sure it's not interfering with anything still executing.

Complicating disposal is that most garbage collected languages have non-deterministic destructors, which are not invoked until the time of collection and may be long after use of the instance has ceased. Since there are scenarios where clean-up needs to happen in a deterministic fashion (such as closing file and network handles), C# added the IDisposable pattern. This pattern seems more like a "oh, crap, what do we do about deterministic cleanup?" add-on than a language feature. It completely puts the onus on the programmer both for calling .Dispose (unless in a using block) and for handling access to an already disposed instance.

Enter Inversion of Control

For the most part, all we should care about is that when we want an instance with certain capabilities, we should be able to get access to one. Who cares if it was freshly constructed or a shared instance or a singleton or whatever. Those are details that are important but once defined not part of the user story we set out to satisfy.

In Java and C#, this need for pushing instance management out of the business logic and into dedicated infrastructure led to the creation of Inversion of Control containers, named thus because they invert the usual procedural flow of "create an object, hand it to another object constructor as a dependency, etc." to "ask for the object you need and the depedency chain will be resolved for you". There are numerous articles on the benefits of Dependency Injection and Inversion of control. One of the simplest explanation was given by John Munch to the Stackoverflow question "How to explain Dependency Injection to a 5-year-old":

When you go and get things out of the refrigerator for yourself, you can cause problems. You might leave the door open, you might get something Mommy or Daddy doesn't want you to have. You might even be looking for something we don't even have or which has expired.

What you should be doing is stating a need, "I need something to drink with lunch," and then we will make sure you have something when you sit down to eat.

But IoC goes beyond the wiring-up of object graphs that DI provides. It is also responsible for knowing when to hand you a singleton vs. a shared instance for the current scope vs. a brand new instance and handles disposal of those instance as their governing scopes are exited.

These frameworks are build on top of the existing constructor plumbing and use reflection to figure out how to take over the tasks that used to fall to the programmer. For Promise this plumbing is considered a natural extension of what we already expect of garbage collection and tries to be automatic and invisible.

By default every "constructor" access to an instance resolves the implicit Type to the Class of the same name, and creates an instance, i.e. behavior as you expect from OO languages. However, using nested execution scopes, lifespan management and Type mapping, this behavior can be modified without touching the business logic. In the next post, I'll start by explaining how the built in IoC works by tackling Type/Class mapping.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

July 16, 2010
in net, geek, promise
1 min read

Promise: Constructor revisionism

Only 3 posts into the definition of the language and already I'm changing previously published specs. Well, that's the way it goes.

I'm currently writing the article about language level IoC which I eluded to previously, but the syntax effects I had not fully considered yet. The key concept, tho, is that there is no construction, there is only instance resolution, which .new being a call on the Type not Class hinted at. But that does mean that what you get does not necessarily represent a new instance.

And beyond naming implications, the implications of what arguments passed into the resolution call mean is also ambiguous. The could be initialization values or they could be arguments to determine which instance of that Type to fetch (like in a Repository pattern). And if that's the case, the overloading this process becomes tricky as well, since it should access the super class, which means it only makes sense in the construction paradigm.

Basically lots of syntactic implications I'm working through right now. The only thing that is certain is that .new will not make it through that review process.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.