Skip to content

geek

Mastodon Integration

Well, damn it. Here I am procrastinating on my archviz project with a server side project. I am an infrastructure engineer at heart, so anything server related will always pull at me harder than any other project. So right after writing the Future Work section in my last post, I started looking more closely at Daniel's approach to using mastodon for comments. The part that scratched my server-side geek itch was his chicken-egg problem forcing the workflow:

  1. create post
  2. create toot
  3. update post with toot id

I just want to publish my post, have a matching toot automatically be generated and have the original post be able to discover the corresponding toot_id. I want to keep my blog static and don't really want to add this into the build pipeline, so what I need is a web service that will lazy publish the toot and return the toot_id for a given URL.

Josh.js 0.3 roadmap: rethinking I/O

My goal has been to expand Josh.Shell to be a better emulation of a console window. It seems simple enough: execute a command, print the output in a div. But print something longer than your shell window and you immediately see where the real thing has a couple more tricks up its sleeve:

cat foo.txt

If foo.txt is too large, you could use less:

less foo.txt

And now you pagination, plus a couple of other neat features.

Ok, let's implement that in Josh.Shell. Let's start with the less command. It could be hardcoded to know the div the content is displayed in, do some measuring and start paginating output. Aside from being ugly, you immediately run into another problem: In order to determine after what line you pause paginated output, how do you determine where one line begins and ends. Which only further exposes the problem of output in a div, sure the browser will wrap the text for you, but by delegating layout to the browser, you've lost all knowledge about the shape of the content.

To start taking back control over output we need the equivalent of TermCap, i.e. an abstraction of our terminal div that at least gives height and width in terms of characters. Next, we need change output to be just a string of characters with line feeds. This does lead us down a rabbit hole where we'll eventually need to figure out how to handle ANSI terminal colors, and other character markup, but for the time being, let's assume just plain-text ASCII.

Now we could implement a basic form of less. But chances are the first time you want to use less is a scenario such as this:

ls -lah /foo | less

i.e. we don't have a file that we want to display, we have the output of an existing command that we want to pipe into less. And this is where we run into our next problem: Josh.Readline has only one processor for the entire command line, i.e. the above will always be handled by the command handler attached to ls. And while we could make that command handler smart enough to understand | we'd have to do it for every command and then do the same for < and >.

Intercepting command and completion handling

No, what we need is a way to intercept readline processing before it goes to the command handler for either execution or completion, so that we can recognize command separators and handle each appropriately. It also means that the command no longer should return their output to the shell, but that the pre-processor executing multiple commands receives it and provide it as input for the next command.

The pre-processor work will go in Josh.Readline and can be discussed via Issue 16, while the pipping behavior will be implemented on top of the pre-processor work and discussion of it should happen on Issue 18.

Standard I/O

We certainly could just chain the callbacks, but we still have no way of providing input, and we'd end up being completely synchronous, i.e. one command would have to run to completion before its output could be piped to the next.

Rather than inventing some crazy custom callback scheme, what we are really looking at is just standard I/O. Rather than getting a callback to provide output, the command invocation should receive an environment, which provides input, output and error streams along with TermCap and a completion code callback. The input stream (stdin) can be only be read from while output (stdout) and error (stderr) can only be written to. As soon as the out streams are written to, the next receiver (command or shell) will be invoked with the output as its input. Stderr by default will invoke the shell regardless of what other commands are still in the pipeline.

All these changes are planned for 0.3, a minor revision bump, because it will likely introduce some breaking changes. I don't want to stop supporting the ability to just return HTML, so the stdio model might be something to opt in, leaving the current model in place. If you have feedback on the stdio and TermCap work, please add to the discussion in Issue 14.

One other pre-requisite for these changes is Issue 3. In a regular console, text followed by a backlash and a space and more text or quoting a string treats that sequence of characters as a single argument. Josh.Readline does not do this, causing some problems with completing and executing arguments with spaces in them and that will be even more of a problem once we support piping, so that needs to be fixed first.

Using Josh for form data

While Josh.js was built primarily for creating bash-style shells in the browser, the same mechanisms would be nice to use for form input. Yes, it does break the expected user behavior, but if you are creating a UI for unix people this might just be the geeky edge to push usability over the top. Problem was, Josh.js originally bound itself to the document root and you had to activate/deactivate it to take over key interception. While you could manually trigger this, it was less than ideal for having multiple instances of Josh.Readline on one page each attached to an input field. With todays release of Josh.js (marked minor, since it's backwards compatible and incompatible breaking changes are on the horizon), readline can bind to an element and can even do so after the fact. In addition, key bindings of josh can be disabled or remapped to better match your use case. Finally, because taking over the UX of form input is unfortunately not quite as simple as just binding readline to an input element, Josh.js now includes Josh.Input to simplify the binding.

Josh.Input

There are two ways to create a josh form element. Either you attach Josh.Input to an <input> field or a <span>. The former preserves the standard look and feel of an input field, but has to play some tricks to properly handle cursor positioning. The latter can be styled any way you like and uses the underscore cursor behavior also used in Josh.Shell.

The former uses html like this:

<input id="input1" type="text" />

and binds it to Josh.Input with:

var cmd1 = new Josh.Input({id: "input1"});

to produce:

Input:

And the latter will generally use a span like this:

<span id="input2"></span>

Adorned with whatever css you like to create the input look and feel and then is bound to Josh.Input the same way with:

var cmd2 = new Josh.Input({id: "input2"});

creating an input like this:

Span:

Note that the two inputs share history and killring, which in addition to line editing behavior makes them so much more powerful than just plain old input boxes. Also note that the only reason we capture the object created by new Josh.Input is so that we could access its members, such as the Josh.Readline instance bound to the element. As I said, I wouldn't recommend using this in a regular form, since form input behavior on the web is rather well established, but a custom app that evokes a unix input mentality could certainly benefit from it.

Changes to Josh.js for Input

Adding Josh.Input revv'ed Josh.js to 0.2.9 (i.e. no breaking changes), which allows Josh.Readline and by extension Josh.Shell to be bound to an element. When Josh.Shell is bound to an element, it will now activate/deactivate on focus (for this it will add a tabindex to the shell element if one isn't already there). Binding to an input or input mimicking span did illustrate that certain key bindings just don't make sense. Rather than hardcode a couple of binding exceptions for input, 0.2.9 also introduces the ability to bind and unbind the existing Readline commands to any key (modifiers work, but only a single modifier is recognized per key combo right now). This could be used to change the emacs-style bindings, but really is primarily intended to unbind commands that don't make sense. The bindable commands are:

  • complete - invoke command completion
  • done - invoke the command handler
  • noop - capture the key but don't do anything (bound to caps-lock, pause and insert)
  • history_top - go to and display the history top
  • history_end - go to and display the history end
  • history_next - go to and display the next item in the history
  • history_previous - go to and display the previous item in the history
  • end - cursor to end of line
  • home - cursor to beginning of line
  • left - cursor left
  • right - cursor right
  • cancel - interrupt the current command input
  • delete - delete character under cursor
  • backspace - delete character to the left of cursor
  • clear - clear the shell screen
  • search - start reverse search mode
  • wordback - cursor to previous word
  • wordforward - cursor to next word
  • kill_eof - kill text to end of line
  • kill_wordback - kill the previous word
  • kill_wordforward - kill the next word
  • yank - yank the current head of the killring to the cursor position
  • yank_rotate - if following yank, replace previously yanked text with next text in killring

Binding commands to keys is done with:

readline.bind(key,cmd);

and unbinding is done with:

readline.unbind(key);

where key is:

{
 keyCode:  120
 // or
 char: 'x',
 // plus one of these:
 crtlKey: true,
 metaKey: true
}

Josh also provides a simply lookup for special key keycodes:

Josh.Keys = {
    Special: {
      Backspace: 8,
      Tab: 9,
      Enter: 13,
      Pause: 19,
      CapsLock: 20,
      Escape: 27,
      Space: 32,
      PageUp: 33,
      PageDown: 34,
      End: 35,
      Home: 36,
      Left: 37,
      Up: 38,
      Right: 39,
      Down: 40,
      Insert: 45,
      Delete: 46
    }
  };

Josh.Input automatically unbinds Tab and Ctrl-R.

All these changes do not affect existing usages of Josh.js, however 0.3 is coming up soon and it may have some breaking changes (will try not to, but can't determine yet if that's possible), but I'll talk about those plans in a future post

List comprehension over non-enumerables in C

As I was trying to finish up my C# implementation of Try[A] for Scando, I ran into some LINQ issues. Or at least I thought I did. Using the LINQ syntax kept failing to call my extension methods and instead fell back to the IEnumerable<T> ones. I thought to myself how unfortunate it was that the LINQ syntax was hard wired to IEnumerable<T>, otherwise it would be perfect to emulate Scala's for comprehensions.

Fortunately I had just watched Runar Bjarnason's excellent "Lambda: The Ultimate Dependency Injection Framework" and recalled usage of a container type similar to how Option<T> and Try<T> are used. I also recalled that projecting such a type with a function that also returned a new instance of that container type via map (roughly equivalent to Select) created nested containers, yet I had a test for Option<T> that looked like this:

[Test]
public void Can_chain_somes() {
 var r = Api.DoSomething()
    .Select(Api.Passthrough)
    .Select(Api.Passthrough)
    .GetOrElse("bar");
 Assert.AreEqual("foo", r);
}

This should have returned an Option<Option<Option<string>>> but instead just returned Option<string>. Obviously my implementation of Select was wrong, which would explain why LINQ syntax didn't use it. And that probably meant that my assumptions about IEnumerable were wrong as well. Runar solved the nested container problem by implementing flatMap which also enabled for comprehensions in Scala. For me that meant that I was also missing SelectMany to make LINQ work properly.

Scala For Comprehension

Before I get into LINQ, I should cover why wrapping a value in a single or no value collection is a useful thing in the first place. Let's look at an example of functions returning Option[A]:

for {
  user <- UserRepository.findById(1) if user.isActive
  profile <- ProfileRepository.getProfileByUser(user)
  url <- profile.url
} yield url

This will get a single Option[User], check that the user is active, fetch the user's profile and then return the profile url. What makes this code special is that we don't have to check that the previous call returned a value. If UserRepository returned None, then the if guard never gets executed (since it gets executed for all values in the "Collection") and likewise ProfileRepository is never called since no user was selected.

With for comprehensions and the Option container we can chain calls and have it handle the short circuiting along the way so we will always receive an Option of the yielded type without checking return values along the way.

LINQ and IEnumerable

Ok, but that's subverting a list processing construct for its behavior on empty collections and Scala's excellent type inference deals with types changing throughout the comprehensions. Can LINQ pull that off?

Turns out, yep, it can. The LINQ syntactic sugar doesn't have any ties to operating on IEnumerable or any type at all. To mimic the Scala for comprehension above, we just need to implement 3 methods.

Select: Transforming C to C

For illustration purposes, I've create a container Type C<T> with an accessor Value. It does not implement IEnumerable<T> as Option<T> originally did (which I've since removed since the only reason had been that I thought I needed it for LINQ).

Let's start with the simplest comprehension style LINQ query:

from x in new C<int>(1) select x

At this point the compiler complains with Could not find an implementation of the query pattern for source type 'Sandbox.C'. 'Select' not found. Simple enough, let's add Select to C<T>:

public C<TResult> Select<TResult>(Func<T, TResult> selector) {
  return new C<TResult>(selector(_value));
}

Note: LINQ does not care whether its methods are provided as methods on the object or extensions. For this example I'm just attaching it straight on C<T> instead of using an Extension Method, so I don't need the this C<T> source argument.

Now that query works and we are getting back a C<T>, not an IEnumerable<T>. That means that we can call .Value directly on the result of the query.

SelectMany: Flatting C>

Next up is passing the result from the first as input to the next call:

from x in new C<int>(1)
from y in new C<int>(x)
select y

Again, it won't compile. Now it complains about SelectMany. Since there are multiple overloads of SelectMany, let's start with the simplest, i.e. a selector to transform T into C<TResult>:

public C<TResult> SelectMany<TResult>(Func<T, C<TResult>> selector ) {
   return selector(Value);
}

Except that just leads to a new compiler error: No overload for method 'SelectMany' takes 2 arguments. To get a closer look at what the compiler wants to do, we can just rewrite the query using int[] instead of C<T>:

from x in new[] {1}
from y in new[] {x}
select x;

Then, using IlSpy and turning Query decompilation off, we can see what .NET actually translates that LINQ to:

new int[] {1}.SelectMany(x => new int[] {x}, (x, y) => y);

That's interesting. Instead of

SelectMany<TSource, TResult>(
  this IEnumerable<TSource> source,
  Func<TSource,IEnumerable<TResult>> selector
)

It's using

SelectMany<TSource, TCollection, TResult>(
  this IEnumerable<TSource> source,
  Func<TSource,IEnumerable<TCollection>> collectionSelector,
  Func<TSource,TCollection,TResult> resultSelector
)

but passes in a resultCollector that merely ignores the first argument and always returns the second, making it functionally equivalent of the first signature. Here's what that looks like as a Method on C<T>:

public C<TResult> SelectMany<TCollection, TResult>(
    Func<T, C<TCollection>> collectionSelector, 
    Func<T, TCollection, TResult> resultCollector
) {
  var c = collectionSelector(Value);
  return new C<TResult>(resultCollector(Value, c.Value));
}

And now the LINQ query with multiple from clauses works.

Where: What does a short circuit of C look like?

Finally, we want to add the guard clause, which in LINQ is simply Where:

from x in new C<T>(1) where x == 1
select x

with Where having the following signature

C<T> Where(Func<T, bool> predicate) { ... }

Implementing the guard poses a problem, though. Where will always return a C<T>, but what does a C<T> that failed the predicate check look like? Which also raises the question, what value does the from retrieve when encountering such a C<T>? If the point was that comprehension lets us short circuit the execution chain, then we need a way to indicate that C<T> doesn't have a value.

For this implementation, I've simply added an IsEmpty flag and use the no argument constructor to construct an empty version. While that takes care of implementing Where, it does point to an oversight in Select and SelectMany. Both need to be aware of IsEmpty and return an empty C<T> rather than executing the selectors in order to provide short-circuiting behavior. The full implementation of C<T> now looks like this:

public class C<T> {

  public C() { IsEmpty = true; }
  public C(T value) { Value = value; }

  public T Value { get; private set; }
  public bool IsEmpty { get; private set; }

  public C<T> Where(Func<T, bool> predicate) {
    if(IsEmpty) {
      return this;
    }
    return predicate(Value) ? this : new C<T>();
  }

  public C<TResult> Select<TResult>(Func<T, TResult> selector) {
    return IsEmpty ? new C<TResult>() : new C<TResult>(selector(Value));
  }

  public C<TResult> SelectMany<TCollection, TResult>(
    Func<T, C<TCollection>> collectionSelector,
    Func<T, TCollection, TResult> resultCollector
  ) {
    if(IsEmpty) {
      return new C<TResult>();
    }
    var c = collectionSelector(Value);
    return c.IsEmpty ? new C<TResult>() : new C<TResult>(resultCollector(Value, c.Value));
  }
}

That still does not answer how from x in a works since that clearly pulls a value from the container. You would expect an enumerator to be required. The key to understanding how this construct works is that the transformation of the call becomes a Select or SelectMany, which has no concept of enumeration. Both have a selector that takes T and converts it into the containerized T, C<T>. Since traditionally the container is IEnumerable, the natural association for x in a is an enumeration, but it really is just an instance of the container with no assumption about what it contains.

Applying LINQ comprehensions

Now that we know how to create a container type, we can create return values for our methods that follow those principles and rewrite the Scala example in C#:

from user in UserRepository.FindById(1) where user.isActive
from profile in ProfileRepository.GetProfileByUser(user)
from url in profile.url
select url

Essentially, C<T> is just a naive re-implementation of Option<T>. The reason I created it instead of just walking through the existing type was to illustrate this container type pattern. It's the pattern Option<T> uses, as does Try<T> and it is used it to implement the Reader Monad for functional dependency injection. The essence of this pattern is that by wrapping T in a container we can attach an environment to it to carry along additional data. In the case of Option, it was the ability to determine the presence or lack of a value for T. In case of Try it is presence of an error occurring while trying to produce T. and for functional Dependency Injection it was the ability to carry along dependencies that a call at the end of a long chain needed without the intermediate calls having to be aware of them. By using the container, functions that care about the environment can take it as input or return it, while functions that don't can be lifted to fit the required shape and take part in the composition without ever needing to be aware of the environment being passed along.

Having used for comprehensions in Scala and seeing how it can remove a lot of imperative boiler plate that does nothing to aid in the comprehension of the workflow, it is nice to see that it requires just three methods to allow LINQ to be used for a similarly expressive syntax for composing method chains.

Scala in the key of C#: Option

Update: Since I first released Option, I've realized that it implementing IEnumerable was unneccesary to use LINQ syntax and actually made it less useful. I've updated the post to reflect these changes to Option.

Currently reading Daniel Westheide's excellent "Neophyte's guide Scala" series and it's inspired me to port some concepts from Scala to C#. This is less about trying to replace Scala -- it has too many compelling aspects that cannot be mimicked -- and more about exploring concepts I come across in Scala in the realm i'm best versed in, and, hopefully, along the way, create some useful C# utilities. First up, the Option<T> type.

Get thee behind me, null

Reading The Option Type the simplicity and elegance struck me. My Option<T> is inspired by Scala's Option[A], which in turn has its origin in Haskell's Option. On the surface it's a pretty simple container for a value that may or may not exist.

var some = Option<string>.Some("foo");

// manually checking for defined
Console.WriteLine(some.IsDefined ? some.Value : "not defined");
// => "foo"

// or use the built in helper
Console.WriteLeing(some.GetOrElse("not defined"));
// => "foo"

// None is a singleton
var none = Option<string>.None;

Console.WriteLine(none.GetOrElse("not defined"));
// => "not defined"

Nobody likes NullReferenceException, so Option makes reference types behave more like nullable value types. But with the null coalescing operator ??, trying to check for null and substituting an alternative value is already very simple in .NET, so why bother with an Option port? After all, the most compelling usage of Option in scala, pattern matching, just cannot be reproduced in .NET, so there goes a large part of the motivation. But once I learned that Option[A] behaves like an Iterable and therefore could use all the common collection methods, I was intrigued.

The Power of LINQ compels you

You see, an Option is can be considered a collection of zero or one values. By implementing the Select, SelectMany and Where Extension Methods for Option we can use LINQ to chain calls together. This makes Option much more composable than manual null checks.

var title = (from url in ParseUrl(urlStr)
             from response in GetWebRequest(url)
             from doc in GetXmlBody(response)
             from title in GetTitleTag(doc)
             select title);
if(title.IsDefined) {
  Console.WriteLine(title.Value);
}

Each method returns an Option, for which the compiler insert the appropriate extension method for the specific LINQ syntax. An undefined Option behaves just like and empty IEnumerable, i.e. the selector callback is skipped and the query is short circuited. Using the from x in a syntax uses SelectMany (if there is more than one from clause) and could just as easily have been written by manually chaining the calls:

var title = ParseUrl(urlStr)
            .SelectMany(GetWebRequest)
            .SelectMany(GetXmlBody)
            .SelectMany(GetTitleTag);
Console.WriteLine(title.GetOrElse("no title found"));

What Else can Option do for Me?

Option implements equality comparison based on the contained type, so two Options of the same type containing the same value will be equal, according to that type's rules. Along with that Option<T>.None equals null and all None regardless of type have the same hashcode, i.e. 0.

Finally, in addition to GetOrElse, there is also OrElse, which provides a way to substitute a different Option, allowing the chaining of many calls to get the first one that returns a value.

Scando

Option is available on github under the MIT license in the Scando repository. This repository is where I plan to put all code inspired by Scala. There's already an implementation of the Try type in there as well, which I will be writing about shortly.

CLI all the things: Introducing Josh.js

Everytime I click my way through a hierarchy tree, I long for a simple BASH shell with TAB completion. It's such a simple thing, but TAB completion (usually implemented via the trusty Readline library) still ranks as one of the most productive tools in my book. So as I was once again trying to navigate to some page in a MindTouch site, I thought that all I really want is to treat this documentation hierarchy like a file system and mount it with a bash shell. Shortly after that I'd implemented a command line interface using our API and Miguel DeIcaza's C# Readline inspired library, GetLine. But I couldn't stop there, I wanted it built into the site itself as a Quake-style, dropdown console. I figured that this should be something that already exists in the javascript ecosystem, but alas, I only found a number of demos/implementations tightly integrated into whatever domain they were trying to create a shell for. The most inspired of them was the XKCD shell, but it also was domain specific. Worst of all, key-trapping and line editing was minimal and none of them even trapped TAB properly, leaving me with little interest in using any of them as a base for my endeavours.

Challenge Accepted

cli-all-the-things

Thus the challenge was set: I wanted a CLI w/ full Readline support in the browser for every web project I work on. That means TAB completion, emacs-style line editing, killring and history with reverse search. Go! Of course, some changes from a straight port of Readline had to be made: Commands needed to work using callbacks rather than synchronous. History needed to go into LocalStorage so it would survive page reloads, Killring wouldn't co-operate with the clipboard. But other than that it was all workable, including a simple UI layer to deal with prompts, etc., to create a BASH-like shell.

Josh.js

The result of this work is the Javascript Online SHell, a collection of building blocks for adding a command line interface to any site: Readline.js handles all the key-trapping to implement full Readline line editing. History.js is a simple command history backed by localstorage. Killring.js implements the cut/paste history mechanism popular in old skool, pre-clipboard unix applications. Shell.js provides the UI layer to quickly create a console in the browser and, finally, Pathhandler.js implements cd, pwd, ls and filepath TAB completion with simple hooks to adapt it to any hierarchy. The site for josh.js provides detailed documentation and tutorials from the hello world scenario to browsing github repos as if they were local git file systems.

Go ahead, type ~ and check it out

For the fun of it, I've added a simple REST endpoint to this blog to get basic information for all published articles. Since I already publish articles using a YYYY/MM/Title naming convention, I turned the article list into a hierarchy along those path delimiters, so it can be navigated like a file system like this:

/2010
  /06
    /Some-Post
    /Another-Post
  /10
    /Yet-Another-Post

In addition I added a command, posts [n], to created paged lists of articles and a command, go, to navigate to any of these articles. Since the information required (e.g. id, title, path) is small enough to load quickly in its entirety, I decided to forego the more representative use of Josh.js with a REST callback for each command/completion and instead load it all at initialization and operate against the memory model. I wrote a 35 line node.js REST API to serve up the post json which is called on the first console activation, takes the list of articles and builds an in-memory tree of the YYYY/MM/Title hierarchy:

var config = require('../config/app.config');
var mysql = require('mysql');
var _ = require('underscore');
var connection = mysql.createConnection({
  host: 'localhost',
  database: config.mysql.database,
  user: config.mysql.user,
  password: config.mysql.password
});
connection.connect();
var express = require('express');
var app = express();
app.configure(function() {
  app.use(express.cookieParser());
  app.use(express.bodyParser());
});
app.get("/posts", function(req, res) {
  connection.query(
    "SELECT ID, post_title, post_name, post_date " +
      "FROM wp_posts " +
      "WHERE post_status = 'PUBLISH' AND post_type = 'post' " +
      "ORDER BY post_date DESC",
    function(err, rows, fields) {
      if(err) throw err;
      res.send(_.map(rows, function(row) {
        return {
          id: row.ID,
          name: row.post_name,
          published: row.post_date,
          title: row.post_title
        }
      }));
    });
});
app.listen(config.port);

Implementing unix filesystem style navigation in the console is as simple as adding Josh.PathHandler to the shell and providing implementations of getNode(path, callback) and getChildNodes(node, pathParts, callback). These two functions are responsible for finding nodes by path and finding a node's children, respectively, which is the plumbing required for pwd, ls, cd and TAB completion of paths. The posts command is even simpler: Since it only takes an optional numeric argument for the page to show, there is no completion handler to implement. The execution handler simply does a slice on the array of articles to get the desired "page", uses underscore.map to first transform the data into the viewmodel and then renders it with underscore.template:

_shell.setCommandHandler("posts", {
  exec: function(cmd, args, callback) {
    var arg = args[0];
    var page = parseInt(arg) || 1;
    var pages = Math.ceil(_posts.length / 10);
    var start = (page - 1) * 10;
    var posts = _posts.slice(start, start + 10);
    _console.log(posts);
    callback(_shell.templates.posts({
      posts: _.map(posts, function(post) {
        return {id: post.id, date: formatDate(post.published), title: post.title}
      }),
      page: page,
      pages: pages
    }));
  }
});

The final addition is the go command which acts either on the current "directory" or the provided path. Since any argument is a path, go gets to re-use Josh.PathHandler.pathCompletionHandler which ls and cd use.

_shell.setCommandHandler('go', {
  exec: function(cmd, args, callback) {
    return _pathhandler.getNode(args[0], function(node) {
      if(!node) {
        callback(_shell.templates.not_found({cmd: 'go', path: args[0]}));
      } else {
        root.location = '/geek/blog'+node.path;
      }
    });
  },
  completion: _pathhandler.pathCompletionHandler
});

Once called, the appropriate node is resolved and the path of the node used to change the window location. The console uses localStorage to track the open state of the console, so that navigating to a new page re-opens the console as appropriate while the page location is used to initialize the current directory in the console. The help, clear and history commands come for free.

Oh, and then there is wat? as a go shortcut to get to this article :)

What's next?

Go ahead, play with it, let me know whether there are assumptions built in that prevent your favorite console scenario. For reference and those interested in the nitty gritty, the full, annotated source for the console I use on this blog can be found here. Most of the lift is done by Josh.js, and its fairly simple to get your own console going. Check out the documentation here along with the tutorials that walk through some usage scenarios.

Josh.js is certainly ready for use and deployment, but until it's been put through its paces a bit more to figure out what's working and what's not, the API is still open for significant changes. I'm tracking things I am planning to do or working on via GitHub Issues. Please submit any problems there as well, or even better, provide a pull request. I hope my love for CLIs on websites will inspire others to do the same for their applications. I certainly will endeavour to include a shell in every web project I undertake for the forseeable future.

The problem with Frameworks

Over years, I've developed a dislike for frameworks, especially ORMs and web stacks such as Rails. But aside from complaining about "magic" and a vague icky feeling, I could never eloquently explain why. Meanwhile, all my spare time web projects have been done with Express and, man, I haven't had more fun in years, again not being able to eloquently explain why.

So when it came time to pick a stack for a scala web app, I was staring at a sea of options. Play seemed to be the obvious choice, being part of the TypeSafe stack, but looking at their documentation I got that icky feeling again. I searched for some vs. posts to hear what people like/dislike about the many options available. And that's where I came across a paragraph that was the best explanation of why I stay clear of frameworks when i can:

"[...] frameworks are fine, but it is incumbent upon you to have an intimate understanding of how such frameworks abstract over this infrastructure, as well as the infrastructure itself. Some will disagree with that, arguing that the very reason you want these abstractions is so you don't need this understanding. If you buy that then I wish you luck; you'll surely need it should you decide to develop a non-trivial application."

                                             -- [Scalatra vs Unfiltered vs Lift vs Play](http://www.scala-lang.org/node/10817#comment-46844)

Frameworks mean you have to know more, not less, about what you are doing

It all comes down to the fallacy that an abstraction alleviates the need to understand what it abstracts. Sure, for the scenarios where what you are trying to do maps 100% to the target scenarios of the framework and you have no bugs, frameworks can really cut down on that boiler plate. This is why scaffolding examples of most frameworks are so magically concise. They excercise exactly what the framework was built to be best at. But the moment you step out of their wheelhouse or need to troubleshoot something (even if the bug is in your own code), you'll have to not only know how the abstracted system should work but also how the abstraction works, giving you two domains to master rather than just one.

The ORM example

I used to be huge proponent of ORMs. I even wrote two and a half of them. My original love for them stemmed from believing that I could curb bad SQL getting into production by providing an abstraction. I was violently disabused of that notion and have since come to accept that badly written code isn't a structural, but rather a cultural problem of developers either not understanding what is bad or not caring because they are insulated from the pain their code causes.

But I was still convinced that ORMs were an inherent win. I went on to evangelize NHibernate, Linq-2-SQL, LLBLGen Pro. Entity Framework never got a chance, since I was already burned out by the time it stopped sucking.

Over time it became obvious ORMs were not saving me time or preventing mistakes. Quite the opposite, setup was increasingly more complex, as you had to design schemas AND write mapping code to get them to work right. Usage went the same way: I knew what SQL I wanted to execute and now had a new dance to tweaking the ORM queries to do what I wanted. I was an abstraction whisperer. And that's not even the hours wasted with bugs. Debugging through the veil of abstraction usually ended up showing not an error in business logic or data structure but simply in usage or configuration of the ORM.

I had resisted it for several years, but finally had to admit that Ted Neward was right: ORM's are the Vietnam of Computer Science.

But... DRY, damnit!

One of the attractions of frameworks is that they drastically cut down on repetitive boilerplate code. I admit that using component libraries, you will likely do more manual wiring in your initial setup than with a framework, but this is generally a one-time setup per project. In my experience that one time cost of explicit wire-up pales in comparison to the benefits of having explicit configuration that you can trace down when problems do occur. So my stance is that wire-up isn't repeated boilerplate but rather explicit specificatgion and so does not violate the spirit of DRY.

For the most part I favor components that simplify working in an infrastructure's domain language over frameworks that try to hide that domain and expose it as a different one. Sooner or later you will have to understand what's happening under the hood and when that time comes, having a collection helpers for working with the native paradigm beats trying to diagnose the interactions of two (or more) domains.

I fell victim to one of the classic blunders...

Nope, I didn't get involved in a land war in Asia, but I did let yet another exciting thought exercise trick me into picking up a coding project I don't have time for. Happenstance was supposed to be just a reflection on whether the perceived benefits of central control could be achieved in a decentralized, distributed and federated status network.

And boy, is designing the next twitter the new hawtness right now. But while the exercise was quite interesting and I have answered the question of "is it possible" to my own satisfaction, I just don't have the energy or clout to implement and evangelize it given the other projects I'm already on and are giving me the evil eye over this distraction. So, rather than letting it linger, i'm just gonna admit that I'm going to stop working on the reference implemenation of Happenstance.

What to take away from Happenstance

Before I turn my attention back to the work I've been neglecting, I do want to touch on the lessons I feel I took away from the exercise

Subscription isn't a problem

Following is already well established in RSS and PuSH solves the delivery of content in a near-realtime fashion. OStatus does a fine job of formalizing the existing infrastructure in a way to track other people's updates. And a feed having a URI, gives it a canonical name, so in the most basic extend the question of origin is solved as well. You could make the claim that this solves all the issues of distributed, federated and decentralized. But it really doesn't, it just tells you where to get things.

Trusted delivery of adhoc messaging is the problem

What makes twitter, and by extension every walled garden, valueable is not the publishing of content, but the trusted aggregation of content that you are interested in. This is both the subscription of feeds and the delivery of what's generally called mentions now.

Mentions are like emails you are CC'ed on by virtue of your name occuring in the message. But unlike email, the keeper of the walled garden instills the trust that we've all lost in email authenticity due to the ease of faking the origin. Sure Twitter has a spam problem, but you never doubt the origin of the message. You just don't care about certain messages arriving at all.

The way we read our timeline, we don't wonder if the content is faked, we trust that what is in our timeline is at least from the claimed sources. With aggregating distributed feeds you do trust that you only get messages that came from a feed you've personally validated, but the moment you add the capability to receive adhoc messages, everything message has to become suspect.

Signed content

This is why i spent more time on the message-, name server- and meta-data signing than anything else on Happenstance. In the end, you are accumulating a store of messages that represents your own copy of messages passed around that may or may not have come directly from a trusted canonical uri. And for that network to be as valuable as twitter, the ability for your reader to authenticate the content as originating from where it claims to come from will make the difference between a valueable resource and the next spam haven that people abandon for the safety of the walled gardens.

And that's the one thing i've not seen anyone else address: Everyone is talking about all these competitors to twitter being able to interchange messages, which of course they must since none can rival the reach twitter has on its own. But none of the existing and proposed mechanisms address the issue of trust in content of a remote origin. Hopefully my exposition of the Happenstance design will at least inspire the emerging standard for status distribution to avoid the pitfalls that turned email into more spam than actual discourse.

Key management strategies for Happenstance

Fundamental to the design of Happenstance is the idea that everything you publish, messages and meta-data, is cryptographically signed. The need for the signature stems from the distributed nature of the system. I.e. as messages are published and copied to any number of aggregators to be included in any number of feeds, there needs to be a way to know that the message you are reading came from the claimed author, without having to read all messages directly on the author's feed.

The challenge this brings with it, especially when striving for consumer adoption, is that the goals of accessibility/usability and security can be at odds. For this reason, I've designed (and am currently refinining/revising) the use of public key infrastructure (PKI) in such a fashion that private key compromise does not fatally affect the integrity of an author's identity.

A full explanation of how content is signed can be found in the Signing Spec. The short version is that feeds provide RSA public keys, and all message blocks are signed with an RSA private key in base64-encoded RSA-SHA256 format creating a _sig key in the message block like this:

{
  _sig: {
    name: 'key2',
    sig: 'xfZ4DmrcLbz8qPJoTwYg ...'
  },
  ...
}

The pitfalls of signing content in a service environment

In order for content to be signed, a private key has to be used. If anyone else gets their hand on your private key, they could fake messages appearing to come from you. While the fakes won't hold up to scrutiny, as I will explain below, it's still a trust issue. So the private key needs to be protected. Before I get into how a compromised key can be dealt with, let me explain why PKI for Happenstance may violate normal best practices for dealing with private keys, creating greater risk of compromise.

Ideally only you would have your private key, preferably encrypted and only decrypted in memory for the purpose of signing. That implies that all authoring has to happen client side. For some clients and users that is certainly feasible, but for non-technical users used to web applications the added level of complexity may be too cumbersome. These users have come to expect the experience provided by walled garden services with their implicit trust via authority (not that that trust can't be and hasn't been compromised).

Using a service to author and sign your messages will expose your private key to attack vectors one way or another. But since a hosted authoring experience can remove the complexity of key management, I've tried to enumerate some approaches to reduce the dangers, while simultaneously designing the ecosystem to survive the compromise of keys.

Service stores user encrypted key, sends it to the client for signing

Note that all strategies assume that communication with the host always happens over HTTPS.

The first option keeps the key encrypted by the user's passphrase at the server and sends it to the client at authoring time. The client side application can then prompt the user to decrypt it, sign content and send the signed content back. Basically, the authoring experience is implemented on the client, but hides the complexity of key management.

Drawbacks:

  • Need the password for every message (could cache in memory, which has dangers of its own)
  • Need a client that can handle encrypted RSA keys and can sign with them. At the time of this writing, I've found javascript for signing but only via unencrypted keys. I'm sure this could be addressed.
  • The encrypted key is frequently sent over the wire and its use by the client application may be exploited
  • If the service is compromised, the passphrase for each key stored at the server must be cracked, which isn't outside the realm of processing power these days.

It should be noted that this is the only approach in which the service provider might be considered untrusted, since they never get the passphrase. However they still have the key, so they could crack it or compromise the client authoring experience (since they provide it) to capture the key, so the service provider still needs to be trusted.

Service stores user encrypted key, requires password at signing

This approach moves authoring to the server, which means that the passphrase to decrypt the key must be sent to server. This makes the client a lot thinner. Less moving parts, less chance for errors or attack vectors on the client and once again the complexity of key management is hidden.

Drawbacks:

  • Need the password for every message (server could cache in memory, again introducing new dangers)
  • Passphrase is frequently sent over the wire
  • If the service is compromised, the passphrase for each key stored at the server must be cracked, which isn't outside the realm of processing power these days
  • Even worse, if the service is compromised, it could be made to capture the passphrases as well

You might think you could just use the password the user uses for login as the key passphrase, but in a properly designed system, the login password is stored in a one way hashed form, i.e. the service has no way to "decrypt" the password, it just hashes the incoming password and compares hashes. The private key, however, even though it is part of an asymetric encryption scheme is itself encrypted symmetrically, i.e. you need the same passphrase you used to encrypt the private key to decrypt it in order to use it. So the key's passphrase must be available for every signing event.

Service stores key without user encryption, signs on behalf of user without additional authentication

This final approach is the most user friendly but wrests the control over the private key from the user. The server keeps the key in a form in which it can use it to sign messages on the behalf of the user without any additional authentication. The entire key management mechanism is hidden.

Drawbacks:

  • Complete reliance on the service to protect the key in case of service compromise

This one seems to require more trust in the service provider, but in reality each version is vulnerable if the service is a bad actor or compromised. However this last method removes all need for the private key to ever be communicated between client and server. While the security of the key is now completely in the hand of the service provider, much like credit card storage, there are established practices for securely storing and using such data, such as via some form of one-way vault (usually a dedicated server not on the public network that accepts keys and can sign messages, but has no mechanism for retrieving the keys). Given that Happenstance allows for multiple keys and key expiration, a service provider that never divulges the private key to the user is likely to be the most secure and convenient approach.

Mitigating the effects of compromised keys

Since all methods (even you storing your own key and never giving it out) can be compromised, the focus really should be on mitigating the effects a compromise can have. Since Happenstance does not actually encrypt data to hide it from prying eyes, but only uses it authenticate data in the clear, a compromised key's only use is for impersonation.

A compromised key could be used to push fake posts into the ecosystem in one of two ways: Create mentions to deliver messages directly to users impersonating the owner of the key, or by "re-posting" a faked post to their followers. Fortunately, a compromised key does not mean that the user has lost control over their feed and the user could simply generate a new key and remove the existing one from their feed meta data. This way any aggregator would immediately detect it as a fake when validating incoming messages.

The side-effect of key revocation is that all messages currently in flight and older messages being re-checked would also become invalid. To guard against this, a message failing validation should be re-fetched via its canonical uri and be re-validated. This way, upon removing a compromised key, all previously signed messages can be re-signed with the a new key.

Multiple keys and dealing with loss of access to a key

Because loss of access to a key or compromise through personal or service provider negligence are realities that cannot be fully eliminated, it is important that Happenstance messaging can mitigate the loss of a key without compromising identity.

Keys being compromised isn't the only way that an author might have for discontinuing the use of a key. A service provider could never have given out the private key (as suggested above). Or a service provider went out of business and the status of their key storage is unknown. Or the author forgot their passphrase, or lost the only back-up of a key they used in client side authoring environment. etc.

In addition a user may have need for using multiple different keys due to using multiple authoring environments with different private key strategies. To support multiple keys, public keys are represented in the feed meta data like this:

{
  ...
  public_keys: {
    key2: {
      key: '-----BEGIN PUBLIC KEY----- ...'
    },
    key1: {
      key: '-----BEGIN PUBLIC KEY----- ...',
      expired: '2012-07-30T11:31:00Z'
    }
  },
  ...
}

Revocation of a key is simply the removal of the key from the author meta-data public_keys hash and re-signing all content signed by that key. Alternatively, a key can also be expired with the expired key, meaning that any message signed by that key after that time should no longer be trusted.

Expiration is best for keys that are no longer accessible, rather than compromised, since faked messages with old timestamps could be created.

You seriously want me to treat my keys as expendable?

Hopefully those better versed and more serious about the application of PKI infrastructure haven't stopped reading before this. I know that as a PKI best practice it does go against established norms for guarding private keys.

So to answer the above question: No, I do not. Happenstance is a spec and the implementation details are completely open. I'm simply providing strategies that I personally consider secure enough and which do not hinder broad user adoption. But it does not preclude anyone from creating authoring environments that only work on local, encrypted keys that are never shared. That's the point of Happenstance being broken up into small interoperating parts rather than being a large platform with lots of hard API requirements. Let the ecosystem and users decide which implementations strategy best suits their comfort level.

The problem with the Benevolent Dictator

Just 2 days left before app.net either funds or fades away. It's pretty close, so a last minute push might do it. But regardless of funding, they will have a tough road ahead, since weening the internet off the "free" user-as-product paradigm has been a battle since the first .com boom where customer acquisition costs of over $1000/user with no revenue model were considered successful. I've signed up with app.net at the developer level and wish for them to succeed, but am afraid that it will either be a niche product or an interesting experiment.

Meanwhile MG Siegler has a bit more pessimistic of an assessment, i.e. either app.net will die because it sticks to its principles or succumb to the realities that being successful corrupts. If history is to be any indicator, his prognosis is a fairly safe one to make. And it goes to the heart of the problem of tackling something as fundamental as status network plumbing via a single vendor. As long as there is central control, the success of the controlling entity is likely to cause their best interest and their user's best interest to drift apart over time.

The annoying thing about this is that there doesn't have to be and neither should there be central control. Do you think that blogs would ever have been as big as they are now, if blogger.com or wordpress.com would have been the sole vendor controlling the ecosystem and you could only blog or read a blog by having an account on their system? There is no owner to RSS or Atom, it's just specifications that address a simple but common pain of trying to syndicate content. Yet despite this lack of a benevolent dictator owning the space, a lot of companies were able to spring up and become very successful, without ever having to sign-up and get a developer key.

And that's is why I am working on the Happenstance specification. Mind you, I don't have the clout of Dalton Caldwell, but most of the plumbing we now take for granted did not originate from established entrepreneurs in our space, so I hope I will hit a nerve here with Happenstance that can get some traction. My aim is to design and implement something simple, useful and expandable enough that adopting it is a tiny lift and opens opportunities. The specification doesn't force any revenue model, so if implementors think they are better served by ad or subscription or some other model, they can all co-exist and benefit from the content being posted. I'm gonna keep plodding away at Happenstance and see whether the web is interested in being in control of their message feeds and be able to innovate on top of the basic feed without asking permission from a Benevolent Dictator.