Protecting your CouchDB Views

If you work with a SQL or other RDBMS database you most likely have your schema backed up somewhere under source control.  Maybe it’s a bunch of SQL scripts, maybe it’s the classes from which you generated your Entity Framework schema, but you almost certainly have some way of restoring your DB schema into a new database (at least I hope that you do!).

couchdb

But what about CouchDB?

CouchDB, as anyone who has read the first sentence of a beginners guide will know, is a Non-Relational Database and so it does not have a schema.  All of the data is stored as arbitrary JSON documents which can (and do) contain data in a wide range of formats.

The problem is that whilst there is no schema to “restore” into a new database, there is another very important construct: views.

CouchDB Views

Views within CouchDB define how you query the data.  Sure, you can always fall back to basic ID-lookup to retrieve documents, but as soon as you want to do any form of complicated (i.e. useful) querying then you will mostly likely need to create a view.

Each view comprises 2 JavaScript functions: a map function and an optional reduce function.  I don’t want to go into a lot of detail on the map-reduce algorithm or how CouchDB views work under the covers (there are plenty of other resources out there) but the important thing here is that you have to write some code that will play a very significant role in how your application behaves and that should be in source control somewhere!

Storing Views in Source Control

In order to put our view code under source control we first need to get it into a format that can be saved to disk.  In CouchDB, views are stored in design documents and the design documents are stored as JSON, so we can get a serialized copy of the view definitions by just GETting the design document from couch:

curl http://localhost:5984/databaseName/_design/designDocumentName

Pass the output through pretty-print and you will see the contents of the design document in a JSON structure:

{
   "_id": "_design/designDocumentName",
   "_rev": "1-47b20721ccd032b984d3d46f61fa94a8",
   "views": {
       "viewName": {
           "map": "function (doc) {\r\n\t\t\t\tif (doc.value1 === 1) {\r\n\t\t\t\t\temit(\"one\", null);\r\n\t\t\t\t} else {\r\n\t\t\t\t\temit(\"other\", {\r\n\t\t\t\t\t\tother: doc.value1\r\n\t\t\t\t\t});\r\n\t\t\t\t}\r\n\t\t\t}"
        }
   },
   "filters": {
       "filterName": "function () {}"
   }
}

This is, at least, a serialized representation of the source for our view, and there are definitely some advantages to using this approach.  On the other hand, there are quite a few things wrong with using this structure in source control:

Unnecessary Data
The purpose of this exercise is to make sure that the view code is safely recoverable; whilst there is debatably some use in storing the ID, the revision (_rev) field refers to the state of the database and may vary between installations and shouldn’t be needed.

Functions as Strings
The biggest problem with this approach is that the map, reduce and filter functions are stored as strings.  You may be able to put up with this in simple examples, but as soon as they contain any complexity (or indentation, as seen above) they become completely unreadable.  Tabs and newlines are all concatenated into one huge several-hundred-character string, all stored on one line.  Whilst this is not a technical issue (you could still use these to restore the views) it makes any kind of change tracking impossible to understand – every change is on the same line!

As well as the readability issues we also lose the ability to perform any kind of analysis on the view code.  Whether that is static analysis (such as jsLint), unit testing or some-other-thing, we cannot run any of them against a string.

An Alternative Format

Instead of taking a dump of the design documents directly from CouchDB, I would recommend using an alternative format geared towards readability and testability.  You could be pretty creative in exactly how you wanted to lay this out (one file per design document, one file per view…) but I have found that the structure below seems to work quite well:

exports.designDocumentName = {
	views: {
		viewName: {
			map: function (doc) {
				//some obviously-fake logic for demo purposes
				if (doc.value1 === 1) {
					emit("one", null);
				} else {
					emit("other", {
						other: doc.value1
					});
				}
			}
		}
	},
	filters: {
		filterName: function () { }
	}
};

exports.secondDesignDocument = {
	//...
};

This has several advantages over the original format:

  • It is much easier to read!  You get syntax highlighting, proper indentation and the other wonderful features of your favourite code editor
  • There is no redundant information stored
  • jsLint/jsHint can easily be configured to validate the functions
  • By using the AMD exports object, the code is available to unit tests and other utilities (more on that below)

There is one significant disadvantage though: because I have pulled this structure out of thin air, CouchDB has no way of understanding it.  This means that whilst my view code is safe and sound under source control I have no way of restoring it.  At least with the original document-dump approach I could manually copy/paste the contents of each design document into the database!

So how can we deal with that?

Restoring Views

As I mentioned above, one of the advantages of attaching design documents as objects to the AMD exports object is that they can be exposed to node utilities very easily.  To demonstrate this I have created a simple tool that is able to create or update design documents from a file such as the one above in a single command: view-builder.

You can see the source for the command on GitHub or you can install it using NPM.

npm install -g view-builder

After installation you can run the tool like this:

view-builder --url http://localhost:5984/databasename  --defs ./view-definitions.js

This will go through the definitions and for each of the design documents…

  1. Download the latest version of the design document from the server
  2. Create a new design document if none already exists
  3. Compare each view and filter to identify any changes
  4. If changes are present, update the version on the server

The comparison is an important step in this workflow – updating a design document will cause CouchDB to rebuild all of the views within it; if you have a lot of data then this can be a very slow process!

Now we have a human-readable design document definition that can be source-controlled, unit tested and then automatically applied to any database to which we have access.  Not bad…

Other Approaches

Whilst this system works for me, I can’t imagine that I am the first person to have considered this problem.  How is everyone else protecting their views?  Any suggestions or improvements in the comments are always welcome!

Advertisements

Single Page Applications using Node & Knockout

This post is going to be a short walkthrough on how to use Node and KnockoutJS to create a simple single page application.

What is a Single Page Application?

…a web application or web site that fits on a single web page with the goal of providing a more fluid user experience akin to a desktop application

That’s according to Wikipedia.  For the purposes of this post, a single page application (or SPA) will mean a web application for which we only want to serve up one HTML page.

That page will then link to a couple of javascript files which, in cohort with a templating engine, will create and manipulate the content of the page.  All communication with the server will be through AJAX, and will only ever transfer JSON data – no UI content.

We will be using node to serve the page (plus scripts, styles, etc.) and to handle the API calls, while knockout will provide us with the client-side interaction and templating.

Serving the Single Page

First up: we need to configure node to serve up our single HTML page:

<html>
	<body>
		<h1>I'm a single page application!</h1>
	</body>
</html>

We’ll be adding more to that later, but let’s get node working first.

Express

We’re going to be using expressjs to implement the more interesting API calls later on, but we can make use of it here to serve up a static file for us as well.  To install express, use the node package manager by running the command below.:

npm install express

Now we need to create a javascript file – app.js – to run in node.  This file will grab a reference to express using the require function and will start listening on port 3000.

var express = require("express"),
	app = express();

//start listening on port 3000
app.listen(3000);

Let’s see what happens when we run this.  In a command prompt, browse to the folder containing app.js and enter the command below to start node.

node app.js

Next, open your favourite browser and navigate to http://localhost:3000/index.html.  You should see something like this:

cannot-get

This is express telling us that it cannot resolve the URL “/index.html”, which isn’t unreasonable – we haven’t told it how to yet.  We want express to respond with the contents of static files from the current folder (eventually including our styles and javascript), so let’s get that set up.

We do this in the app.configure method (before we call app.listen) using the express.static method and the current application folder (stored in the special __dirname node variable).

app.configure(function() {
	//tell express to serve static files from the special
	//node variable __dirname which contains the current
	//folder
	app.use(express.static(__dirname));
});

Restart the node application, refresh the browser and you should now see the content from our single page:

can-get

Conveniently, express will automatically return index.html if you don’t specify a file name, so we can get the same response from http://localhost:3000/

Creating the Page

The next step is to start building up content in the page.  We are going to need a few javascript resources – jQuery for the AJAX calls, Knockout for the view model – and I’m going to include my command pattern implementation to help with the actions.

For the page itself I’m going to pull in a page.js to contain our page-specific code, and we should probably include a stylesheet as I can’t stand Times New Roman.

Our HTML page now looks like this:

<html>
	<head>
		<title>SPA Example</title>
		<link rel="stylesheet" href="spa.css" />
	</head>
	<body>
		<h1>I'm a single page application</h1>
	</body>
	<script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.9.0.min.js"></script>
	<script src="http://ajax.aspnetcdn.com/ajax/knockout/knockout-2.2.1.js"></script>
	<script src="https://raw.github.com/stevegreatrex/JsUtils/master/JsUtils/utils.min.js"></script>
	<script src="page.js"></script>
</html>

I’m using a CDN for jQuery and knockout, and I’m pulling my command implementation direct from Github (sorry Github!). I’m assuming that both spa.css and page.js are in the same folder as index.html

Refresh the browser again (no need to restart node this time) and…
styled
Much better!

Creating the View Model

As this is just a sample application I don’t want to get too distracted by the view model – the purpose of this post is demonstrate the end-to-end rather than to focus on a specific functionality.  With that in mind, let’s use the example functionality of a basic todo app (as that seems to be the thing to do).

Our view model will start off with a list of todo items which we will store in a knockout observableArray.  Each todo item will have a name and a complete flag.

For the time being, we’ll bootstrap the collection with a few sample items.

var TodoViewModel = function(data) {
	this.name = ko.observable(data.name);
	this.complete = ko.observable(data.complete);
};

var TodoListViewModel = function() {
	this.todoItems = ko.observableArray();
};

$(function() {
	var viewModel = new TodoListViewModel();

	//insert some fake todo items for now...
	viewModel.todoItems.push(new TodoViewModel({ name: "Pending Item", complete: false }));
	viewModel.todoItems.push(new TodoViewModel({ name: "Completed Item", complete: true }));

	ko.applyBindings(viewModel);
});

The view model is now being populated but there’s still nothing to see in our view – we need to add some HTML and start working with the knockout templating engine to get things to display.

Displaying Items using Knockout Templating

With knockout, the UI is data bound to the view model in order to generate HTML.  http://knockoutjs.com/ has a wealth of documentation and examples on how to achieve this, but for this example we are going to use three bindings: foreach to iterate through each of the todo list items; text to display the name; and checked to display the completed state.

<ul data-bind="foreach: todoItems">
	<li>
		<span data-bind="text: name"></span>
		<input type="checkbox" data-bind="checked: complete" />
	</li>
</ul>

Refresh the page in a browser and you should now see something like this:

items

We now have the text and the completed state of the our two fake todo items.  That’s all well and good, but what about when you want to get real data from the server?

Getting Real Data from the Server

In a single page application, data is acquired from the server using AJAX calls and our example today will be no different.  Unfortunately, our server doesn’t support any AJAX calls at the moment, so our next step is to configure a URL that will return some data; in this case: todo list items.

Configuring API Calls using Express

We want to set up an API call on our node server that will respond with a JSON list of todo items for the URL:

GET /api/todos

To set this up in express we use the app.get method, which accepts a path as the first parameter – in this case /api/todos – and a callback as the second.

app.get("/api/todos", function(req, res) {
	//...
});

The callback will now be invoked whenever we browse to http://localhost:3000/api/todos.  The two parameters on the callback are the request and the response objects, and we now want to use the latter to send JSON data back to the client.

Ordinarily you would be getting the data from some kind of backing store, but to keep things simple I’m just going to return a few fake items using the res.json method.  Here we are passing in the HTTP response code (200 – OK) and our data, then calling the res.end method to finish the response.

res.json(200, [
	{ name: "Item 1 from server", complete: false },
	{ name: "Item 2 from server", complete: false },
	{ name: "Completed Item from server", complete: true }
]);
res.end();

Now let’s hook up our view model to access that data…

Getting the View Model Speaking to the Server

As our server now expects a GET call we can use jQuery.getJSON to load the data from the client side.  Once we have the data, all we need to do is push it into our view model to update the UI.

var TodoListViewModel = function() {
	var self = this;
	this.todoItems = ko.observableArray();

	this.refresh = ko.command(function() {
		//make a call to the server...
		return $.getJSON("/api/todos");
	}).done(function(items) {
		//...and update the todoItems collection when the call returns
		var newItems = [];
		for (var i=0; i < items.length; i++ ){
			newItems.push(new TodoViewModel(items[i]));
		}
		self.todoItems(newItems);
	});

	//refresh immediately to load initial data
	this.refresh();
};

Note that I’ve used the command pattern in this example (to get some free loading indicators and error handling) but there’s no need to do so – a regular function would suffice.

Restart node, refresh the page and you should now see the data returned from the server.

server-items

Sending Data back to the Server

We’ve managed to display data from the server, but what about if we want to save a change from the client?

Let’s add another API method that expects a PUT call to /api/todos/[id] with a body containing the JSON data.  We’ll also need to add an id property to the fake data returned by the server so that we can reference it in the URL.

The configuration of the PUT URL looks very similar to the GET configuration from earlier.

app.put("/api/todos/:todoId", function(req, res) {
    //...
});

The only difference (besides the verb) is that our URL path now includes a parameter named “todoId”, signified by the prefixed colon.  This will allow us to access the value of the ID appended to the URL through the req.params object.

Our handler will also need access to the body of the request, and to provide that we need to configure express to use its body parser:

app.use(express.bodyParser());

Now we have access to the body of the request through the req.body property.

As our server doesn’t have a real backing store, there isn’t much we can do to actually process this call.  To demonstrate that it is actually getting through we’ll just log the details to the node console and respond with a 200 - OK for the time being.

app.put("/api/todos/:todoId", function(req, res) {
	console.log(req.params.todoId + ": " + JSON.stringify(req.body, null, 4));
	res.send(200);
	res.end();
});

We now need our view model to call this method whenever the value of the complete flag is updated by the user.  First off, lets add another command that uses jQuery to make an AJAX call with the appropriate data.

var TodoViewModel = function(data) {
	// as before

	this.updateServer = ko.command(function() {
		return $.ajax({
			url: "/api/todos/" + data.id,
			type: "PUT",
			contentType: "application/json",
			data: JSON.stringify({
				id: data.id,
				name: self.name(),
				complete: self.complete()
			})
		});
	});
};

This one is a bit more verbose than the getJSON call earlier as we need to call the jQuery.ajax method directly in order to PUT data.  It is also worth noting that the JSON object being sent is derived from the updated values for the name and complete fields from the relevant observables.

We can now use the subscribe method on the observable complete flag to ensure that this update function will be automatically invoked whenever the flag changes.

this.complete.subscribe(this.updateServer);

Restart node, refresh the page, and try clicking on the check boxes.  You should see confirmation of the successful call to the server in the node window.

server-output

Wrapping Up

This has only been a very simple example, but hopefully demonstrates the architecture of a single page application and how it can be implemented using node and knockout.

ProxyApi: Now With Intellisense!

After announcing ProxyApi in my last post I had a few people suggest that it would be more useful if it included some kind of intellisense.

So…now it does! Install the new ProxyApi.Intellisense NuGet package and you will automatically have intellisense for the generated JavaScript API objects.

I’ve made this into a separate package for 2 reasons:

  1. The original ProxyApi package still works perfectly on it’s own; and
  2. The intellisense implementation is a little but more intrusive than I would have liked

It works by adding a T4 template to the Scripts directory of your project that uses the ProxyApi classes to generate a temporary version of the script at design-time. That scripts is then added to _references.js so it gets referenced for any JavaScript file in the solution.

This would be fine, but unfortunately Visual Studio doesn’t have any mechanism for regenerating the T4 template automatically, meaning that changes to the API or MVC controllers wouldn’t be reflected until you either manually rebuilt the templates or restarted VS. For the time being I have worked around this used a simple Powershell script to re-evaluate all T4 templates after each build, but hopefully I can find a more elegant solution later.

Because this does add a slight performance penalty, and because not everyone would need intellisense support, I’ve left this as an extra package. If you prefer the vanilla ProxyApi then you can grab it here.

The next step will be generating TypeScript files using a similar mechanism, which would allow the intellisense to extend to the parameter types as well.

Watch this space…

Deserializing Interface Properties using Json.Net

The Problem

Let’s say that you have a simple class structure such as this one:

public class Thing
{
	public string Name { get; set; }
}

public class ThingContainer
{
	public Thing TheThing { get; set; }
}

Here we have a class ThingContainer that has a single reference property of type Thing, and Json.Net will do a great job of serializing and deserializing instances of ThingContainer without any extra help:

static void Main(string[] args)
{
	var container = new ThingContainer
	{
		TheThing = new Thing { Name = "something" }
	};
	var serialized = JsonConvert.SerializeObject(container, Formatting.Indented);
	Console.WriteLine(serialized);
	// {
	//   "TheThing": {
	//      "Name: "something"
	//   }
	// }
	var deserialized = JsonConvert.DeserializeObject<ThingContainer>(serialized);
	Console.WriteLine(deserialized.TheThing.Name);
	// "something"
}

Unfortunately the real-world is rarely that simple and today you are writing a good model so you can’t go about using concrete types. Instead, you want to specify your properties as interfaces:

public interface IThing
{
	string Name { get; set; }
}

public class Thing : IThing
{
	public string Name { get; set; }
}

public class ThingContainer
{
	//notice that the property is now of an interface type...
	public IThing TheThing { get; set; }
}

After making these changes the serialization will still work as before, but when we try to deserialize the model we get the following error:

Could not create an instance of type JsonSerialization.IThing. Type is an interface or abstract class and cannot be instantated

This means that the JSON deserializer has seen that there is a property of type IThing but doesn’t know what type of object it should create to populate it.

Enter JsonConverter

The solution to this is to explicitly tell the deserializer what type it should be instantiating, and we do this using an attribute – specifically the JsonConverterAttribute.

The JsonConverterAttribute is part of Json.Net, and allows you to specify a custom converter to handle serialization and deserialization of an object by extending JsonConverter.

public class ThingContainer
{
	[JsonConverter(typeof(/*CustomConverterType*/))]
	public IThing TheThing { get; set; }
}

In this case we are going to write a custom implementation of JsonConverter that will behave exactly as the non-attributed property would, but using a specific type.

The code below shows the shell of the converter class and the methods we need to override. Notice that we are specifying a generic type parameter TConcrete on the class – this will set the desired concrete type for the property when we actually use the attribute later.

public class ConcreteTypeConverter<TConcrete> : JsonConverter
{
	public override bool CanConvert(Type objectType)
	{
		//determine whether or not this converted can create an instance of
		//the specified object type
	}

	public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
	{
		//deserialize an object from the specified reader
	}

	public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
	{
		//serialize the object value
	}
}

The roles of the missing methods are fairly self explanatory, and seeing as we’re feeling lazy today we’ll pick off the easy ones first.

CanConvert

What we should be doing in CanConvert is determining whether or not the converter can create values of the specified objectType. What I am actually going to do here is just return true (i.e. “yes, we can create anything”). This will get us through the example and leaves the real implementation as an exercise for the reader…

WriteJson

The WriteJson method is responsible for serializing the object, and (as I mentioned above) the serialization of interface properties already works as expected. We can therefore just use the default serialization to fulfil our needs:

public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
{
	//use the default serialization - it works fine
	serializer.Serialize(writer, value);
}

ReadJson

The ReadJson method is where it starts to get interesting, as this is actually what is causing us the problem.

We need to re-implement this method so that it both instantiates and populates an instance of our concrete type.

Thankfully, Json.Net already knows how to populate the object – and any sub-objects, for that matter – so all we really need to do is get it to use the correct concrete type. We can do this by explicitly calling the Deserialize<T> overload on the serializer:

public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
{
	//explicitly specify the concrete type we want to create
	//that was set as a generic parameter on this class
	return serializer.Deserialize<TConcrete>(reader);
}

Usage and Final Code

Now that we have our custom attribute we just need to specify it on the model…

public class ThingContainer
{
	[JsonConverter(typeof(ConcreteTypeConverter<Thing>))]
	public IThing TheThing { get; set; }
}

…and we can deserialize without any errors!

The final code for the converter is below. There are quite a few extensions that could be made to this, but as a quick-fix to a problem that I come across often this will do the job nicely.

public class ConcreteTypeConverter<TConcrete> : JsonConverter
{
	public override bool CanConvert(Type objectType)
	{
		//assume we can convert to anything for now
		return true;
	}

	public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
	{
		//explicitly specify the concrete type we want to create
		return serializer.Deserialize<TConcrete>(reader);
	}

	public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
	{
		//use the default serialization - it works fine
		serializer.Serialize(writer, value);
	}
}