RESTful Reporting with Visual Studio Online


My team uses Visual Studio Online for work item tracking and generally speaking it has pretty good baked-in reporting.  I can see an overview of the current sprint, I can see capacity and I can see the burndown.  One area that I’ve always felt it was missing, however, is a way to analyse the accuracy of our estimations.

We actually make pretty good estimations, in general terms: we rarely over-commit and it’s unusual for us to add anything significant to a sprint because we’ve flown through our original stories.  This is based on a completely subjective guess at each person’s capacity and productivity which – over time – has given us a good overall figure that we know works for us.

But is that because our estimates are good, or because our bad estimates are fortuitously averaging out?  Does our subjective capacity figure still work when we take some people out of the team and replace them with others?

This is an area where the reporting within VSO falls down and the limitation boils down to one issue: there is no way to (easily) get the original estimate for a task once you start changing the remaining work.  So how can we get at this information?

Enter the API

I had seen a few articles on the integration options available for VSO but hadn’t really had a chance to look into it in detail until recently.  The API is pretty extensive and you can run pretty much any query through the API that you can access through the UI, along with a bunch of useful team-related info.  Unfortunately the API suffers the same limitation as the VSO portal, but we can work around it using a combination of a little effort and the Work Item History API.

Getting the Data

There is nothing particularly complicated about pulling the relevant data from VSO:

  1. Get a list of sprints using the ClassificationNode API to access iterations
  2. Use Work Item Query Language to build a dynamic query and get the results through the Query API.  This gives us the IDs of each Task in the sprint
  3. For each Task, use the Work Item History API to get a list of all updates
  4. Use the update history to build up a picture of the initial state of each task

Point 4 has a few caveats, however.  The history API only records the fields that have changed in each revision so we don’t always get a complete picture of the Task from a single update.  There are a few scenarios that need to be handled:

  1. Task is created in the target sprint and has a time estimate assigned at the same time.  This is then reduced during the sprint as the Task moves towards completion
  2. Task is created in the target sprint but a time estimate is assigned at a later date before having time reduced as the sprint progresses
  3. Task is created in another sprint or iteration with a time assigned, then moved to the target sprint at a later date
  4. Task is created and worked on in another sprint, then is moved to the target sprint having been partially completed

The simplest scenario (#1 above) would theoretically mean that we could take the earliest update record with the correct sprint.  However, scenario 2 means that the first record in the correct sprint would have a time estimate of zero.  Worse, because we only get changes from the API we wouldn’t have the correct sprint ID on the same revision as the new estimate: it wouldn’t have changed!

The issue with scenario 3 is similar to #2: when the Task is moved to the target sprint the time estimate isn’t changed so isn’t included in the revision.

A simplistic solution that I initially tried was to simply take the maximum historical time estimate for the task (with the assumption that time goes down as the sprint progresses, not up).  Scenario 4 puts an end to this plan as the maximum time estimate could potentially be outside of the current sprint.  If I move a task into a sprint with only half it’s work remaining, I don’t really want to see the other half as being completed in this sprint.

Calculating the Original Estimate: Solution

The solution that I eventually went with here was to iterate through every historical change to the work item and store the “current” sprint and remaining work as each change was made.  That allows us to get the amount of remaining work at each update alongside the sprint in which it occurred; from this point, taking a maximum of the remaining work values gives us a good number for the original amount of work that we estimated.

It does rely on the assumption that Tasks estimations aren’t increased after they have started work (e.g. start at 2 hours, get 1 hour done then realise there’s more work so increase back to 2) but in this scenario we tend to create new tasks instead of adjusting existing ones (we did find more work, after all) which works for us.

Tying it all Together

Once I was able to get at the data it was relatively simple to wrap a reporting service around the implementation.  I went with node & express for the server-side implementation with a sprinkling of angular on top for the client, but visualising the data wasn’t the challenge here!

With this data available I can see a clear breakdown of how different developers affect the overall productivity of the team and can make decisions off the back of this.  I have also seen that having a live dashboard displaying some of the key metrics acts as a bit of a motivator for the people who aren’t getting through the work they expect to, which can’t be a bad thing.

I currently have the following information displayed:

  • Total remaining, completed and in-progress work based on our initial estimates
  • %age completion of the work
  • Absolute leaderboard; i.e. who is getting through the most work based on the estimates
  • Adjusted leaderboard; i.e. who is getting through the most work compared to our existing subjective estimates
  • Current tasks

I hope that the VSO API eventually reaches a point that I can pull this information out without needing to write code, but it’s good to know I can get the data if I need it!


Using Blob Snapshots to Backup Azure Virtual Machines

Now that Windows Azure Virtual Machines are out of preview it seems like a good time to look at how we can safeguard against the inevitable disasters that will befall those VMs.

Azure Virtual Machines use blob storage for both OS and data disks so we can get some basic backups going with nothing more than the blob API and a simple powershell script.

Setting Up the API

Before we can start writing anything to make use of the blob API we need to make sure that we have it downloaded and configured. How, you ask?

  • Firstly, install Windows Azure Powershell from here or through the Web Platform Installer
  • Next, create a new powershell script and import the API namespace
    Add-Type -Path "C:\Program Files\Microsoft SDKs\Windows Azure\.NET SDK\2012-10\bin\Microsoft.WindowsAzure.StorageClient.dll"

Setting Up Your Subscription

You will also need to configure powershell to use your Windows Azure account. Here’s how (as described by the excellent Michael Washam):

  • Download your Azure publish profile through the portal or with this link
  • Import the downloaded settings file using the Import-AzurePublishSettingsFile cmdlet
    Import-AzurePublishSettingsFile "c:\downloads\subscription.publishsettings"
  • Finally, set your subscription as active
    Set-AzureSubscription -SubscriptionName "[subscription name]"

At this point you are able to start using the API with your Azure account.

The CloudBlob Class

Most of the operations we want to invoke for our backup are going to be called on the CloudBlob class, which represents both the original blob and any snapshots that are acquired from Azure.

Finding the Right Blob

The first thing that we need to do is get a reference to the blob that contains the virtual hard disk (vhd) for our virtual machine, and to do that we need to dig through the portal.

Browse to the Dashboard for your virtual machine and you will see something like:

2013-04-24 18_10_25-Virtual machines - Windows Azure

Make a note of the disk name (highlighted) then browse to Virtual Machines > Disks. This will display a table listing all the disks that are being used by virtual machines, to which VMs they are attached and their location. From the table, locate the disk with the matching name and grab the location URL, which will look something like this:

Make a note of both the storage account and disk blob names, and we have enough information to identify the correct blob.

Getting a Blob Reference

We now need to grab an instance of CloudBlob that represents our VM disk. We do this using the CloudBlobClient.GetBlobReference method, building up the required credentials from the storage account details.

$storageAccount = "[storage account name]"
$blobPath = "vhds/[disk blob name.vhd]"

#get the storage account key
$key = (Get-AzureStorageKey -StorageAccountName $storageAccount).Primary

#generate credentials based on the key
$creds = New-Object Microsoft.WindowsAzure.StorageCredentialsAccountAndKey($storageAccount,$key)

#create an instance of the CloudBlobClient class
$blobClient = New-Object Microsoft.WindowsAzure.StorageClient.CloudBlobClient($blobUri, $creds)

#and grab a reference to our target blob
$blob = $blobClient.GetBlobReference($blobPath)

We now have a $blob variable that we can start using to manipulate the blob.

Creating Snapshots

Creating a new snapshot is an incredibly simple step – we can just call CreateSnapshot!

$snapshot = $blob.CreateSnapshot()

The $snapshot variable now contains another instance of CloudBlob that represents the snapshot, and we can use this to download the snapshot content at any point in the future.

That’s useful, but it’s not that useful as we’re unlikely to keep a reference to that object until the next time we need it. So how do we find snapshots that have been made in the past?

Finding Snapshots

The API includes a method on the CloudBlobContainer class that will list all blobs within a particular container (in this case, vhds). Unfortunately it does not do much in the way of filtering, so we need to add some code of our own.

#assume we can get our blob as before
$blob = Get-Blob

#we need to create an options object to pass to
#the ListBlobs method so that it includes snapshots
$options = New-Object Microsoft.WindowsAzure.StorageClient.BlobRequestOptions
$options.BlobListingDetails = "Snapshots"
$options.UseFlatBlobListing = $true
$snapshots = $blob.Container.ListBlobs($options);

#once we have the results we need to manually filter 
#the ones we aren't interested in just now
foreach ($snapshot in $snapshots)
	#make sure that the blob URI ends in our blobPath variable from earlier
	if ($snapshot.Uri -notmatch ".+$blobPath$") { continue }

	#and make sure that the SnapshotTime is set - this filters out
	#the current live version of the blob
	if ($snapshot.SnapshotTime -eq $null) { continue }

	Write-Output $snapshot

If we wrap this up in a Get-Snapshots function then it will return each snapshot of our blob in date order.

Now that we can get a list of snapshots associated with a blob, we want to be able restore the “live” blob to a point in the past.

Restoring the Blob

Once we have a reference to both the current blob and the snapshot that we want to restore, it’s trivial to overwrite the current blob with the older version:

#grab the snapshots for the blob as described above
$allSnaphots = Get-Snapshots

#restore to original version

#restore to most recent snapshot

We probably want to be a little bit more careful with this though, as you could inadvertently overwrite the current version and lose our data. To avoid this we can take another snapshot as a backup prior to restoring – this way we are in no risk of losing any data.

#grab the snapshots for the blob as described above
$allSnaphots = Get-Snapshots

#take a backup snapshot before restoring
$backupSnapshot = $blob.CreateSnapshot()

#and then restore safely

One important thing to note about this approach: it simply restores the disk. Understandably, Azure gets touchy about you overwriting the disk of a live machine, so you will have to make sure that the VM is shut down and disassociated with the disk before you can restore.

Obviously this is not a complete backup solution but it can quickly give you a means to recover your Azure virtual machines to an earlier point. Everything is in powershell so can be easily scheduled to run automatically at scheduled intervals, and the snapshots are fully accessible so a more in-depth backup process can pull them to another location as required.

Publish an Azure Web Site from the Command Line

Azure Web Sites, though still in preview, are a great way of quickly hosting a scalable site on Windows Azure without the overhead  of setting up and maintaining a virtual machine.

One of the great features is the ability to use integrated Publish tools to push the project to the server. No need to manually build, package, transfer and deploy your code – Visual Studio handles everything for you.

Publishing from Visual Studio

Publishing refers to the process of deploying your ASP.NET web site to Azure server, and when working from Visual Studio is very simple.  A few good walkthroughs are available elsewhere so I won’t repeat them here; to summarise:

  1. Download publish settings from the Windows Azure management portal
  2. Configure a publish profile in the ASP.NET project
  3. Run the Publish wizard to push to Azure

This is very useful when getting started, but in the real world you don’t want to publish to the server from Visual Studio; you want to do it from your build server having run unit tests, code coverage, etc. etc.

My build server is running TeamCity, so using MSBuild from the command line seems to be a good route to take. Let’s take a look at how we can get MSBuild to run that same publication for us.

Publishing using MSBuild

Of the 3 steps for publishing from Visual Studio, the first two are setup steps that need only be performed once.  As the output from these steps can be saved (and checked into source control), we are only interested in invoking step 3, but before we can do that we need to make a couple of amendments to the publish profile from step 2.

Modifying the Publish Profile

In step 2 above we created a publish profile that is located in the Properties\PublishProfiles folder:

 + Properties
   + PublishProfiles
     - DeployToAzure.pubxml

Note: by default, the pubxml file is named something like [AzureSiteName] – Web Deploy.pubxml; I have renamed it here to remove the reference to the site name.

Let’s take a look at that generated XML.

<Project ToolsVersion="4.0" xmlns="">
    <RemoteSitePhysicalPath />
	  <!-- omitted for brevity -->
    <MSDeployParameterValue Include="$(DeployParameterPrefix)DefaultConnection-Web.config Connection String" />

Most of the properties here are self explanatory and have obviously come from the fields that were filled out during the the publish wizard in Visual Studio.  We need to make 2 changes to this XML in order to use it from the command line:

  1. Specify the password from the publish settings
  2. Allow unsecure certificates (no idea why, but we get exceptions if we skip this step)

It goes without saying that, because we are going to save a password, be careful where you save this file.  If there is any risk of this publish profile becoming externally visible then do not save the password.

Assuming that you have decided it is safe to store the password, we need to find out what it should be.  Go back to step 1 in the original Visual Studio instructions and find the downloaded publish settings file named [AzureSiteName]

This file is plain XML and contains profile details for both Web Deploy and FTP publication.  Locate the userPWD attribute on the Web Deploy node and add a Password node with this value to the publish profile. We can also add the AllowUntrustedCertificates node needed to avoid certificate exceptions.

  <!-- before... -->
  <Password>[password from .PublishSettings file]</Password>

That’s all we need to change in the publish profile; now let’s take a look at the MSBuild command.

The MSBuild Command

Now that we have the publish profile configured, the MSBuild command is very simple:

msbuild MyProject.sln

The 3 flags tell MSBuild to deploy once the build completes; to use the DeployToAzure publish profile we have been modifying; and to build in Release mode (assuming you want to publish a release build).

You can run this from the command line on your local machine or as part of your build process and the site will be rebuilt and deployed directly to Azure!