The Joy of Profiling (part 2)

This is the second in a series of posts about performance enhancements. Part 1 covered how you should go about finding a suitable target when you sit down to make something faster; this post covers some of the ways you can speed it up.

Don’t Do It

It may sound obvious, but if you have found that a method is slowing down your application then the first thing you should ask is whether or not it needs to be called at all.

Consider the following example: updating and saving some entity objects where the CommitChanges call is the bottleneck.

public void UpdateAges()
{
  var people = repository.GetAllPeople();
  foreach (var person in people)
  {
    if (person.DateOfBirth == DateTime.Today)
      person.Age += 1;
  }
  repository.CommitChanges();
}

Here we are saving changes to the repository whether or not anything changed. Instead we can add a flag showing whether or not anyone was actually updated and potentially avoid saving altogether:

public void UpdateAges()
{
  var people = repository.GetAllPeople();
  var anyBirthdaysToday = false;

  foreach (var person in people)
  {
    if (person.DateOfBirth == DateTime.Today)
    { 
      anyBirthdaysToday = true;
      person.Age += 1;
    }
  }

  if (anyBirthdaysToday)
    repository.CommitChanges();
}

Another example that crops up quite often (in application development, at least) is where a user opens a dialog window to perform some action or to edit an object. Something like this:

public void ShowEditPersonDialog(Person person)
{
  var editor = new EditPersonDialog(person);
  editor.ShowDialog();
  this.Repository.CommitChanges();
}

Here we are making the costly CommitChanges call as soon as the dialog window is closed, regardless of the user action.  If the user has cancelled their changes then there is no need for us to save (though we may need to roll changes back) so we can update our code to only commit changes when the dialog is confirmed:

public void ShowEditPersonDialog(Person person)
{
  var editor = new EditPersonDialog(person);
  if (editor.ShowDialog() == DialogResult.Ok)
    this.Repository.SaveChanges();
}

Goodbye slow method!

Do It Less

If you absolutely have to call a slow method as part of your operation then the next step is to make sure that it is called the minimum number of times you can get away with. Consider this example:

public void MassWedding(string newName)
{
  foreach (var personId in repository.GetAllPeopleIds())
    UpdatePersonSurname(person, newName);
}

private void UpdatePersonSurname(int personId, int surname)
{
  var person = repository.FindPerson(personId);
  person.Name = surname;
  repository.CommitChanges();
}

Here we are going to call CommitChanges for every person that we update. Assuming that our repository will save all changes when we call CommitChanges then we can move that commit until the end:

public void MassWedding(string newName)
{
  foreach (var person in repository.GetAllPeople())
  {
    UpdatePersonSurname(person, newName);
  }
        repository.CommitChanges();
}

private void UpdatePersonSurname(int personId, int surname)
{
  var person = repository.FindPerson(personId);
  person.Name = surname;
}

Whilst this example may seem obvious it is often easy to overlook these if the call hierarcy is long enough between the loop and the slow method.

An easy-to-miss example of this problem is when updating a number of items where each update causes a refresh. If we consider a collection class that keeps track of the oldest person within it:

class PersonList : ObservableCollection<Person>
{
  protected override void OnCollectionChanged(NotifyCollectionChangedEventArgs e)
  {
    base.OnCollectionChanged(e);
    foreach (Person newPerson in e.NewItems)
      newPerson.AgeChanged += this.OnPersonAgeChanged;
  }

  private void OnPersonAgeChanged(object sender, EventArgs e)
  {
    Person oldest = this.First();
    foreach (var person in this.Skip(1))
    {
      if (person.Age > oldest.Age)
        oldest = person;
    }
  }
}

This class works by attaching an event handler to the AgeChanged event on each Person that is added to it, and then re-calculating the oldest person whenever an age is changed. If we were to iterate through the collection and update the age of everyone with a birthday today then we could potentially end up recalculating the oldest person many times unnecessarily.

To avoid this we can implement a method that tells the list to defer recalculations until all updates have been made. I generally prefer to implement this with a Using pattern as in the example below, which makes use of an IDisposable class that calls a constructor-specified Action on Dispose:

//Helper class that invokes an Action when the class is disposed
class ActionOnDispose : IDisposable
{
  private Action _onDispose;

  public ActionOnDispose(Action onDispose) {
    _onDispose = onDispose;
  }

  public void Dispose() {
    _onDispose();
  }
}

class PersonList : ObservableCollection<Person>
{
  //use a flag to store whether we are currently deferring refresh
  private bool _refreshDeferred;

  public IDisposable DeferRefresh() {
    //set the flag...
    _refreshDeferred = true;

    //...and return an object that, when disposed, resets the flag and
    //refreshes
    return new ActionOnDispose(() => {
      _refreshDeferred = false;
      this.OnPersonAgeChanged(null,null);
    });
  }

  private void OnPersonAgeChanged(object sender, EventArgs e) {
    //do nothing if the flag has been set
    if (_refreshDeferred) return;

    Person oldest = this.First();
    foreach (var person in this.Skip(1))
    {
      if (person.Age > oldest.Age)
        oldest = person;
    }
  }
}

Now we have reduced any number of possible refreshes down to a guaranteed single one.

Only Do It Once

This is really an extension of the point above, but worth mentioning. If you can get away with running your slow operation once and then using the result throughout your application (see: Singleton Pattern) you can potentially save a lot of unnecessary calculation.

Do It Later

So you’ve found your slow method and you’ve come to the conclusion that not only does it have to be run, it has to be run for each of the 10,000 items you’re creating…the next question you should ask is “do I need to do this now?”  Will each one of those 10,000 items need the result of the operation right away?

Take the example where you are validating a large number of models so that you can display the validation state when they are selected on the UI.  You might have something like the following:

class Model
{
  public bool IsValid { get; private set; }

  public void Validate()
  {
    //this line is slow
    this.IsValid = !EvaluateComplexBusinessRule();
  }

  public Model ()
  {
    this.Validate();
  }
}

In this example the model class is updating its validation state as soon as it is created, but we don’t actually need to use that validation result until something gets the value of the IsValid property.  So why don’t we defer that slow running calculation until someone actually cares about the result?

class Model
{
  private bool _isValidationDirty;
  private bool _isValid;
  public bool IsValid
  {
    get
    {
      if (_isValidationDirty)
      {
        _isValid = !EvaluateComplexBusinessRule();
        _isValidationDirty = false;
      }

      return _isValid;
    }
  }

  public void Validate()
  {
    _isValidationDirty = true;
  }

  public Model ()
  {
    this.Validate();
  }
}

This way we can call Validate as often as we like without incurring the cost of recalculating the validation state – we only do that when someone actually wants the result.

Note: in the validation example it’s quite likely that we will want to recalculate the validation state more than once, but there are scenarios where this doesn’t apply. If we can be sure that we only need to run a slow calculation once then we can make use of the Lazy<T> class, which takes care of the “evaluate this when someone asks for it” implementation:

  var result = new Lazy<double>(() => MySlowMethod());

  /* ...several hours later... */

  if (result.Value > 0) // <-- the calculation will be evaluated here
  {

  }

Do It In The Background

If you find yourself with a slow operation that can’t be avoided then it might be time to stop looking at actual performance and start considering perceived performance.  A user of your application will notice a wait much more if it is unexpected.  Think about the times you click a button and nothing seems to happen – isn’t that always more annoying than a ‘loading…’ message that pops up immediately until the task is done?

Performing long-running tasks in the background whilst the user is either notified of progress or is able to continue working is a great way of making an application feel faster, and this can be achieved by running tasks on a background thread.

Invoking a task on a background thread is a pretty simple thing to do, particularly with the Task Parallel Libary introduced in .NET 4.0.  To run a method asynchronously is as simple as:

static void Main(string[] args)
{
  Task.Factory.StartNew(() => MyLongRunningMethod());
}

The tricky part with background operations is that you need to give some thought to how exactly the rest of the application should react to them.  Should other operations be able to run at the same time?  What about tasks that rely on the result of this operation?

Unfortunately there isn’t really a catch-all answer – it’s extremely dependent on the situation – but make sure that you have at least thought about it!

Wrapping Up

This is by no means an exhaustive list of ways to improve performance but hopefully points you in the right direction.

Happy profiling…

Advertisements

WPF DataGrid Performance

The WPF DataGrid is a fantastic component but it can get pretty slow with a large number of rows or columns.

Try these quick-and-easy changes to improve the performance:

  • Set the following attributes to enable row and column virtualization:
    EnableColumnVirtualization="True"
    EnableRowVirtualization="True"
    

     

  • Set a fixed column width – this reduces the number of layout recalculations that are required when rendering: 
    ColumnWidth="100"
    

The Joy of Performance Profiling (part 1)

Performance: it’s great. It’s a feature.  Everyone appreciates a fast application. But despite our best efforts, it’s not unheard of for software to turn out a little (whisper it)… slow.

Maybe it runs ok on your development machine, or the test environment, but once it gets out there in the wild, away from the SSDs and multicore processors, it grinds to a halt.  The problem is that whilst very few users would list “performance” as one of their most important features, a slow application has the power to drive people away in droves  – no one wants their tools to slow them down.

Waiting...

I know a lot of developers who dread the idea of working on performance enhancements, but when it’s done right it is one of my personal favourite development activities.  You have a simple aim, you get to be a bit creative and the end result (assuming you succeed) will be universally loved – from a user’s perspective, you cannot implement “faster” badly.

You can take a range of approaches when trying to improve performance but, over a number of projects, I’ve settled on a system that seems to work pretty well.

1. Focus on a single operation

If your remit is “this application is slow – fix it” then it is tricky to know where to start.  A much simpler proposition would be “this button freezes up the UI for too long when clicked – fix it”.  As with any bug (and poor performance should be treated as a bug), you need a small, repeatable set of instructions to reproduce the problem so that you can investigate the cause and test your fix.

2. Measure the problem

…or “get a profiler”.

This is far and away the most important step in improving performance.  If you don’t know what is slow there is no way you can improve it; if you don’t know how slow it is then you can’t know if you have succeeded.

Trying to track down a bottleneck without tools to measure performance is always painful, so you will need to find a  profiler to analyse where the time is being lost.  There are a number of such tools on the market, including one that comes free with Visual Studio, but my weapon of choice has always been the RedGate Performance Profiler.  The exact tool doesn’t matter too much, provided that it can observe your application and tell you how long each method (or, better yet, each line) is taking to run.  You are looking for:

  • Total time spent in the method (including children), preferably as a percentage of its calling method’s time
  • Hit count – the number of times that method was called

Once you have this information you are halfway there.

3. Pick one method

Once you’ve run your test case with the profiler attache, you should end up with something like the below: a list of methods that were executed, ordered by the amount of time each took.

You are not looking for the slowest method, at least initially.  The slowest method will almost always be a high-level button click handler or page load, and will do a thousand tiny things before it completes.  If that method needs to be faster then you need to work out which of those thousand things is taking longer than it should, and then focus on that.

The ideal target method is one that has a single easily understood function that doesn’t rely too much on external resources and doesn’t have too many external effects.  Obviously this is not mandatory – if one method is clearly too slow, then that is your target no matter what it looks like –  but if you have a choice in the matter, you should probably pick GetRecordsForDate() or CommitChanges() over something high-level that does a lot of different things.

4. Make it faster

Having finally selected a method to work on, all you need to do  is…make it faster!  This is where the fun part starts, because you will need to be a little bit creative to see an improvement.  There are a few common ways that you can get some positive results, but those will have to wait until part 2