The Joy of Profiling (part 2)

This is the second in a series of posts about performance enhancements. Part 1 covered how you should go about finding a suitable target when you sit down to make something faster; this post covers some of the ways you can speed it up.

Don’t Do It

It may sound obvious, but if you have found that a method is slowing down your application then the first thing you should ask is whether or not it needs to be called at all.

Consider the following example: updating and saving some entity objects where the CommitChanges call is the bottleneck.

public void UpdateAges()
{
  var people = repository.GetAllPeople();
  foreach (var person in people)
  {
    if (person.DateOfBirth == DateTime.Today)
      person.Age += 1;
  }
  repository.CommitChanges();
}

Here we are saving changes to the repository whether or not anything changed. Instead we can add a flag showing whether or not anyone was actually updated and potentially avoid saving altogether:

public void UpdateAges()
{
  var people = repository.GetAllPeople();
  var anyBirthdaysToday = false;

  foreach (var person in people)
  {
    if (person.DateOfBirth == DateTime.Today)
    { 
      anyBirthdaysToday = true;
      person.Age += 1;
    }
  }

  if (anyBirthdaysToday)
    repository.CommitChanges();
}

Another example that crops up quite often (in application development, at least) is where a user opens a dialog window to perform some action or to edit an object. Something like this:

public void ShowEditPersonDialog(Person person)
{
  var editor = new EditPersonDialog(person);
  editor.ShowDialog();
  this.Repository.CommitChanges();
}

Here we are making the costly CommitChanges call as soon as the dialog window is closed, regardless of the user action.  If the user has cancelled their changes then there is no need for us to save (though we may need to roll changes back) so we can update our code to only commit changes when the dialog is confirmed:

public void ShowEditPersonDialog(Person person)
{
  var editor = new EditPersonDialog(person);
  if (editor.ShowDialog() == DialogResult.Ok)
    this.Repository.SaveChanges();
}

Goodbye slow method!

Do It Less

If you absolutely have to call a slow method as part of your operation then the next step is to make sure that it is called the minimum number of times you can get away with. Consider this example:

public void MassWedding(string newName)
{
  foreach (var personId in repository.GetAllPeopleIds())
    UpdatePersonSurname(person, newName);
}

private void UpdatePersonSurname(int personId, int surname)
{
  var person = repository.FindPerson(personId);
  person.Name = surname;
  repository.CommitChanges();
}

Here we are going to call CommitChanges for *every *person that we update. Assuming that our repository will save all changes when we call CommitChanges then we can move that commit until the end:

public void MassWedding(string newName)
{
  foreach (var person in repository.GetAllPeople())
  {
    UpdatePersonSurname(person, newName);
  }
  repository.CommitChanges();
}

private void UpdatePersonSurname(int personId, int surname)
{
  var person = repository.FindPerson(personId);
  person.Name = surname;
}

Whilst this example may seem obvious it is often easy to overlook these if the call hierarcy is long enough between the loop and the slow method.

An easy-to-miss example of this problem is when updating a number of items where each update causes a refresh. If we consider a collection class that keeps track of the oldest person within it:

class PersonList : ObservableCollection<Person>
{
  protected override void OnCollectionChanged(NotifyCollectionChangedEventArgs e)
  {
    base.OnCollectionChanged(e);
    foreach (Person newPerson in e.NewItems)
      newPerson.AgeChanged += this.OnPersonAgeChanged;
  }

  private void OnPersonAgeChanged(object sender, EventArgs e)
  {
    Person oldest = this.First();
    foreach (var person in this.Skip(1))
    {
      if (person.Age > oldest.Age)
        oldest = person;
    }
  }
}

This class works by attaching an event handler to the AgeChanged event on each Person that is added to it, and then re-calculating the oldest person whenever an age is changed. If we were to iterate through the collection and update the age of everyone with a birthday today then we could potentially end up recalculating the oldest person many times unnecessarily.

To avoid this we can implement a method that tells the list to defer recalculations until all updates have been made. I generally prefer to implement this with a Using pattern as in the example below, which makes use of an IDisposable class that calls a constructor-specified Action on Dispose:

//Helper class that invokes an Action when the class is disposed
class ActionOnDispose : IDisposable
{
  private Action _onDispose;

  public ActionOnDispose(Action onDispose) {
    _onDispose = onDispose;
  }

  public void Dispose() {
    _onDispose();
  }
}

class PersonList : ObservableCollection<Person>
{
  //use a flag to store whether we are currently deferring refresh
  private bool _refreshDeferred;

  public IDisposable DeferRefresh() {
    //set the flag...
    _refreshDeferred = true;

    //...and return an object that, when disposed, resets the flag and
    //refreshes
    return new ActionOnDispose(() => {
      _refreshDeferred = false;
      this.OnPersonAgeChanged(null,null);
    });
  }

  private void OnPersonAgeChanged(object sender, EventArgs e) {
    //do nothing if the flag has been set
    if (_refreshDeferred) return;

    Person oldest = this.First();
    foreach (var person in this.Skip(1))
    {
      if (person.Age > oldest.Age)
        oldest = person;
    }
  }
}

Now we have reduced any number of possible refreshes down to a guaranteed single one.

Only Do It Once

This is really an extension of the point above, but worth mentioning. If you can get away with running your slow operation once and then using the result throughout your application (see: Singleton Pattern) you can potentially save a lot of unnecessary calculation.

Do It Later

So you’ve found your slow method and you’ve come to the conclusion that not only does it have to be run, it has to be run for each of the 10,000 items you’re creating…the next question you should ask is “do I need to do this now?”  Will each one of those 10,000 items need the result of the operation right away?

Take the example where you are validating a large number of models so that you can display the validation state when they are selected on the UI.  You might have something like the following:

class Model
{
  public bool IsValid { get; private set; }

  public void Validate()
  {
    //this line is slow
    this.IsValid = !EvaluateComplexBusinessRule();
  }

  public Model ()
  {
    this.Validate();
  }
}

In this example the model class is updating its validation state as soon as it is created, but we don’t actually need to use that validation result until something gets the value of the IsValid property.  So why don’t we defer that slow running calculation until someone actually cares about the result?

class Model
{
  private bool _isValidationDirty;
  private bool _isValid;
  public bool IsValid
  {
    get
    {
      if (_isValidationDirty)
      {
        _isValid = !EvaluateComplexBusinessRule();
        _isValidationDirty = false;
      }

      return _isValid;
    }
  }

  public void Validate()
  {
    _isValidationDirty = true;
  }

  public Model ()
  {
    this.Validate();
  }
}

This way we can call Validate as often as we like without incurring the cost of recalculating the validation state - we only do that when someone actually wants the result.

Note: in the validation example it’s quite likely that we will want to recalculate the validation state more than once, but there are scenarios where this doesn’t apply. If we can be sure that we only need to run a slow calculation once then we can make use of the Lazy class, which takes care of the “evaluate this when someone asks for it” implementation:

  var result = new Lazy<double>(() => MySlowMethod());

  /* ...several hours later... */

  if (result.Value > 0) // <-- the calculation will be evaluated here
  {

  }

Do It In The Background

If you find yourself with a slow operation that can’t be avoided then it might be time to stop looking at actual performance and start considering perceived performance.  A user of your application will notice a wait much more if it is unexpected.  Think about the times you click a button and nothing seems to happen - isn’t that always more annoying than a ‘loading…’ message that pops up immediately until the task is done?

Performing long-running tasks in the background whilst the user is either notified of progress or is able to continue working is a great way of making an application feel faster, and this can be achieved by running tasks on a background thread.

Invoking a task on a background thread is a pretty simple thing to do, particularly with the Task Parallel Libary introduced in .NET 4.0.  To run a method asynchronously is as simple as:

static void Main(string[] args)
{
  Task.Factory.StartNew(() => MyLongRunningMethod());
}

The tricky part with background operations is that you need to give some thought to how exactly the rest of the application should react to them.  Should other operations be able to run at the same time?  What about tasks that rely on the result of this operation?

Unfortunately there isn’t really a catch-all answer - it’s extremely dependent on the situation - but make sure that you have at least thought about it!

Wrapping Up

This is by no means an exhaustive list of ways to improve performance but hopefully points you in the right direction.

Happy profiling…