Thursday, March 29, 2007

Thread Safety In .NET Collections

Most .NET collections are not thread-safe. Take generic list as an example, the MSDN description is as following:

Thread Safety

Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

A List<T> can support multiple readers concurrently, as long as the collection is not modified. Enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with one or more write accesses, the only way to ensure thread safety is to lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

So we need to implement read/write locks in multi-threaded environment when the list objects are modified. That's a lot of extra work and is prone to deadlock in a complicate system.

Is there any easy way on this? Can we build a new collection and switch the list reference to the new constructed collection? Following code demos such approach:
using System;
using System.Collections.Generic;
using System.Threading;

class Test
{
public class Person
{
public string Name;
public int Age;
public Person(string name, int age)
{
Name = name;
Age = age;
}
}

private static List<Person> _personList = new List<Person>();
static void Main()
{
// Build initial list
for (int i = 0; i < 10; i++)
{
_personList.Add(new Person("Name" + i.ToString(), i));
}

new Thread(new ThreadStart(LoopthroughList)).Start();

//Following thread will throw InvalidOperationException:
//Exception: Collection was modified; enumeration operation may not execute.
//new Thread(new ThreadStart(UpdateListUnsafely)).Start();

// Following thread won't throw exception
new Thread(new ThreadStart(UpdateListSafely)).Start();

Console.Read();
}

static void LoopthroughList()
{
Console.WriteLine("First loop");
foreach (Person person in _personList)
{
Console.WriteLine("ListHashCode:{0} ListCount:{1} Name:{2} Age:{3}",
_personList.GetHashCode(), _personList.Count, person.Name, person.Age);
Thread.Sleep(100);
}
Console.WriteLine("Second loop");
foreach (Person person in _personList)
{
Console.WriteLine("ListHashCode:{0} ListCount:{1} Name:{2} Age:{3}",
_personList.GetHashCode(), _personList.Count, person.Name, person.Age);
}
}

static void UpdateListUnsafely()
{
Thread.Sleep(500);
_personList.RemoveAt(_personList.Count - 1);
_personList.Add(new Person("A New Person", 20));
}

static void UpdateListSafely()
{
Thread.Sleep(500);
List<Person> newList = new List<Person>(_personList);
newList.RemoveAt(newList.Count - 1);
newList.RemoveAt(newList.Count - 1);
newList[newList.Count - 1].Name = "Modified Name";
newList.Add(new Person("A New Person", 20));
_personList = newList;
Console.WriteLine("List updated! ListCount:{0}", _personList.Count);
}
}
Result:
The result looks promising. We don't see any error when switching the reference (theoretically we should lock this operation). Is such implementation really thread-safe now? Maybe yes maybe no. The problem is that I couldn't find an official answer from Microsoft.