Sunday, May 02, 2010

Test MongoDB In Windows Environment

NoSQL movement becomes more and more popular. Nowadays most NoSQL product environments are Unix-based. MongoDB is one of few solutions that provide Windows installation and .NET provider. This exercise evalutes MongoDB's performance in Windows platform.

The test machines' configuration:
Local tests and remote tests are both examined in this exercise. Local tests are conducted inside one physical box. MongoDB server and client process are running in separated machines in remote tests, and connection between server and client machines are 100 Mbit/Sec.

A simple document with key (unique-indexed) value (non-indexed) pair is used for testing. All fields (including internal _id field) will store GUID value:
{
_id: "12345678-1234-1234-1234-123456789012", // Internal field
key: "12345678-1234-1234-1234-123456789012", // Unique-indexed
value: "12345678-1234-1234-1234-123456789012" // Non-indexed
}
Manually handling "_id" field without using Oid format is not recommended. The reason of doing that in our tests is efficiently do update and delete operations without extra search.

Local tests are conducted inside one physical machine. MongoDB server and client process are running in separated machines in remote tests, and connection between server and client machines are 100Mbit/Sec.

Local test results:


(All in document actions per second, 1M=1000000)




























































Document number
1M 2M 4M 8M 10M
Insert 32210 29540 27160 22100 20530
Update 18560 18160 18010 17900 17540
Delete 12660 12510 11980 11610 11550
Search on _id 13340 13330 13350 12990 12130
Search on key 9190 9280 9300 9110 9010
Search on value 7.2 3.6 1.8 0.9 0.7


Remote test results:





























































Document number 1M 2M 4M 8M 10M
Insert 31450 30700

22420 18810 17600

Update 18180 18160 17400 18100 17830
Delete 12550 12270 11730 11680 11530
Search on _id 2410

2440

2410

2420

2420

Search on key 1470

1470

1450

1450

1440

Search on value 7.2 3.6 1.8 0.9 0.7


Findings:
  1. Fastest operation is data insert.
  2. Search on _id is faster than search on other indexed field.
  3. Search on indexed field seems in constant time, not proportional to size of the data set.
  4. Search on non-indexed field shows very poor performance, and the time spent is proportional to the size the data set. This is similar to a table scan scenario in a relational database.
  5. There's no big gap in insert/update/delete between local test and remote test.
  6. Remote search is significantly slower than local search.
By digging into source code of MongDB-CSharp driver, I found that insert, update and delete functions in the client side are implemented by one-way IO writing without any acknowledgement, i.e. keep sending data to TCP NetworkStream. Streamed data can be efficiently buffered in both client and server side. From client point of view it's asynchronous process. That explains why there's no big difference between remote and local tests in batch insert/update/delete scenarios. On the other hand, read or search (find related functions) requires two-way communication. The client sends a request to the server then waits the response to be completed. The server process time plus the round-trip network delay are all counted for one request action. That's why remote search are significantly slower the tests conducted in local box.

I thought the relatively low speed of indexed search in our test results could be caused by my single-threaded test application. So I modified the code and did the tests with 2, 4, 6, 8 and 10 threads. But surprisingly the performance did not improve at all.

The test code is listed below. It is simple and may be not accurate enough. But it gave me an idea how it performs in general with Windows environment.

Test code:
using System;
using System.Collections.Generic;
using System.Configuration;
using System.Diagnostics;
using System.Threading;
using MongoDB.Driver;

public class Program
{
struct TimeUnit
{
public int NumberOfRecords;
public long MilliSeconds;
public TimeUnit(int num, long ms)
{
NumberOfRecords = num;
MilliSeconds = ms;
}
}
const int MaxDocuments = 1000000;
const int NonIndexSearchCount = 1000;
const int ThreadCount = 1;
const string DBConnectionString = "Server=127.0.0.1";
const string DBName = "TestMongoDB";
const string DBCollectionName = "TestCollection";
static string[] testContent = new string[MaxDocuments];
static object lockObject = new object();
static Dictionary<string, List<TimeUnit>> resultTimes = new Dictionary<string, List<TimeUnit>>();

static void Main(string[] args)
{
// Initialize data
for (int i = 0; i < MaxDocuments; i++)
testContent[i] = Guid.NewGuid().ToString();

// Start Threads
List<Thread> threads = new List<Thread>();
for (int i = 0; i < ThreadCount; i++)
{
int max = MaxDocuments / ThreadCount;
int insertStart = i * max;
int searchStart = insertStart + max / 2;
int searchEnd = searchStart + max / 10;
Thread thread = new Thread(() => TestMongoDB(insertStart, searchStart, searchEnd, max));
thread.Start();
threads.Add(thread);
//ThreadPool.QueueUserWorkItem(o => TestMongoDB(insertStart, searchStart, searchEnd, max));
}
Console.WriteLine();
// Wait threads to be completed
foreach (Thread thread in threads)
thread.Join();

// Display results
Console.WriteLine("Results:");
foreach (var key in resultTimes.Keys)
{
List<TimeUnit> times = resultTimes[key];
Console.Write(key + ":");
long totalTime = 0, totalNumber = 0;
foreach (TimeUnit st in times)
{
totalTime += st.MilliSeconds;
totalNumber += st.NumberOfRecords;
Console.Write(st.MilliSeconds + " ");
}
Console.WriteLine(string.Format("{0} speed is : {1}/sec", key, totalNumber * 1000.0 / totalTime));
}
Console.WriteLine("Done");
Console.Read();
}

/// <summary>
/// MongoDB test. Document format:
/// {
/// _id, // internal field
/// key, // indexed field
/// value // non-indexed field
/// }
/// </summary>
static void TestMongoDB(int insertStart, int searchStart, int searchEnd, int max)
{
// Init MongoDB
Mongo mongo = new Mongo(DBConnectionString);
IMongoDatabase db;
IMongoCollection documents;
Document doc = new Document();
Stopwatch watch = new Stopwatch();

mongo.Connect();
db = mongo[DBName];
documents = db.GetCollection(DBCollectionName);

// Setup Index on key field
lock (lockObject)
{
if (!documents.MetaData.Indexes.ContainsKey("key"))
{
Document indexDoc = new Document { { "key", 1 } };
documents.MetaData.CreateIndex(indexDoc, true);
}
}

// Insert data
watch.Start();
for (int i = insertStart; i < insertStart + max; i++)
{
Document newdoc = new Document();
newdoc["_id"] = newdoc["key"] = newdoc["value"] = testContent[i];
documents.Insert(newdoc);
}
watch.Stop();
AddTimeUnit("insert", new TimeUnit(max, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB insert {0} records: {1} ms",
max, watch.ElapsedMilliseconds.ToString()));
watch.Reset();

// Search on id field
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["_id"] = testContent[i];
var doc1 = documents.FindOne(doc);
string value = doc1["value"].ToString();
}
watch.Stop();
AddTimeUnit("search-id", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} search on id field over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Search on indexed "key" field
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["key"] = testContent[i];
var doc1 = documents.FindOne(doc);
string value = doc1["value"].ToString();
}
watch.Stop();
AddTimeUnit("search-key", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} search on indexed field over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Search on non-indexed "value" field
watch.Start();
for (int i = searchStart; i < searchStart + NonIndexSearchCount; i++)
{
doc["value"] = testContent[i];
var doc1 = documents.FindOne(doc);
string value = doc1["value"].ToString();
}
watch.Stop();
AddTimeUnit("search-value", new TimeUnit(NonIndexSearchCount, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} search on non-indexed field over {1} records: {2} ms",
NonIndexSearchCount, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Update test
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["_id"] = testContent[i];
//var doc1 = documents.FindOne(doc);
doc["key"] = testContent[i];
doc["value"] = i.ToString();
documents.Update(doc);
}
watch.Stop();
AddTimeUnit("update", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} update over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
watch.Reset();
doc = new Document();

// Delete test
watch.Start();
for (int i = searchStart; i < searchEnd; i++)
{
doc["_id"] = testContent[i];
//var doc1 = documents.FindOne(doc);
documents.Delete(doc);
}
watch.Stop();
AddTimeUnit("delete", new TimeUnit(searchEnd - searchStart, watch.ElapsedMilliseconds));
Console.WriteLine(string.Format("MongoDB {0} delete over {1} records: {2} ms",
searchEnd - searchStart, documents.Count(), watch.ElapsedMilliseconds));
}

private static void AddTimeUnit(string key, TimeUnit st)
{
lock (lockObject)
{
if (resultTimes.ContainsKey(key))
{
List<TimeUnit> times = resultTimes[key];
times.Add(st);
}
else
{
List<TimeUnit> times = new List<TimeUnit>();
times.Add(st);
resultTimes.Add(key, times);
}
}
}
}