[C++] An exercise in optimisation #1 - ImmutableString

Introduction

This post outlines a problem I had to solve regarding using a string as a key for a hash table, and the steps that were taken to optimise the problem by applying the solution of an immutable string type.

I may make this into a series if I can find interesting enough examples that I can suitably simplify the code for.

The Problem

I was creating a key -> value pair map for use with a C++ serialisation system; the keys had to be strings as they were used as identifiers for JSON, and around 99% of the time the key would be a string literal.

Solution #1 - STL String

Due to the requirements of storing non-literal keys, my first naive attempt at this involved using a std::string.

The performance of this solution was terrible. I no longer have accurate profiles from my real use-case, however the operation in question would have taken 2 - 4 seconds under the legacy serialisation system, and under my new system it was taking 40 - 60 seconds. The profile itself showed no real hot-spots, but rather a “death by a thousand cuts”.

Solution #2 - ImmutableString

Despite it being hard to see from the profile, I suspected one of the main issues was due to the string copying for the keys.

Considering that the keys were never mutated, had I not had the requirement of storing non-literal keys, I’d have just used a const char* as that would let us just take a pointer to the literal rather than copy the whole string.

After a bit of thought, I realised that I could create an ImmutableString type for C++ that would let me store a literal by a pointer, and store a non-literal as a ref-counted object. This would optimise my 99% case excellently as I’d just be copying a pointer, as well as optimising my 1% case by only copying the string once and then ref-counting it.

The performance of this solution was much improved. The operation in question was now only taking 6 - 10 seconds, and while this wasn’t as good as the legacy system, it was far superior to attempt #1 (the new system would always be slower due to the way it worked internally, however it was further optimised at a later date to reduce the amount of array reallocations).

The ImmutableString Type

As mentioned above, the ImmutableString type had two requirements:

  1. Store literal strings as a pointer.
  2. Store non-literal strings as a ref-counted copy.

My actual implementation was templated to allow it to be used with both a char and wchar_t type, however for simplicity this version will only support a char type.

Let’s take a look at requirement #2 first. This was solved using an internal proxy object to store the string contents along side a ref-count; this proxy also had a 64-byte scratch buffer to avoid having to allocate from the heap for relatively small strings (64 worked well for my use-case; you may want to adjust this value, or even remove the scratch buffer entirely depending on your requirements).

struct StringProxy
{
	StringProxy(
		const char *const str,
		const size_t strLen
		)
		: m_str(nullptr)
		, m_refCount(1)
	{
		const size_t strBufLen = strLen + 1;
		if(strBufLen <= _countof(m_strScratch))
		{
			strcpy_s(m_strScratch, _countof(m_strScratch), str);
			m_str = m_strScratch;
		}
		else
		{
			char *const tmpStr = new char[strBufLen];
			strcpy_s(tmpStr, strBufLen, str);
			m_str = tmpStr;
		}
	}

	~StringProxy()
	{
		if(m_str != m_strScratch)
			SafeArrayDelete(m_str);
	}

	void AddRef()
	{
		++m_refCount;
	}

	void Release()
	{
		if(!--m_refCount)
			delete this;
	}

	const char *m_str;
	short       m_refCount; // Should probably be an atomic
	char        m_strScratch[64];
};

The above code should be relatively self explanatory. Given a string, the constructor works out whether it will fit into the scratch buffer and either copies it into that buffer, or allocates a buffer from the heap and copies it into that. The destructor deletes “m_str” if its buffer was allocated from the heap, and AddRef and Release take care of the ref-count, deleting “this” when the ref-count reaches zero. _countof is a MSVC function which returns the size of an array.

Now let’s take a look at requirement #1. We can abuse the fact that a string literal has a type of const char[] to assume that any variable given to an ImmutableString of type const char[] is a string literal (or at the very least, a string that will persist longer than the ImmutableString instance) and therefore just store it as a pointer.

class ImmutableString
{
public:
	template <size_t StrLen>
	ImmutableString(
		const char (&str)[StrLen]
		)
		: m_str(str)
		, m_strLen(StrLen-1)
		, m_proxy(nullptr) // no need for a proxy as we don't have to delete the string
	{
	}

private:
	const char  *m_str;
	size_t       m_strLen;
	StringProxy *m_proxy;
};

That funky template syntax defines a constructor which only accepts an array of type const char, filling in “StrLen” with the size of the given array.

So that takes care of our string literal case, now what about the non-literal case? You might think that would be as simple as adding a constructor which accepts a const char* as well as the templated version above? Unfortunately this isn’t the case, as this causes the compiler to always favour the const char* variant and never use the templated variant for string literals.

To work around this, I found a solution on Stack Overflow which suggested using an intermediary struct as a constructor parameter. This overloads the compilers complexity cost calculation when constructing an ImmutableString from a string literal to have it favour the templated constructor, rather than try and convert the const char[] to a const char* to pass into the intermediary struct.

//! Used as an intermediary:
//! http://stackoverflow.com/questions/2041355/c-constructor-accepting-only-a-string-literal
struct ConstructionWrapper
{
	ConstructionWrapper(
		const char *const str
		)
		: m_str(str)
	{
	}

	const char *m_str;
};
ImmutableString(
	ConstructionWrapper strWrapper
	)
	: m_str(nullptr)
	, m_strLen(strlen(strWrapper.m_str))
	, m_proxy(new StringProxy(strWrapper.m_str, m_strLen))
{
	m_str = m_proxy->m_str;
}

That’s really it; the class now solves both of our requirements. The rest of the code in the ImmutableString class dealt with maintaining the ref-count when copying the string (and moving it in C++11), as well as ensuring the proxy object was correctly destroyed.

The full source code for ImmutableString is available here, along with a simple test case which shows how the constructors and copy/move operators work.

 
comments powered by Disqus