Wednesday, February 16, 2005

What is under the hood of .Net String.GetHashCode()?

Ever wonder what is the hashing algorithm used inside the .Net String class's GetHashCode() override? Well, here is the fruit of a couple of hours of my labour.
C++ (Unmanaged):
ULONG HashString(LPCWSTR szStr)
{
ULONG hash = 5381;
int c;

while ((c = *szStr) != 0)
{
hash = ((hash << 5) + hash) ^ c;
++szStr;
}
return hash;
}

C++ Managed:

int Class1::HashString(String *szStr)
{
int hash = 5381;
int c;
int len = szStr->Length;
for (int i = 0; i < len; i++)
{
char f = szStr->get_Chars(i);
int c = f;
if (c < 0) c += 256; // Some characters like '£', '¬' are returned with a negative values!
hash = ((hash << 5) + hash) ^ c;
}
return hash;
}

C#:
int HashString(string szStr)
{
int hash = 5381;
int c;
int i = 0;

while(i < szStr.Length)
{
c = (int)szStr[i];

hash = ((hash << 5) + hash) ^ c;
i++;
}
return hash;
}

Enjoy!