Wednesday, February 16, 2005

What is under the hood of .Net String.GetHashCode()?

Ever wonder what is the hashing algorithm used inside the .Net String class's GetHashCode() override? Well, here is the fruit of a couple of hours of my labour.
C++ (Unmanaged):
ULONG HashString(LPCWSTR szStr)
{
ULONG hash = 5381;
int c;

while ((c = *szStr) != 0)
{
hash = ((hash << 5) + hash) ^ c;
++szStr;
}
return hash;
}

C++ Managed:

int Class1::HashString(String *szStr)
{
int hash = 5381;
int c;
int len = szStr->Length;
for (int i = 0; i < len; i++)
{
char f = szStr->get_Chars(i);
int c = f;
if (c < 0) c += 256; // Some characters like '£', '¬' are returned with a negative values!
hash = ((hash << 5) + hash) ^ c;
}
return hash;
}

C#:
int HashString(string szStr)
{
int hash = 5381;
int c;
int i = 0;

while(i < szStr.Length)
{
c = (int)szStr[i];

hash = ((hash << 5) + hash) ^ c;
i++;
}
return hash;
}

Enjoy!

3 comments:

Anonymous said...

home equity loan

Anonymous said...

That is very helpful.Is this possible to get this code in VB.NET?

C#:
int HashString(string szStr)
{
int hash = 5381;
int c;
int i = 0;

while(i < szStr.Length)
{
c = (int)szStr[i];

hash = ((hash << 5) + hash) ^ c;
i++;
}
return hash;
}

Anonymous said...

This is what I came up with for VB.NET. But since I needed to have compiler integer checks turned on, it's only the 24 least significant bits. Turn off compiler checks and remove the 24 bit mask for the full 32 bits.

Shared Function GetHashCode24Bit(ByVal text As String) As Integer
GetHashCode24Bit = 5381
Dim limit As Integer = text.Length - 1
Dim index As Integer
For index = 0 To limit
GetHashCode24Bit = (((GetHashCode24Bit * 32) + GetHashCode24Bit) Xor Convert.ToInt32(text.Chars(index))) And &HFFFFFF
Next
#If Debug Then
Dim textHashCodeDotNet1 As Integer = text.GetHashCode And &HFFFFFF
If GetHashCode24Bit <> textHashCodeDotNet1 Then
MsgBox("For text = " & text & ", GetHashCode24Bit = " & GetHashCode24Bit & ", but GetHashCode = " & textHashCodeDotNet1)
End If
#End If
End Function