String comparison performance, and regular expression, in vb.net

by Svelmoe 26. May 2009 16:50

Having to do some optimizing of processes over the last period of time, I decided to do some quick benchmark of the various string comparisons, to see how they performened against each other.
The ones I was interested in was the String.Equal and String.Compare, so I decided to make some quick testing.

I made a WinForm and used the following code to test with:

      For i As Integer = 0 To 100000
            If String.Equals("item", "ITEM", StringComparison.InvariantCultureIgnoreCase) Then
                Debug.WriteLine("yes")
            End If
        Next
        For i As Integer = 0 To 100000
            If String.Equals("item", "ITEM", StringComparison.CurrentCultureIgnoreCase) Then
                Debug.WriteLine("yes")
            End If
        Next
        For i As Integer = 0 To 100000
            If String.Equals("item", "ITEM", StringComparison.OrdinalIgnoreCase) Then
                Debug.WriteLine("yes")
            End If
        Next
        For i As Integer = 0 To 100000
            If String.Compare("item", "ITEM", True) = 0 Then
                Debug.WriteLine("yes")
            End If
        Next
        For i As Integer = 0 To 100000
            If String.CompareOrdinal("item", "item") = 0 Then
                Debug.WriteLine("yes")
            End If
        Next
        Dim regex As New System.Text.RegularExpressions.Regex("item", System.Text.RegularExpressions.RegexOptions.Compiled Or System.Text.RegularExpressions.RegexOptions.IgnoreCase)
        For i As Integer = 0 To 100000
            If regex.IsMatch("item") Then
                Debug.WriteLine("yes")
            End If
        Next


The specifics of the code aren’t that interesting, it was mostly to create some comparable numbers, so I use just one string “item” to compare to “ITEM” (except for CompareOrdinal, which is mostly there as a benchmark).

Anyways I was quite surprised at the numbers when I ran the code through RedGate’s ANTS performance profiler.



For the String.Equals with the various options, the following numbers were meassured:
StringComparison.InvariantCultureIgnoreCase gave 51.490ms (the actual number is irrelevant, it is the relative difference I’m interested in)

StringComparison.CurrentCultureIgnoreCase gave 65.747ms which is close enough to the above that I’ll say those two are quite similar.

StringComparison.OrdinalIgnoreCase however only took 12.378 for the same amount of iterations and true checks. This surprised me a bit.

String.Compare took 66.004ms, which is similar to the CurrentCultureIgnoreCase.

CompareOrdinal only took 8.056ms, but of course is case sensitive, so it doesn’t quite match up to this situation (unless you spend less time running a ToUpper on your compare clauses).

RegEx.IsMatch took a staggering 375.127ms with compiled and ignore clause, but then again – you shouldn’t use regular expression for something as trivial as this anyway. Regular expression is a very powerful tool in its own right, this was just to compare.

The big surprise is that there is such a big difference between using StringComparison.OrdinalIgnoreCase in your String.Equals compared to the other options. I’ll refer to the msdn documentation for the specifics of the different types (http://msdn.microsoft.com/en-us/library/system.stringcomparison.aspx) , but shallowly said - the Ordinal as the name says ignores culture specific matching.

So if that is acceptable in a given situation, using String.Equals with OrdinalIgnoreCase is by far the best option when comparing strings.

Again note the numbers aren’t numerically important, the relative difference existed as I ran the code several times, with OrdinalIgnoreCase winning out significantly each time.
In a production function I had made, I was able to shave several percent off my execution time, simply by switching from String.Compare to String.Equals with OrdinalIgnoreCase option used.
That was an easy way of gaining performance in my book.

Email validate

by Svelmoe 12. September 2007 13:57

This is a semi-strong validation of e-mails for danish national characters as well. It is not perfect, but decent if needing a solid validating of an email, and I am constantly updateing it for personal use.

Note: Beware of the line breaks added

 ^[0-9a-zA-ZæøåÆØÅ][\w\.\-_æøåÆØÅ]*[a-zA-Z0-9æøåÆØÅ\.\-]@
a-zA-Z0-9æøåÆØÅ]([\w\.\-_æøåÆØÅ][a-zA-Z0-9æøåÆØÅ])*\.
[a-zA-ZæøåÆØÅ][a-zA-ZæøåÆØÅ\.]*[a-zA-Z]$


Yes, I know it is (far) from perfect, but it works for some of the reasons I encounter.
Also - remember, sometimes a loose validation is much better, if you can't validate completely - however, there are times where a stronger, more restrictive validation is useful.

 

At least for me :) 

Find HTML tags

by Svelmoe 30. August 2007 14:01

Some regular expression for finding HTML tags in code.

It is - as always is the case with regular expression in my view - not perfect, but it'll do for fast, simple parsing.

Find HTML tags:
<[^>]*>

This can be modified easy to find HTML tags except for anchor tags (for example):
<[^>aA]*>

 

Numbers in regular expression.

by Svelmoe 23. August 2007 14:04

Made these regular expressions some time ago to parse/validate numbers in text.

Validate number with comma as decimal pointer, and period as group separator:
^(\d+|([1-9]|[1-9]\d|[1-9]\d{2})(\.\d{3})*)(,\d{1,2})?$

Validate number with period as decimal pointer and comma as group separator:
^(\d+|([1-9]|[1-9]\d|[1-9]\d{2})(,\d{3})*)(\.\d{1,2})?$


Hope they are useful.

 

About Svelmoe

My real name is Allan Svelmøe Hansen.

I live in Denmark, where I work as a developer for hedal:kruse:brohus using SQL Server and the .NET framework since 2004. Svelmoe.dk is a place for my every day thoughts and reactions and the occasional technical blog entry.

I also blog about SQL and MS SQL Server at www.execsql.com so in case you are looking for more about that, please visit that website.



View Allan Svelmøe Hansen's profile on LinkedIn     

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2010 Svelmoe.dk