|
This article is to demonstrate how easy it is to write a primitive CLR regular expression function with C# and SQL Server 2005. We will solve the problem of separating numeric from nonnumeric characters. Let’s use the string ‘123abc45d[#*6’ as our example input. We want one of two results returned, either ‘123456’ or ‘abcd[#*’.
We start by creating a new SQL-CLR C# Project in Visual Studio. Next we add a new User-Defined Function and call it fctRegex. Then we copy the following code into our file.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;
using Microsoft.SqlServer.Server;
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlString fctRegex(string sRegex, string regType)
{
Regex regNumeric = new Regex("[0-9]*");
MatchCollection mcNumeric = regNumeric.Matches(sRegex);
string sTheRest = sRegex,
sMatch = "";
foreach (Match mt in mcNumeric)
{
sMatch += mt.Value.ToString();
}
sTheRest = regNumeric.Replace(sRegex, "");
if (regType == "int")
return new SqlString(sMatch);
else
return new SqlString(sTheRest);
}
};
|
When calling our function, we send in the string to be parsed and what type of information we want returned. Due to the simple nature of our example we return either the integers or everything excluded the integers. We use the built in Regular Expression namespace to handle our regular expressions and matching process. Once we have our regex set, we loop through the string searching for matches. Each match is stored in our variable sMatch. After the entire string has been searched we remove all occurrences of the regex pattern and store it in sTheRest. The type that was sent in with the function determines which variable is returned.
When we build our project it should add the necessary assembly to the database. From this point we can either create/modify the test.sql in our project or open up SQL Server Management Studio (SSMS) and test our new function. Here is a sample script:
select dbo. fctRegex( '342342132*fdsdf23sda5452bbppspe[s' , 'int' ),dbo.fctRegex ('342342132*fdsdf23sda5452bbppspe[s' , 'string' )
The result set we get back is
| (No column name) |
(No column name) |
| 342342132235452 |
*fdsdfsdabbppspe[s |
As you can see, this function is relatively easy to create. Without much work you can take the function above and create more complex regular expression patterns to search for.
|