Articles

Current Articles | Categories | Search | Syndication

By Andrew Novick @ Monday, May 26, 2008 12:00 AM :: 135 Views :: 0 Comments

This article was originally published on the author's site at: http://www.novicksoftware.com/Articles/sql-2005-product-user-defined-aggregate.htm

Where you can find the code as a download. 

You may also be interested in the related discussion that has a SQL implementaion of the aggregate provided by Joe Celko at: A Note on SQL Aggregates    

SQL Server doesn't have a Product aggregate function.  A recent blog post by Karen Watterson pointed out that Microsoft Knowledge Base article Q89656: Simulating a PRODUCT() Aggregate Function discusses how to achieve the Product aggregate functionality using the POWER function.  We e-mailed about it and she challenged me to create a Product aggregate in SQL Server 2005.  I decided to accept the challenge. 

SQL Server 2005 allows the creation of User Defined Aggregates in VB.Net or C#.   I started with a new  Visual Basic/Database type project and I selected SQL Server Project.  That created the project without any classes.  The next step is to add a User Defined Aggregate Class to the project and code it.  So here it is, my first attempt at a udagg:

Imports System
Imports System.Data.Sql
Imports System.Data.SqlTypes
Imports System.Runtime.InteropServices

''' <summary>
''' Sample user-defined aggregate that takes the product
''' of multiple values.  NULLs are ignored, except that if there
''' are no non-null values, the result is NULL.
'''
</summary>
''' <remarks>This is sample code created with a Beta version of
''' Visual Studio 2005. Do not use this in a production application.
'''
''' by Andrew Novick  http://www.NovickSoftware.com
'''
</remarks>
<Serializable()> _
<StructLayout(LayoutKind.Sequential)> _
<SqlUserDefinedAggregate(Format.Native, IsInvariantToNulls:=True, _
                    IsInvariantToOrder:=True, IsNullIfEmpty:=True, _
                    IsInvariantToDuplicates:=False)> _
Public Class Product

    Private m_Accumulator As SqlTypes.SqlDouble
    Private m_HasNonNull As Boolean ' Tracks that any non-nulls
    ' have been passed to the Accumulate or Merge methods.

    ''' <summary>
    ''' Called once when the class is being initialized or reused.
    ''' </summary>
    ''' <remarks></remarks>
    Public Sub Init()
        m_Accumulator = 1.0
        m_HasNonNull = False
    End Sub

    ''' <summary>
    ''' Called once for each row.  value could be NULL.
    ''' </summary>
    ''' <param name="value">New value to aggregate by
    ''' multiplication</param>
    ''' <remarks></remarks>
    Public Sub Accumulate(ByVal value As SqlDouble)
     
       if value.isnull then exit sub     

        m_Accumulator *= value
        m_HasNonNull = True
    End Sub

    ''' <summary>
    ''' Joins two Product aggregates.
    ''' </summary>
    ''' <param name="value">The Product aggregate to join to this one.</param>
    ''' <remarks></remarks>
    Public Sub Merge(ByVal value As Product)

        if value is nothing then exit sub
        if not value.m_HasNonNull then exit sub
        m_Accumulator *= value.m_Accumulator
        m_HasNonNull = True
       
    End Sub

    ''' <summary>
    ''' Returns the result.
    ''' </summary>
    ''' <returns>Product of all non-null arguments.</returns>
    ''' <remarks></remarks>
    Public Function Terminate() As SqlDouble
        if not m_HasNonNull then
            return sqldouble.null
        else
            Return m_Accumulator
        end if
    End Function
End
Class

A user defined aggregate must implement four methods:

Init Initializes the aggregate and reinitializes it when SQL Server chooses to reuse an aggregate class instead of creating a new one. The Product.Init method resets m_Accumulator to 1 and the m_HasNonNull boolean back to false.  These put the aggregate back to it starting point.
Accumulate Called once per row to be aggregated. Product.Aggregate keeps track of non-NULL arguments and does the multiplication.
Merge SQL Server may decide to use multiple threads to perform aggregation and each would use its own ugagg class.  Then the results accumulated into one of the classes is merged into another to combine the results.  Notice that Product.Merge is handed a Product aggregate, not a SQLDouble.  Private members are available to it, which makes the merge process simple.
Terminate Finally, the terminate method returns the results.  Product.Terminate checks to see if any non-NULL values were aggregated.  If not, it returns null.  If non-NULLs were encountered, the product is returned.

There are a couple of choices that have to be made in the design of the Product aggregate:

  • What sort of precision should be maintained?

  • How to handle NULLs?

I chose to use SQLDouble as the data type for the aggregate.  For starters, There's only a limited choice of types that can be when using the Format.Native attribute.  SQLDouble is one of them.  Without Format.Native, the aggregate must implement the IBinarySerializable interface and I preferred not to do that for a first cut at a udagg.

The artribute IsNullIfEmpty:=True covers the case where there are no values to aggregate by returning NULL.  But what's the Product of only NULL values?  I decided it was NULL.  I suppose that an argument could be made for 1.  After all, one raised to the zero power is one.  And you might even argue for zero.  In the end, I decided to mimic the behavior of the SUM aggregate which returns NULL when given only NULL values.  However, the point of udagg's is that these decisions are now in our hands.

Visual Studio 2005 takes care of deploying the assembly to SQL Server 2005 (f.k.a Yukon).  If the assembly is added to SQL Server manually, a CREATE AGGREGATE statement is required.  Here's a sample of what it looks like with an appropriate GRANT statement:

CREATE AGGREGATE [Product](@Value float)
                             Returns [float]
       EXTERNAL NAME [SampleAggregate].[SampleAggregate.Product]
go
GRANT EXEC ON dbo.Product TO PUBLIC
go

After using the Build -> Deploy menu command you're ready to try out Product.  My first instinct was to run my tests in the the new SQL Server Management Studio, which takes over the function of Query Analyzer, but I decided to stick to Visual Studio.  My understanding is that they're actually the same program, just configured differently.  The Visual Studio 2005 Solution Explorer has a node for TestScripts.  All you have to do is pick Add Test Script from its context menu.  I added a script called Test.sql.  Here it is:

-- Simple case with 3 integer sample numbers.
SELECT dbo.product(sample) [Three integers]
    FROM (           SELECT CAST(1.0 as float) as Sample
           UNION ALL SELECT 3
           UNION ALL SELECT 5
         ) Numbers

-- Case with 3 floating point sample numbers and a NULL value
SELECT    dbo.Product(Sample) [Three floats and one NULL]
    FROM (           SELECT CAST(1.745 as float) as Sample
           UNION ALL SELECT 20.7
           UNION ALL SELECT NULL -- <<< NULL value
           UNION ALL SELECT 23.4
         ) Numbers

-- Case of just a NULL value.
SELECT dbo.product(sample) [Just a NULL]
    FROM (           SELECT CAST(NULL as float) as Sample
         ) Numbers

I set Test.sql as the "Default Debug Script" using its context menu and pressed F5, equivalent to Debug->Start.  Here are the results:

(1 row(s) affected)
Three integers 
---------------
15             
(1 row(s) returned)
(1 row(s) affected)
Three floats and one NULL
---------------
845.2431       
(1 row(s) returned)
(1 row(s) affected)
Just a NULL    
---------------
1              
(1 row(s) returned)
Finished running sp_executesql.

There seem to be a few more system messages than in SQL Server 2005 than in 2000 but everything seems in order.  Integers were converted to float by SQL Server.  NULLs were handled the way that I intended.  It worked.

Okay.  It didn't really happen so easily.  I spent a fair amount of time going back and forth between the documentation and the book A First Look at SQL Server 2005 for Developers and my sample code.  All-in-all it wasn't that difficult.

SQL CLR User Defined Aggregate SQL 2005 2008 Andrew Novick

Posted @ Monday, May 26, 2008 12:00 AM by Andrew Novick
Previous Page | Next Page
Comments
Currently, there are no comments. Be the first to post one!
You must be logged in to post a comment. You can login here

Survey

Which of the following CLR objects are you currently using in SQL Server?




Submit Survey  View Results

Links


Copyright 2007 by SQLCLR.net Terms Of Use Privacy Statement
Website graphics provided by Matt Green Designs
Page generated in 0.7343609 seconds.

All information and example code on this site is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from its use.

This site is in no way affiliated with Microsoft. Unless specifically stated otherwise, nothing should be construed to represent the official positions or opinions of Microsoft and/or its Employees.