Home | About | Resources | Contact | Partners
MUST Logo

Layout for printing

Download Free MUST Trial Today



"MUST allows our business to use Access to flexibly RAD our systems and then easily move production systems to SQL Server"
Karen Spencer, Partner and Business Manager,
Gravitas IT solutions

"FANTASTIC! What was going to take me hours hand (re)coding Stored Procs just worked..."
David Peach, I.T. Management & Support Services,
Qbt Consulting

"It has saved me a lot of time"
Julian Baker,
The Access Man



Using UNPIVOT to normalize mapped data sets in SQL Server (2nd November 2011)

In this article I want to describe a technique in SQL Server, which can be used to remap data that would otherwise be computationally expensive to perform in Access with VBA or in .NET. In our own applications we have reduced processing time from typically 20 minutes to around 5 seconds using this technique on larger data sets.

The sample script file for this article will allow you to create the denormalised data set using the SQL Server Northwind Database and then normalise the data, but we are going to skip the steps in creating the denormalised data (we will return to this later in the article) and head straight for the solution to this difficult problem.

The denormalized sample data set

We start with the following data set:-



Here, we have a classification of data which means that if a new category was added, we would have to restructure our table to have a new field, and what we would like to do is have the data normalised, so we can easily add new categories, and still return to the above presentation of the data using for exampl a SQL Server PIVOT (or Access Crosstab).

We will start by showing the final structure that we want to construct, below we have created a normalised table for the data:-

CREATE TABLE tblSalesSummaryNormalised(
  PK_Summary     INT IDENTITY(1,1) PRIMARY KEY,
  OrderId                INT,
  CategoryName   VARCHAR(15),
  OrderValue          DECIMAL(16,2)
)
GO

This table would present the normalized data as shown below (once populated with data):-



The Mapping Table

In a more complex situation you could need to remap the column names (we will show this later), but to keep things simple we choose to remap the data to the same column names as field values. We start with a mapping table as shown below, which contains field values matching the old column values:-



UNPIVOT

Not only does SQL Server provide a feature to PIVOT (CROSSTAB) data, but it also offers the UNPIVOT syntax to in part reverse the operation.

The following view can be used to UNPIVOT (normalize) the data:-

-- Processing Statement to select our data
CREATE VIEW UnPivot_tblSalesSummary_vw
AS
SELECT
    Q1.OrderID,
    CategoryName,
    Q1.Value As OrderValue
FROM tblCategoriesMap
 INNER JOIN
  (SELECT
  OrderID,
  tblPivot.Property,
  tblPivot.Value
  FROM
    (SELECT
      OrderID,
      CONVERT(decimal(16,2),[Beverages]) As [Beverages],
      CONVERT(decimal(16,2),[Condiments]) As [Condiments],
      CONVERT(decimal(16,2),[Confections]) As [Confections],
      CONVERT(decimal(16,2),[Dairy Products]) As [Dairy Products],
      CONVERT(decimal(16,2),[Grains/Cereals]) As [Grains/Cereals],
      CONVERT(decimal(16,2),[Meat/Poultry]) As [Meat/Poultry],
      CONVERT(decimal(16,2),[Produce]) As [Produce],
      CONVERT(decimal(16,2),[Seafood]) As [Seafood]
      FROM tblSalesSummary
      ) DataValues
  UNPIVOT (Value FOR Property
      IN([Beverages],
      [Condiments],
      [Confections],
      [Dairy Products],
      [Grains/Cereals],
      [Meat/Poultry],
      [Produce],
      [Seafood])
  ) as tblPivot
) Q1
ON tblCategoriesMap.CategoryName = Q1.Property
WHERE Q1.Value IS NOT NULL
GO

This results in the following data set:-



Then to populate the now normalised table we use the following:-

INSERT INTO tblSalesSummaryNormalised(
  OrderId,
  CategoryName,
  OrderValue)

SELECT OrderID,CategoryName,OrderValue
FROM UnPivot_tblSalesSummary_vw
ORDER BY OrderID
GO

How we used the PIVOT operation to get the denormalized data set

It is worth taking a moment to consider how we generated our demoralised data set, for that operation we used the PIVOT statement:-

-- PIVOT (CROSSTAB) the data
CREATE VIEW SummaryOfSales_vw
AS
SELECT   OrderID,
  [Beverages],
  [Condiments],
  [Confections],
  [Dairy Products],
  [Grains/Cereals],
  [Meat/Poultry],
  [Produce],[Seafood]
FROM
    (SELECT       OrderId,       CategoryName,       LineItemValue
      FROM ProductInfo_vw       ) AS SourceTable
PIVOT
(
SUM(LineItemValue) FOR CategoryName IN ([Beverages],
    [Condiments],
    [Confections],
    [Dairy Products],
    [Grains/Cereals],
    [Meat/Poultry],
    [Produce],
    [Seafood])
) AS PivotTable
GO

Then we used the following to generate a table of data:-

SELECT * INTO tblSalesSummary
FROM SummaryOfSales_vw
GO

More complex remapping of data

To show how to extend the above example to a more advanced remap we have the following script where we add a 3 character column called ReMap:-

ALTER TABLE tblCategoriesMap ADD ReMap VARCHAR(3)
GO

UPDATE tblCategoriesMap
SET Remap = LEFT(CategoryName,3)
GO

This means that for example the column name Dairy Products would be mapped to a value Dai, we have used a simple 3 character code system, but in your applicatuion the remapping could be significantly different, for example mapping a name to a code ( Dairy Products becomes 001).

We need to create a new results table:-

CREATE TABLE tblSalesSummaryNormalisedRemap(
  PK_Summar    INT IDENTITY(1,1) PRIMARY KEY,
  OrderId    INT,
  ReMap    VARCHAR(3),
  OrderValue    DECIMAL(16,2)
)
GO

Then we create a view to normalise the data set:-

CREATE VIEW UnPivot_tblSalesSummaryRemap_vw
AS
SELECT
    Q1.OrderID,
    Remap,
    Q1.Value As OrderValue
FROM tblCategoriesMap
  INNER JOIN
  (SELECT
    OrderID,
    tblPivot.Property,
    tblPivot.Value
    FROM
      (SELECT
         OrderID,
        CONVERT(decimal(16,2),[Beverages]) As [Beverages],
        CONVERT(decimal(16,2),[Condiments]) As [Condiments],
        CONVERT(decimal(16,2),[Confections]) As [Confections],
        CONVERT(decimal(16,2),[Dairy Products]) As [Dairy Products],
        CONVERT(decimal(16,2),[Grains/Cereals]) As [Grains/Cereals],
        CONVERT(decimal(16,2),[Meat/Poultry]) As [Meat/Poultry],
        CONVERT(decimal(16,2),[Produce]) As [Produce],
        CONVERT(decimal(16,2),[Seafood]) As [Seafood]
      FROM
        tblSalesSummary
      ) DataValues
      UNPIVOT (Value FOR Property
      IN([Beverages],
        [Condiments],
        [Confections],
        [Dairy Products],
        [Grains/Cereals],
        [Meat/Poultry],
        [Produce],
        [Seafood])
    ) AS tblPivot
  ) Q1
ON tblCategoriesMap.CategoryName = Q1.Property
WHERE Q1.Value IS NOT NULL
GO

-- Putting this together to generate the new data

INSERT INTO tblSalesSummaryNormalisedRemap(
  OrderId,
  ReMap,
  OrderValue)

SELECT OrderID,ReMap,OrderValue
FROM UnPivot_tblSalesSummaryReMap_vw
ORDER BY OrderID
GO

This then results in the following remapped data set:-



VBA Recordset operations

If we had to write a solution to this problem in VBA, then we would take advantage of the ability to use a field value in one recordset, to lookup a field name in another recordset, the key line in the following code is:-

varData = rstDenormalised(rstMap!CategoryName)

The UNPIVOT code in this article illustrates how to achieve a similar effect in SQL Server. Below we have included an example of VBA code to remap the data (the demo database is included in the download).

Sub upv_RecordSetOnAnARecordset()
  Dim db As Database
  Dim rstDenormalised As Recordset
  Dim rstMap As Recordset
  Dim rstNormalised As Recordset
  Dim varData As Variant
  Set db = CurrentDb
  Set rstDenormalised = _
      db.OpenRecordset("tblSalesSummary", dbOpenDynaset)
  Set rstMap = _
      db.OpenRecordset("tblCategoriesMap", dbOpenDynaset)
  Set rstNormalised = _
      db.OpenRecordset("tblSalesSummaryNormalisedRemap", _
                  dbOpenDynaset)

  Do While Not rstDenormalised.EOF
    ' Search accross each field in the map
    rstMap.MoveFirst
      ' get the value from each category
      varData = rstDenormalised(rstMap!CategoryName)
      If Not Nz(varData) = 0 Then
        With rstNormalised
          .AddNew
          !OrderId = rstDenormalised!OrderId
          !ReMap = rstMap!ReMap
          !OrderValue = varData
          .Update
        End With
      rstMap.MoveNext
    Loop
    rstDenormalised.MoveNext
  Loop
  MsgBox "Completed"
End Sub

Download Code

The sample TSQL can be downloaded from here.

MUST+SQL allows you to upsize into multiple schemas splitting up an application into manageable chunks or bringing together multiple databases and simplifying security. Click below to improve your development productivity.



Top of Page