In this article I want to describe a technique in SQL Server, which can be used to remap data that would otherwise
be computationally expensive to perform in Access with VBA or in .NET. In our own applications we have reduced processing time
from typically 20 minutes to around 5 seconds using this technique on larger data sets.
The sample script file for this article will allow you to create the denormalised data set using the SQL Server Northwind
Database and then normalise the data, but we are going to skip the steps in creating the denormalised data
(we will return to this later in the article) and head straight for the solution to this difficult problem.
The denormalized sample data set
We start with the following data set:-
Here, we have a classification of data which means that if a new category was added, we would have
to restructure our table to have a new field, and what we would like to do is have the data normalised, so we can easily add
new categories, and still return to the above presentation of the data using for exampl a SQL Server PIVOT (or Access Crosstab).
We will start by showing the final structure that we want to construct, below we have created a normalised table for the data:-
CREATE TABLE tblSalesSummaryNormalised(
PK_Summary INT IDENTITY(1,1) PRIMARY KEY,
OrderId INT,
CategoryName VARCHAR(15),
OrderValue DECIMAL(16,2)
)
GO
This table would present the normalized data as shown below (once populated with data):-
The Mapping Table
In a more complex situation you could need to remap the column names (we will show this later),
but to keep things simple we choose to remap the data to the same column names as field values.
We start with a mapping table as shown below, which contains field values matching the old column values:-
UNPIVOT
Not only does SQL Server provide a feature to PIVOT (CROSSTAB) data, but it also offers the UNPIVOT syntax to
in part reverse the operation.
The following view can be used to UNPIVOT (normalize) the data:-
-- Processing Statement to select our data
CREATE VIEW UnPivot_tblSalesSummary_vw
AS
SELECT
Q1.OrderID,
CategoryName,
Q1.Value As OrderValue
FROM tblCategoriesMap
INNER JOIN
(SELECT
OrderID,
tblPivot.Property,
tblPivot.Value
FROM
(SELECT
OrderID,
CONVERT(decimal(16,2),[Beverages]) As [Beverages],
CONVERT(decimal(16,2),[Condiments]) As [Condiments],
CONVERT(decimal(16,2),[Confections]) As [Confections],
CONVERT(decimal(16,2),[Dairy Products]) As [Dairy Products],
CONVERT(decimal(16,2),[Grains/Cereals]) As [Grains/Cereals],
CONVERT(decimal(16,2),[Meat/Poultry]) As [Meat/Poultry],
CONVERT(decimal(16,2),[Produce]) As [Produce],
CONVERT(decimal(16,2),[Seafood]) As [Seafood]
FROM tblSalesSummary
) DataValues
UNPIVOT (Value FOR Property
IN([Beverages],
[Condiments],
[Confections],
[Dairy Products],
[Grains/Cereals],
[Meat/Poultry],
[Produce],
[Seafood])
) as tblPivot
) Q1
ON tblCategoriesMap.CategoryName = Q1.Property
WHERE Q1.Value IS NOT NULL
GO
This results in the following data set:-
Then to populate the now normalised table we use the following:-
INSERT INTO tblSalesSummaryNormalised(
OrderId,
CategoryName,
OrderValue)
SELECT OrderID,CategoryName,OrderValue
FROM UnPivot_tblSalesSummary_vw
ORDER BY OrderID
GO
How we used the PIVOT operation to get the denormalized data set
It is worth taking a moment to consider how we generated our demoralised data set, for that operation we used the PIVOT statement:-
-- PIVOT (CROSSTAB) the data
CREATE VIEW SummaryOfSales_vw
AS
SELECT
OrderID,
[Beverages],
[Condiments],
[Confections],
[Dairy Products],
[Grains/Cereals],
[Meat/Poultry],
[Produce],[Seafood]
FROM
(SELECT
OrderId,
CategoryName,
LineItemValue
FROM ProductInfo_vw
) AS SourceTable
PIVOT
(
SUM(LineItemValue)
FOR CategoryName IN ([Beverages],
[Condiments],
[Confections],
[Dairy Products],
[Grains/Cereals],
[Meat/Poultry],
[Produce],
[Seafood])
) AS PivotTable
GO
Then we used the following to generate a table of data:-
SELECT * INTO tblSalesSummary
FROM SummaryOfSales_vw
GO
More complex remapping of data
To show how to extend the above example to a more advanced remap we have the following script
where we add a 3 character column called ReMap:-
ALTER TABLE tblCategoriesMap ADD ReMap VARCHAR(3)
GO
UPDATE tblCategoriesMap
SET Remap = LEFT(CategoryName,3)
GO
This means that for example the column name Dairy Products would be mapped to a value Dai, we have used a simple
3 character code system, but in your applicatuion the remapping could be significantly different, for example mapping a
name to a code ( Dairy Products becomes 001).
We need to create a new results table:-
CREATE TABLE tblSalesSummaryNormalisedRemap(
PK_Summar INT IDENTITY(1,1) PRIMARY KEY,
OrderId INT,
ReMap VARCHAR(3),
OrderValue DECIMAL(16,2)
)
GO
Then we create a view to normalise the data set:-
CREATE VIEW UnPivot_tblSalesSummaryRemap_vw
AS
SELECT
Q1.OrderID,
Remap,
Q1.Value As OrderValue
FROM tblCategoriesMap
INNER JOIN
(SELECT
OrderID,
tblPivot.Property,
tblPivot.Value
FROM
(SELECT
OrderID,
CONVERT(decimal(16,2),[Beverages]) As [Beverages],
CONVERT(decimal(16,2),[Condiments]) As [Condiments],
CONVERT(decimal(16,2),[Confections]) As [Confections],
CONVERT(decimal(16,2),[Dairy Products]) As [Dairy Products],
CONVERT(decimal(16,2),[Grains/Cereals]) As [Grains/Cereals],
CONVERT(decimal(16,2),[Meat/Poultry]) As [Meat/Poultry],
CONVERT(decimal(16,2),[Produce]) As [Produce],
CONVERT(decimal(16,2),[Seafood]) As [Seafood]
FROM
tblSalesSummary
) DataValues
UNPIVOT (Value FOR Property
IN([Beverages],
[Condiments],
[Confections],
[Dairy Products],
[Grains/Cereals],
[Meat/Poultry],
[Produce],
[Seafood])
) AS tblPivot
) Q1
ON tblCategoriesMap.CategoryName = Q1.Property
WHERE Q1.Value IS NOT NULL
GO
-- Putting this together to generate the new data
INSERT INTO tblSalesSummaryNormalisedRemap(
OrderId,
ReMap,
OrderValue)
SELECT OrderID,ReMap,OrderValue
FROM UnPivot_tblSalesSummaryReMap_vw
ORDER BY OrderID
GO
This then results in the following remapped data set:-
VBA Recordset operations
If we had to write a solution to this problem in VBA, then we would take advantage of the ability to use a field value in one recordset,
to lookup a field name in another recordset, the key line in the following code is:-
varData = rstDenormalised(rstMap!CategoryName)
The UNPIVOT code in this article illustrates how to achieve a similar effect in SQL Server.
Below we have included an example of VBA code to remap the data (the demo database is included in the download).
Sub upv_RecordSetOnAnARecordset()
Dim db As Database
Dim rstDenormalised As Recordset
Dim rstMap As Recordset
Dim rstNormalised As Recordset
Dim varData As Variant
Set db = CurrentDb
Set rstDenormalised = _
db.OpenRecordset("tblSalesSummary", dbOpenDynaset)
Set rstMap = _
db.OpenRecordset("tblCategoriesMap", dbOpenDynaset)
Set rstNormalised = _
db.OpenRecordset("tblSalesSummaryNormalisedRemap", _
dbOpenDynaset)
Do While Not rstDenormalised.EOF
' Search accross each field in the map
rstMap.MoveFirst
' get the value from each category
varData = rstDenormalised(rstMap!CategoryName)
If Not Nz(varData) = 0 Then
With rstNormalised
.AddNew
!OrderId = rstDenormalised!OrderId
!ReMap = rstMap!ReMap
!OrderValue = varData
.Update
End With
rstMap.MoveNext
Loop
rstDenormalised.MoveNext
Loop
MsgBox "Completed"
End Sub
Download Code
The sample TSQL
can be downloaded from here.
MUST+SQL allows you to upsize into multiple schemas splitting up an application into manageable chunks or
bringing together multiple databases and simplifying security. Click below to improve your
development productivity.
Top of Page