BP(K*D) = i

March 25, 2010

Hey readers,  here is an excerpt from another blogger Jeff McQuigg

The full post can be found here.

I think the most interesting part about this post is discerning the difference between BI and DW.  It reminds me of the telescope/microscope analogy:  BI is about making reactive and more importantly proactive business decisions.  While some BI may involve analyzing historical trends, the newer the data, the more relevant it is.

Read the rest of this entry »

Data Quality Part II

February 26, 2010

In Part I of this series, I showed how to create rules around data in a table. This post will expand on the idea of data quality and the need for data quality integration in your systems and applications. Ideally, a good data quality solution will help to reduce the amount of ETL (and associated overhead) required to incorporate data sets between applications, and will help to create data that is both efficient and valuable to your organization. Read the rest of this entry »

Extract Variable Records to the Same File

January 22, 2010

Hello my faithful reader(s).  I have been away for a while but do not despair:  I am back and 2010 promises some really cool posts.  Today’s post centers on extracting variable record types to a single flat file.  This applies to data warehouse type files that have  header, detail & trailer records somewhat similar to those I have seen for terradata and other warehouse applications.  There will also be a future post in this blog that will describe how to load this data type into your warehouse along with some built in intelligence.

Read the rest of this entry »

Parent-Child records in SSIS

December 1, 2009

Converting a File with Parent-Child Records in SSIS

This article talks about converting a file with Parent-Child records using SSIS.  Parent-child files have a header record (parent) with an undetermined amount of subsequent detail records(child).  The transformation extracts non-blank records, creates a key for each record and splits that record by type and then merges the records back together to create a flattened record with a 1-to-1 relationship. 

Read the article at:


Data Quality Part – I

December 1, 2009

Since the blog is named data integrity, I figured I would look up what “data integrity” truly meant.  In searching for the definition on google, data integrity came to be defined as ‘data that is complete or whole. All characteristics of the data including business rules, rules for how pieces of data relate, dates, definitions and lineage must be correct for data to be complete’.  With that in mind, I am going to start a number of posts on Data Quality and Data Profiling over the coming weeks. 

Read the rest of this entry »

Dynamic SQL using CHARINDEX

November 11, 2009

Recently I was called upon to create a new field from an existing field.  The specifications called for the new field to be composed of any data between two dashes.  The dashes could be at any positional value and the data between the dashes was not a consistent length.  

Read the rest of this entry »

Comparing tables with TABLEDIFF

October 30, 2009

Recently, I stumbled across a pretty cool utility called tablediff.  This utility is shipped standard with Microsoft SQL Server and is used to compare the data in two tables for non-convergence (dissimilarities).  The contingencies for this tool require that the tables being compared both have a primary key or a similar field.  This utility can be used from the command prompt or in a batch file which means it can be called from SSIS.   Read the rest of this entry »

Creating Dyanamic Email Content using SSIS

October 27, 2009

From time to time, I will be required to send emails with dynamic content to a number of users within an organization.  Typically, a user will want data in that email that is relevant to them or relevant to their subordinates for reporting/notification purposes.  Thankfully, SSIS can handle this task fairly easily.

Read the rest of this entry »

Connecting to an XLSX using SSIS

October 16, 2009

You have likely noticed that documents, templates, spreadsheets, and presentations that you create in the 2007 Office release are saved with new file-name extensions with an x at the end of the extension. For example, when you save a spreadsheet in Excel, the file now uses the .xlsx extension, instead of the .xls extension.  In the 2007 Office release, Microsoft has adopted and XML-based file format that is said to improve file compression and provide better integration and interoperability of data.  Unfortunately, SSIS cannot connect to an XLSX file using the Excel Connection Manager. 

Read the rest of this entry »

Zip Utility Compression Test

October 14, 2009

No matter how much storage space you have, it’s never seems to be enough.  This is when file compression can become crucial to saving space on your file system, moving files quickly between servers and performing day-to-day tasks.  With just a quick search there are an overwhelming number of file compression utilities on the market.  This post outlines a test of 3 popular utilities for both file compression and execution time.

Read the rest of this entry »