Data Reflections
Discussions of data warehousing, transformation and analysis by the Business Intelligence Team at Tallan.
Tallan

July 9, 2009

Understanding Metadata Management

Filed under: Analysis Services, Data Modeling, ETL, SQL Server, Uncategorized, Warehousingbgager @ 2:18 pm

Metadata Management- Fun stuff, right?  Depending on what your relationship to an organization’s data is, it can have many different meanings. Database developers and administrators tend to focus on technical metadata, which provides descriptive information on table structures, job streams, ETL mappings, data lineage, etc.

Business users have a need for metadata as well. Attribute definitions, business rules, formulas & calculations, and system definitions provide context to your data and help turn it into real information. Often in the form of a corporate glossary or data dictionary, business metadata can provide clarity to an organization.

One of the challenges of getting your organization to buy in on the need for Metadata Management is that it is often not well understood by business stakeholders. Have you ever been in a meeting that was called because of a dispute over the definition of a business term? Have you ever seen the same field, like “Transactions” for instance, listed on two different reports with different numbers? When business terminology is ambiguous and not well documented, productivity is lost and resources are wasted trying to understand and resolve such discrepancies. It also impacts the trust your people have in your data. And it happens over and over again in the average company. These are Metadata Management issues and they are costing you money.

In an effort to describe how Metadata Management impacts your organization, we have provided some resources that we hope will help. The first is a webinar that I have presented to businesses interested in understanding Metadata Management and some of the options available when implementing a solution. It is available here. This webinar describes the differences between business and technical metadata, outlines its role in the organization, and suggests a framework for creating a metadata repository from the ground up.

We have also posted a case study that explains what can go wrong in an organization that lacks a metadata management strategy. If you are wondering how expensive this oversight can be, the paper describes a project gone wrong because of the lack of available metadata. You may find similarities in your own company.

Half the battle with Metadata Management is in understanding what metadata is and the role it plays in your organization. We hope this content will help with that understanding and get you thinking about how much more efficient you can be with better data management.In future posts, we will get more into the nuts and bolts of implementing a solution.

April 10, 2009

Introduction to Analysis Services

Filed under: Analysis Services, Microsoft, SQL Server, Warehousingachandler @ 1:00 pm

This post is intended to introduce Analysis Services 2005/2008 foundational concepts.

SQL Server Analysis Services is a component included in the Microsoft SQL Server product, and its use is fully covered under the same license (which applies to Integration Services and Reporting Services as well). Like the database engine, SSAS has a range of features stratified by Express, Workgroup, Standard Edition and Enterprise Editions.

SSAS is an OLAP, multi-dimensional database. While a relational data warehouse can also be described as OLAP, products such SSAS, Cognos and Essbase have fundamental differences. These OLAP servers:

· Aggregate data from a variety of sources into a compressed format optimized for query response
· Emphasize end-user navigation with modeling capabilities such as hierarchies, or drill-paths
· Employ MDX as the standard query language for retrieving result sets
· Use XMLA as the standard communication mechanism between clients and servers
· Automate the aggregation of granular data

OLAP browsing tools, such as Excel Pivot tables, enable users to drag in measures (facts in DW terminology) and constrain and slice them by dimensional attributes (columns in a dimensional DW table). This is a typical browsing session of the AdventureWorks cube in Management Studio:


(more…)

December 4, 2008

Partitioned Fact Tables

Filed under: ETL, Warehousingachandler @ 1:06 pm

Once tables grow into the millions of records, they become candidates for partitioning. Table partitioning offers many benefits, particularly in warehouse environments. Since data is split into smaller units of storage, backups can target filegroups with a higher rate of change. Systems with multiple CPUs see improved query performance as partitioned data leads to greater parallelism. Perhaps most significant is the ability to swap in huge amounts of data by partition switching, an operation that is practically instantaneous.

Tables can be partitioned horizontally or vertically. With vertical partitioning, columns are split out into separate physical tables. This post focuses on horizontally partitioned tables, which take advantage of new constructs Microsoft added in the 2005 release - partition functions and partition schemes. Table partitioning is an Enterprise Edition only feature in SQL Server 2005 and 2008.
(more…)

September 23, 2008

Developing a Dimensional Model w/ conformed dimensions and facts

Filed under: Data Modeling, Warehousingocash @ 9:04 am

Conformed Dimensions and Facts

Data marts are often developed to represent important systems within a company.  Over time additional data marts are added, and eventually there is a desire extract data across multiple marts.  Extracting data across data marts can be cumbersome and some times impossible if the data marts were not designed to share common dimensions, also known as conformed dimensions. Ideally since the dimension criteria would be the same for each data mart, selecting data across multiple marts would be as routine as selecting data from a single data mart.

Going hand and hand with conformed dimensions, conformed facts involve standardizing facts across multiple marts.  Adopting conformed facts eliminates the ambiguity of having facts that possess the same name but have different underlying calculations.

In this blog we use simple data marts to demonstrate the use of both conformed dimensions and facts but the same concept applies to EDW’s and AS Cubes as well.

Now, to see how conformed dimensions and facts are applied during development lets start at the beginning with our ER diagram.

(more…)

September 8, 2008

SCD via SQL Stored Procedure

Filed under: ETL, Warehousingsrichards @ 7:37 am

We are going to revisit the issue of dealing with Slowly Changing Dimensions in a data warehouse. We have seen a demonstration of using the SCD transformation that is available in SQL Server Integration Services (SSIS); however, this is not always the best option for processing an SCD. When the volume of rows you’re dealing with is substantial, this creates a significant, and usually unacceptable, performance hit. Another valid reason why you may choose not to implement the SSIS transformation is flexibility. The SSIS transformation requires your begin and end dates to be datetime data types, which could prove to be inconvenient when your SCD needs to tie back to a date and/or time dimension. 

One alternative we are going to exhibit is using a SQL Server stored procedure. This example demonstrates the implementation of a Type 2 SCD, preserving the change history in the dimension table by creating a new row when there are changes. (more…)

August 14, 2008

The SCD Transformation in SSIS

Filed under: ETL, Warehousingpross @ 10:00 am

In this post, we will demonstrate the Slowly Changing Dimension (SCD) transformation that is available for use in a Data Flow task within SQL Server Integration Services (SSIS).  We’ll walk through an example of how to use it, explaining the terms as we go. In addition, I’ll demonstrate some “extra effort” kind of modifications that can be made for a more thorough usage.  After that, we’ll discuss the pros and cons of using it at all.  And we’ll wrap up with some tips on how to tune the transformation for better performance.
(more…)

July 31, 2008

A Warehousing Overview

Filed under: Data Modeling, Warehousingachandler @ 10:00 am

Data warehousing is a big subject. This overview is intended to cover some of the most representative issues on a high level: the nature of OLAP systems, star schemas, facts and dimensions, and differing perspectives (Inmon vs. Kimball) on warehouse design.

OLTP vs. OLAP

OLTP systems are the operational databases supporting applications. They are highly normalized, and focused on CRUD operations.

OLAP databases are usually arranged in star schemas and are built for speed in retrieving aggregated data.
(more…)

© 2008 DataReflections.Net  : : : :All rights reserved.  : : : :Tallan, Inc.