Friday, March 30, 2012

Relationshiptype in type 2 Slowly Changing Dimensions

Hi,

I've got a cusomer dimension buil on a type 2 slowly changing dimension with following attributes:

customer group

customer name

customer number (business key)

surrogate key

I've got one hierarchie:

customer group -- customer number

and defined a one to many relationship between customergroup and customer.

What I am not sure about:

Can I choose relationshiptype = Rigid for this relation?

When a customer is moved to another customer group, a new record in the dimension table will be added.

Before change

Group A nr. 12345 surr. key 1

after the change, a second record was added

Group A nr. 12345 surr. key 1

Group B nr. 12345 surr. key 2

So the relationship between customer group and customer number is not one to many, but many to many in the dimension table.

So, do i have to define the relation as flexible ?

or is there some misunderstanding......

Keme

Hello. You use surrogate keys so rigid attribute relations will not be a problem. As long as you keep your source key for customer in the cube I cannot see any problems with this approach.

If you have a problem with dynamic attributes in a dimension, that changes frequently, you can build a new dimension table for them. In the AdventureWorks cube and DW you can see a separate dimension for Geography.

People change address so instead of keeping this dynamic relationship in a customer dimension table you make a separate dimension table and a new foreign key in the fact table.

In AdventureWorksDW Geography is related to Customer but I would prefer a Geography foreign key in the fact table. if you use this approach type1, type2 or type3 SCD:s will not be a problem.

Downside: You will not be able to drill down from customer country to a single customer.

Rigid attribute relations will always require a full process of the dimension and the cube when there is a change.

Regards

Thomas Ivarsson

|||

Hi Thomas,

thanks for your reply.

>> You use surrogate keys so rigid attribute relations will not be a problem. As long as you keep your source key for customer in the cube I cannot see any problems with this approach.

My problem is, that in my example the key for the relation customer --> customer group is not unique

Group A nr. 12345 surr. key 1

Group B nr. 12345 surr. key 2

it chould be customergroup customer should be 1:n ; But customer nr. 12345 has two groups in the table.

So referential integrity for the attribute relation is destroyed and if I change ErrorConfiguration, KeyDuplicate from 'Ignore' to 'ReportAndContinue' the dimension fails to process.

The surrogate key is still unique, but it seems that does not help for the relation beween customer and customer group.

I read 'The Microsoft Data Warehouse Toolkit" from the KImball Group and followed the instructions for building the relational DWH with slowly changing dimensions and surrogate keys.

What I'm not sure about ist how to handle relations for the changing attributes ofSCD2 Dimensions.

First choice was to have related all attributes to the dimension key as dimension designer by default does. Worked fine.

Then I read about the importance of attribute relations on performance because of natural hierarchies, so I decided to better declare them. (Type rigid does fail, and I'm not sure about the impact of Type 'flexible' and if it is still a help to performance then)

Your suggestĂ­on with a separate table would be working, but I thought it would be the nice thing about a well built relational DWH that you can put a cube on top by a few clicks ?

|||

OK. So if you build your attribute relation between customer and the customergroup with the surrogate key, this does not work?

The relation between the surrogate customer key and the customergroup must be one to many, from the customergroup level?

It seems to me that you have build this relation on the source customer key?

Have missed some important information.

Regards

Thomas Ivarsson

|||

You are right. I built the relation on the source customer key.

For my example, you are right. In this constellation, the problem is solved when using the surrogate key for the relation.

So, please try this one:

The dimension table has the following attributes:

catgegory -- subcategory -- item--surrogate key

In the source system theres the relation category-subcategory-item. > natural hierarchie with 3 levels

So what i would do in SAS is:

1. Within Attribute surrogate key, I define a relation (one to many) with subcategory

2. Within Subcategory I define a relationship (one to many) with category

The problem would occur, if a subcategory changed to another category. Then the constellation in the dimension table would be:

Category A Subcategory a item 4711 SK 1

Category B Subcategory a item 4711 SK 1

Then I would get the 'key duplicate' problem for 'Subcategory a' again.

It is all a bit theoretical, but the main question to me is:

I used star schema for a relational DWH: The dimension is totally denormalized.

I use surrogate keys for the dimensions and dimensions are implemented as SCD with all attributes type 2.

Now, if I have more than 2 related attributes in a dimensio, I have either to use snowflake design or leave out the relationship ?

Regards,

Keme

|||

Hello. I think that you are on the right track.

If the relation between the subcategory and the category change over time, and you would like to keep history, for this relation, you will have to use surrogate keys here as well. The only option then, from what i know, is to snowflake the dimension.

If you see the upper relation as type1 SCD you can keep the starschema design. Simply do an update in the ETL process.

When you do a significant change like changing relations above the leaf level in a dimension, I think the idea of SCD fails. You will expand the dimension tree continously and make it even harder to analyse data over the time dimension. Categories and subcategories will no longer be the same. It is better, from my experience to treat these changes as SCD I and do a change that erase history.

Kind Regards

Thomas Ivarsson

|||

Hello Thomas,

Thank you.

Think, now it is clear to me.

Regards

Keme

No comments:

Post a Comment