Sunday, September 23, 2012

Elements of the Dimensional Model

OLTP vs OLAP :
Describe the application that business is looking to build.

Business Processes :
Identify the business processes. Each of the business processes typically becomes a tables in data warehouse called FACT tables.

Fact Tables:> Granularity is at which level of detail the data will be stored for each fact table.
> Trasaction Fact : Transaction records for individual events as they happened
> Periodic Snapshot : Aggregated summary of data.
> Accumulating Snapshot :
> Additive, semi additive, non - additive Fact tables
> Factless Fact tables

Data Warehouse Bus Architecture :
> Define a standard interface for the data warehouse environment, separate data marts can be implemented by different groups at different times. The separate data marts can be plugged together and usefully coexist if they adhere to the standard.
> The team designs a master suite of standardized dimensions and facts that have uniform interpretation across the enterprise. This establishes the data architecture framework.

Data Warehouse Bus Matrix
The Data Warehouse Business Matrix is the tool of choice for documenting and communicating the Dimensional Modeling process so far. Business Processes vs Dimensions

Conformed Dimensions:
Dimension tables are not conformed if the attributes are labeled differently or contain different values. If a customer or product dimension is deployed in a nonconformed manner, then either the separate data marts cannot be used together or, worse, attempts to use them together will produce invalid results.
Flavours:
> Conformed dimensions mean the exact same thing with every possible fact table to which they are joined.
> Sometimes dimensions are needed at a rolled-up level of granularity. Perhaps the roll-up dimension is required because the fact table represents aggregated facts that are associated with aggregated dimensions

Dimensional Authority:The dimension authority has responsibility for defining, maintaining, and publishing a particular dimension or its subsets to all the data mart clients who need it.

SQL-92 - Four Isolation Levels:

Overview
While creating or reusing an existing acess method, it is required to know the degree of isolation supported between processes attempting to access the same data. This is commonly defined by referring to the SQL-92 Isolation levels.

SQL-92 defines four Isolation Levels:
  • Read Uncommitted
  • Read Committed
  • Repeatable Read
  • Serializable
Isolations levels are stated in terms of three prohibited operation sequences, called Phenomena:
  1. Dirty Read
  2. Non-Repeatable Read
  3. Phantom Read
If an access method does not support Serializable, then it is possible that the data set qualifying for a particular query may change within a transaction because of an update by another transaction. If this can occur, then it is very important for user to know what level is supported, so that they may design their applications accordingly. Phenomena define the ways in which the qualifying data set may change in the transaction.



This occurs if one transaction can see the results of the actions of another transaction before it commits. Consider the following example:


Transaction 1Write(a)


Rollback
Transaction 2


Read(a)
If Transaction 2 reads the value written by Transaction 1, then a DIRTY READ has occurred.



This occurs if the results of one transactions can be modified or deleted by another transaction before it commits. This is illustrated below:





Transaction 1Read(a)





Read(a)
Transaction 2


Write/Delete(a)Commit
If Transaction 1 gets a different result from the each Read, then a NON-REPEATABLE READ has occurred



This occurs if the results of a query in one transaction can be changed by another transaction before it commits. This is illustrated below:





Transaction 1Select(criteria)





Select(criteria)
Transaction 2


Update/Create(match to criteria)Commit
If Transaction 1 gets a different result from each Select, then a PHANTOM READ has occurred



The SQL-92 Isolation levels are defined in terms of what Phenomena can occur:
Isolation LevelP1 P2P3
Read Uncommitted YES YES YES
Read CommittedNOYESYES
Repeatable ReadNONOYES
SerializableNONONO
Thus if Read Committed is supported you will never experience a DIRTY READ, but you might experience a NON-REPEATABLE READ or a PHANTOM READ.
The above order is in terms of "strength" in that Read Uncommitted is less restrictive, and thus less "strong" or at a "lower level" (of isolation) than Read Committed.