Business Intelligence in Sydney: 2012

10 May 2012

Optional Feature not implemented -- ODBC call error

The problem:
Features: an access application with one form linked correctly to a table.
The form loads fine, the data in the table is viewed correctly.
The person in need to make a change (in this case: the business analyst) can't make a change. He enters the new data to the field, and once he leaves he gets the following error message:
ODBC -- call failed
[Microsoft][ODBC SQL Server Driver]Optional feature not implemented (#0)
Possible solutions:
The results I've found googling the problem usually relate to the case when there's some code. The problem is, that there's absolutely no code, so I can't tell "which line is it". It cannot be "parameter passing" as it was proved to some people.
Does he have the right permissions on the database?
Apparently he does, as
I've given the business analyst the permissions required on the database (namely: db_datareader, db_datawriter),and the following works as it should:

Also, when trying to use the following code straight from the SQL server, it runs perfectly fine:

EXECUTE AS LOGIN = 'BusinessAnalyst';

SELECT SUSER_NAME(), USER_NAME();

--gives 'BusinessAnalyst', 'BusinessAnalyst'

EXECUTE AS LOGIN = 'BusinessAnalyst';

SELECT *

from dbo.dimTable

--produces the results, as expected from someone with db_ddlreader permissions

EXECUTE AS LOGIN = 'BusinessAnalyst';

update DBO.dimTable

set column= 'value'

where ID = 1

--produces the results, as expected from someone with db_ddlwriter permissions

Needless to say, when I try to make the change from my instance of Ms-Access on my computer it run perfectly fine. And no, giving him higher permissions, like "db_ddladmin" didn't really help.
So obviously the problem is in the ODBC driver. But where?
As it works on my machine and doesn't work on his, we've decided to log in from my machine.
Then I had to re configure the ODBC connection, and found out where the error was:
Instead of choosing "SQL Server" as a client, you need to choose "SQL Server Native Client 10.0".

True, the BA didn't have the driver installed on his machine, but he's a smart guy, so he googled and got here:

http://www.microsoft.com/en-us/download/details.aspx?id=16978

choose "native client" and sent me the following beautiful email:

DONE,

WORKING J

Excellent! now I'm back to work. There's a new Fact Table coming right up...

Extra comment on October 2013: The problem results from the SQL back end being on an x64 machine. Please note that for Access front end you need your SQL server placed on X86 Machine.

07 February 2012

OLE DB error: OLE DB or ODBC error: Operation canceled; HY008.

Got this error after adding an attribute to a dimension.
Looking closely at the dimension and then straight in the SQL server found out that it had two values possible for no value:
'NULL' and '' (empty string).
Solution: making sure all the "unknown" were the same (I choose and empty string) fixed the issue and the dimension was processed successfully.
Update 22/11/2013
A different solution: wasn't a time out, it was the "warning" at the end of the dimension processing error message, and the details are below:

http://www.sqlservercentral.com/blogs/dknight/2009/03/02/cube-processing-error-hy008/

27 January 2012

The differences between a transactional database and an analytical one

Imagine the following scenraio:

A business decision has been made to create a BI project. After all the ETL process has been completed, and the decisions had been made about what's in and what out of the BI project; After all the data had been incorporated from several different systems (including the Excel spreadsheets in the "My Documents" folder); After the cube had been processed and reports had been created to please the business, somebody (else) from business comes with a question regarding one of the good old transactional systems, and expect the new flashy BI project to answer. Now you need to come with a good reason why a "no can do" is actually the correct answer.

So here's an extract from an email I've sent regarding this issue. I've changed the names and some of the details to avoid identification, so the transactional database is called "DB" and the analytical database is called "BI". If you decide to cut & paste this post and send it to your business analyst, change the "DB" to the name of the particular database, and the same with the "BI".

More homework: check the DB versus the BI project. What are the questions that the DB is supposed to answer quickly and efficiently? What are the questions that the BI project is supposed to answer quickly and efficiently? Once done, put them down as "Examples" and spread them across you email.

Dear Business Analyst,

Enclosed hereby please find a thorough information about DB, BI and the differences between them.

1. Purpose:

a. DB is a transactional database. It is meant to give you an answer for questions for the here & now. It is optimised for UPDATE, INSERT & DELETE operations.

b. BI is an analytical database. It is optimised for GROUP BY (operations like SUM, AVG, COUNT etc.) and for reporting.

2. Time:

a. DB is a transactional database, and therefore it doesn’t retain historic information. While some data is being kept for auditing purposes, if you want to find out how many red bicycles were sold in the month of March 2011 you'd look in the AdventureWorks_DW, not in AdventureWorks.

b. BI is an analytical database and therefore it maintains historic information across the time dimension. The question in (a) is considered pretty easy for BI querying.

c. Granularity – DB is changed constantly, BI is changed according to the frequency of the feed. While you can change the frequency of the feed, the changes are not being caught by the minute and we’ll always “miss” a change in order to take a snapshot of the bigger picture.

3. Differences in values:

Theoretically, you’d expect that if you download the data from BI at a certain point in time and compare it to the DB then the values would be identical. Unfortunately, that is not true. It could because in the DB there are values that are incorrect; It could be because the DB keeps some fields for the purposes of answering questions that are relevant to its purpose, but are not being exported to the BI as they convey very little meaning in terms of the bigger picture. As the BI tries to incorporate data from more than one system that's quite likely.

In summary:

If you compare at a certain point in time the differences between DB and BI you’ll find them vast and you’ll probably get pretty frustrated at the differences. This comparison is useless.

What BI can give you (if we put the correct data inside) are changes and trends across time. BI isn’t meant for an exact snapshot at a certain point in time; that's the purpose of the DB.

Sincerely Yours,

Your BI developer

18 January 2012

Errors in the OLAP storage engine: The attribute key cannot be found when processing

The following error message had appeared while I was trying to create the "bridge table" that appeared so nicely in the Many to Many examples by Marco Russo & Alberto Ferrari:
Errors in the OLAP storage engine: The attribute key cannot be found when processing: Table: '
', Column: '', Value: ''. The attribute is ''. Errors in the OLAP storage engine: The attribute key was converted to an unknown member because the attribute key was not found. Attribute Client Group of Dimension:

from Database: , Cube: TestMtoM, Measure Group: Bridge Client Adviser A, Partition: v Bridge Client Adviser A, Record: 97.

The view:

CREATE VIEW [dbo].[vBridge_ClientAdviserA]

AS



SELECT  

      

    distinct 



      [ClientGroup]



      ,[dimAdviserId]



  FROM [dbo].dimClient

As I've explained in my previous post I needed to create those "bridge tables" with the ClientGroup (that's the unique ID or "natural key" of the Client that was necessary for finding out the changes in the SCD) and the Adviser Id. The problem was with the linkage between the dimension and the measure group: SSAS must have the linkage (foreign key - primary key relationship) set up. If there's a violation then it stops and gives you this error.
Of course, I set up keys for indexing and that should have done that trick, correct? The problem was that I've created here a view for the bridging based on the source (dimension) table, which means that errors that would have otherwise been raised hadn't.
So where did the problem lie? In the simple fact that in my factless fact table (which is another view that refers to dimClient) there had been some "client group" missing. That's due to the fact that actually, though I was sure I had data integrity, it's been missing for some of the clients. I was sure that a certain structure of the data existed (that's what I was specifically told!) and I didn't bother to set up keys/checks/conditions to check it.
So I had to check it up (I did that using LEFT JOIN on the factless fact table, found out that some (it's enough to have one, but I believe there had been several cases) weren't strictly adhering to the business roles that were set up so carefully by my Business Analyst. As he kindly told me to "ignore it", I had, and created the bridge table as follows:

SELECT  

      

    distinct 



      [ClientGroup]



      ,[dimAdviserId]



  FROM [dbo].factClient

Please keep in mind that "fact Client" is a view that is based on dim Client, and picks up all the clients with a certain condition (or WHERE clause) and cross join them with the dimDate table.

Conclusion:
The Moral of the story is obviously: don't ever assume data integrity! Always check it manually either by a check constraint, a key, or a simple check. If you don't then SSAS would do it for you - and you know how descriptive those error messages are!

13 January 2012

How I've implemented a Many to Many relationship in SSAS


Figure 1 - the data source view.

A Many to Many relationship example in SSAS: an implementation with a difference.
It all started with a pretty innocent request:
in my work we have advisers and they serve clients.
Sometimes clients change across advisers, and we would like to keep track of that.
So, How do you do that? Or in technical terms:

How do you measure changes in measure across Slowly Changing Dimension?

(You can use Many To Many relationship for Distinct Count as well: more about it in here).
In my case there are actually two types of references to "clients". The first is the reference to "dimClient" which is the original Slowly Changing Dimension (SCD) and the second is the factless fact table: or, to be exact, a view that picks up the relevant clients from dimClient table and cross join it with the date table, and this way I have the changes per client per month.
As the clients sits in a factless fact table, I've googled "factless fact table", and before long (actually, on link # 7), I reached this invaluable article: The many to many revolution. This article was written by Marco Russo & Alberto Ferrari and it is brilliant. I'm not even trying to closely imitate that - I'm just giving a short example of how it worked in my cube, that's all.

So the problem:
There's the SCD (Slowly changing dimension) called dimClient. For all-sort-of-reasons (you can read about them in here if you're really interested) I've made a "factless" fact table out of it. Now the end user (or the business analyst, whatever you'd name him) likes to know what happened to different clients across the time. The first question referred to Advisers: clients move across advisers, how do you keep track of that?

The SQL part of the solution

After I've read the article closely, printed the relevant part (in this case: pages 63-69), and read it thoroughly several times, I created 3 new views:

The original view, factClient, became the measure group Adviser A.
A new view, called v_bridge_Adviser, with the following structure:

SELECT  

    distinct 

        [dimDateId]

      

      ,[dimAdviserId]

      ,ClientGroup

  FROM [dbo].[factClient]

This view gives me another link between the ClientGroup (the natural key for every client), the Adviser id and the date id. This view became the measure group Adviser B.

Next two views are actually identical:

CREATE VIEW [dbo].[vBridge_ClientAdviserA]

AS



SELECT DISTINCT 

      [ClientGroup]

      ,[dimAdviserId]



  FROM factClient

and

CREATE VIEW [dbo].[vBridge_ClientAdviserB]

AS



SELECT DISTINCT 

      [ClientGroup]

      ,[dimAdviserId]



  FROM factClient

Brilliant!
The Data Modelling part of the solution
Now all I had to do is:
add the tables to the dsv
make sure that they are linked correctly
create the new measure group ("Adviser A", "Adviser B", "bridge Client Adviser A" and "bridge Client Adviser B")
process the data
find an incompatibility in the data source
sort it out (details in the next post)
And this is how the code looks like in the "measure group".


Figure 2 - the cube and the measures usage

Please note:
Measure group: "Adviser A" is simply fact client
Measure group "Adviser B" is v_Bridge_Adviser
Measure group "Client Adviser A" is v_Bridge_Adviser A, and;
Measure group "Client Adviser B" is v_Bridge_Adviser B

The most important tip is:
Make sure that the dimensions and the measure groups are all linked correctly.
If you don't you'll get "weird" numbers that you'll have no way of linking to each other, and that's a very frustrating experience.
As the entire process is being done with dsv's and cubes, you need to be able to read the visual graphs better than reading code.
Another tip: (sound obvious, but still) : First link all the "regular" measure groups, only after that the "many to many" ones.

the Dimensions:
dimDate 1, dim date are the "dear old" dim dates dimensions. I had to re insert dim date again to the cube; no need to re define the dimension itself (hierarchy and all of that staff).
dim Adviser 1 and dim Adviser are the Advisers dimension referring to dim Adviser table from the dsv (see figure 1)
vBridgeClient 1 is vBridgeClient view (don't know why I had two in the dimensions, some things I found too difficult to deal with) gives you a list of the distinct clientGroup members (the natural key - there's only one natural key for clients in my DW). Getting the distinct number enables me not only to get the distinct count, but also of getting the info of who exactly had moved where; very important data, as it is SCD type 2).

And after all is set & done, this is how I've written a very complicated MDX to use it (crap, the user can do it by drag & drop himself!)

SELECT

[Adviser A].[Id].&[20] ON 0,

NON EMPTY [Adviser B].[Id].allmembers ON 1

FROM [TestMtoM]

WHERE ([Date A].[Id].[Aug 2011],

[Date B].[Year-Month].[Month Year].&[Oct 2011])

Then I was asked to do the same, but not with the "Adviser" dimension but with the Category dimension. That's a different dimension, but when it came to implementation, I realized something very important:
I don't need the "bridge Category A" and "Bridge Category B" measure groups!
They could be useful in Marco & Alberto's example; but not in mine. They refer to checking changes in a dimension linked to a fact table - I'm dealing with changes in a factless fact table. In my case, where there are way fewer data every month, and the view to measure link the advisers, clients & dates isn't large; the total amount of lines include only less than half a million rows - that's not a lot! (that's for a year and a half worth of data, when we don't need to retain history for more than 3 years).
Therefore, I deleted the "bridge Client Adviser A" and "Bridge Client Adviser B" measure groups, and this is how the end result looks like:

As you can clearly see, while we still need the many to many relationship, we don't need it in as many measure groups as we needed it as the complicated manner which had been used before.
Yes, it works brilliantly, and yes, the data is accurate; it does give you the end results as it should!

One last thing: Yes, it is worth your while to set up the measure groups correctly. Of course you can write those complicated SQL queries, but why should you bother? This way your Business Analyst can create the reports him/herself to the best interest of the business without me worrying about the details of the display.

Business Intelligence in Sydney