15 June 2017

How to: Cut & Paste from Excel

I don't know if any of you have read the book "Copying & Pasting from Stack Overflow", but even if you copy from Excel you need to be aware of the following:
If you copy from an old file you might copy the connections & the names as well, not just the values. It's important to either select "values & formulas" after the copying, or alternatively, just clear the name manager:

05 June 2017

Partition by Half-a-year using windows functions

Let's say you're using Windows functions, and you want to partition by a quarter.
That's simple, you'd just do:

SELECT SUM ( [MyValue]) OVER (PARTITION BY [Employee], YEAR ([Valuation date]), DATEPART (qq,[Valuation date])  ORDER BY ROW_ID)

But what happens if you would like to run a sum by half a year? If it's just getting the value (first or second half of year), you'd play around with

SELECT CASE WHEN month < 7 THEN 1 ELSE 2 END AS [HalfAYear]

 or with

SELECT CASE WHEN DATEPART (qq, Valuation date)  < 3 THEN 1 ELSE 2 END AS [HalfAYear]

 and that would do.
But since we need a running sum per half a year, you'd like to use this code:

SUM ( [MyValue]) OVER (PARTITION BY [Employee], YEAR ([Valuation date]), YEAR (DATEADD(m, 6, [Valuation date]))  ORDER BY ROW_ID)

Using the YEAR for the first time gives me the current year, when I'm using YEAR over "add 6 month" I'm using the "Australian Financial Year", but this way I am partitioning by half a year. Please note that at the end of the day I don't know which half it is, but it doesn't matter, since I don't care about the label, I care about the results.

03 May 2017

Pointer to a function in SQL ?!

If you're coming from C then you would know that there's a thing called a  a function pointer.
It's good to put in a structure, and then whenever you need to "compare" you would call the function pointer. In the "sort" function call in one of the C's library you can see it, so you can compare apples to oranges to your heart's delight.
But what happens in SQL?
Let's say I have different funds (which I do) and they require different calculation according to their Fund type.
So funds A, B, C & D have calculation which is completely different the funds E, F & G.
Had I used a programming language like C I could say that a Fund could have a pointer to a function in their structure, and calculate of that. But how do I handle this in SQL?
I've designed a solution which would look like this:
I would add a field (column) to the Fund table called "CalcType" which would refer to the calculation type. Now, I can use the following code:


SELECT *
FROM PerformanceCalc1 (@param) --that's a function
INNER JOIN dbo.FUND
WHERE Fund.CalcType = 1
UNION ALL
SELECT *
FROM PerformanceCalc2 (@param) --that's a function
INNER JOIN dbo.FUND
WHERE Fund.CalcType = 2
UNION ALL
SELECT *
FROM PerformanceCalc3 (@param) --that's a function
INNER JOIN dbo.FUND
WHERE Fund.CalcType = 3



This calculation has its limitation: each Performance Calc functions run on all the funds, but the return set is limited to the appropriate Fund Type. The number of funds I have is as such that it doesn't really matter; if I had more it could have been an issue. Yet, what I like about this solution is that it is set based in its thinking, and not using Dynamics SQL (which requires elevated permissions of the end-user, if we want her to run it).

06 April 2017

Sub totals in Excel's Pivot Tables - which total should I sub?

This wonderful feature shows you why Pivot Tables (Power Query or other pivot table) wins hands-down vs Power BI, the much hyped MS app.
Let's say you have a Pivot Table (which I do, plenty of them).
And let's say they are summed by different values: Year, Quarter, and forecast vs. Actual (just an example). But you want to pick which columns (or rows) you want the sub totals to run on.
You can choose if you want to sub total all column or only few of them.
Let's look at an example:
This is what the header looks like:



Now I would like to have the total on the Year, but not on the Quarter.
First of all I need to go to PivotTalbe tools -> Design -> and pick the Subtotals that I want
( chose the Show all Sub Totals at the bottom of the group; you can pick top, but don't chose non).




I can go to the columns definition, right click the setting (FyQ, in my case) , choose the Field Settings, and then I have the choice of Subtotals: Automatic or None. So If I want it to be summed I would and if not then I don't.

The beauty here is that I don't need complicated MDX functions in order to design my Pivot table. Great!

23 March 2017

Rolling Average SQL

How many scripts you need to see for rolling average (AVG) in SQL? Well, it never hurts to see another one...
This is the beauty of Windows Functions, they take a complicated problem and put it in a simple script.

(BTW - I've taken the Sales numbers from here - a great webiste, but I believe it's for PostgresSQL, and then added the year 2016 and the extra numbers. I believe you'll be able to find the pattern.)

CREATE TABLE #temp  (StartDate DATE, Sales INT)

INSERT #temp (StartDate, Sales)
VALUES ('20150110', 5)
,('20150210', 3)
,('20150310', 7)
,('20150410', 8)
,('20150510', 2)
,('20150610', 3)
,('20150710', 6)
,('20160110', 8)
,('20160210', 6)
,('20160310', 10)
,('20160410', 11)
,('20160510', 5)
,('20160610', 6)
,('20160710', 9)



If you just want the rolling average (you can replace the AVG function with SUM, MIN, MAX as well) for all the time periods, you need to run the following code:


SELECT
StartDate,
AVG (sales) OVER ( order by StartDate ) AS AVGSales
FROM #temp


But if you want to partition by YEAR, for example, you'd better run the following code:


SELECT
StartDate,
AVG (sales) OVER (Partition  by YEAR (StartDAte) order by StartDate ) AS AVGSales
FROM #temp



Short & sweet!

(this is a simple example; I need to run this on a total assets for a fund. It's amazing how those extra digits get us confused!)

Don't forget :)

DROP TABLE #temp


25 January 2017

MDX with Excel - for future reference

Let's say you have data model that you want to enquire on. If you're using Power Query (in Excel) you might want to investigate a little further. In order to do that you need to unleash the power of MDX, which is complete and utterly crazy. The problem with MDX (unlike plain Excel functions) is that it's really hard to know what you're doing and there's very little documentation. Nevertheless it's important to read the documentation, and you'll get the hand of it - after a while.

So, how do I find out what was the last date in my cube that actually gave me some data?

My model consist of an SQL database, in which there's a table called dimPeriod, and measures which are stored in a Fact Table. The data is queried from Excel using Stored Procedures, and reported via PowerQuery. Obviously enough, the dimPeriod table includes days from the last couple of years to the next 20+ years. That's enough, and it's great as it warns me if I'm trying to enter the wrong date (yes, I'm using foreign keys to enforce the referential integrity).
Let's just say that [ALL ACP] is the table where there are measures. So I thought of using something like TopCount function to create a set. Unfortunately, it didn't give me the correct results:

=CUBESET("ThisWorkbookDataModel","TopCount([ALL ACP].[Datekey].Children,5)","Max Date Reported")


Explanation:
Cubeset - creates a cube set
"ThisWorkbookDataModel" - the data model that is stored in this workbook. That's true for Excel 2013, and basically, once you type the "Cubeset" function, open bracekts and the double qoute, it comes up with the name of the model.
TopCount - give you the first number
[All Acp] - the table that has measures in it
DateKey - yep, that's the datekey
Children - otherwise you'll get "all"
5 - just a number. I could (or should) have used 1, but 5 gives me a range of numbers to work with.

After inserting that set to a cell in Excel, all I needed was to enquire after that set:

=CUBERANKEDMEMBER("ThisWorkbookDataModel",$G$30,1)


G30 is where the code above is inserted.
A problem: I got the minimum date.
Solution:

=CUBESET("ThisWorkbookDataModel","BottomCount([ALL ACP].[Datekey].Children,5)","Max Date Reported")


Changing from TopCount to BottomCount gave me the latest date to come in. Please note that there are no measures in the query.

You have two options to query on the result, which is in DateKey format.
The first one is to query the relevant data member. Let's say, you're looking for FY Year, you would use the following code:

=CUBEMEMBER("ThisWorkbookDataModel","EXISTS([Dim Period].[FY Month Name].children, [Dim Period].[DateKey].["&G32&"])")

Where G32 is where you result is
Alternatively, just get out the year, the month and the date using:
Year

 =LEFT(G32,4)

Month:

=MID(G32,5,2)

Day:

=RIGHT(G32,2)

Combining the Excel Formulas with the MDX ones gives you lots of power to manipulate the end results.