(It's Been a Long Time Coming... 🎵🎹) Getting Date Quarters for Months in Python - Or "Why Date and Time Dimension Tables Exist"
Hi. It's been a while. :-)
So I recently saw a question on StackOverflow called "First month of quarter given month in Python" where they were asking how to work out a date related issue in Python. There are some wonderful techincal and mathematical solutions there, but it reminded me very much of proper architecture design, especially when the project will be for a non-hobby.
User jonrsharpe nailed it with his answer, IMHO. In the continued balancing act of storage space vs. compute power, a dataset will usually win over a computation. Not always, just usually. And even when it doesn't, depending on circumstances, a shared dataset is VITAL to avoid computation bugs when it is needed more than a few times, or by more than one department. It can be a lifesaver when it comes to reporting, as well as Fact-Dimension architecture or even a Data Vault architecture.
Create, share, and maintain the single version of the truth.
Here's a code sample for SQL Server to get you started. I used something like this many years ago and then it expanded over time. Edit and change as you and your organisation sees fit, for example changing the Day Of Week number if you want to start on a Sunday or changing your Fiscal Quarter. You might have a need to include a column like "Fiscal Half." This sample includes things that you as a programmer/engineer might not consider important, or even know about, but business might want. For example, the ISO Week.
DECLARE @StartDate DATE = '20230101'; DECLARE @NumberOfYears INT = 100; SET DATEFIRST 1; SET DATEFORMAT mdy; SET LANGUAGE US_ENGLISH; DECLARE @CutoffDate DATE = DATEADD(YEAR, @NumberOfYears, @StartDate); WITH DimDatePrep AS ( SELECT 'PK_DateID' = CAST(REPLACE(CAST(D AS VARCHAR), '-', '') AS BIGINT) ,'Date' = CAST(D AS DATE) ,'Year' = YEAR(CAST(D AS DATE)) ,'QuarterNumber' = CAST(DATEPART(Q,D) AS BIGINT) ,'QuarterName' = CONCAT('Q',DATEPART(Q,D)) ,'MonthNumber' = CAST(DATEPART(MM,D) AS BIGINT) ,'MonthName' = SUBSTRING(DATENAME(MONTH, D), 1, 3) ,'Week' = DATEPART(WEEK, D) ,'ISOWeek' = (DATEPART(dy,DATEDIFF(dd,0,CAST(D AS DATE))/7*7+3)+6)/7 ,'DayOfYear' = DATEDIFF(DAY,STR(YEAR(D),4)+'0101',D)+1 ,'DayOfMonth' = DAY(D) ,'DOWNumber' = DATEPART(DW, D) ,'DOWName' = SUBSTRING(DATENAME(DW, D), 1, 3) ,'FiscalYear' = CAST((CASE WHEN DATEPART(MONTH,D) > 6 THEN (YEAR(D) + 1) ELSE YEAR(D) END) AS BIGINT) ,'FiscalQuarterNumber' = CAST((CASE WHEN MONTH(D) BETWEEN 1 AND 3 THEN '3' WHEN MONTH(D) BETWEEN 4 AND 6 THEN '4' WHEN MONTH(D) BETWEEN 7 AND 9 THEN '1' WHEN MONTH(D) BETWEEN 10 AND 12 THEN '2' END ) AS BIGINT) ,'FiscalQuarterName' = CAST((CASE WHEN MONTH(D) BETWEEN 1 AND 3 THEN 'Q3' WHEN MONTH(D) BETWEEN 4 AND 6 THEN 'Q4' WHEN MONTH(D) BETWEEN 7 AND 9 THEN 'Q1' WHEN MONTH(D) BETWEEN 10 AND 12 THEN 'Q2' END ) AS VARCHAR) ,'FiscalMonthNumber' = CAST((CASE WHEN MONTH(D) = 7 THEN '1' WHEN MONTH(D) = 8 THEN '2' WHEN MONTH(D) = 9 THEN '3' WHEN MONTH(D) = 10 THEN '4' WHEN MONTH(D) = 11 THEN '5' WHEN MONTH(D) = 12 THEN '6' WHEN MONTH(D) = 1 THEN '7' WHEN MONTH(D) = 2 THEN '8' WHEN MONTH(D) = 3 THEN '9' WHEN MONTH(D) = 4 THEN '10' WHEN MONTH(D) = 5 THEN '11' WHEN MONTH(D) = 6 THEN '12' END ) AS BIGINT) ,'ISOYear' = YEAR(DATEADD(DAY, (4 - DATEPART(WEEKDAY, D)), D)) ,'WeekStarting' = DATEADD(DAY, 1 - DATEPART(WEEKDAY, D), CAST(D AS DATE)) ,'WeekEnding' = DATEADD(DAY, 7 - DATEPART(WEEKDAY, D), CAST(D AS DATE)) FROM ( SELECT D = DATEADD(DAY, RN - 1, @StartDate) FROM ( SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate)) RN = ROW_NUMBER() OVER (ORDER BY [S1].[object_id]) FROM [sys].[all_objects] AS [S1] CROSS JOIN [sys].[all_objects] AS [S2] ORDER BY [S1].[object_id] ) AS X ) AS Y ) --INSERT INTO [dbo].[DimDate] -- ( -- [PK_DateID], -- [Date], -- [Year], -- [QuarterNumber], -- [QuarterName], -- [MonthNumber], -- [MonthName], -- [Week], -- [ISOWeek], -- [DayOfYear], -- [DayOfMonth], -- [DOWNumber], -- [DOWName], -- [FiscalYear], -- [FiscalQuarterNumber], -- [FiscalQuarterName], -- [FiscalMonthNumber], -- [ISOYear], -- [WeekStarting], -- [WeekEnding] -- ) SELECT * FROM DimDatePrep;
Here's what the output will look like:
Note that "ISOWeek" for 2023/01/01 is "52" even though it's the first calendar day of the year.
To go along with your DimDate, you might have a need for a DimTime. Normally it's only 1 440 entries, one for each minute, and you can add metadata onto that, like "Hour" or something more tuned to what your business needs, like "PartOfDay" segmented into "Red-Eye", "Mid-Morning", "Early Afternoon", "Late Afternoon", and "Evening". You might have a requirement to go down into the seconds or even parts of a second, although that is rarer than having a need for just minutes.
Enjoy and happy architecting! :-)
Comments
Post a Comment