Friday, 7 February 2025

 How I created my first Chatbot ?

In my current project, Chatbot development initiative was started by our Project Manager and it opened up an opportunity for all of us to learn about it. I also started exploring about it and with the guidance and help from my project manager, YouTube, online blogs / documents and ChatGPT, I was able to develop my first little chatbot that leveraged ChatGPT LLM.

I would like to share what all components I have used and how I developed it.

Use Case : Tableau Admin Assistant, that answers questions on Licenses, Users, Data sources, Jobs, etc.

Tools & Libraries
  1. Python
  2. Streamlit
  3. Tableau APIs
  4. OpenAI (Model gpt-4o-mini )
Python Libraries
  • openai : to call ChatGPT LLM
  • streamlit : this is our UI developer
  • os 
  • sys
  • pandas : for data processing
  • matplotlib.pyplot : for plotting the charts

Flow chart of Chatbot
  1. User Asks Question from UI
  2. Send the question to ChatGPT along with the dataset schema
  3. Inform GPT through prompt that code should be generated in Python
  4. Capture the ChatGPT response
  5. Execute the code by performing small cleanup on the code received from GPT
  6. Display code output on the UI
Coding

Import the below libraries

import openai
import streamlit as st
import os
import sys # for commandline aurguments
import pandas as pd
import matplotlib.pyplot as plt

Create a dropdown on the UI for the user to select the subject on which question needs to be asked 

v_Query = st.selectbox('Query',['Select','Licenses','Datasources','Groups','Users']) 

 Based upon what subject user has selected, I will be deciding my source file that holds the subject information

if v_Query == 'Select':

    st.write("Please select subject to query from the above dropdown.")

elif v_Query == 'Licenses':

    File = "Usage.txt"

    data = pd.read_csv(File)

    columns = data.columns.tolist()

Set openai api_key

openai.api_key = 'place_your_open_ai_api_key_here'

Once question is received, execute the below code. 

if v_userQuestions == '':
    st.write("How can I help you?")
else:

    def callPrompt(v_userQuestions,v_dataSample):
        prompt = f"""
            I have this dataset with the columns {columns}.
            
            Generate python code based upon the question asked
            
            Questions : {v_userQuestions}
                  """
        #print(prompt)
        
        return prompt

    def callOPenAI(v_userQuestions):

        v_dataSample = data.head(100).to_string(index=False) # I'm loading the sample data here
        #print(v_dataSample)
        #prompt = callPrompt(v_userQuestions,v_dataSample) # create a prompt with sample data
        prompt = callPrompt(v_userQuestions,columns) # create a prompt with list of columns
        
        response = openai.ChatCompletion.create(model = 'gpt-4o-mini'
                                    ,messages = [
                                                  {"role" : "system", "content" : "you are an intelligent assistant to generate python code using the columns provided. Use python only. My data frame variable is data. Convert Timestamp column to date before applying in the filter. Store final output always in FinalResult variable. Always import pandas libray "}
                                                , {"role":"user", "content":prompt}
                                                ]
                                    , max_tokens = 500
                                    ,temperature=0)
                                    
        #print(response["choices"][0]["message"]["content"])
        print(response)
        openAIResponse = response["choices"][0]["message"]["content"]
        return openAIResponse
        
        
    openAIResponse = callOPenAI(v_userQuestions)
    #st.write(openAIResponse) # print python code on UI for validation purpose

    openAIResponse = openAIResponse.strip() # leading and trailing whitespaces
    # Remove triple backticks and 'python' if present
    openAIResponse = openAIResponse.replace("'''python", "").replace("```python", "")
    openAIResponse = openAIResponse.replace("'''", "").replace("```", "")
    openAIResponse = openAIResponse.strip()

    execResult = {}
    #st.pyplot(exec(openAIResponse))
    exec(openAIResponse,{'data':data},execResult) # pass data as a global scope. Follow the systax. exec returns none. To store its output, use local variable example execResults.
    result = execResult['FinalResult']
    st.write(result)

Important Functions

ChatCompletion.create : This is an OpenAI function that takes model, message, max_tokens, temperature as arguments.

  • message are sent in the form of role and content. Role defines who is speaking. Content is the question or the text.
  • max_tokens limits gpt response to number of tokens. Change depends on numbers of tokens.
  • temperature ranges from 0 to 1. 0 means model will be non-creative. Greater than 0 means model starts getting creative.


     # model = select the model within the GPT versions like GPT4/3 etc

    # message = The chat completion function requires a list of messages, each with a role ("system", "user", or "assistant") and content.
    # A list of dictionaries where each dictionary represents a message in the conversation.
    # role: Defines who is speaking. It can be "system" (to set context), "user" (the input from the user), or "assistant" (the model's response).
    # content: The actual text content of the message.
    # max_tokens = max number of token you want to limit in this request. OpenAI will not go beyond the max limits as charge depends on tokens consumed.
    # temperature = can range from 0 to 1. 0 means model will not be creative. with temperatur greater that 0, model starts being creative towards the answers.
    # n = numbers of outputs model can send in the reponse
   
Reading the Response : ChatGPT sends response in the json format. Based upon the number of answers we have requested ChatGPT, it sends multiple answers. 
To retrieve the first answer follow the syntax, response["choices"][0]["message"]["content"]

Chatbot Screens





Thank you for reading !!!

 

Sunday, 9 July 2023

Tableau Relationships | How Tableau Relationships Helped us Implement an OR Filter?

Scenario

We had a use case to implement an OR Filter in the Tableau reports.

How did we Solve it?

We used relationships to achieve this. I would like to explain this with an example.

Dataset for the Use Case

Consider the below data. It has daily sales by vehicle name. This dataset has vehicle name as A,B,C,D and E. 

Main Dataset



Assume, your organization had an old list of vehicles that your management people are more familiar to it and would want to see that list only, as filter on the reports. Assume, below is the old list of name of vehicles. For example, current day vehicle A was known as A LXI, A VXI, A DXI, which are now all known as only A. Similarly, others.

Old list of vehicles


User wants to see the Vehicle filter with the list of old values rather than new values. But the data that you return will be based upon the new list of vehicle name.

Example, regardless of user selecting A LXI,A VXI,A DXI, you should show the below data





Relationships to Rescue

1. Prepare the dataset as shown below, Main Dataset and Old List of Vehicles will be in relation based upon the Vehicle ID field and ensure it is Many to Many relation (default set by Tableau). No human intervention in this.


2. Go to sheet and do the below,
  1. Bring the fields from the Main Dataset into the view as shown below
  2. For the purpose of filter, use the vehicle name filter from the Old List of Vehicles. This is where, OR will be implemented.
Your view will appear something as shown below.




Now, observe the filtering behavior, You select A LXI OR A VXI OR A DXI, your view would always return the A vehicle data.

   Output, with different filter selection
 


Don't you think this is the OR behavior one would expect?

How did this work? 

Tableau relationships are different from Join. Had, it been a join, then records would have always duplicated regardless of you use Vehicle Name field from the Old List of Vehicles in the view or outside the view but relationship, filters the duplicates unless the field from the non-fact table is consumed inside the view. Relationship are driven based upon cardinality, referential integrity and usage of fields from the related tables/dimensions.


Potential Issue : If you are already a master of relationships in Tableau, then you can ignore this, otherwise you should definitely read it. 

Ensure, you use the fields from the Old List of Vehicles, only for the purpose of filtering or in parameters, As long as they are away from the View, Tableau doesn't duplicate the records and on this one single feature of of relationship, OR filtering is implemented.

See below, what happens, if you use the field from the Old List of Vehicles dataset. Your data gets multiplied based upon the number of occurrences of joining records in the Old List of Vehicles.



My Comment on the Use Case used in this Blog

You may consider, the use case I have explained is trivial and may be non-realistic but think from the perspective of How filtering is implemented and in case if you have any such scenario, whether this solution works.

Alternate Approach

You could use this old list of vehicles in a separate dataset and through parameters, you can implement this filter. While this article is being written, Tableau doesn't support multi-select parameters by native feature so you have to implement multi-select parameters through calculations, which is achievable. That approach, I will share later. But it is not feasible, if you have multiple such filters like imagine, 4 filters or 5 filters are this way.

Monday, 19 June 2023

SQL | Basics, but tricky scenarios, to refresh your knowledge.

 If you have lost SQL touch, considering the continuous transformation that is happening across software industry (Data Warehouse and Business Intelligence) with the introduction of self service BI tools and advanced ETL tools, then here are some basic SQL scenarios that will definitely brush-up your SQL. Consider answering them on your own before you look at the solution. You may definitely have a better approach for the solutions provided, I'm just sharing my solutions.

Disclaimer : Below questions are not my own creations but are the ones asked to me by SQL experts. Solutions are mine.

  1. Consider below table, can you get the percentage of products that are both low fat and recyclable ?
/*
CREATE TABLE products (	product_id INT,
brand_name VARCHAR(100),
product_name VARCHAR (100),
is_low_fat_flg BIT,
is_recyclabl e_flg BIT
)


INSERT INTO products VALUES
 (1,NULL,'A',1,1),
 (1,NULL,'B',1,1),
 (1,NULL,'C',1,1),
 (1,NULL,'D',1,0),
 (1,NULL,'E',1,1),
 (1,NULL,'F',0,1),
 (1,NULL,'G',1,1),
 (1,NULL,'H',1,0),
 (1,NULL,'I',1,1),
 (1,NULL,'J',0,1),
 (1,NULL,'K',1,1),
 (1,NULL,'L',1,1),
 (1,NULL,'M',0,0),
 (1,NULL,'N',1,1),
 (1,NULL,'O',1,1),
 (1,NULL,'P',0,1),
 (1,NULL,'Q',1,1),
 (1,NULL,'R',1,1),
 (1,NULL,'S',1,0),
 (1,NULL,'T',0,1),
 (1,NULL,'U',1,1),
 (1,NULL,'V',1,1),
 (1,NULL,'W',1,1),
 (1,NULL,'X',0,0),
 (1,NULL,'Y',1,1),
 (1,NULL,'Z',1,1)
 */
--  SQL
 SELECT CAST(SUM(IIF(is_low_fat_flg = 1 and is_recyclable_flg = 1,1,NULL)) AS FLOAT) / COUNT(*)
 FROM products

2. Consider below the table, get the difference of sales of apple and orange at date level.


 /*
 CREATE TABLE fruit_sales (dateid DATE, fruit VARCHAR(50), sold DECIMAL)

 INSERT INTO fruit_sales VALUES
 ('2015-01-01'  ,'Apple' ,31 ),
 ('2015-01-01'  ,'Orange'  ,  19 ),
 ('2015-01-02'  ,'Apple'   ,  37 ),
 ('2015-01-02'  ,'Orange'  ,  26 ),
 ('2015-01-03'  ,'Apple'   ,  21 ),
 ('2015-01-03'  ,'Orange'  ,  21 ),
 ('2015-01-04'  ,'Apple'  ,  35 ), 
 ('2015-01-04'  ,'Orange'  ,  27)

 */

-- SQL

 SELECT dateid, ABS(SUM(IIF(fruit = 'Apple',sold,NULL)) - SUM(IIF(fruit = 'Orange',sold,NULL)))
 FROM fruit_sales
 GROUP BY dateid

3. Consider below the table, write a query that shows the US and France sales, by date. Grain is 1 row/day.

 /*
 CREATE TABLE country_sales(dateid   DATE, country VARCHAR(2), amt DECIMAL )
 INSERT INTO country_sales VALUES
( '2013-01-01' , 'US' ,         5 ) ,
( '2013-01-02', 'US' ,        8  ),
( '2013-01-03' , 'US' ,        10  ),
( '2013-01-01' , 'FR' ,         6  ),
( '2013-01-02' , 'FR' ,         6  ),
( '2013-01-05' , 'FR' ,         7  ),
( '2013-01-08' , 'GB' ,        10  ),
( '2013-01-01' , 'FB' ,         5  ),
( '2013-01-06' , 'JP' ,         8  ),
( '2013-01-07' , 'JP' ,         8 )
*/

-- SQL
SELECT dateid,country,SUM(amt)
FROM country_sales
WHERE country IN ('US','FR')
GROUP BY dateid, country

-- check what happens if you use the below code

SELECT dateid,SUM(IIF(country = 'US',amt,NULL)),SUM(IIF(country = 'FR',amt,NULL))
FROM country_sales
WHERE country IN ('US','FR')
GROUP BY dateid

4. Consider below the table, write a query to get the students who have borrowed at least 5 books.

/*
CREATE TABLE Students(Stid INT, Sname VARCHAR(50))
CREATE TABLE BooksBorrowed(Bid INT,Stid INT, BookName VARCHAR(100))

INSERT INTO Students VALUES
(1,'A'),
(2,'B'),
(3,'C'),
(4,'D'),
(5,'E'),
(6,'F'),
(7,'G'),
(8,'H'),
(9,'I'),
(10,'J'),
(11,'K'),
(12,'L'),
(13,'M'),
(14,'N'),
(15,'O'),
(16,'P'),
(17,'Q'),
(18,'R'),
(19,'S'),
(20,'T'),
(21,'U'),
(22,'V'),
(23,'W'),
(24,'X'),
(25,'Y'),
(26,'Z')



INSERT INTO BooksBorrowed VALUES
(1,1,'Book1'),
(2,3,'Book2'),
(3,5,'Book3'),
(4,7,'Book4'),
(5,9,'Book5'),
(6,11,'Book6'),
(7,1,'Book7'),
(8,3,'Book8'),
(9,7,'Book9'),
(10,9,'Book9'),
(11,11,'Book8'),
(12,2,'Book7'),
(13,4,'Book6'),
(14,6,'Book5'),
(15,1,'Book4'),
(16,1,'Book3'),
(17,3,'Book2'),
(18,3,'Book1'),
(19,3,'Book5'),
(20,9,'Book6'),
(21,9,'Book7'),
(22,9,'Book8'),
(23,12,'Book9'),
(24,14,'Book1'),
(25,16,'Book2'),
(26,18,'Book3'),
(27,18,'Book4'),
(28,12,'Book6'),
(29,14,'Book8'),
(30,14,'Book4'),
(31,16,'Book3'),
(32,18,'Book7'),
(33,14,'Book3'),
(34,12,'Book6'),
(35,12,'Book7'),
(36,16,'Book8'),
(37,16,'Book2'),
(38,16,'Book1')
*/

-- SQL

SELECT S.Sname,B.*
FROM Students S INNER JOIN
(SELECT Stid,count(bid) TotalBooksBorowed
FROM BooksBorrowed
GROUP BY Stid
HAVING COUNT(*)>=5) B
	ON S.Stid = B.Stid

5. Use the above schema and get the students who have barrowed no books

SELECT S.stid,S.Sname,B.stid
FROM Students S 
LEFT JOIN BooksBorrowed B
ON S.Stid = B.Stid
WHERE B.Stid IS NULL

6. Use the above code and get the students who have borrowed least number of books. I have not provided solution for this question.

7. Consider the below code and Get the average sales by given month

/*
CREATE TABLE Items (ItemId VARCHAR(10),StartDate DATE,EndDate DATE,Price DECIMAL)
INSERT INTO Items VALUES
('mobile','2020-01-01','2020-01-15',100),
('mobile','2020-01-16','2020-01-20',150)

CREATE TABLE Sales(Saleid INT,ItemId VARCHAR(10),PurchaseDate DATE,Quantity INT)
INSERT INTO Sales VALUES
(100,'mobile','2020-01-05',100),
(101,'mobile','2020-01-17',200)

SELECT * FROM Items
SELECT * FROM Sales
*/

--SQL
--  Get the average sales by given month

SELECT MONTH(PurchaseDate), SUM(S.quantity * I.Price)/COUNT(*) TotalAmount
FROM Sales S
INNER JOIN Items I 
	ON S.ItemId = I.ItemId AND PurchaseDate BETWEEN I.StartDate AND I.EndDate
GROUP BY MONTH(PurchaseDate)

Please do leave comments below, if you see this post as helpful or if there are any issues with the solutions provided.

Tableau | Password Protected Dashboard

 Let’s develop simple password protected tableau dashboard. This dashboard accepts UserID and Pin from user and shows data only after authenticating user. This implementation involves very basic mechanics as main intention of the post is to educate how to implement userid and password protection for dashboards.  Following are the steps to do this,

  1. Store UserIds and Pins in a master table. Here table variable is used.
  2. Take UserId and Pin from user through dashboard parameter
  3. Pass these parameters to SQL Server database stored procedure
  4. Stored procedure authenticates user and returns data accordingly.

Before we begin the post, it is assumed that the reader is aware of following things

  1. Basic understanding of Tableau, Connecting to data source from Tableau
  2. Calling stored procedures from Tableau
  3. SQL Server Stored Procedures
  4. Parameters in Tableau

 

  1. Develop simple SQL procedures that returns data after authenticating userid and pin

Following is the SP developed for this post. This SP takes user id & pin from user and returns Customer details if authentication is successful. Otherwise no data is displayed.

CREATE PROCEDURE Sp_tableaupwdsecureddashboard(@UserID VARCHAR(4) = NULL, @Pin    VARCHAR(4) = NULL)
AS
BEGIN
SET nocount ON

-- Master Table with UserIds and Pwds
DECLARE @Users TABLE
(
userid VARCHAR(10),
pin    VARCHAR(4)
)

INSERT INTO @Users
VALUES      ('East',
'1234'),
('West',
'5678'),
('North',
'2222')

-- Validate Userid & Pin and then return the result
IF (SELECT Count(*)
FROM   @Users
WHERE  userid = @UserID
AND pin = @Pin) > 0
BEGIN
SELECT [CustomerName] = C.firstname + ' ' + Isnull(C.middlename, '')
+
' '
+ Isnull(C.lastname, ''),
C.birthdate,
C.gender,
C.yearlyincome,
[CustomerAddress] = C.addressline1 + ' '
+ Isnull(C.addressline2, ''),
C.phone,
C.emailaddress
FROM   [AdventureWorksDW2012].[dbo].[dimcustomer] C
END
-- Return blank set
ELSE
BEGIN
SELECT [CustomerName] = C.firstname + ' ' + Isnull(C.middlename, '')
+
' '
+ Isnull(C.lastname, ''),
C.birthdate,
C.gender,
C.yearlyincome,
[CustomerAddress] = C.addressline1 + ' '
+ Isnull(C.addressline2, ''),
C.phone,
C.emailaddress
FROM   [AdventureWorksDW2012].[dbo].[dimcustomer] C
WHERE  1 = 0
END
END

2. Open Tableau Desktop and connect to the server and database where this SP is deployed.

2

 

3. Take New sheet and create two parameters UserId and Pin as shown below

UserId

3.1

Pin

3.2

4. In data source connection, connect to the SP developed for this dashboard,

Connect to the SP and map the parameters created.

4

 

5. Go to sheet and show UserId and Pin parameters. Create simple view as shown below. Enter Valid UserId and Pin in the parameters. If userid and pin are valid, then data is displayed.

5

 

6. No data is shown for invalid userid and password

6

Please let know if there are any simple ways of achieving this. Will try to post this workbook to tableau public and update the link here shortly.

Tableau | Code Dirty | Show/Hide Multiple Parameters based upon another Parameter

 Hi

You might have come across a requirement while working on Tableau, where you are required to show/hide parameters based upon another parameter. In this blog, I would like to show, how to hide multiple parameters dynamically.

Requirement : I have four parameters, Geography, Region, State and City. Geography parameter will have values as 'Region', 'State', 'City'. If Region is selected in the Geography parameter, then only Region Parameter should be shown to the user and rest two (State and City) should be hidden. Similarly, if City is selected in Geography parameter, then hide State and Region parameters.

This is how parameters look like (I'm using Tableau's Sample Super Store dataset)

Geography

Geography parameter

Region

Region Parameter

State

State parameter

City

City Parameter

Solution: In short, solution is to place the parameters to show or hide (Region, State, City) in a horizontal container and make them move left/right based upon the selected parameter value (Geography). To achieve the movements, use sheets.

  1. Create a calculated field Geography Filter, that holds Geography parameter values. This will be used in the sheets that will create parameter movement.
  1. Create four sheets, LHS, LHS1, RHS, RHS1 as shown below. Observe the parameter values to be selected in each of the sheet. At any point of time, out of four sheets, only two will be active.
  2. LHS : Create a sheet, LHS, and apply Geography filter by selecting Region and add a dummy header like "" in rows

3. Duplicate LHS and create LHS1. Here set Geography Filter to Region and State and add a dummy header like "" in rows

4. Duplicate LHS and rename it to RHS. Set Geography Filter to State and City and add a dummy header like "" in rows

5. Duplicate RHS and rename it to RHS1. Set geography filter to City only and a dummy header like "" in rows

6. Take new dashboard and add a horizontal container. Add the three parameters, Region, State & City and add the four sheets created as shown below in the same order. Order is important here.

8. Size of the container is also important. For example, I'm using the parameters width as 100 pixels in my dashboard. So, the 3 parameters' width will be set to 100 pixels each (fixed) and will add 200 pixels more to the container for the sheets to make the movement, taking the container width to 500 pixels, as sown below. Remember, at any given point of time, only two sheets will be active.

Format the sheets properly, hide titles, remove borders, add the dummy header, make it full screen. In short, follow all the rules of sheet toggling in Tableau

8. Bring the Geography parameter to the dashboard, hide the tile of each sheet (LHS/1, RHS/1) and see the parameter making the movements.

if region is selected in Geography, look at the position of Region parameter.

if State is selected in Geography, look at the position of State parameter, it has taken the position of Region parameter.

If I do some formatting and place white blank objects on the right and left hand side on the horizontal container and just show one parameter, in this case, only state will be shown in the view and everything will be behind the blanks,

State

City

Format further and see now

Will try to post it on gallery and share the link here, soon. if you think it is helpful, then please leave a comment, otherwise let me know of doing this in a better way.

Tableau | 4 or N Months Rolling Sales Example Without Tableau Calculations

Before you read : Method used in the post is through calculated fields. You can achieve the same using Table calculation with few clicks, but this way of writing will make you create dynamic Rolling Sales like 4 months, 3 months, 2 months as per user requirement which Tableau doesn't support - at least until the time this blog is written.

Requirement : Using Tableau, show 4 months rolling sales until previous or last closing month. For example, if current month is Sept 2021, then you are requested to show rolling sales from May 2021 to Aug 2021.

  1. May-21 should show sales from Feb-21 until May-21
  2. Jun-21 should show sales from Mar-21 until Jun-21
  3. Jul-21 should show sales from Apr-21 until Jul-21
  4. Aug-21 should show sales from May-21 until Aug-21

Dataset : I will be using sample super store dataset provided by Tableau

1. Bring last 7 months of data in the view. As I have to show 4 months rolling, I need current 4 months and previous 3 months of data.

4 months rolling date filter
// Since we are calculating 4 months rolling, load view with data from last 7 months
DATETRUNC('month',[Order Date])>= DATEADD('month',-7,DATETRUNC('month',TODAY())) 
AND
DATETRUNC('month',[Order Date])< DATETRUNC('month',TODAY()) // Untill Last Month

2. Rank the months in ascending order. Here, I use Order Date as date column

Rank Order Date (ascending)
RANK_DENSE(MAX([Order Date]),'asc')

3. Create view as shown below

4. Observe the ranks created for each month and pick the data for current four months i.e May-21. Jun-21, Jul-21, Aug-21. Rolling Sales for the month of May-21 will be Running Total from Rank 1 to Rank 4 i.e Feb-21 to May21. Similarly, for Jul-21, Rolling sales will be Running Total from Rank 3 to Rank 6. You end up creating below logic,

Below part of code, you will have to replicate to make it dynamic based upon l4 months, 3 months, 2 months etc.

4 Months Rolling Sales 
IF [Rank Order Date (ascending)] = 4 
    THEN RUNNING_SUM(IF  [Rank Order Date (ascending)]>=1 
    AND [Rank Order Date (ascending)]<=4 THEN SUM([Sales]) END ) 
//Jun
ELSEIF [Rank Order Date (ascending)] = 5 
    THEN RUNNING_SUM(IF  [Rank Order Date (ascending)]>=2 
    AND [Rank Order Date (ascending)]<=5 THEN SUM([Sales]) END )
// Jul
ELSEIF [Rank Order Date (ascending)] = 6 
    THEN RUNNING_SUM(IF  [Rank Order Date (ascending)]>=3 
    AND [Rank Order Date (ascending)]<=6 THEN SUM([Sales]) END )
//Aug
ELSEIF [Rank Order Date (ascending)] = 7 
    THEN RUNNING_SUM(IF  [Rank Order Date (ascending)]>=4 
    AND [Rank Order Date (ascending)]<=7 THEN SUM([Sales]) END )

END

5. Bring 4 Months Rolling Sales measure to view.

6. Remove the previous 3 months from the view using another rank filter. Select only the months with rank 4 or more because we are showing last four months rolling sales. Apply this filter in the new

4 Months Rolling Rank Filer
[Rank Order Date (ascending)]>=4

View is ready for use

This logic works dynamically for last four months because I have a dynamic date filter in the view. if you want to show last 5 months rolling sales, then you will have to include [(2*5)-1] = 9 months of data in the view and accordingly add one more if else condition to fetch rolling sales for one more month and adjust the rank range accordingly.

If you think this post is helpful or if there is any other better optimized way of doing this, then please let me know through comment.

Thank you.

Power BI | DAX | Convert SQL to DAX

 Hi

If you are new to DAX, then this post will help you compare DAX with SQL and get an overview of DAX. It is assumed that the reader is aware of SQL (basics).

Consider the below tables that I have taken from Tableau's Sample Super Store dataset. You can download this data set from Tableau site or search Tableau Sample Super Store data set. It is an excel.

I would like to get all the orders from the city 'Los Angeles' and the people who have ordered them. From RDBMS querying perspective, my SQL would look something like this

SELECT  O.OrderID [City_OrderID]
	   ,O.OrderDate [City_OrderDate]
	   ,O.Sales [City_Slaes]
	   ,O.Profit [City_Profit]
	   ,O.Quantity [City_Quantity]
	   ,O.Region [City_Region]
	   ,P.Person [Person]
FROM Orders O
INNER JOIN People P ON O.Region = P.Region
WHERE City = 'Los Angeles'

How would you write a DAX query to get the same results as above from a Power BI data set? Below is the DAX

OrdersByCity = 
 SELECTCOLUMNS(FILTER(GENERATE(Orders,People),Orders[Region] 
                    = People[Region] && Orders[City] = "Los Angeles")
                               ,"OrderID",Orders[Order ID]
                               ,"OrderDate",Orders[Order Date]
                               ,"Sales",Orders[Sales]
                               ,"Profit",Orders[Profit]
                               ,"Quantity",Orders[Quantity]
                               ,"City_Region",Orders[Region]
                               ,"Person",People[Person])

Output from Power BI

GENERATE is a JOIN, FILTER is a WHERE clause, SELECTCOLUMNS is a SELECT clause

I have written this post to make a comparison between SQL and DAX querying. In case if you are new to DAX, like me, then it may help you in getting started, if you have little bit of SQL background.

-Yadnesh Madiwale