There is no harm in using the Scalar UDF’s if it is used for Arithmetic operation or other purposes which doesn’t involves accessing any table. Always create such Scalar UDF’s with SCHEMABINDING to avoid Spooling.
But, Scalar UDF’s are not good option if it is accessing the table.
Let us evaluate the above statement.
We have a table that holds all the Stocks and another table that hold the Day wise Stock Price for every Stock. We want to fetch the Latest Stock Price of all the Stocks for a given date.
Let us first create the tables.
-- Table for Stock master CREATE TABLE [dbo].[StockMaster] ( StockID INT NOT NULL IDENTITY(1,1) CONSTRAINT PK_StockMaster_StockID PRIMARY KEY CLUSTERED , StockName VARCHAR(50) ) GO -- Table for Daily Stock Price CREATE TABLE [dbo].[StockPrice] ( ID NUMERIC NOT NULL IDENTITY(1,1) CONSTRAINT PK_StockPrice_ID PRIMARY KEY CLUSTERED , StockID INT , PriceDate DATE , StockPrice NUMERIC(18,4) ) CREATE UNIQUE NONCLUSTERED INDEX UNC_StockPrice_PriceDate_StockID ON [dbo].[StockPrice] (PriceDate DESC, StockID ASC)
Now let’s fill these table with some data. Please wait for some time to complete the execution of the queries for our analysis.
-- Query to insert records into StockMaster table SET NOCOUNT ON DECLARE @StockID INT SET @StockID = 1 WHILE (@StockID <= 20000) BEGIN INSERT INTO [dbo].[StockMaster] (StockName) VALUES ('Stock ' + CAST(@StockID AS VARCHAR(50))) SET @StockID = @StockID + 1; END GO -- Query to insert records in StockPrice for every Stock and for every Date since 2016-10-01 to till date SET NOCOUNT ON DECLARE @Date DATE , @StockID INT SET @Date = '2016-10-01' WHILE (@Date <= CAST(GETDATE() AS DATE)) BEGIN SET @StockID = 1 WHILE (@StockID <= 20000) BEGIN INSERT INTO [dbo].[StockPrice] (StockID, PriceDate, StockPrice) VALUES (@StockID, @Date, (CHECKSUM(GETDATE()) % 100)) SET @StockID = @StockID + 1; END SET @Date = DATEADD(DD, 1, @Date); END GO
We now have the data available for our Analysis. Let’s fetch the Latest Stock Price for all the Stocks for a given date using the Scalar UDF.
First create the Scalar UDF to return the latest stock price for a given date.
-- Scalar UDF to return the Latest Stock Price for a given date CREATE FUNCTION [dbo].[ufn_GetLatestStockPrice] ( @StockID INT , @AsOnDate DATE ) RETURNS NUMERIC(18,4) AS BEGIN DECLARE @StockPrice NUMERIC(18,4) DECLARE @LatestPriceDate DATE SELECT @LatestPriceDate = MAX([PriceDate]) FROM [dbo].[StockPrice] WHERE [PriceDate] <= @AsOnDate AND [StockID] = @StockID SELECT @StockPrice = [StockPrice] FROM [dbo].[StockPrice] WHERE [PriceDate] = @LatestPriceDate AND [StockID] = @StockID RETURN COALESCE(@StockPrice, 0); END GO
Now, run the following query which is using scalar UDF to return the latest Stock Price for a given date.
-- Original Query with Scalar function SET NOCOUNT ON SET STATISTICS IO ON SET STATISTICS TIME ON DECLARE @AsOnDate DATE SET @AsOnDate = '2016-10-13' SELECT StockID , StockName , [dbo].[ufn_GetLatestStockPrice] (StockID, @AsOnDate) AS [StockPrice] FROM [dbo].[StockMaster]
Let’s look at the stats.
It took 241219 ms which is equivalent to 241 seconds.
Workarounds to avoid scalar UDF’s
Workaround 1… Highly recommended
-- Workaround 1 to replace the Scalar UDF (Highly recommended) SET NOCOUNT ON DECLARE @AsOnDate DATE SET @AsOnDate = '2016-10-13' IF OBJECT_ID('tempdb..#Max_PriceDate') IS NOT NULL DROP TABLE #Max_PriceDate SELECT StockID , MAX(PriceDate) AS [Max_PriceDate] INTO #Max_PriceDate FROM [dbo].[StockPrice] WHERE [PriceDate] <= @AsOnDate GROUP BY StockID CREATE NONCLUSTERED INDEX IX_#Max_PriceDate_StockID ON #Max_PriceDate(StockID) SELECT A.StockID , A.StockName , C.StockPrice FROM [dbo].[StockMaster] A INNER JOIN #Max_PriceDate B ON B.StockID = A.StockID INNER JOIN [dbo].[StockPrice] C ON C.PriceDate = B.Max_PriceDate AND C.StockID = A.StockID
Let’s look at the stats.
Isn’t it Great! It took only 121 ms.
Workaround 2
-- Workaround 2 to replace the Scalar UDF SET NOCOUNT ON DECLARE @AsOnDate DATE SET @AsOnDate = '2016-10-13' SELECT A.StockID , A.StockName , C.StockPrice FROM [dbo].[StockMaster] A CROSS APPLY ( SELECT MAX(PriceDate) AS [Max_PriceDate] FROM [dbo].[StockPrice] WHERE [PriceDate] <= @AsOnDate AND StockID = A.StockID ) B INNER JOIN [dbo].[StockPrice] C ON C.PriceDate = B.Max_PriceDate AND C.StockID = A.StockID
Let’s look at the stats.
Not bad! It took only 210 ms.
Workaround 3
-- Workaround 3 to replace the Scalar UDF SET NOCOUNT ON DECLARE @AsOnDate DATE SET @AsOnDate = '2016-10-13' ; WITH cte_max_stock_price AS ( SELECT StockID , MAX(PriceDate) AS [Max_PriceDate] FROM [dbo].[StockPrice] WHERE [PriceDate] <= @AsOnDate GROUP BY StockID ) SELECT A.StockID , A.StockName , C.StockPrice FROM [dbo].[StockMaster] A INNER JOIN cte_max_stock_price B ON B.StockID = A.StockID INNER JOIN [dbo].[StockPrice] C ON C.PriceDate = B.Max_PriceDate AND C.StockID = A.StockID
Let’s look at the stats.
Sounds good! It took only 232 ms.
Conclusion
Scalar UDF’s are not good option if it is accessing the table. Always try to avoid it. There are many ways to avoid it like we have discussed some in this article. Out of the 3 workarounds we discussed here in this Article, the Workaround 1 with temporary table was the most optimized one with 121 ms ONLY.
Cheers! Happy learning and do share your feedback 🙂
Two serious flaws here, firstly, why not use an inline table valued functions? Secondly, how can anyone assess this without a test data set? Further, statistics IO and TIME will affect the measurements, not the best way for timing code executions.
LikeLike
Thanks Eirikur for your valuable feedbacks!
Test data is provided and anyone can surely asses it😃
You have brought up a valid point to utilize the inline table valued functions. It can be surely used and the function will ultimately have the query mentioned in Workaround 2 or 3.
One important point to note is we can’t create temporary table in inline table. However, it could be a good reusable solution with slight performance impact.
LikeLike