Work around to scalar functions – SQL Server

There is no harm in using the Scalar UDF’s if it is used for Arithmetic operation or other purposes which doesn’t involves accessing any table. Always create such Scalar UDF’s with SCHEMABINDING to avoid Spooling.

But, Scalar UDF’s are not good option if it is accessing the table.

Let us evaluate the above statement.

We have a table that holds all the Stocks and another table that hold the Day wise Stock Price for every Stock. We want to fetch the Latest Stock Price of all the Stocks for a given date.

Let us first create the tables.

-- Table for Stock master
CREATE TABLE [dbo].[StockMaster]
(
	StockID			INT NOT NULL IDENTITY(1,1)
				CONSTRAINT PK_StockMaster_StockID PRIMARY KEY CLUSTERED
	, StockName		VARCHAR(50)
)

GO

-- Table for Daily Stock Price
CREATE TABLE [dbo].[StockPrice]
(
	ID			NUMERIC NOT NULL IDENTITY(1,1)
				CONSTRAINT PK_StockPrice_ID PRIMARY KEY CLUSTERED
	, StockID		INT
	, PriceDate		DATE
	, StockPrice	NUMERIC(18,4)
)

CREATE UNIQUE NONCLUSTERED INDEX UNC_StockPrice_PriceDate_StockID
ON [dbo].[StockPrice] (PriceDate DESC, StockID ASC)

Now let’s fill these table with some data. Please wait for some time to complete the execution of the queries for our analysis.

-- Query to insert records into StockMaster table
SET NOCOUNT ON

DECLARE @StockID		INT

SET @StockID	= 1
WHILE (@StockID <= 20000)
	BEGIN
		INSERT INTO [dbo].[StockMaster] (StockName) VALUES ('Stock ' + CAST(@StockID AS VARCHAR(50)))

		SET @StockID	=	@StockID + 1;
	END

GO

-- Query to insert records in StockPrice for every Stock and for every Date since 2016-10-01 to till date
SET NOCOUNT ON

DECLARE @Date		DATE
	, @StockID		INT

SET @Date		= '2016-10-01'

WHILE (@Date	<= CAST(GETDATE() AS DATE))
	BEGIN
		SET @StockID	= 1
		WHILE (@StockID <= 20000)
			BEGIN
				INSERT INTO [dbo].[StockPrice] (StockID, PriceDate, StockPrice) VALUES (@StockID, @Date, (CHECKSUM(GETDATE()) % 100))

				SET @StockID	=	@StockID + 1;
			END

		SET @Date	=	DATEADD(DD, 1, @Date);
	END

GO

We now have the data available for our Analysis. Let’s fetch the Latest Stock Price for all the Stocks for a given date using the Scalar UDF.

First create the Scalar UDF to return the latest stock price for a given date.

-- Scalar UDF to return the Latest Stock Price for a given date
CREATE FUNCTION [dbo].[ufn_GetLatestStockPrice]
(
	@StockID			INT
	, @AsOnDate			DATE
)
RETURNS NUMERIC(18,4)
AS
	BEGIN
		DECLARE @StockPrice			NUMERIC(18,4)
		DECLARE @LatestPriceDate	DATE

		SELECT @LatestPriceDate		=	MAX([PriceDate])
		FROM [dbo].[StockPrice]
		WHERE [PriceDate]	<=	@AsOnDate
			AND [StockID]	=	@StockID

		SELECT @StockPrice	=	[StockPrice]
		FROM [dbo].[StockPrice]
		WHERE [PriceDate]	=	@LatestPriceDate
			AND [StockID]	=	@StockID

		RETURN COALESCE(@StockPrice, 0);
	END

GO

Now, run the following query which is using scalar UDF to return the latest Stock Price for a given date.

-- Original Query with Scalar function
SET NOCOUNT ON

SET STATISTICS IO ON
SET STATISTICS TIME ON

DECLARE @AsOnDate	DATE

SET @AsOnDate		= '2016-10-13'

SELECT StockID
	, StockName
	, [dbo].[ufn_GetLatestStockPrice] (StockID, @AsOnDate) AS [StockPrice]
FROM [dbo].[StockMaster]

Let’s look at the stats.

stats1

It took 241219 ms which is equivalent to 241 seconds.

Workarounds to avoid scalar UDF’s

Workaround 1… Highly recommended
-- Workaround 1 to replace the Scalar UDF (Highly recommended)
SET NOCOUNT ON

DECLARE @AsOnDate	DATE

SET @AsOnDate		= '2016-10-13'

IF OBJECT_ID('tempdb..#Max_PriceDate') IS NOT NULL DROP TABLE #Max_PriceDate
SELECT StockID
	, MAX(PriceDate) AS [Max_PriceDate]
INTO #Max_PriceDate
FROM [dbo].[StockPrice]
WHERE [PriceDate]	<=	@AsOnDate
GROUP BY StockID

CREATE NONCLUSTERED INDEX IX_#Max_PriceDate_StockID	ON #Max_PriceDate(StockID)

SELECT A.StockID
	, A.StockName
	, C.StockPrice
FROM [dbo].[StockMaster] A
	INNER JOIN #Max_PriceDate B
		ON B.StockID	=	A.StockID
	INNER JOIN [dbo].[StockPrice] C
		ON C.PriceDate	=	B.Max_PriceDate
		AND C.StockID	=	A.StockID

Let’s look at the stats.

stats2

Isn’t it Great! It took only 121 ms.

Workaround 2
-- Workaround 2 to replace the Scalar UDF
SET NOCOUNT ON

DECLARE @AsOnDate	DATE

SET @AsOnDate		= '2016-10-13'

SELECT A.StockID
	, A.StockName
	, C.StockPrice
FROM [dbo].[StockMaster] A
	CROSS APPLY
		(
			SELECT MAX(PriceDate) AS [Max_PriceDate]
			FROM [dbo].[StockPrice]
			WHERE [PriceDate]	<=	@AsOnDate
				AND StockID	=	A.StockID
		) B
	INNER JOIN [dbo].[StockPrice] C
		ON C.PriceDate	=	B.Max_PriceDate
		AND C.StockID	=	A.StockID

Let’s look at the stats.

stats3

Not bad! It took only 210 ms.

Workaround 3
-- Workaround 3 to replace the Scalar UDF
SET NOCOUNT ON

DECLARE @AsOnDate	DATE

SET @AsOnDate		= '2016-10-13'

; WITH cte_max_stock_price AS
(
	SELECT StockID
		, MAX(PriceDate) AS [Max_PriceDate]
	FROM [dbo].[StockPrice]
	WHERE [PriceDate]	<=	@AsOnDate
	GROUP BY StockID
)

SELECT A.StockID
	, A.StockName
	, C.StockPrice
FROM [dbo].[StockMaster] A
	INNER JOIN cte_max_stock_price B
		ON B.StockID	=	A.StockID
	INNER JOIN [dbo].[StockPrice] C
		ON C.PriceDate	=	B.Max_PriceDate
		AND C.StockID	=	A.StockID

Let’s look at the stats.

stats4

Sounds good! It took only 232 ms.

Conclusion

Scalar UDF’s are not good option if it is accessing the table. Always try to avoid it. There are many ways to avoid it like we have discussed some in this article. Out of the 3 workarounds we discussed here in this Article, the Workaround 1 with temporary table was the most optimized one with 121 ms ONLY.

Cheers! Happy learning and do share your feedback 🙂

3 thoughts on “Work around to scalar functions – SQL Server

  1. Two serious flaws here, firstly, why not use an inline table valued functions? Secondly, how can anyone assess this without a test data set? Further, statistics IO and TIME will affect the measurements, not the best way for timing code executions.

    Like

    1. Thanks Eirikur for your valuable feedbacks!

      Test data is provided and anyone can surely asses it😃

      You have brought up a valid point to utilize the inline table valued functions. It can be surely used and the function will ultimately have the query mentioned in Workaround 2 or 3.

      One important point to note is we can’t create temporary table in inline table. However, it could be a good reusable solution with slight performance impact.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s