WHILE LOOP can be used for batch processing and can be helpful if you are dealing with huge data processing.
Recently I did an analytics project where I had to iterate through 70K customers and generate the Monthly, Last One Year and Since Inception returns of all the customers and their respective benchmark index since 2003 to till date for every month.
So in this scenario if you notice, there are 3 interrelated iterations involved each for Client, Month and type of returns.
Initially I ran the logic with simple WHILE LOOP iterating one record at a time and kept it running over the weekends but it was not completed. So on Monday morning I tried something else with slight twicking. I utilized the Recursive CTE for month and did cross join with my dataset to get the Cash Flow which is the main ingredient of Return (XIRR). I converted the WHILE LOOP to execute in batch of few different sample batch size for the POC.
I tried with various batch size and putting below my observations after POC with each batch size with atleast 4 iterations. Below analysis is of average time taken for one client for all the months and for a single type of return.
800 MS with batch size of 1
500 MS with batch size of 50
400 MS with batch size of 100
280 MS with batch size of 500
350 MS with batch size of 1000
If you see the analysis then batch size of 500 gave us the better results.
Now I modified the logic to process data based on the range of Customer ID’s and type of return supplied as input to the stored procedure. Then I taken advantage of the server capacity and executed the 18 parallel processes using separate 18 SQL Jobs. 6 jobs for Monthly Return each for 12K clients. 6 jobs for Last One Year Return and 6 job for Since Inception Return. Result was outstanding. Process which ran for 2 days and did not completed now got completed in 1.5 hours.
Don’t you think it is worth to give a try.
Significant performance gain can be achieved by implementing batch processing using the WHILE LOOP by choosing right candidate for batch size.
Aforesaid result could be different on different databases based on database and server setting and server configuration.