Did you ever noticed thousands of lines of code in a single Stored Procedure? Did you faced performance issues in such Stored Procedure? Did you faced difficulties in debugging such Stored Procedure?
Did you ever noticed the millions and billions of rows in a single Table? Did you faced the difficulties and performance issues in dealing with the table data viz. Querying data, Modifying data etc.?
Sorry for asking a bunch of question! But it will give you the essence of decoupling of code and data. The question asked at the beginning of this article gives the challenges that we can face if our code and data is not decoupled.
Code Decoupling is breaking a larger set of code into smaller logical, manageable and reusable objects and arranging them in a proper sequence to achieve re-usability, easy maintainability, easy code debugging, easy deployment, better performance because of optimized execution plan and parallelism, parallel development and parallel unit testing etc.
/* Stored procedure to - 1. parse the XML input (50+ input parameters in multiple hierarchies like customer info, order summary, order detail etc.), 2. log the data in staging table, 3. perform various DML operation, 4. perform the analytics; and 5. finally return the output Situation could much more complicated. Believe me, I have seen 10000 lines of code in a single stored procedures. */ CREATE PROCEDURE [dbo].[usp_DecouplingTest] ( @InputXML XML ) AS BEGIN BEGIN TRY -- Multiple lines of code to parse XML input and store it in temporary tables, table variables and other variable -- This can be part of seperate reusable stored procedure or function (Object 1) BEGIN -- Trigger Object 1 END -- Generic code to log the data in staging table.. Multiple lines of code depending upon the number of tables and number of columns in table -- This can be part of seperate reusable stored procedure (Object 2) -- Condition(s) / Strategy(s) (if any) BEGIN -- Trigger Object 2 END -- Multiple DML operations depending upon the situation...It can also be multiple lines of code -- This can also be decoupled into another stored procedure(s) (Object 3) -- Condition(s) / Strategy(s) (if any) BEGIN -- Trigger Object 3 END -- Complex analytics code comprising of multiple lines of code which can be made generic and reusable. Result can be returned in tabular format or in XML or JSON format -- This can be part of seperate reusable stored procedure (Object 4) -- Condition(s) / Strategy(s) (if any) BEGIN -- Trigger Object 4 and store the output in temporary table, table variable, String variable or XML variable for further use END -- Perform further calculation (if any) (Object 5) -- Condition(s) / Strategy(s) (if any) BEGIN -- Trigger Object 5 and return the final output END END TRY BEGIN CATCH -- Multiple lines of code to capture and return the error -- This can be part of seperate reusable stored procedure END CATCH END
Did you ever thought to apply Design Patterns in your Stored Procedures? If not, then you should. Somehow we can’t apply the design patterns ditto as we can do in C#, Java and in other programming language but we can definitely make use of some of the patterns to implement it in SQL Server programming. Some of the design patterns that I think we can partially implement in SQL Server are Adapter pattern, Factory pattern, Strategy pattern etc.
The above example partially implements the Adapter and Strategy patterns.
You must be wondering about Data Decoupling. I am not sure if this terminology is present in the market or not, but I usually use this terminology for distributing the rows of a heavy table into multiple tables having smaller data sets.
Data Decoupling in my case helped a lot in avoiding hot spot, locks, deadlocks and performance issues; by having the smaller data sets in separate tables instead of one table with big data set, and with the appropriate use of Clustered Index.
It can be implemented using Dynamic SQL with the use of parameterized query.
Avoid use of inline queries. It’s not safe in terms of Security as well as some of the future version of SQL Server functionalities do not support the inline queries. If you have used the inline queries then make it parametrized.
How to implement data decoupling?
- Develop a stored procedure to dynamically create the table with fix schema by first checking the existence of the table before actually creating it;
- Whenever you are performing any insert, update or delete
- Call the Stored Procedure mentioned in Step 1;
- Perform the INSERT, UPDATE or DELETE using Dynamic SQL with the help of parameterized query;
- Whenever you are performing read operation;
- Check the existence of the table using Dynamic SQL with the help of parameterized query;
- Perform the SELECT query by using the Dynamic SQL with the help of parameterized query;
- If you want to read from multiple tables at the same time, make use of while loop and iterate through multiple tables and store the output in a temporary table for the further use;
- For precisely implementing the Data Decoupling, apply the following best practices
- Make use of Adapter pattern and develop the generic adapters;
- Develop the generic adapter for creating tables with appropriate constraints and indexes;
- Develop the generic adapter for write;
- Develop the generic adapter for historical data. This will be single point of trigger to read the data from the table;
- Develop the generic adapter for latest data. Consume adapter mentioned in point no. 4;
- Develop the generic adapter(s) for various summary which requires read from various tables and then aggregation or simple listing. Consume adapter mentioned in point no. 4;
- Try to enforce the read on Clustered Index in all the read adapters by enforcing the input to adapter accordingly;
- This will help to control the way the data is written and read from the database. No one will be able to access the table directly. Data access will be through adapters only;
- Create a separate database schema as a place holder for the dynamic tables;
- Create the separate database schema as a place holder for the generic adapters. Limit the access to this place holder so that only authorized person can make change to the generic adapters.
If we are concerned with the performance then each and every solution and suggestion that can help improve the performance matters. Decoupling is one of the least focused area in the database design and programming, but it has a lot of potential to achieve greater performance.
This is a personal weblog. The opinions expressed here represent my own and not those of my employer.
All data and information provided on this site by Author is for informational purposes only. Author makes no representations as to accuracy, completeness, correctness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use.
Please don’t use any of the examples discussed in this article in Production without evaluating it based on your need.