25 August, 2010

SQL Optimization

Process and data modelingImage via WikipediaI didn't think about optimizing my SQL scripts until some recent events, where I was called by one of the guys in the IT Infra team told me some bad news: I'm in the top 3 list. And that isn't something to be proud of: it was the list of the scripts that is being monitored as being resource-hogging.

I immediately checked on the said script, and to my surprise, it was a very simple one!

Which means one thing: I never thought about writing SQL scripts that are optimized. for 10 years, I was simply writing scripts, not minding whether or not they are optimized.

But the good side of that is this: help was available.
The guy who flagged me of the problem is also the one who would eventually show me how to do optimization. I said I was one who didn't formally learn SQL scripting, and it was all self-study, trial and error. So the big error happened.

And to my surprise, he has the same story to tell! (Which means I have nowhere to run, but to optimize my queries...)

How did we go through the optimization? Just a few things really did happen, and I can say it worked, because my daily report running at midnight had a drastic runtime reduction - from 2.5hours to less than an hour. And I'm still optimizing the scripts, of course, with the script runtime still reducing from time to time.

And so, below are what we did, 'we' at the start, then 'me' to carry on the work, till today.


5 Simple Steps taken to optimize the scripts:
1. Table Indexing
2. Optimized Views
3. Local Tables
4. Break Where Predicates
5. Rearrange the filters

Table Indexing
My colleague pointed out that before you even write a script (and my thought raced back 10 years ago when I started writing queries, huh?) is to know whether the table holding the data I want is indexed or not. If the table(s) are indexed, go ahead and write your queries against it. if not, try the next one: optimized views

Optimized Views
Many a times an index will not be possible - due to system constraints. An index taxes the RDBMS to a certain degree, and indexing one table that has updates (insert, delete, update) frequently creates more problem than it would solve. The solution? Some top-gun guys who know the trade would just create optimized views, indexed for optimum run, and independent from the main table. Of course, these views would be updated regularly, like hourly, since that is the very reason they are created - to get away from the problem of instantaneous updates that will inhibit the creation of indexes. So don't forget that optimized views, while indexed, are usually updated on an hourly basis.

Local Tables
Sometimes there are no indexed tables, and though there are optimized views, these are enough, or simply, the data you want is not in them. Views cause the problem of additional system load if they are used with additonal criteria. A view is already a query, so additional filtering against a view will pose more system load.
So create your own local table. Create your own local tables to store temporary data. Run simple queries and store the data in your local tables. You'll be surprised how it helps!

Break Where Predicates
Okay, the very use of the local tables is when you take a complex WHERE clause, break it into individual queries, small enough to have one, single WHERE clause, and run it, saving the data in the local table. And don't forget that you should be indexing these local tables for optimum run!

For example, if you have 4 filters in the WHERE clause, break it, store the data in your local tables, and run against each previous local table.

Try out cascaded query style, running queries against the previous data, and see if that doesn't speed up your scripts.

Try using JOIN in the FROM clause instead of putting everything in the WHERE clause.

Rearrange the filters
If the filters in the WHERE clause can't be broken or can't be reduced, try rearranging the filters. Goal is to begin from the top with the filter that will immediately reduce the data size being captured, for the succeeding filters to work against as the script progresses. And it may not be done at first try, so this is a continual process, even many years after. (Hey, that is exactly my experience here).

So there you have it. 5 simple steps to optimizing queries. While these I have applied, and have worked for me, the optimization steps are not limited to these 5 alone. As in software development, the saying is true: "There is nothing permanent but change." Hopefully this helps you, and it helped me - a lot.


Till then!


Enhanced by Zemanta