Wednesday, February 25, 2015

SQL Server Knowledge Refresh - Tuning

As a contractor, a problem to face is to periodically refresh the knowledge on certain areas in order to prepare for new job applications. I want to make an effort to prepare an outline of major knowledge points to easy my future life.

Most of them are pasted from Internet sources. My motto is copy it, repeat it and spread it.

Clear the caches
--do not do it in production environment
checkpoint
dbcc dropcleanbuffers
dbcc [freeProcCache|FreeSystemCache|FlushProcInDB()]

also be able to free just one plan if you have the handle.

Enable some features/monitors (to set measurement)

set statistics [time|io]
set showplan [text|xml] or graphic execution plan in studio
--clear wait stats
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR)

DMVs to Use

-system info: dm_os_performance_counters, dm_os_wait_stats
-query info: dm_exec_requests; dm_exec_sql_text
-index info: dm_db_index_usage_stats; dm_io_virtual_file_stats

select xxx
From sys.dm_exec_requests ER
Cross Apply sys.dm_exec_sql_text(ER.sql_handle) ST
Where session_id = @@SPID

Given no context, what to look in execution plan:
they are not necessarily bad or wrong, sometimes they are good, but a place to start to make sure they are good.
scans: table, index
look up
parallelism:
estimated vs actual rows accessed:
physical reads: lack of memory or index
hash join, groud by, having etc will cause it to use tempdb

Some topics that makes bad queries

Default Cursor
Default cursor is the normal way most of people use cursor in SQL Server, it is global and update-able, even most of case writer have no intention to use as that. It is resource heavy and slow.

It can be made lighter and faster by explicitly declare as local and non-updateable.

declare c cursor local fast_forward
for select name from atable;

Cursor is expensive in SQL Server, not that severe in Oracle.

Not In vs Left Join
Not Exists and EXCEPT are better but EXCEPT adds extinct sort operator just like what UNION does, so duplication is removed.

has to consider the nullable columns.

RDBMS is good at relations, but not good at find no relations because it has to go through all the relationships to decide there is no relation.


Where In vs Where Exists

within in, exists, join, apply subquery, use exists over join, because it does not care the other details and it works with short circuit, it stops earlier and do not bother to scan the rest of data set. worst case is same as using join.

Reduce Recompile

reason to recompile:
--expected
create proc.. with recompile
exec proc with recompile
sp_recompile xx
plan was aged out of memory

--UnExpected
big changes after last execution: schema change, newer index statistics etc
interleave ddl or dml

sp_executesql vs exec()

sp_executesql , a good way to compose dynamic SQL because 1. it can be parameterized and 2. force recompile to choose right execution plan each time parameter changes

(Dynamic SQL is the same as normal SQL)
set @sql=N' xx=@xxIn and xxx=@xxxIn';
exec sp_executesql @sql,N'@xxIn int, @xxxIn date', @xxIn, @xxxIn;

parameterization is better than concatenation.

Comma-delimited parameters

string splitting is expensive, using table-values parameters is preffered approach

Implicit conversion is performance killer

Temp table vs table variable
later has less lock overhead, but otherwise temp table is better because it has statistics, can be indexed and support parallelism.

they are all stored in tempdb

UDFs

moving functions to inline form to improve performance.


Checking Execution plan

If they are bad is depends, but worth to pay attention to these keywards.
slow plan usually has:
lookup
  RID lookup:nonclustered index
  key loopup:clustered index
meaning at least one column is not in index. you got trouble when it is for lot of rows
spool
  it is a cache in the query processor. it is there because the table is lack of adequate indexes or uniqueness information. it is implemented as hiden table in tempdb(not good)
  table spool
    lazy spool
    eager spool
  index spool(eager spool)

sort
  order by, merge join, stream aggregation, windowing...
  not linear
  do you really need that sort? maybe do it in client.
hash
  hash match join
  hash match aggregate
  good for olap bad for oltp
nested loops
Scan
  index scan
  table scan

No comments: