Monday, March 12, 2012

requesting query optimization help

so here are the tables, with a little info about each one:
create table areas (
areaid uniqueidentifier primary key not null, -- PK, clustered
areanumber int identity(1, 1) not null, -- indexed
areaname varchar(100) not null,
regionnumber int not null) -- FK to regions table PK
the areas table has about 16000 rows, and three unindexed columns not
listed here.
create table regions (
regionnumber int identity(1, 1) primary key not null, -- PK, clustered
regionname varchar(100) not null,
parentregionnumber int not null) -- FK to regions table PK
the regions table has about 800 rows, and about 12 columns not listed
here
create table regionrelationships (
regionnumber int primary key not null,
relatedregionnumber int primary key not null) -- PK is clustered
the regionrelationships table has about 1700 rows, and no unlisted
columns
and here are the various versions of the stored procedure query i'm
trying to optimize with the execution plan costs listed with each one.
the original stored procedure body looked liked this:
SELECT DISTINCT a.areanumber, a.areaname
FROM areas AS a WITH (NOLOCK)
WHERE a.regionnumber = 56
ORDER BY a.areaname
cost: .205
then business rules caused a change in the stored procedure body, and
the cost went way up:
SELECT DISTINCT a.areanumber, a.areaname
FROM areas AS a WITH (NOLOCK)
WHERE a.regionnumber = 56 OR a.regionnumber IN (
SELECT relatedregionnumber
FROM regionrelationships
WHERE regionnumber = 56)
ORDER BY a.areaname
cost: .518
i've tweaked a bunch of stuff, and this is the best i've been able to
do so far:
SELECT DISTINCT a.areanumber, a.areaname
FROM areas AS a WITH (NOLOCK)
LEFT JOIN regionrelationships AS r
ON r.relatedregionnumber = a.regionnumber
WHERE a.regionnumber = 56 OR r.regionnumber = 56
ORDER BY areaname
cost: .417
most of the remaining cost is in a hash right outer join (35%, mostly
CPU) and a cluster scan on areas (44%, mostly I/O)
i've tried a lot of things, including:
* using the areanumber as the primary key of the areas table instead of
the areaid, to make the cluster index smaller, to try to make cluster
scans faster (no effect at all)
* creating clustered indexes on columns other than the primary keys,
such as the regionrelationships regionnumber column, to try to make
hash scans faster (no effect at all)
so i'm just wondering, i'm new to this aspect of optimization. is there
anything here that stands out as something i can do to improve the
performance of this stored procedure? hopefully i've provided enough
information.
thanks in advance for any help
jasonOn 15 Mar 2006 13:45:49 -0800, jason wrote:
(snip)
>then business rules caused a change in the stored procedure body, and
>the cost went way up:
>SELECT DISTINCT a.areanumber, a.areaname
>FROM areas AS a WITH (NOLOCK)
>WHERE a.regionnumber = 56 OR a.regionnumber IN (
> SELECT relatedregionnumber
> FROM regionrelationships
> WHERE regionnumber = 56)
>ORDER BY a.areaname
>cost: .518
>
>i've tweaked a bunch of stuff, and this is the best i've been able to
>do so far:
(snip)
Hi Jason,
Try this version instead:
SELECT a.areanumber, a.areaname
FROM areas AS a WITH (NOLOCK)
WHERE a.regionnumber = 56
UNION ALL
SELECT a.areanumber, a.areaname
FROM areas AS a
INNER JOIN regionrelationships AS r
ON r.relatedregionnumber = a.regionnumber
WHERE r.regionnumber = 56
AND r.relatedregionnumber <> 56
ORDER BY areaname
Indexing suggestions:
- In regionrelationships, make sure that regionnumber is the first of
the two columns in the primary key.
- In areas, have a clustered index with regionnumber as first column. Or
(maybe even better) a nonclustered index on (regionnumber, areaname,
areanumber).
Hugo Kornelis, SQL Server MVP|||thank you very much sir!

No comments:

Post a Comment