Reposition Subtotals: Reproducable bug, yay!

Steps to reproduce:
--Create test table of one uniqueidentifier column
CREATE TABLE [dbo].[UniverseCollapse] (
[SomeID] [uniqueidentifier] NOT NULL
)
--Add clustered index
CREATE CLUSTERED INDEX [IX_UniverseCollapse_1] ON
[dbo].[UniverseCollapse]([SomeID])
--Add test row
INSERT INTO UniverseCollapse (SomeID) VALUES
('F9CB8628-571D-463B-9E82-0AC4D0A3C2EB')
--Run the select over and over. Sometimes returns zero rows, sometimes
returns one.
SELECT SomeID
FROM UniverseCollapse
WHERE SomeID = 'F9CB8628-571D-463B-9E82-0AC4D0A3C2EB'
AND '6c03c85d-8151-43ed-858f-7ec5d887689b' <> newID()
Run the select over and over and sometimes you get zero rows, sometimes you
get one.
Without the clustered index, it never fails.
Looking closer at the two WHERE clause conditions:
1) SomeID = '...'
This is always the case, as it is the value that was inserted.
2) '6c03...' <> newid()
This is a trueism, since guid's don't conflict very that often.
If we run just:
SELECT SomeID
FROM UniverseCollapse
WHERE SomeID = 'F9CB8628-571D-463B-9E82-0AC4D0A3C2EB'
we always get one row. If we run:
SELECT SomeID
FROM UniverseCollapse
WHERE '6c03c85d-8151-43ed-858f-7ec5d887689b' <> newID()
we always get one row. Have both criteria together, and we sometimes get
zero rows.
Execution Plan:
|--Filter(
WHERE:(
[UniverseCollapse].[SomeID]<newid()
OR
[UniverseCollapse].[SomeID]>newid() ))
|--Clustered Index Seek(
OBJECT:(
[foo].[dbo].[UniverseCollapse].[IX_UniverseCollapse_1]),
SEEK:([UniverseCollapse].[SomeID]=F9CB8628-571D-463B-9E82-0AC4D0A3C2EB)
ORDERED FORWARD)
The "ID < newid OR ID > newid" is very interesting. If "newid()" is
evaluated twice, once for each half of the WHERE clause, and it happens to
get a GUID that has swung the other way, it could then eliminate the row
when it shouldn't.
If i only execute:
SELECT SomeID
FROM UniverseCollapse
WHERE '6c03c85d-8151-43ed-858f-7ec5d887689b' <> newID()
i get the plan:
|--Filter(
WHERE:(
STARTUP
R(6C03C85D-8151-43ED-858F-7EC5D887689B<>newid()) ))
|--Clustered Index Scan(
OBJECT:([foo].[dbo].[UniverseCollapse].[IX_UniverseCollapse_1]))
And it's only evaluating newid() once. So the question then becomes, "Why
does it fail when a clustered index is there?" Looking at the execution
plan if i remove the clustered index:
|--Filter(
WHERE:(STARTUP EXPR(6C03C85D-8151-43ED-858F-7EC5D887689B<>newid())))
|--Table Scan(
OBJECT:([foo].[dbo].[UniverseCollapse]),
WHERE:([UniverseCollapse].[SomeID]=F9CB8628-571D-463B-9E82-0AC4D0A3C2EB))
It doesn't do a "guid > newid or guid < newid", but instead does a "Startup
Expression". Which means that the problem is in fact that newID is being
evaluated twice for the "field <> newid()"
Yay! i found a bug! So do i get a cheque or something?Oh. SQL Server 2000 w/SP4
select @.@.version
Microsoft SQL Server 2000 - 8.00.2039 (Intel X86)
May 3 2005 23:18:38
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
Doesn't fail in 2k5, which does a "value <> newid()" rather than "value <
newid() or value > newid()"|||Yes, it's a bug (at least IMO).
However, since it's fixed in 2005, I'd say your chances of getting a SQL
2000 fix are about equal to the chance of getting
'6c03c85d-8151-43ed-858f-7ec5d887689b' the next time you run NewID() <grin>.
Tom
"Ian Boyd" <ian.msnews010@.avatopia.com> wrote in message
news:uB0m1xbbIHA.1164@.TK2MSFTNGP02.phx.gbl...
> Oh. SQL Server 2000 w/SP4
> select @.@.version
> Microsoft SQL Server 2000 - 8.00.2039 (Intel X86)
> May 3 2005 23:18:38
> Copyright (c) 1988-2003 Microsoft Corporation
> Standard Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
>
> Doesn't fail in 2k5, which does a "value <> newid()" rather than "value <
> newid() or value > newid()"
>
>|||And as for the check, "it's in the mail" <grin>
Tom
"Tom Cooper" <tomcooper@.comcast.no.spam.please.net> wrote in message
news:eruUuCcbIHA.4208@.TK2MSFTNGP04.phx.gbl...
> Yes, it's a bug (at least IMO).
> However, since it's fixed in 2005, I'd say your chances of getting a SQL
> 2000 fix are about equal to the chance of getting
> '6c03c85d-8151-43ed-858f-7ec5d887689b' the next time you run NewID()
> <grin>.
> Tom
> "Ian Boyd" <ian.msnews010@.avatopia.com> wrote in message
> news:uB0m1xbbIHA.1164@.TK2MSFTNGP02.phx.gbl...
>> Oh. SQL Server 2000 w/SP4
>> select @.@.version
>> Microsoft SQL Server 2000 - 8.00.2039 (Intel X86)
>> May 3 2005 23:18:38
>> Copyright (c) 1988-2003 Microsoft Corporation
>> Standard Edition on Windows NT 5.0 (Build 2195: Service Pack 4)
>>
>> Doesn't fail in 2k5, which does a "value <> newid()" rather than "value <
>> newid() or value > newid()"
>>
>|||> And as for the check, "it's in the mail" <grin>
> Tom
Excellent! Maybe i can buy a copy of 2005.|||Hi Ian,
Here is a good article about changes in behavior for this function, between
versions 2000 and 2005.
Change in Behavior of RAND and NEWID in SQL Server 2005
http://www.sqlmag.com/Articles/ArticleID/97032/97032.html?Ad=1
Enjoy it,
AMB
"Ian Boyd" wrote:
> Steps to reproduce:
> --Create test table of one uniqueidentifier column
> CREATE TABLE [dbo].[UniverseCollapse] (
> [SomeID] [uniqueidentifier] NOT NULL
> )
> --Add clustered index
> CREATE CLUSTERED INDEX [IX_UniverseCollapse_1] ON
> [dbo].[UniverseCollapse]([SomeID])
> --Add test row
> INSERT INTO UniverseCollapse (SomeID) VALUES
> ('F9CB8628-571D-463B-9E82-0AC4D0A3C2EB')
> --Run the select over and over. Sometimes returns zero rows, sometimes
> returns one.
> SELECT SomeID
> FROM UniverseCollapse
> WHERE SomeID = 'F9CB8628-571D-463B-9E82-0AC4D0A3C2EB'
> AND '6c03c85d-8151-43ed-858f-7ec5d887689b' <> newID()
>
> Run the select over and over and sometimes you get zero rows, sometimes you
> get one.
> Without the clustered index, it never fails.
> Looking closer at the two WHERE clause conditions:
> 1) SomeID = '...'
> This is always the case, as it is the value that was inserted.
> 2) '6c03...' <> newid()
> This is a trueism, since guid's don't conflict very that often.
> If we run just:
> SELECT SomeID
> FROM UniverseCollapse
> WHERE SomeID = 'F9CB8628-571D-463B-9E82-0AC4D0A3C2EB'
> we always get one row. If we run:
> SELECT SomeID
> FROM UniverseCollapse
> WHERE '6c03c85d-8151-43ed-858f-7ec5d887689b' <> newID()
> we always get one row. Have both criteria together, and we sometimes get
> zero rows.
> Execution Plan:
> |--Filter(
> WHERE:(
> [UniverseCollapse].[SomeID]<newid()
> OR
> [UniverseCollapse].[SomeID]>newid() ))
> |--Clustered Index Seek(
> OBJECT:(
> [foo].[dbo].[UniverseCollapse].[IX_UniverseCollapse_1]),
> SEEK:([UniverseCollapse].[SomeID]=F9CB8628-571D-463B-9E82-0AC4D0A3C2EB)
> ORDERED FORWARD)
> The "ID < newid OR ID > newid" is very interesting. If "newid()" is
> evaluated twice, once for each half of the WHERE clause, and it happens to
> get a GUID that has swung the other way, it could then eliminate the row
> when it shouldn't.
>
> If i only execute:
> SELECT SomeID
> FROM UniverseCollapse
> WHERE '6c03c85d-8151-43ed-858f-7ec5d887689b' <> newID()
> i get the plan:
> |--Filter(
> WHERE:(
> STARTUP
> R(6C03C85D-8151-43ED-858F-7EC5D887689B<>newid()) ))
> |--Clustered Index Scan(
> OBJECT:([foo].[dbo].[UniverseCollapse].[IX_UniverseCollapse_1]))
> And it's only evaluating newid() once. So the question then becomes, "Why
> does it fail when a clustered index is there?" Looking at the execution
> plan if i remove the clustered index:
> |--Filter(
> WHERE:(STARTUP EXPR(6C03C85D-8151-43ED-858F-7EC5D887689B<>newid())))
> |--Table Scan(
> OBJECT:([foo].[dbo].[UniverseCollapse]),
> WHERE:([UniverseCollapse].[SomeID]=F9CB8628-571D-463B-9E82-0AC4D0A3C2EB))
> It doesn't do a "guid > newid or guid < newid", but instead does a "Startup
> Expression". Which means that the problem is in fact that newID is being
> evaluated twice for the "field <> newid()"
>
> Yay! i found a bug! So do i get a cheque or something?
>
>|||> Here is a good article about changes in behavior for this function,
> between
> versions 2000 and 2005.
> Change in Behavior of RAND and NEWID in SQL Server 2005
> http://www.sqlmag.com/Articles/ArticleID/97032/97032.html?Ad=1
It should be mentioned that these changes to newid() are not intended for
this case. These changes ensure that Rand() and Newid() are only evaluated
once for each reference. Before if you aliased a column that was a Rand or a
NewID, it would generate a new value for each time the alias is used.
For example in:
SELECT
CASE
WHEN TheValue < 0.5 THEN 'LessThanPoint5'
ELSE 'GreaterThanPoint5'
END AS Test1,
CASE
WHEN TheValue < 0.5 THEN 'LessThanPoint5'
ELSE 'GreaterThanPoint5'
END AS Test2,
CASE
WHEN TheValue < 0.5 THEN 'LessThanPoint5'
ELSE 'GreaterThanPoint5'
END AS Test3
FROM (SELECT rand() as TheValue) ADerivedTable
it happens that for any given execution, Test1 Test2 and Test3 need not be
equal, even through you're selecting rand() only once. Because the rand
column is aliased, it generates a new value.
This is also a bug, but not quite the situation i have, since i am not
referencing the rand or newid column more than once. The optimizer is
internally turning my one reference into two references - and then exposing
the bug.

Wednesday, March 7, 2012

Reproducable bug, yay!

No comments:

Post a Comment

Reposition Subtotals

Blog Archive

About Me