How do not scan all records to query top rows by complex condition

问题: I have a table and data like this: create table AmountObjects ( objectId integer, unixTimestamp integer, amount integer, PRIMARY KEY ( [objectId] ASC,...

可以将文章内容翻译成中文,广告屏蔽插件会导致该功能失效:

问题:

I have a table and data like this:

create table AmountObjects
(
  objectId integer,
  unixTimestamp integer,
  amount integer,

  PRIMARY KEY  
  (
      [objectId] ASC,
      [unixTimestamp] ASC  
  )
);

insert into AmountObjects values (1, 1, 33);
insert into AmountObjects values (1, 2, 33);
insert into AmountObjects values (1, 3, 33);
insert into AmountObjects values (1, 4, 33);
insert into AmountObjects values (1, 5, 33);
insert into AmountObjects values (1, 6, 33);
insert into AmountObjects values (1, 7, 33);
insert into AmountObjects values (1, 8, 33);
insert into AmountObjects values (1, 9, 33);
insert into AmountObjects values (1, 10, 33);

I want to query last records filtered by date and cumulative amount, but queries like this scan all records by object:

select 
    a.objectId,
    a.unixTimestamp,
    a.amount,
    s.total
from AmountObjects a
cross apply
(
    select sum(amount) total from AmountObjects stat 
    where a.unixTimestamp <= stat.unixTimestamp and a.objectId = stat.objectId 
) s
where 
    unixTimestamp >= 9
    or s.total <= 150

My question is: how to query data without scann all data by object?

Thanks

回答1:

The only way I can think of doing this involves running the backwards running total twice. Once to get the target timestamp below which should be ignored (short circuited with the TOP 1) and then again to get the running totals for values above that (uses a seek to only get the range of rows above that).

Unless you have a very high proportion of rows to ignore this is unlikely to be an improvement over the simpler approach of just calculating the running total for everything and discarding what you don't need.

WITH DistinctObjects
     AS (SELECT DISTINCT objectId
         FROM   AmountObjects a),
     MinTimeStampsByObjectId
     AS (SELECT do.objectId,
                ca.minUnixTimeStamp
         FROM   DistinctObjects do
                CROSS APPLY (SELECT ISNULL((SELECT TOP 1 unixTimeStamp
                                            FROM   (SELECT *,
                                                           SUM(ao.amount)
                                                             OVER (
                                                               ORDER BY ao.unixTimeStamp DESC) AS total
                                                    FROM   AmountObjects ao
                                                    WHERE  ao.objectId = do.objectId) d
                                            WHERE  total > 150
                                            ORDER  BY d.unixTimeStamp DESC), -1))ca(minUnixTimeStamp))
SELECT ca2.*
FROM   MinTimeStampsByObjectId mts
       CROSS APPLY (SELECT *,
                           SUM(ao.amount)
                             OVER (
                               ORDER BY ao.unixTimeStamp DESC) AS total
                    FROM   AmountObjects ao
                    WHERE  ao.objectId = mts.objectId
                           AND ao.unixTimeStamp > IIF(mts.minUnixTimeStamp > 8,8,mts.minUnixTimeStamp)) ca2

回答2:

This should implement the same logic and be more efficient:

select a.*
from (select a.objectId, a.unixTimestamp, a.amount,
             sum(a.amount) over (partition by a.objectId order by a.unixTimeStamp desc) as total
      from AmountObjects a
     ) a
where unixTimestamp >= 9 or total <= 150;

However, it will still scan all the rows.

发表于 2019-01-05 05:49
阅读 ( 200 )
分类：网络文章

部分文章转自于网络，若有侵权请联系我们删除

How do not scan all records to query top rows by complex condition

问题:

回答1:

回答2:

你可能感兴趣的文章

随机文章

相关问题

条评论

赞助商

作家榜 »

最新公告