How do not scan all records to query top rows by complex condition

问题: I have a table and data like this: create table AmountObjects ( objectId integer, unixTimestamp integer, amount integer, PRIMARY KEY ( [objectId] ASC,...

问题:

I have a table and data like this:

create table AmountObjects
(
  objectId integer,
  unixTimestamp integer,
  amount integer,

  PRIMARY KEY  
  (
      [objectId] ASC,
      [unixTimestamp] ASC  
  )
);

insert into AmountObjects values (1, 1, 33);
insert into AmountObjects values (1, 2, 33);
insert into AmountObjects values (1, 3, 33);
insert into AmountObjects values (1, 4, 33);
insert into AmountObjects values (1, 5, 33);
insert into AmountObjects values (1, 6, 33);
insert into AmountObjects values (1, 7, 33);
insert into AmountObjects values (1, 8, 33);
insert into AmountObjects values (1, 9, 33);
insert into AmountObjects values (1, 10, 33);

I want to query last records filtered by date and cumulative amount, but queries like this scan all records by object:

select 
    a.objectId,
    a.unixTimestamp,
    a.amount,
    s.total
from AmountObjects a
cross apply
(
    select sum(amount) total from AmountObjects stat 
    where a.unixTimestamp <= stat.unixTimestamp and a.objectId = stat.objectId 
) s
where 
    unixTimestamp >= 9
    or s.total <= 150

My question is: how to query data without scann all data by object?

Thanks


回答1:

The only way I can think of doing this involves running the backwards running total twice. Once to get the target timestamp below which should be ignored (short circuited with the TOP 1) and then again to get the running totals for values above that (uses a seek to only get the range of rows above that).

Unless you have a very high proportion of rows to ignore this is unlikely to be an improvement over the simpler approach of just calculating the running total for everything and discarding what you don't need.

WITH DistinctObjects
     AS (SELECT DISTINCT objectId
         FROM   AmountObjects a),
     MinTimeStampsByObjectId
     AS (SELECT do.objectId,
                ca.minUnixTimeStamp
         FROM   DistinctObjects do
                CROSS APPLY (SELECT ISNULL((SELECT TOP 1 unixTimeStamp
                                            FROM   (SELECT *,
                                                           SUM(ao.amount)
                                                             OVER (
                                                               ORDER BY ao.unixTimeStamp DESC) AS total
                                                    FROM   AmountObjects ao
                                                    WHERE  ao.objectId = do.objectId) d
                                            WHERE  total > 150
                                            ORDER  BY d.unixTimeStamp DESC), -1))ca(minUnixTimeStamp))
SELECT ca2.*
FROM   MinTimeStampsByObjectId mts
       CROSS APPLY (SELECT *,
                           SUM(ao.amount)
                             OVER (
                               ORDER BY ao.unixTimeStamp DESC) AS total
                    FROM   AmountObjects ao
                    WHERE  ao.objectId = mts.objectId
                           AND ao.unixTimeStamp > IIF(mts.minUnixTimeStamp > 8,8,mts.minUnixTimeStamp)) ca2 

回答2:

This should implement the same logic and be more efficient:

select a.*
from (select a.objectId, a.unixTimestamp, a.amount,
             sum(a.amount) over (partition by a.objectId order by a.unixTimeStamp desc) as total
      from AmountObjects a
     ) a
where unixTimestamp >= 9 or total <= 150;

However, it will still scan all the rows.

  • 发表于 2019-01-05 05:49
  • 阅读 ( 200 )
  • 分类:网络文章

条评论

请先 登录 后评论
不写代码的码农
小编

篇文章

作家榜 »

  1. 小编 文章
返回顶部
部分文章转自于网络,若有侵权请联系我们删除