较之于
group by,开窗函数over的好处在于:over返回的是group by之后再join的结果。也就是说,over返回的大小和原表格应该是一致的,且能够获取到除了group by之外的column。
create table temp.orders_temp(name VARCHAR(20), cate int, s int, d int);
insert into orders_temp values
("a", 1, 1, 13),
("b", 2, 2, 112),
("c", 3, 3, 12),
("d", 2, 4, 1),
("e", 1, 5, 3),
("f", 3, 6, 4),
("g", 2, 7, 76),
("h", 3, 8, 5),
("i", 2, 9, 1);
select
name as nm,
cate as ct,
s,
d,
max(s) over() as a,
max(s) over(partition by cate) b,
max(d) over(partition by name) c,
sum(s) over(partition by cate) d,
sum(d) over(partition by name) e,
sum(s) over(partition by cate order by s) f,
sum(s) over(partition by cate order by s rows between unbounded preceding and unbounded following) g,
sum(s) over(partition by cate order by s rows between unbounded preceding and current row) h,
sum(s) over(partition by cate order by d) i,
sum(s) over(partition by cate order by d rows between unbounded preceding and unbounded following) j,
sum(s) over(partition by cate order by d rows between unbounded preceding and current row) k,
rank() over(partition by cate order by s) l,
row_number() over() m,
row_number() over(partition by cate order by s) n,
row_number() over(partition by cate order by d) o
from temp.orders_temp;
nm ct s d a b c d e f g h i j k l m n o
a 1 1 13 9 5 13 6 13 1 6 1 6 6 6 1 1 1 2
e 1 5 3 9 5 3 6 3 6 6 6 5 6 5 2 5 2 1
b 2 2 112 9 9 112 22 112 2 22 2 22 22 22 1 2 1 4
d 2 4 1 9 9 1 22 1 6 22 6 13 22 4 2 4 2 1
g 2 7 76 9 9 76 22 76 13 22 13 20 22 20 3 7 3 3
i 2 9 1 9 9 1 22 1 22 22 22 13 22 13 4 9 4 2
c 3 3 12 9 8 12 17 12 3 17 3 17 17 17 1 3 1 3
f 3 6 4 9 8 4 17 4 9 17 9 6 17 6 2 6 2 1
h 3 8 5 9 8 5 17 5 17 17 17 14 17 14 3 8 3 2
over总结:
-
over()是对全局进行操作; -
over(partition by oo)是将数据集按oo的值不同切分成若干组,分别对每个组整组(从头至尾)进行操作,相当于省略了rows between unbounded preceding and unbounded following; -
over(partition by oo order by xx)是oo切分好的组,按照xx排序后,对当前组当前“值”(而不是“行” (current row),对比i和k):当分组里的xx列都是不重复的值,此时,相当于省略了rows between unbounded preceding and current row,如h和f;当分组里的xx列出现重复的值,相同的值 是最小的操作单位,而rows between unbounded preceding and current row严格按照当前行进行操作。
对比over 中 order by和rows between unbounded preceding and current row:
-
order by是 按值操作,如果省略rows,则是从该组第一行到当前“值”(如果有重复,则是到相同值的最后一行,比如i); -
rows是 按行操作。
数据切片
f和i
nm ct s d f i
a 1 1 13 1 6
e 1 5 3 6 5
b 2 2 112 2 22
d 2 4 1 6 13
g 2 7 76 13 20
i 2 9 1 22 13
c 3 3 12 3 17
f 3 6 4 9 6
h 3 8 5 17 14
i和k
nm ct s d i k
a 1 1 13 6 6
e 1 5 3 5 5
b 2 2 112 22 22
d 2 4 1 13 4
g 2 7 76 20 20
i 2 9 1 13 13
c 3 3 12 17 17
f 3 6 4 6 6
h 3 8 5 14 14
d按升序排序后的i
ct s d i
1 5 3 5
1 1 13 6
2 4 1 13
2 9 1 13
2 7 76 20
2 2 112 22
3 6 4 6
3 8 5 14
3 3 12 17
d按升序排序后的k
ct s d k
1 5 3 5
1 1 13 6
2 4 1 4
2 9 1 13
2 7 76 20
2 2 112 22
3 6 4 6
3 8 5 14
3 3 12 17











网友评论