一、背景
需求:求【昨日触达】、【本周拜访】、【本月活动数】。
背景:三个维度的数据都按日统计在一张表里。
首先想到的就是left join +group by。
1.1 优化前语句
select
user_id,
user_name,
org_id,
org_name,
sum(reach_num) dayReachNum,
sum(visit_num) weekVisitNum,
sum(activity_num) mouthActivityNum
from
(
select
srsm.user_id,
srsm.user_name,
srsm.org_id,
srsm.org_name,
srsd.reach_num,
srsw.visit_num,
srsm.activity_num
from
standard_reach_statistics srsm
left join standard_reach_statistics srsd on
srsd.user_id = srsm.user_id
and srsd.statistical_date between '2024-01-01' and '2024-01-01'
left join standard_reach_statistics srsw on
srsm.user_id = srsw.user_id
and srsw.statistical_date between '2024-01-01' and '2024-01-07'
where
srsm.statistical_date between '2024-01-01' and '2024-01-31'
) tt
group by
user_id,
user_name,
org_id,
org_name
分析这条语句存在的问题:
二、优化后语句
思路:
- 解决后两个left join表数据过大问题:可以先把后两个left join语句sum出来,在left join;
select
srsm.user_id,
reach_num_sum dayReachNum,
visit_num_sum weekVisitNum ,
sum(srsm.activity_num) mouthActivityNum
from
standard_reach_statistics srsm
left join (
select
srsd.user_id,
sum(srsd.reach_num) reach_num_sum
from
standard_reach_statistics srsd
where
srsd.statistical_date between '2024-01-01' and '2024-01-01'
group by
user_id ) sum_day on
srsm.user_id = sum_day.user_id
left join (
select
srsw.user_id,
sum(srsw.visit_num)visit_num_sum
from
standard_reach_statistics srsw
where
srsw.statistical_date between '2024-01-01' and '2024-01-07'
group by
user_id ) sum_month on
srsm.user_id = sum_month.user_id
where
srsm.statistical_date between '2024-01-01' and '2024-01-31'
group by
user_id;
优化后效率由原来的50s,变成了86ms,效率提升明显。