背景
select t1.p_date,t1.member_id,top3_domains,CASE WHEN label_code = 'realtime' THEN COLLECT_SET(bayes_first) END as realtime_bysfrom create_top3 t1 join consume_profile t2 on t1.p_date=t2.p_date and cast(t1.member_id as string)=cast(t2.member_id as string)group by 1,2,3,4报错:
aggregate functions are not allowed in GROUP BY, but found CASE WHEN (t2.`label_code` = 'realtime') THEN collect_set(t2.`bayes_first`) END;
原因&解决方案
你遇到的错误是因为在 GROUP BY 子句中,不能包含聚合函数(如 COLLECT_SET)之外的其他列或表达式。在这种情况下,你可以使用子查询来解决问题。以下是一种修改方法:
SELECT t1.p_date,t1.member_id,top3_domains,MAX(CASE WHEN t2.label_code = 'realtime' THEN t2.collect_set_bayes_first END) AS realtime_bys
FROM create_top3 t1
JOIN (SELECTp_date,member_id,label_code,COLLECT_SET(bayes_first) AS collect_set_bayes_firstFROM consume_profileWHERE label_code IN ('text_bayes_first_realtime', 'text_bayes_first_short_term', 'text_bayes_first_long_term')GROUP BY p_date,member_id,label_code) t2
ON t1.p_date = t2.p_date AND cast(t1.member_id as string) = cast(t2.member_id as string)
GROUP BY 1, 2, 3;
声明:本文章有 AI 辅助