案例情况:同事使用公司数据探查跑一段代码,部分代码如下,报错,显示不支持in内的子查询。但是直接用虚拟机去跑的话代码没有任何报错,也出结果,很奇怪。
SELECT t1.SIGN_CODE AS bus_src,t1.ORGANIZATION_NO,t3.loan_amts,t4.restSum,NULL AS c1,NULL AS c2,NULL AS c3,t5.draft_cnt,t5.draft_amt,t5.draft_amt AS draft_balance,NULL AS c4
FROM FDM_SOR.SOR_EVT_TBL_FB_CUST t1
where t1.FB_CUST_CODE in (
select e.CUST_CODE from FDM_SOR.SOR_EVT_TBL_FB_CREDIT e where e.COMPANY_CODE='5103'
)
and t1.FB_CUST_CODE in (
select e.FB_CUST_CODE from FDM_SOR.SOR_EVT_TBL_FB_LOAN e where CURRENT_SETTLE_FLAG != 1
)
百度了一下,说Hive对子查询的支持很有限。它只允许子查询出现在SELECT语句的FROM子句中。如果发现Hive不支持你写的子查询,可以看看能不能把它写成连接操作。例如,一个IN子查询可以写成一个半连接或连接。
如下,使用join去替换in内的子查询
SELECT t1.SIGN_CODE AS bus_src,t1.ORGANIZATION_NO,t1.loan_amts,t1.restSum,NULL AS c1,NULL AS c2,NULL AS c3,t1.draft_cnt,t1.draft_amt,t1.draft_amt AS draft_balance,NULL AS c4FROM FDM_SOR.SOR_EVT_TBL_FB_CUST t1
inner join(
select e.CUST_CODE as FB_CUST_CODE from FDM_SOR.SOR_EVT_TBL_FB_CREDIT e where e.COMPANY_CODE='5103'
) a11
on t1.FB_CUST_CODE = a11.FB_CUST_CODE
inner join
(
select e.FB_CUST_CODE from FDM_SOR.SOR_EVT_TBL_FB_LOAN e where CURRENT_SETTLE_FLAG != 1
) c11
on t1.FB_CUST_CODE = c11.FB_CUST_CODE
但是,既然该子查询在虚拟机跑的通的话,那就说明hive肯定是支持in内的查询的,但是为什么用web界面的探查去跑会报错呢。于是查询了hive官网。在hive的官网,找到问题所在。原来hive在0.13版本以后开始支持更多的子查询,如in ,not in的子查询。如果我们用的hive不支持如in,exists,not in等子查询,很可能是0.13版本之前的旧版本。