目录
一、概述
二、行动
1.准备数据
2.数据清洗
3.substring_index函数
4.split函数实现
一、概述
想将hive表中包含逗号的字段按逗号做分隔符进行分列操作
二、行动
1.准备数据
--1
select '{1,2,3,4,5,6,7,8}' as num_str --使用的数据
2.数据清洗
--2
select num_str,replace(replace(num_str,'{',''),'}','') as clean1 --清洗掉大括号from (select '{1,2,3,4,5,6,7,8}' as num_str --使用的数据) t
3.substring_index函数
函数介绍:
substring_index(string A, string delim, int count)
返回分隔符出现次数之前字符串A中的子字符串,count>0是从前往后截取,count<0是从后往前截取
--3select num_str,substring_index(replace(replace(num_str,'{',''),'}',''),',',1) as index1 --从前往后1个分隔符,substring_index(replace(replace(num_str,'{',''),'}',''),',',2) as index2 --从前往后2个分隔符,substring_index(replace(replace(num_str,'{',''),'}',''),',',3) as index3 --从前往后3个分隔符,substring_index(replace(replace(num_str,'{',''),'}',''),',',-1) as index_1 --从后往前1个分隔符,substring_index(replace(replace(num_str,'{',''),'}',''),',',-2) as index_2 --从后往前2个分隔符,substring_index(replace(replace(num_str,'{',''),'}',''),',',-3) as index_3 --从后往前3个分隔符from (select '{1,2,3,4,5,6,7,8}' as num_str --使用的数据) t
4.split函数实现
split(string str, string pat)
数组(注意下角标数字从0开始)
string str :待分割字符串
string pat:分割符
--4select num_str,split(replace(replace(num_str,'{',''),'}',''), ',')[0] as s1,split(replace(replace(num_str,'{',''),'}',''), ',')[1] as s2,split(replace(replace(num_str,'{',''),'}',''), ',')[2] as s3,split(replace(replace(num_str,'{',''),'}',''), ',')[3] as s4,split(replace(replace(num_str,'{',''),'}',''), ',')[4] as s5,split(replace(replace(num_str,'{',''),'}',''), ',')[5] as s6,split(replace(replace(num_str,'{',''),'}',''), ',')[6] as s7,split(replace(replace(num_str,'{',''),'}',''), ',')[7] as s8
from (select '{1,2,3,4,5,6,7,8}' as num_str --使用的数据) t
由此可以看出,split函数可以实现我想要的效果,over~