文章目录
- 1.建表语句
- 2.主键约束
- 3.主键约束的意义
- 参考文献
1.建表语句
先看一下官方给的完整的见表语句:
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)[(col_name data_type [column_constraint_specification] [COMMENT col_comment], ... [constraint_specification])][COMMENT table_comment][PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)][CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS][SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)[STORED AS DIRECTORIES][[ROW FORMAT row_format] [STORED AS file_format]| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)][LOCATION hdfs_path][TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)[AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_nameLIKE existing_table_or_view_name[LOCATION hdfs_path];data_type: primitive_type| array_type| map_type| struct_type| union_type -- (Note: Available in Hive 0.7.0 and later)primitive_type: TINYINT| SMALLINT| INT| BIGINT| BOOLEAN| FLOAT| DOUBLE| DOUBLE PRECISION -- (Note: Available in Hive 2.2.0 and later)| STRING| BINARY -- (Note: Available in Hive 0.8.0 and later)| TIMESTAMP -- (Note: Available in Hive 0.8.0 and later)| DECIMAL -- (Note: Available in Hive 0.11.0 and later)| DECIMAL(precision, scale) -- (Note: Available in Hive 0.13.0 and later)| DATE -- (Note: Available in Hive 0.12.0 and later)| VARCHAR -- (Note: Available in Hive 0.12.0 and later)| CHAR -- (Note: Available in Hive 0.13.0 and later)array_type: ARRAY < data_type >map_type: MAP < primitive_type, data_type >struct_type: STRUCT < col_name : data_type [COMMENT col_comment], ...>union_type: UNIONTYPE < data_type, data_type, ... > -- (Note: Available in Hive 0.7.0 and later)row_format: DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char][MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char][NULL DEFINED AS char] -- (Note: Available in Hive 0.13 and later)| SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]file_format:: SEQUENCEFILE| TEXTFILE -- (Default, depending on hive.default.fileformat configuration)| RCFILE -- (Note: Available in Hive 0.6.0 and later)| ORC -- (Note: Available in Hive 0.11.0 and later)| PARQUET -- (Note: Available in Hive 0.13.0 and later)| AVRO -- (Note: Available in Hive 0.14.0 and later)| JSONFILE -- (Note: Available in Hive 4.0.0 and later)| INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classnamecolumn_constraint_specification:: [ PRIMARY KEY|UNIQUE|NOT NULL|DEFAULT [default_value]|CHECK [check_expression] ENABLE|DISABLE NOVALIDATE RELY/NORELY ]default_value:: [ LITERAL|CURRENT_USER()|CURRENT_DATE()|CURRENT_TIMESTAMP()|NULL ] constraint_specification:: [, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE RELY/NORELY ][, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE RELY/NORELY ][, CONSTRAINT constraint_name FOREIGN KEY (col_name, ...) REFERENCES table_name(col_name, ...) DISABLE NOVALIDATE [, CONSTRAINT constraint_name UNIQUE (col_name, ...) DISABLE NOVALIDATE RELY/NORELY ][, CONSTRAINT constraint_name CHECK [check_expression] ENABLE|DISABLE NOVALIDATE RELY/NORELY ]
2.主键约束
其中关于主键约束的定义语句如下:
PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE RELY/NORELY
在 Hive 中,可以使用 PRIMARY KEY 子句来定义主键约束,以确保表中的每行数据都具有唯一标识。主键约束要求表中的每个行都必须具有唯一的主键值,这可以通过在表中定义一个或多个列来实现。
当使用 PRIMARY KEY 子句定义主键约束时,可以使用 DISABLE NOVALIDATE 子句来禁用主键约束的验证。
CREATE TABLE my_table (id INT,name STRING,PRIMARY KEY (id) DISABLE NOVALIDATE
);
注意: 因为 Hive 目前对主键约束支持的不是很完善,现在暂不支持默认的 ENABLE 和 VALIDATE,所以定义主键约束时,「必须带上 DISABLE NOVALIDATE」,不然会报如下错误:
SemanticException [Error 10326]: Invalid Constraint syntax ENABLE/ENFORCED feature not supported yet. Please use DISABLE/NOT ENFORCED instead. (state=42000,code=1032
既然需要禁用主键约束,那么创建一个无法约束的主键意义何在呢?
3.主键约束的意义
对于唯一约束,目前 Hive 是不支持的,但是其他一些约束是支持的。
进入 HIVE-16575 的版本信息,会看到如下说明:
虽然 Hive 不支持主键与外键约束,但它们将被存储并可以使用 RELY 进行重写/优化。
定义主键约束时,RELY/NORELY 是可选的,缺省为 RELY。
如果一个约束指定 RELY,也就是希望 HIVE 基于代价的优化器 CBO(Cost-Based Optimizer)使用约束信息来获得更好的统计信息,得到更好的执行计划。
具体到主键约束,一般主键常用于多表关联查询,利用主键约束,可以删除不必要的连接( JOIN ELIMINATION),得到更好的执行计划。
参考文献
Hive LanguageManual - DDL
Hive 建表语句解析 - 阿里云开发者社区