In the Importing data from Sqoop into Hive External Table with Avro encoding i blogged about how to sqoop data from RDBMS into Hive. But i wanted to take it to next step by moving the data downloaded to ORC table. I followed these steps to achieve that
show create table CUSTOMER;
You will get output that looks something like this, it contains schema of the table
CREATE EXTERNAL TABLE `CUSTOMER`( `contactid` int COMMENT 'from deserializer', `firstname` string COMMENT 'from deserializer', `lastname` string COMMENT 'from deserializer', `email` string COMMENT 'from deserializer')ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'LOCATION 'hdfs://sandbox.hortonworks.com:8020/tmp/customer/data'TBLPROPERTIES ( 'avro.schema.url'='hdfs:///tmp/customer/schema/customer.avsc', 'transient_lastDdlTime'='1431719666')
CREATE EXTERNAL TABLE `CUSTOMER_ORC`( `contactid` int , `firstname` string , `lastname` string , `email` string )ROW FORMAT DELIMITEDFIELDS TERMINATED BY '\001'LOCATION 'hdfs://sandbox.hortonworks.com:8020/tmp/customer/data_orc'STORED AS ORC tblproperties ("orc.compress"="SNAPPY","orc.row.index.stride"="20000");
insert into table CUSTOMER_ORC select * from customer;