Convert HIVE table to AVRO format and export as AVRO file
Step 1:
Create an new table using AVRO SERDE based off the original table in HIVE. You can do it in HUE data browser:
CREATE TABLE avro_test_table
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.literal'='{
"namespace": "testnamespace.avro",
"name": "testavro",
"type": "record",
"fields": [ {"name":"strt_tstmp","type":"string"},{"name":"end_tstmp","type":"string"},{"name":"stts_cd","type":"int"}]
}');
Create an new table using AVRO SERDE based off the original table in HIVE. You can do it in HUE data browser:
CREATE TABLE avro_test_table
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.literal'='{
"namespace": "testnamespace.avro",
"name": "testavro",
"type": "record",
"fields": [ {"name":"strt_tstmp","type":"string"},{"name":"end_tstmp","type":"string"},{"name":"stts_cd","type":"int"}]
}');
This will create a new table in AVRO compatible format in HIVE.
Step 2: Load data from the original table
INSERT OVERWRITE TABLE avro_test_table SELECT * FROM tablename;
Step 3: Go to HUE file browser and see if the file is created. It might be created with file names as some zeroes without any extension. You should look for this file in one of the sub-folders of your application directory for ex: /data/apps/appname/appname_db/avro_test_table/xxx
Step 4:
Run the following command from command line in Hadoop
hadoop fs -cat /data/apps/appname/appname_db/avro_test_table/xxx > testdata.avro
Step 5: Look for this avro file in your home directory in edge node.
Comments