apache pig - Pig takes long time to stroe into Hbase -
hi new guy hadoop.
recently, put large amount of text files hdfs. wanna read these files , put them hbase using pig (load, store). however, found takes long long time store hbase.
does meet similar situations before? if yes, how solve problem?
thanks
i face same issue when use hbasestorage. actualy hbasestorage sequential put operations load data hbase. not bulk load. see unresolved jira. https://issues.apache.org/jira/browse/pig-2921
but significant performance difference after using importtsv option. http://hbase.apache.org/book/ops_mgt.html#importtsv
the bulk load involved 3 steps 1. pig : read data source , format in hbase table structure, load hdfs. 2. importtsv : preparing storefiles loaded via completebulkload. 3. completebulkload : move generated storefiles hbase table. (it's cut-pest)
hope useful :)
Comments
Post a Comment