jruby - HBase shell scan bytes to string conversion -
i scan hbase table , see integers strings (not binary representation). can conversion have no idea how write scan statement using java api hbase shell:
org.apache.hadoop.hbase.util.bytes.tostring(   "\x48\x65\x6c\x6c\x6f\x20\x48\x42\x61\x73\x65".to_java_bytes)   org.apache.hadoop.hbase.util.bytes.tostring("hello hbase".to_java_bytes) i happy have examples of scan, searching binary data (long's) , output normal strings. using hbase shell, not java.
hbase stores data byte arrays (untyped). therefore if perform table scan data displayed in common format (escaped hexadecimal string), e.g:
 "\x48\x65\x6c\x6c\x6f\x20\x48\x42\x61\x73\x65" -> hello hbase
if want typed value serialized byte array have manually. have following options:
- java code (bytes.tostring(...))
- hack to_string function in $hbase/home/lib/ruby/hbase/table.rb: replace tostringbinary toint non-meta tables
- write get/scan jruby function converts byte array appropriate type
since want hbase shell, consider last option:
 create file get_result.rb :
import org.apache.hadoop.hbase.hbaseconfiguration import org.apache.hadoop.hbase.client.htable import org.apache.hadoop.hbase.client.scan; import org.apache.hadoop.hbase.util.bytes; import org.apache.hadoop.hbase.client.resultscanner; import org.apache.hadoop.hbase.client.result; import java.util.arraylist;  # simple function equivalent scan 'test', {columns => 'c:c2'} def get_result()   htable = htable.new(hbaseconfiguration.new, "test")   rs = htable.getscanner(bytes.tobytes("c"), bytes.tobytes("c2"))   output = arraylist.new   output.add "row\t\t\t\t\t\tcolumn\+cell"   rs.each { |r|      r.raw.each { |kv|       row = bytes.tostring(kv.getrow)       fam = bytes.tostring(kv.getfamily)       ql = bytes.tostring(kv.getqualifier)       ts = kv.gettimestamp       val = bytes.toint(kv.getvalue)       output.add " #{row} \t\t\t\t\t\t column=#{fam}:#{ql}, timestamp=#{ts}, value=#{val}"     }   }   output.each {|line| puts "#{line}\n"} end load in hbase shell , use it:
require '/path/to/get_result' get_result note: modify/enhance/fix code according needs
Comments
Post a Comment