【HBase】使用CopyTable备份表


本博客文章如无特别说明,均为原创!转载请注明出处:Big data enthusiast(http://www.lubinsu.com/)

本文链接地址:【HBase】使用CopyTable备份表(http://www.lubinsu.com/index.php/archives/322)

CopyTable用法:
执行命令前,需先创建表
支持时间区间、row区间,改变表名称,改变列簇名称,指定是否copy删除数据等功能,例如:
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –starttime=1265875194289 –endtime=1265878794289 –peer.adr= dstClusterZK:2181:/hbase –families=myOldCf:myNewCf,cf2,cf3 TestTable
1、同一个集群不同表名称
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –new.name=tableCopy  srcTable
2、跨集群copy表
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –peer.adr=dstClusterZK:2181:/hbase srcTable
该方式,原表、目标表的名称相同
参考链接:

CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The target table must first exist. The usage is as follows:

$ ./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --helpUsage: CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>Options: rs.class hbase.regionserver.class of the peer cluster, specify if different from current cluster rs.impl hbase.regionserver.impl of the peer cluster, startrow the start row stoprow the stop row starttime beginning of the time range (unixtime in millis) without endtime means from starttime to forever endtime end of the time range. Ignored if no starttime specified. versions number of cell versions to copy new.name new table's name peer.adr Address of the peer cluster given in the format hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent families comma-separated list of families phentermine to copy To copy from cf1 to cf2, give sourceCfName:destCfName. To keep the same name, just give "cfName" all.cells also copy delete markers and deleted cellsArgs: tablename Name of the table to copyExamples: To copy 'TestTable' to a cluster that uses replication for a 1 hour window: $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTableFor performance consider the following general options: It is recommended that you set the following to >=100. A higher value uses more memory but decreases the round trip time to the server and may increase performance. -Dhbase.client.scanner.caching=100 The following should always be set to false, to prevent writing data twice, which may produce inaccurate results. -Dmapred.map.tasks.speculative.execution=false
示例:
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –starttime=1478448000000 –endtime=1478591994506 –peer.adr=192.168.0.113,192.168.0.114,192.168.0.115:2181:/hbase –families=txjl –new.name=hy_membercontacts_bk  hy_membercontacts
#根据时间范围备份
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –starttime=1478448000000 –endtime=1478591994506 –new.name=hy_membercontacts_bk  hy_membercontacts
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –starttime=1477929600000 –endtime=1478591994506 –new.name=hy_linkman_tmp hy_linkman
#备份全表
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –new.name=hy_mobileblacklist_bk_before_del hy_mobileblacklist
#拓展根据时间范围查询
scan ‘hy_linkman’, {COLUMNS => ‘lxr:sguid’, TIMERANGE => [1478966400000, 1479052799000]}
scan ‘hy_mobileblacklist’, {COLUMNS => ‘mobhmd:sguid’, TIMERANGE => [1468719824000, 1468809824000]}
hbase org.apache.hadoop.hbase.mapreduce.CopyTable –new.name=hy_mobileblacklist_bk_before_del_20161228 hy_mobileblacklist

发表评论

电子邮件地址不会被公开。 必填项已用*标注