hudi 0.9.0适配hbase 2.2.6

总览

hudi中,hbase可以作为索引数据的存储,hudi默认使用的hbase版本为1.2.3。

在hbase从1.x升级到2.x之后,其api发生了较大的变化,直接修改hudi中hbase的版本是不合适的,即会发生编译错误。

本文对部分源码进行修改以使hbase 2.2.6适配hudi 0.9.0

编译报错

如果我们直接修改hbase的版本为2.2.6的话,会出现如下编译错误:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project hudi-common: Compilation failure: Compilation failure: 
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[181,34] no suitable method found for createReader(org.apache.hadoop.fs.FileSystem,org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.HFilePathForReader,org.apache.hadoop.hbase.io.hfile.CacheConfig,org.apache.hadoop.conf.Configuration)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,long,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[309,93] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[534,51] incompatible types: org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.HoodieKVComparator cannot be converted to org.apache.hadoop.hbase.CellComparator
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[537,51] incompatible types: org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.HoodieKVComparator cannot be converted to org.apache.hadoop.hbase.CellComparator
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[72,24] no suitable method found for createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,org.apache.hadoop.conf.Configuration)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,long,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[80,24] no suitable method found for createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,int,org.apache.hadoop.hbase.io.hfile.CacheConfig,org.apache.hadoop.conf.Configuration)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,long,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[114,56] incompatible types: org.apache.hadoop.hbase.io.hfile.HFileBlock cannot be converted to java.nio.ByteBuffer
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[149,27] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[180,54] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[200,50] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[224,28] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable keyScanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hudi-common

针对上述问题,我们发现主要的兼容性问题有4个:

  1. HFileBootstrapIndex#createReader方法参数问题
  2. HFileScanner#getKeyValue方法在hbase 2.2.6中已经不存在了
  3. HFile#withComparator方法的入参为CellComparator类型,而源码中的则是HoodieKVComparator
  4. HFileReaderImpl#getMetaBlock方法返回参数的变化由ByteBuffer变为HFileBlock

hudi源码修改

针对上述问题,我们进行如下修改:

  1. 查看Hbase源码,我们发现HFileBootstrapIndex#createReader增加了一个类型为boolean的参数primaryReplicaReader,该参数说明如下:

    /**
    * Creates reader with cache configuration disabled
    * @param fs filesystem
    * @param path Path to file to read
    * @return an active Reader instance
    * @throws IOException Will throw a CorruptHFileException
    * (DoNotRetryIOException subtype) if hfile is corrupt/invalid.
    */
    public static Reader createReader(FileSystem fs, Path path, Configuration conf)
      throws IOException {
    // The primaryReplicaReader is mainly used for constructing block cache key, so if we do not use
    // block cache then it is OK to set it as any value. We use true here.
    return createReader(fs, path, CacheConfig.DISABLED, true, conf);
    }

    所以我们可以对HFileBootstrapIndex类的182-183行进行如下修改从而解决这个问题(增加了一个true参数):

      HFile.Reader reader = HFile.createReader(fileSystem, new HFilePathForReader(hFilePath),
          new CacheConfig(conf), true, conf);

    (在编译中可能会遇到其他相同的createReader参数问题,用上述方法进行修改即可)

  2. HFileScanner#getKeyValue方法在hbase升级之后已经替换为HFileScanner#getCell,具体可参考HFile的提交历史.

所以我们可以对HFileBootstrapIndex类的309行修改为:

          keys.add(converter.apply(getUserKeyFromCellKey(CellUtil.getCellKeyAsString(scanner.getCell()))));

(在编译中可能会遇到其他相同的getKeyValue方法问题,用上述方法进行修改即可)

  1. 在hbase升级之后,我们可以看到HFile#withComparator需要的参数为CellComparator:

hudi 0.9.0适配hbase 2.2.6

所以我们可以通过修改HFileBootstrapIndex的585-586行,使HoodieKVComparator继承CellComparatorImpl

  public static class HoodieKVComparator extends CellComparatorImpl {
  }

(在编译中可能会遇到其他相同问题,用上述方法进行修改即可)

  1. HFileReaderImpl#getMetaBlock方法返回参数具体问题如下:

hudi 0.9.0适配hbase 2.2.6

我们再来看一下这个方法的变更历史:
hudi 0.9.0适配hbase 2.2.6
hudi 0.9.0适配hbase 2.2.6

由此我们知道可以通过修改HoodieHFileReader的115行,改为如下:

      ByteBuff serializedFilter = reader.getMetaBlock(KEY_BLOOM_FILTER_META_BLOCK, false).getBufferWithoutHeader();

(在编译中可能会遇到相同问题,用上述方法进行修改即可)

其他问题

由于hudi 0.9.0的jetty版本和hbase 2.2.6的jetty版本存在冲突,所以我们需要排除掉hbase的jetty,使用hudi要求的jetty版本。
这个问题可能在编译阶段不会报错,但是在运行阶段是会报错的。

那么解决了上述问题之后,就可以让hudi使用hbase 2.2.6啦!

0 0 投票数
文章评分

本文为从大数据到人工智能博主「xiaozhch5」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。

原文链接:https://lrting.top/backend/2041/

(0)
上一篇 2021-11-12 19:49
下一篇 2021-11-12 20:07

相关推荐

订阅评论
提醒
guest

0 评论
内联反馈
查看所有评论
0
希望看到您的想法,请您发表评论x