tonglin0325的个人主页

使用avro-protobuf将protobuf转换成avro

avro-protobuf项目提供ProtobufDatumReader类,可以用于从protobuf定义生成的java class中获得avro schema

使用方法如下:

1.引入依赖

1
2
3
4
5
6
7
8
9
10
11
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-protobuf</artifactId>
<version>1.11.1</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>3.21.7</version>
</dependency>

2.定义protobuf schema,名为other.proto,schema如下

1
2
3
4
5
6
7
8
9
10
11
12
syntax = "proto3";
package com.acme;

message MyRecord {
string f1 = 1;
OtherRecord f2 = 2;
}

message OtherRecord {
int32 other_id = 1;
}

从使用protobuf定义生成java class

1
2
protoc -I=./ --java_out=./src/main/java ./src/main/proto/other.proto

3.编写java代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
package com.example.demo;

import com.acme.Other;

import java.util.*;
import org.apache.avro.protobuf.ProtobufDatumReader;

public class Demo {

public static void main(String[] args) throws Exception {

ProtobufDatumReader<Other.MyRecord> datumReader = new ProtobufDatumReader<Other.MyRecord>(Other.MyRecord.class);
System.out.println(datumReader.getSchema().toString(true));

}

}

输出如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
"type" : "record",
"name" : "MyRecord",
"namespace" : "com.acme.Other",
"fields" : [ {
"name" : "f1",
"type" : {
"type" : "string",
"avro.java.string" : "String"
},
"default" : ""
}, {
"name" : "f2",
"type" : [ "null", {
"type" : "record",
"name" : "OtherRecord",
"fields" : [ {
"name" : "other_id",
"type" : "int",
"default" : 0
} ]
} ],
"default" : null
} ]
}

注意:该工具在把protobuf schema转换成avro schema的时候,可能会出现不严谨的时候,比如在转换protobuf的uint32(0 到 2^32 -1)的时候,会统一转换成int(-2^31 ~ 2^31-1),这可能会产生问题,解决方法是使用confluent schema registry提供的工具,参考:使用confluent schema registry将protobuf schema转换成avro schema