Expand description
Parquet schema parser.
Provides methods to parse and validate string message type into Parquet
Type
.
ยงExample
use parquet::schema::parser::parse_message_type;
let message_type = "
message spark_schema {
OPTIONAL BYTE_ARRAY a (UTF8);
REQUIRED INT32 b;
REQUIRED DOUBLE c;
REQUIRED BOOLEAN d;
OPTIONAL group e (LIST) {
REPEATED group list {
REQUIRED INT32 element;
}
}
}
";
let schema = parse_message_type(message_type).expect("Expected valid schema");
println!("{:?}", schema);
Structsยง
- Parser ๐Internal Schema parser. Traverses message type using tokenizer and parses each group/primitive type recursively.
- Tokenizer ๐Tokenizer to split message type string into tokens that are separated using characters defined in
is_schema_delim
method. Tokenizer also preserves delimiters as tokens. Tokenizer provides Iterator interface to process tokens; it also allows to step back to reprocess previous tokens.
Functionsยง
- assert_
token ๐ - parse_
bool ๐ - parse_
i32 ๐ - Parses message type as string into a Parquet
Type
which, for example, could be used to extract individual columns. Returns Parquet general error when parsing or validation fails. - parse_
timeunit ๐