Expand description
Transfer data between the Arrow memory format and JSON line-delimited records.
See the module level documentation for the
reader and writer for usage examples.
§Binary Data uses Base16 Encoding
As per RFC7159 JSON cannot encode arbitrary binary data. This crate works around that
limitation by encoding/decoding binary data as a hexadecimal string (i.e.
Base16 encoding).
Note that Base16 only has 50% space efficiency (i.e., the encoded data is twice as large
as the original). If that is an issue, we recommend to convert binary data to/from a different
encoding format such as Base64 instead. See the following example for details.
§Base64 Encoding Example
Base64 is a common binary-to-text encoding scheme with a space efficiency of 75%. The
following example shows how to use the [arrow_cast] crate to encode binary data to Base64
before converting it to JSON and how to decode it back.
use arrow_cast::base64::{b64_decode, b64_encode, BASE64_STANDARD};
// The data we want to write
let input = BinaryArray::from(vec![b"\xDE\x00\xFF".as_ref()]);
// Base64 encode it to a string
let encoded: StringArray = b64_encode(&BASE64_STANDARD, &input);
// Write the StringArray to JSON
let batch = RecordBatch::try_from_iter([("col", Arc::new(encoded) as _)]).unwrap();
let mut buf = Vec::with_capacity(1024);
let mut writer = LineDelimitedWriter::new(&mut buf);
writer.write(&batch).unwrap();
writer.finish().unwrap();
// Read the JSON data
let cursor = Cursor::new(buf);
let mut reader = ReaderBuilder::new(batch.schema()).build(cursor).unwrap();
let batch = reader.next().unwrap().unwrap();
// Reverse the base64 encoding
let col: BinaryArray = batch.column(0).as_string::<i32>().clone().into();
let output = b64_decode(&BASE64_STANDARD, &col).unwrap();
assert_eq!(input, output);Re-exports§
pub use self::reader::Reader;pub use self::reader::ReaderBuilder;pub use self::writer::ArrayWriter;pub use self::writer::Encoder;pub use self::writer::EncoderFactory;pub use self::writer::EncoderOptions;pub use self::writer::LineDelimitedWriter;pub use self::writer::Writer;pub use self::writer::WriterBuilder;
Modules§
Macros§
Enums§
- Struct
Mode - Specifies what is considered valid JSON when reading or writing RecordBatches or StructArrays.
Traits§
- Json
Serializable - Trait declaring any type that is serializable to JSON. This includes all primitive types (bool, i32, etc.).