Top |
GEnum ╰── GArrowJSONReadUnexpectedFieldBehavior GObject ├── GArrowCSVReadOptions ├── GArrowCSVReader ├── GArrowFeatherFileReader ├── GArrowJSONReadOptions ├── GArrowJSONReader ├── GArrowRecordBatchFileReader ╰── GArrowRecordBatchReader ├── GArrowRecordBatchStreamReader ├── GArrowTableBatchReader ╰── GArrowRecordBatchStreamReader
GArrowRecordBatchReader is a base class for reading record batches in stream format from input.
GArrowRecordBatchStreamReader is a class for reading record batches in stream format from input synchronously.
GArrowRecordBatchFileReader is a class for reading record batches in file format from input.
GArrowFeatherFileReader is a class for reading columns in Feather file format from input.
GArrowCSVReader is a class for reading table in CSV format from input.
GArrowJSONReader is a class for reading table in JSON format from input.
GArrowRecordBatchReader * garrow_record_batch_reader_import (gpointer c_abi_array_stream
,GError **error
);
An imported
GArrowRecordBatchReader on success, NULL
on error.
You don't need to release the passed struct ArrowArrayStream *
,
even if this function reports an error.
[transfer full][nullable]
Since: 6.0.0
GArrowRecordBatchReader * garrow_record_batch_reader_new (GList *record_batches
,GArrowSchema *schema
,GError **error
);
record_batches |
A list of GArrowRecordBatch. |
[element-type GArrowRecordBatch] |
schema |
A GArrowSchema to confirm to. |
[nullable] |
error |
[nullable] |
Since: 6.0.0
gpointer garrow_record_batch_reader_export (GArrowRecordBatchReader *reader
,GError **error
);
An exported
GArrowRecordBatchReader as struct ArrowArrayStream *
on
success, NULL
on error.
It should be freed with the ArrowArrayStream::release
callback then
g_free()
when no longer needed.
[transfer full][nullable]
Since: 6.0.0
GArrowSchema *
garrow_record_batch_reader_get_schema (GArrowRecordBatchReader *reader
);
Since: 0.4.0
GArrowRecordBatch * garrow_record_batch_reader_get_next_record_batch (GArrowRecordBatchReader *reader
,GError **error
);
garrow_record_batch_reader_get_next_record_batch
has been deprecated since version 0.5.0 and should not be used in newly-written code.
Use garrow_record_batch_reader_read_next()
instead.
Since: 0.4.0
GArrowRecordBatch * garrow_record_batch_reader_read_next_record_batch (GArrowRecordBatchReader *reader
,GError **error
);
garrow_record_batch_reader_read_next_record_batch
has been deprecated since version 0.8.0 and should not be used in newly-written code.
Use garrow_record_batch_reader_read_next()
instead.
Since: 0.5.0
GArrowRecordBatch * garrow_record_batch_reader_read_next (GArrowRecordBatchReader *reader
,GError **error
);
Since: 0.8.0
GArrowTable * garrow_record_batch_reader_read_all (GArrowRecordBatchReader *reader
,GError **error
);
Since: 6.0.0
GList *
garrow_record_batch_reader_get_sources
(GArrowRecordBatchReader *reader
);
Since: 13.0.0
GArrowTableBatchReader *
garrow_table_batch_reader_new (GArrowTable *table
);
Since: 0.8.0
void garrow_table_batch_reader_set_max_chunk_size (GArrowTableBatchReader *reader
,gint64 max_chunk_size
);
Set the desired maximum chunk size of record batches.
The actual chunk size of each record batch may be smaller, depending on actual chunking characteristics of each table column.
Since: 12.0.0
GArrowRecordBatchStreamReader * garrow_record_batch_stream_reader_new (GArrowInputStream *stream
,GError **error
);
Since: 0.4.0
GArrowRecordBatchFileReader * garrow_record_batch_file_reader_new (GArrowSeekableInputStream *file
,GError **error
);
Since: 0.4.0
GArrowSchema *
garrow_record_batch_file_reader_get_schema
(GArrowRecordBatchFileReader *reader
);
Since: 0.4.0
guint
garrow_record_batch_file_reader_get_n_record_batches
(GArrowRecordBatchFileReader *reader
);
Since: 0.4.0
GArrowMetadataVersion
garrow_record_batch_file_reader_get_version
(GArrowRecordBatchFileReader *reader
);
Since: 0.4.0
GArrowRecordBatch * garrow_record_batch_file_reader_get_record_batch (GArrowRecordBatchFileReader *reader
,guint i
,GError **error
);
garrow_record_batch_file_reader_get_record_batch
has been deprecated since version 0.5.0 and should not be used in newly-written code.
Use garrow_record_batch_file_reader_read_record_batch()
instead.
Since: 0.4.0
GArrowRecordBatch * garrow_record_batch_file_reader_read_record_batch (GArrowRecordBatchFileReader *reader
,guint i
,GError **error
);
Since: 0.5.0
GArrowFeatherFileReader * garrow_feather_file_reader_new (GArrowSeekableInputStream *file
,GError **error
);
Since: 0.4.0
gint
garrow_feather_file_reader_get_version
(GArrowFeatherFileReader *reader
);
Since: 0.4.0
GArrowTable * garrow_feather_file_reader_read (GArrowFeatherFileReader *reader
,GError **error
);
Since: 0.12.0
GArrowTable * garrow_feather_file_reader_read_indices (GArrowFeatherFileReader *reader
,const gint *indices
,guint n_indices
,GError **error
);
reader |
||
indices |
The indices of column to be read. |
[array length=n_indices] |
n_indices |
The number of indices. |
|
error |
[nullable] |
Since: 0.12.0
GArrowTable * garrow_feather_file_reader_read_names (GArrowFeatherFileReader *reader
,const gchar **names
,guint n_names
,GError **error
);
reader |
||
names |
The names of column to be read. |
[array length=n_names] |
n_names |
The number of names. |
|
error |
[nullable] |
Since: 0.12.0
GArrowCSVReadOptions *
garrow_csv_read_options_new (void
);
Since: 0.12.0
void garrow_csv_read_options_add_column_type (GArrowCSVReadOptions *options
,const gchar *name
,GArrowDataType *data_type
);
Add value type of a column.
Since: 0.12.0
void garrow_csv_read_options_add_schema (GArrowCSVReadOptions *options
,GArrowSchema *schema
);
Add value types for columns in the schema.
Since: 0.12.0
GHashTable *
garrow_csv_read_options_get_column_types
(GArrowCSVReadOptions *options
);
The column name and value type mapping of the options.
[transfer full][element-type gchar* GArrowDataType]
Since: 0.12.0
void garrow_csv_read_options_set_null_values (GArrowCSVReadOptions *options
,const gchar **null_values
,gsize n_null_values
);
options |
||
null_values |
The values to be processed as null. |
[array length=n_null_values] |
n_null_values |
The number of the specified null values. |
Since: 0.14.0
gchar **
garrow_csv_read_options_get_null_values
(GArrowCSVReadOptions *options
);
The values to be processed as null.
If the number of values is zero, this returns NULL
.
It's a NULL
-terminated string array. It must be freed with
g_strfreev()
when no longer needed.
[nullable][array zero-terminated=1][element-type utf8][transfer full]
Since: 0.14.0
void garrow_csv_read_options_add_null_value (GArrowCSVReadOptions *options
,const gchar *null_value
);
Since: 0.14.0
void garrow_csv_read_options_set_true_values (GArrowCSVReadOptions *options
,const gchar **true_values
,gsize n_true_values
);
options |
||
true_values |
The values to be processed as true. |
[array length=n_true_values] |
n_true_values |
The number of the specified true values. |
Since: 0.14.0
gchar **
garrow_csv_read_options_get_true_values
(GArrowCSVReadOptions *options
);
The values to be processed as true.
If the number of values is zero, this returns NULL
.
It's a NULL
-terminated string array. It must be freed with
g_strfreev()
when no longer needed.
[nullable][array zero-terminated=1][element-type utf8][transfer full]
Since: 0.14.0
void garrow_csv_read_options_add_true_value (GArrowCSVReadOptions *options
,const gchar *true_value
);
Since: 0.14.0
void garrow_csv_read_options_set_false_values (GArrowCSVReadOptions *options
,const gchar **false_values
,gsize n_false_values
);
options |
||
false_values |
The values to be processed as false. |
[array length=n_false_values] |
n_false_values |
The number of the specified false values. |
Since: 0.14.0
gchar **
garrow_csv_read_options_get_false_values
(GArrowCSVReadOptions *options
);
The values to be processed as false.
If the number of values is zero, this returns NULL
.
It's a NULL
-terminated string array. It must be freed with
g_strfreev()
when no longer needed.
[nullable][array zero-terminated=1][element-type utf8][transfer full]
Since: 0.14.0
void garrow_csv_read_options_add_false_value (GArrowCSVReadOptions *options
,const gchar *false_value
);
Since: 0.14.0
void garrow_csv_read_options_set_column_names (GArrowCSVReadOptions *options
,const gchar **column_names
,gsize n_column_names
);
options |
||
column_names |
The column names (if empty, will be read from first
row after |
[array length=n_column_names] |
n_column_names |
The number of the specified column names. |
Since: 0.15.0
gchar **
garrow_csv_read_options_get_column_names
(GArrowCSVReadOptions *options
);
The column names.
If the number of values is zero, this returns NULL
.
It's a NULL
-terminated string array. It must be freed with
g_strfreev()
when no longer needed.
[nullable][array zero-terminated=1][element-type utf8][transfer full]
Since: 0.15.0
void garrow_csv_read_options_add_column_name (GArrowCSVReadOptions *options
,const gchar *column_name
);
GArrowCSVReader * garrow_csv_reader_new (GArrowInputStream *input
,GArrowCSVReadOptions *options
,GError **error
);
Since: 0.12.0
GArrowTable * garrow_csv_reader_read (GArrowCSVReader *reader
,GError **error
);
Since: 0.12.0
GArrowJSONReadOptions *
garrow_json_read_options_new (void
);
Since: 0.14.0
GArrowJSONReader * garrow_json_reader_new (GArrowInputStream *input
,GArrowJSONReadOptions *options
,GError **error
);
Since: 0.14.0
GArrowTable * garrow_json_reader_read (GArrowJSONReader *reader
,GError **error
);
Since: 0.14.0
#define GARROW_TYPE_RECORD_BATCH_READER (garrow_record_batch_reader_get_type())
struct GArrowRecordBatchReaderClass { GObjectClass parent_class; };
#define GARROW_TYPE_TABLE_BATCH_READER (garrow_table_batch_reader_get_type())
struct GArrowTableBatchReaderClass { GArrowRecordBatchReaderClass parent_class; };
struct GArrowRecordBatchStreamReader;
It wraps arrow::ipc::RecordBatchStreamReader
.
struct GArrowRecordBatchFileReader;
It wraps arrow::ipc::RecordBatchFileReader
.
#define GARROW_TYPE_FEATHER_FILE_READER (garrow_feather_file_reader_get_type())
struct GArrowFeatherFileReaderClass { GObjectClass parent_class; };
#define GARROW_TYPE_CSV_READ_OPTIONS (garrow_csv_read_options_get_type())
They are corresponding to arrow::json::UnexpectedFieldBehavior
values.
#define GARROW_TYPE_JSON_READ_OPTIONS (garrow_json_read_options_get_type())
“record-batch-file-reader”
property“record-batch-file-reader” gpointer
The raw std::shared<arrow::ipc::RecordBatchFileReader> *.
Owner: GArrowRecordBatchFileReader
Flags: Write / Construct Only
“allow-newlines-in-values”
property“allow-newlines-in-values” gboolean
Whether values are allowed to contain CR (0x0d) and LF (0x0a) characters.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: FALSE
Since: 0.12.0
“allow-null-strings”
property“allow-null-strings” gboolean
Whether string / binary columns can have null values.
If TRUE
, then strings in "null_values" are considered null for string columns.
If FALSE
, then all strings are valid string values.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: FALSE
Since: 0.14.0
“block-size”
property “block-size” int
Block size we request from the IO layer; also determines the size
of chunks when “use-threads” is TRUE
.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Allowed values: >= 0
Default value: 1048576
Since: 0.12.0
“check-utf8”
property“check-utf8” gboolean
Whether to check UTF8 validity of string columns.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: TRUE
Since: 0.12.0
“delimiter”
property “delimiter” char
Field delimiter character.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Allowed values: >= 0
Default value: 44
Since: 0.12.0
“escape-character”
property “escape-character” char
Escaping character. This is used only when
“is-escaped” is TRUE
.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Allowed values: >= 0
Default value: 92
Since: 0.12.0
“generate-column-names”
property“generate-column-names” gboolean
Whether to autogenerate column names if column-names is empty. If TRUE, column names will be of the form 'f0', 'f1'... If FALSE, column names will be read from the first CSV row after n-skip-rows.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: FALSE
“ignore-empty-lines”
property“ignore-empty-lines” gboolean
Whether empty lines are ignored. If FALSE
, an empty line
represents a simple empty value (assuming a one-column CSV file).
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: TRUE
Since: 0.12.0
“is-double-quoted”
property“is-double-quoted” gboolean
Whether a quote inside a value is double quoted.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: TRUE
Since: 0.12.0
“is-escaped”
property“is-escaped” gboolean
Whether escaping is used.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: FALSE
Since: 0.12.0
“is-quoted”
property“is-quoted” gboolean
Whether quoting is used.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: TRUE
Since: 0.12.0
“n-skip-rows”
property“n-skip-rows” guint
The number of header rows to skip (not including the row of column names, if any)
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: 0
Since: 0.15.0
“quote-character”
property “quote-character” char
Quoting character. This is used only when
“is-quoted” is TRUE
.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Allowed values: >= 0
Default value: 34
Since: 0.12.0
“use-threads”
property“use-threads” gboolean
Whether to use the global CPU thread pool.
Owner: GArrowCSVReadOptions
Flags: Read / Write
Default value: TRUE
Since: 0.12.0
“csv-table-reader”
property“csv-table-reader” gpointer
The raw std::shared<arrow::csv::TableReader> *.
Owner: GArrowCSVReader
Flags: Write / Construct Only
“input”
property“input” GArrowInputStream *
The input stream to be read.
Owner: GArrowCSVReader
Flags: Read / Write / Construct Only
“feather-reader”
property“feather-reader” gpointer
The raw std::shared<arrow::ipc::feather::Reader> *.
Owner: GArrowFeatherFileReader
Flags: Write / Construct Only
“allow-newlines-in-values”
property“allow-newlines-in-values” gboolean
Whether objects may be printed across multiple lines (for example pretty printed).
if FALSE
, input must end with an empty line.
Owner: GArrowJSONReadOptions
Flags: Read / Write
Default value: FALSE
Since: 0.14.0
“block-size”
property “block-size” int
Block size we request from the IO layer; also determines the size
of chunks when “use-threads” is TRUE
.
Owner: GArrowJSONReadOptions
Flags: Read / Write
Allowed values: >= 0
Default value: 1048576
Since: 0.14.0
“schema”
property“schema” GArrowSchema *
Schema for passing custom conversion rules.
Owner: GArrowJSONReadOptions
Flags: Read / Write
Since: 0.14.0
“unexpected-field-behavior”
property“unexpected-field-behavior” GArrowJSONReadUnexpectedFieldBehavior
How to parse handle fields outside the explicit schema.
Owner: GArrowJSONReadOptions
Flags: Read / Write
Default value: GARROW_JSON_READ_INFER_TYPE
Since: 0.14.0
“use-threads”
property“use-threads” gboolean
Whether to use the global CPU thread pool.
Owner: GArrowJSONReadOptions
Flags: Read / Write
Default value: TRUE
Since: 0.14.0
“input”
property“input” GArrowInputStream *
The input stream to be read.
Owner: GArrowJSONReader
Flags: Read / Write / Construct Only
“json-table-reader”
property“json-table-reader” gpointer
The raw std::shared<arrow::json::TableReader> *.
Owner: GArrowJSONReader
Flags: Write / Construct Only
“record-batch-reader”
property“record-batch-reader” gpointer
The raw std::shared<arrow::ipc::RecordBatchRecordBatchReader> *.
Owner: GArrowRecordBatchReader
Flags: Write / Construct Only
“sources”
property“sources” gpointer
The sources of this reader.
Owner: GArrowRecordBatchReader
Flags: Write / Construct Only