CSV Convert Options
Usage
csv_convert_options(
check_utf8 = TRUE,
null_values = c("", "NA"),
true_values = c("T", "true", "TRUE"),
false_values = c("F", "false", "FALSE"),
strings_can_be_null = FALSE,
col_types = NULL,
auto_dict_encode = FALSE,
auto_dict_max_cardinality = 50L,
include_columns = character(),
include_missing_columns = FALSE,
timestamp_parsers = NULL,
decimal_point = "."
)Arguments
- check_utf8
Logical: check UTF8 validity of string columns?
- null_values
Character vector of recognized spellings for null values. Analogous to the
na.stringsargument toread.csv()ornainreadr::read_csv().- true_values
Character vector of recognized spellings for
TRUEvalues- false_values
Character vector of recognized spellings for
FALSEvalues- strings_can_be_null
Logical: can string / binary columns have null values? Similar to the
quoted_naargument toreadr::read_csv()- col_types
A
SchemaorNULLto infer types- auto_dict_encode
Logical: Whether to try to automatically dictionary-encode string / binary data (think
stringsAsFactors). This setting is ignored for non-inferred columns (those incol_types).- auto_dict_max_cardinality
If
auto_dict_encode, string/binary columns are dictionary-encoded up to this number of unique values (default 50), after which it switches to regular encoding.- include_columns
If non-empty, indicates the names of columns from the CSV file that should be actually read and converted (in the vector's order).
- include_missing_columns
Logical: if
include_columnsis provided, should columns named in it but not found in the data be included as a column of typenull()? The default (FALSE) means that the reader will instead raise an error.- timestamp_parsers
User-defined timestamp parsers. If more than one parser is specified, the CSV conversion logic will try parsing values starting from the beginning of this vector. Possible values are (a)
NULL, the default, which uses the ISO-8601 parser; (b) a character vector of strptime parse strings; or (c) a list of TimestampParser objects.- decimal_point
Character to use for decimal point in floating point numbers.
Examples
tf <- tempfile()
on.exit(unlink(tf))
writeLines("x\n1\nNULL\n2\nNA", tf)
read_csv_arrow(tf, convert_options = csv_convert_options(null_values = c("", "NA", "NULL")))
#> # A tibble: 4 x 1
#> x
#> <int>
#> 1 1
#> 2 NA
#> 3 2
#> 4 NA
open_csv_dataset(tf, convert_options = csv_convert_options(null_values = c("", "NA", "NULL")))
#> FileSystemDataset with 1 csv file
#> 1 columns
#> x: int64