java.lang.Object
org.apache.arrow.dataset.file.JniWrapper
JniWrapper for filesystem based
Dataset
implementations.-
Method Summary
Modifier and TypeMethodDescriptionstatic JniWrapper
get()
long
makeFileSystemDatasetFactory
(String uri, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer.long
makeFileSystemDatasetFactoryWithFiles
(String[] uris, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer.void
writeFromScannerToFile
(long streamAddress, long fileFormat, String uri, String[] partitionColumns, int maxPartitions, String baseNameTemplate) Write the content in aArrowArrayStream
into files.
-
Method Details
-
get
-
makeFileSystemDatasetFactory
public long makeFileSystemDatasetFactory(String uri, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer. The pointer is pointing to a intermediate shared_ptr of the factory instance.- Parameters:
uri
- file uri to read, either a file or a directoryfileFormat
- file format ID.serializedFragmentScanOptions
- serialized FragmentScanOptions.- Returns:
- the native pointer of the arrow::dataset::FileSystemDatasetFactory instance.
- See Also:
-
makeFileSystemDatasetFactoryWithFiles
public long makeFileSystemDatasetFactoryWithFiles(String[] uris, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer. The pointer is pointing to a intermediate shared_ptr of the factory instance.- Parameters:
uris
- List of file uris to read, each path pointing to an individual filefileFormat
- file format ID.serializedFragmentScanOptions
- serialized FragmentScanOptions.- Returns:
- the native pointer of the arrow::dataset::FileSystemDatasetFactory instance.
- See Also:
-
writeFromScannerToFile
public void writeFromScannerToFile(long streamAddress, long fileFormat, String uri, String[] partitionColumns, int maxPartitions, String baseNameTemplate) Write the content in aArrowArrayStream
into files. This internally depends on C++ write API: FileSystemDataset::Write.- Parameters:
streamAddress
- the ArrowArrayStream addressfileFormat
- target file format (ID)uri
- target file uripartitionColumns
- columns used to partition output filesmaxPartitions
- maximum partitions to be included in written filesbaseNameTemplate
- file name template used to make partitions. E.g. "dat_{i}", i is current partition ID around all written files.
-