java.lang.Object
org.apache.arrow.dataset.file.JniWrapper
JniWrapper for filesystem based
Dataset implementations.-
Method Summary
Modifier and TypeMethodDescriptionstatic JniWrapperget()longmakeFileSystemDatasetFactory(String uri, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer.longmakeFileSystemDatasetFactoryWithFiles(String[] uris, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer.voidwriteFromScannerToFile(long streamAddress, long fileFormat, String uri, String[] partitionColumns, int maxPartitions, String baseNameTemplate) Write the content in aArrowArrayStreaminto files.
-
Method Details
-
get
-
makeFileSystemDatasetFactory
public long makeFileSystemDatasetFactory(String uri, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer. The pointer is pointing to a intermediate shared_ptr of the factory instance.- Parameters:
uri- file uri to read, either a file or a directoryfileFormat- file format ID.serializedFragmentScanOptions- serialized FragmentScanOptions.- Returns:
- the native pointer of the arrow::dataset::FileSystemDatasetFactory instance.
- See Also:
-
makeFileSystemDatasetFactoryWithFiles
public long makeFileSystemDatasetFactoryWithFiles(String[] uris, int fileFormat, String[] serializedFragmentScanOptions) Create FileSystemDatasetFactory and return its native pointer. The pointer is pointing to a intermediate shared_ptr of the factory instance.- Parameters:
uris- List of file uris to read, each path pointing to an individual filefileFormat- file format ID.serializedFragmentScanOptions- serialized FragmentScanOptions.- Returns:
- the native pointer of the arrow::dataset::FileSystemDatasetFactory instance.
- See Also:
-
writeFromScannerToFile
public void writeFromScannerToFile(long streamAddress, long fileFormat, String uri, String[] partitionColumns, int maxPartitions, String baseNameTemplate) Write the content in aArrowArrayStreaminto files. This internally depends on C++ write API: FileSystemDataset::Write.- Parameters:
streamAddress- the ArrowArrayStream addressfileFormat- target file format (ID)uri- target file uripartitionColumns- columns used to partition output filesmaxPartitions- maximum partitions to be included in written filesbaseNameTemplate- file name template used to make partitions. E.g. "dat_{i}", i is current partition ID around all written files.
-