pyarrow.hdfs.connect#
- pyarrow.hdfs.connect(host='default', port=0, user=None, kerb_ticket=None, extra_conf=None)[source]#
DEPRECATED: Connect to an HDFS cluster.
All parameters are optional and should only be set if the defaults need to be overridden.
Authentication should be automatic if the HDFS cluster uses Kerberos. However, if a username is specified, then the ticket cache will likely be required.
Deprecated since version 2.0:
pyarrow.hdfs.connect
is deprecated, please usepyarrow.fs.HadoopFileSystem
instead.- Parameters:
- hostNameNode.
Set
to “default”for
fs.defaultFS
from
core-site.xml. - portNameNode’s port.
Set
to 0for
default orlogical
(HA
) nodes. - user
Username
when
connecting
to HDFS;None
implies
login
user. - kerb_ticket
Path
toKerberos
ticket
cache. - extra_conf
dict
, defaultNone
extra Key/Value pairs for config; Will override any hdfs-site.xml properties
- hostNameNode.
- Returns:
- filesystem
HadoopFileSystem
- filesystem
Notes
The first time you call this method, it will take longer than usual due to JNI spin-up time.