datafusion.DataFrame¶
-
class
datafusion.
DataFrame
¶ Bases:
object
A PyDataFrame is a representation of a logical plan and an API to compose statements. Use it to build a plan and .collect() to execute the plan and collect the result. The actual execution of a plan runs natively on Rust and Arrow on a multi-threaded environment.
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
Methods
Initialize self.
Executes the plan, returning a list of `RecordBatch`es.Unless some order is specified in the plan, there is no guarantee of the order of the result..
Returns the schema from the logical plan
Print the result, 20 lines by default
-
aggregate
()¶
-
collect
()¶ Executes the plan, returning a list of `RecordBatch`es. Unless some order is specified in the plan, there is no guarantee of the order of the result.
-
filter
()¶
-
join
()¶
-
limit
()¶
-
schema
()¶ Returns the schema from the logical plan
-
select
()¶
-
show
()¶ Print the result, 20 lines by default
-
sort
()¶
-