pyarrow.TableGroupBy¶
- class pyarrow.TableGroupBy(table, keys)¶
Bases:
object
A grouping of columns in a table on which to perform aggregations.
- Parameters:
- table
pyarrow.Table
Input table to execute the aggregation on.
- keys
str
orlist
[str
] Name of the grouped columns.
- table
Examples
>>> import pyarrow as pa >>> t = pa.table([ ... pa.array(["a", "a", "b", "b", "c"]), ... pa.array([1, 2, 3, 4, 5]), ... ], names=["keys", "values"])
Grouping of columns:
>>> pa.TableGroupBy(t,"keys") <pyarrow.lib.TableGroupBy object at ...>
Perform aggregations:
>>> pa.TableGroupBy(t,"keys").aggregate([("values", "sum")]) pyarrow.Table values_sum: int64 keys: string ---- values_sum: [[3,7,5]] keys: [["a","b","c"]]
- __init__(self, table, keys)¶
Methods
__init__
(self, table, keys)aggregate
(self, aggregations)Perform an aggregation over the grouped columns of the table.
- aggregate(self, aggregations)¶
Perform an aggregation over the grouped columns of the table.
- Parameters:
- Returns:
Table
Results of the aggregation functions.
Examples
>>> import pyarrow as pa >>> t = pa.table([ ... pa.array(["a", "a", "b", "b", "c"]), ... pa.array([1, 2, 3, 4, 5]), ... ], names=["keys", "values"]) >>> t.group_by("keys").aggregate([("values", "sum")]) pyarrow.Table values_sum: int64 keys: string ---- values_sum: [[3,7,5]] keys: [["a","b","c"]] >>> t.group_by("keys").aggregate([]) pyarrow.Table keys: string ---- keys: [["a","b","c"]]