pyarrow.compute.hash_tdigest

pyarrow.compute.hash_tdigest(array, group_id_array, *, memory_pool=None, options=None, q=0.5, delta=100, buffer_size=500, skip_nulls=True, min_count=0)

Calculate approximate quantiles of a numeric array with the T-Digest algorithm.

By default, the 0.5 quantile (median) is returned. Nulls and NaNs are ignored. A array of nulls is returned if there are no valid data points.

Parameters
  • array (Array-like or scalar-like) – Argument to compute function

  • group_id_array (Array-like or scalar-like) – Argument to compute function

  • memory_pool (pyarrow.MemoryPool, optional) – If not passed, will allocate memory from the default memory pool.

  • options (pyarrow.compute.TDigestOptions, optional) – Parameters altering compute function semantics.

  • q (optional) – Parameter for TDigestOptions constructor. Either options or q can be passed, but not both at the same time.

  • delta (optional) – Parameter for TDigestOptions constructor. Either options or delta can be passed, but not both at the same time.

  • buffer_size (optional) – Parameter for TDigestOptions constructor. Either options or buffer_size can be passed, but not both at the same time.

  • skip_nulls (optional) – Parameter for TDigestOptions constructor. Either options or skip_nulls can be passed, but not both at the same time.

  • min_count (optional) – Parameter for TDigestOptions constructor. Either options or min_count can be passed, but not both at the same time.