pyarrow.compute.extract_regex

pyarrow.compute.extract_regex(strings, /, pattern, *, options=None, memory_pool=None)

Extract substrings captured by a regex pattern.

For each string in strings, match the regular expression and, if successful, emit a struct with field names and values coming from the regular expression’s named capture groups. If the input is null or the regular expression fails matching, a null output value is emitted.

Regular expression matching is done using the Google RE2 library.

Parameters
stringsArray-like or scalar-like

Argument to compute function.

patternstr

Regular expression with named capture fields.

optionspyarrow.compute.ExtractRegexOptions, optional

Alternative way of passing options.

memory_poolpyarrow.MemoryPool, optional

If not passed, will allocate memory from the default memory pool.