dask.dataframe.map_partitions
- dask.dataframe.map_partitions(func, *args, meta='__no_default__', enforce_metadata=True, transform_divisions=True, **kwargs)[source]
Apply Python function on each DataFrame partition.
- Parameters
- funcfunction
Function applied to each partition.
- args, kwargs :
Arguments and keywords to pass to the function. At least one of the args should be a Dask.dataframe. Arguments and keywords may contain
Scalar
,Delayed
or regular python objects. DataFrame-like args (both dask and pandas) will be repartitioned to align (if necessary) before applying the function.- enforce_metadatabool
Whether or not to enforce the structure of the metadata at runtime. This will rename and reorder columns for each partition, and will raise an error if this doesn’t work or types don’t match.
- metapd.DataFrame, pd.Series, dict, iterable, tuple, optional
An empty
pd.DataFrame
orpd.Series
that matches the dtypes and column names of the output. This metadata is necessary for many algorithms in dask dataframe to work. For ease of use, some alternative inputs are also available. Instead of aDataFrame
, adict
of{name: dtype}
or iterable of(name, dtype)
can be provided (note that the order of the names should match the order of the columns). Instead of a series, a tuple of(name, dtype)
can be used. If not provided, dask will try to infer the metadata. This may lead to unexpected results, so providingmeta
is recommended. For more information, seedask.dataframe.utils.make_meta
.