7d039e0c0a
## What changes were proposed in this pull request? * Add a "function type" argument to pandas_udf. * Add a new public enum class `PandasUdfType` in pyspark.sql.functions * Refactor udf related code from pyspark.sql.functions to pyspark.sql.udf * Merge "PythonUdfType" and "PythonEvalType" into a single enum class "PythonEvalType" Example: ``` from pyspark.sql.functions import pandas_udf, PandasUDFType pandas_udf('double', PandasUDFType.SCALAR): def plus_one(v): return v + 1 ``` ## Design doc https://docs.google.com/document/d/1KlLaa-xJ3oz28xlEJqXyCAHU3dwFYkFs_ixcUXrJNTc/edit ## How was this patch tested? Added PandasUDFTests ## TODO: * [x] Implement proper enum type for `PandasUDFType` * [x] Update documentation * [x] Add more tests in PandasUDFTests Author: Li Jin <ice.xelloss@gmail.com> Closes #19630 from icexelloss/spark-22409-pandas-udf-type. |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |