spark-instrumented-optimizer/dev/requirements.txt
itholic a9c5b1a5c8 [SPARK-36254][INFRA][PYTHON] Install mlflow in Github Actions CI
### What changes were proposed in this pull request?

This PR proposes adding a Python package, `mlflow` and `sklearn` to enable the MLflow test in pandas API on Spark.

### Why are the changes needed?

To enable the MLflow test in pandas API on Spark.

### Does this PR introduce _any_ user-facing change?

No, it's test-only

### How was this patch tested?

Manually test on local, with `python/run-tests --testnames pyspark.pandas.mlflow`.

Closes #33567 from itholic/SPARK-36254.

Lead-authored-by: itholic <haejoon.lee@databricks.com>
Co-authored-by: Haejoon Lee <44108233+itholic@users.noreply.github.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit abce61f3fd)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-30 00:04:59 -07:00

39 lines
440 B
Plaintext

# PySpark dependencies (required)
py4j
# PySpark dependencies (optional)
numpy
pyarrow
pandas
scipy
plotly
mlflow>=1.0
sklearn
matplotlib<3.3.0
# PySpark test dependencies
xmlrunner
# Linter
mypy
flake8
# Documentation (SQL)
mkdocs
# Documentation (Python)
pydata_sphinx_theme
ipython
nbsphinx
numpydoc
jinja2<3.0.0
sphinx<3.1.0
sphinx-plotly-directive
# Development scripts
jira
PyGithub
# pandas API on Spark Code formatter.
black