Commit graph

5 commits

Author SHA1 Message Date
HyukjinKwon 41af409b7b [SPARK-35303][PYTHON] Enable pinned thread mode by default
### What changes were proposed in this pull request?

PySpark added pinned thread mode at https://github.com/apache/spark/pull/24898 to sync Python thread to JVM thread. Previously, one JVM thread could be reused which ends up with messed inheritance hierarchy such as thread local especially when multiple jobs run in parallel. To completely fix this, we should enable this mode by default.

### Why are the changes needed?

To correctly support parallel job submission and management.

### Does this PR introduce _any_ user-facing change?

Yes, now Python thread is mapped to JVM thread one to one.

### How was this patch tested?

Existing tests should cover it.

Closes #32429 from HyukjinKwon/SPARK-35303.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-06-18 12:02:29 +09:00
Yikun Jiang 31555f7779
[SPARK-34630][PYTHON][FOLLOWUP] Add __version__ into pyspark init __all__
### What changes were proposed in this pull request?
This patch add `__version__` into pyspark.__init__.__all__ to make the `__version__` as exported explicitly, see more in https://github.com/apache/spark/pull/32110#issuecomment-817331896

### Why are the changes needed?
1. make the `__version__` as exported explicitly
2. cleanup `noqa: F401` on `__version`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Python related CI passed

Closes #32125 from Yikun/SPARK-34629-Follow.

Authored-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: zero323 <mszymkiewicz@gmail.com>
2021-04-14 23:36:25 +02:00
Yikun Jiang 4c1ccdabe8 [SPARK-34630][PYTHON] Add typehint for pyspark.__version__
### What changes were proposed in this pull request?
This PR adds the typehint of pyspark.__version__, which was mentioned in [SPARK-34630](https://issues.apache.org/jira/browse/SPARK-34630).

### Why are the changes needed?
There were some short discussion happened in https://github.com/apache/spark/pull/31823#discussion_r593830911 .

After further deep investigation on [1][2], we can see the `pyspark.__version__` is added by [setup.py](c06758834e/python/setup.py (L201)), it makes `__version__` embedded into pyspark module, that means the `__init__.pyi` is the right place to add the typehint for `__version__`.

So, this patch adds the type hint `__version__` in pyspark/__init__.pyi.

[1] [PEP-396 Module Version Numbers](https://www.python.org/dev/peps/pep-0396/)
[2] https://packaging.python.org/guides/single-sourcing-package-version/
### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
1. Disable the ignore_error on
ee7bf7d962/python/mypy.ini (L132)

2. Run mypy:
- Before fix
```shell
(venv) ➜  spark git:(SPARK-34629) ✗ mypy --config-file python/mypy.ini python/pyspark | grep version
python/pyspark/pandas/spark/accessors.py:884: error: Module has no attribute "__version__"
```

- After fix
```shell
(venv) ➜  spark git:(SPARK-34629) ✗ mypy --config-file python/mypy.ini python/pyspark | grep version
```
no output

Closes #32110 from Yikun/SPARK-34629.

Authored-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2021-04-11 10:40:08 +09:00
Josh Soref 13fd272cd3 Spelling r common dev mlib external project streaming resource managers python
### What changes were proposed in this pull request?

This PR intends to fix typos in the sub-modules:
* `R`
* `common`
* `dev`
* `mlib`
* `external`
* `project`
* `streaming`
* `resource-managers`
* `python`

Split per srowen https://github.com/apache/spark/pull/30323#issuecomment-728981618

NOTE: The misspellings have been reported at 706a726f87 (commitcomment-44064356)

### Why are the changes needed?

Misspelled words make it harder to read / understand content.

### Does this PR introduce _any_ user-facing change?

There are various fixes to documentation, etc...

### How was this patch tested?

No testing was performed

Closes #30402 from jsoref/spelling-R_common_dev_mlib_external_project_streaming_resource-managers_python.

Authored-by: Josh Soref <jsoref@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-11-27 10:22:45 -06:00
zero323 31a16fbb40 [SPARK-32714][PYTHON] Initial pyspark-stubs port
### What changes were proposed in this pull request?

This PR proposes migration of [`pyspark-stubs`](https://github.com/zero323/pyspark-stubs) into Spark codebase.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

Yes. This PR adds type annotations directly to Spark source.

This can impact interaction with development tools for users, which haven't used `pyspark-stubs`.

### How was this patch tested?

- [x] MyPy tests of the PySpark source
    ```
    mypy --no-incremental --config python/mypy.ini python/pyspark
    ```
- [x] MyPy tests of Spark examples
    ```
   MYPYPATH=python/ mypy --no-incremental --config python/mypy.ini examples/src/main/python/ml examples/src/main/python/sql examples/src/main/python/sql/streaming
    ```
- [x] Existing Flake8 linter

- [x] Existing unit tests

Tested against:

- `mypy==0.790+dev.e959952d9001e9713d329a2f9b196705b028f894`
- `mypy==0.782`

Closes #29591 from zero323/SPARK-32681.

Authored-by: zero323 <mszymkiewicz@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-09-24 14:15:36 +09:00