### What changes were proposed in this pull request?
This PR proposes to upgrade cloudpickle from 1.5.0 to 1.6.0.
It virtually contains one fix:
4510be850d
From a cursory look, this isn't a regression, and not even properly supported in Python:
```python
>>> import pickle
>>> pickle.dumps({}.keys())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot pickle 'dict_keys' object
```
So it seems fine not to backport.
### Why are the changes needed?
To leverage bug fixes from the cloudpickle upstream.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Jenkins build and GitHub actions build will test it out.
Closes#31007 from HyukjinKwon/cloudpickle-upgrade.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR intends to fix typos in the sub-modules:
* `R`
* `common`
* `dev`
* `mlib`
* `external`
* `project`
* `streaming`
* `resource-managers`
* `python`
Split per srowen https://github.com/apache/spark/pull/30323#issuecomment-728981618
NOTE: The misspellings have been reported at 706a726f87 (commitcomment-44064356)
### Why are the changes needed?
Misspelled words make it harder to read / understand content.
### Does this PR introduce _any_ user-facing change?
There are various fixes to documentation, etc...
### How was this patch tested?
No testing was performed
Closes#30402 from jsoref/spelling-R_common_dev_mlib_external_project_streaming_resource-managers_python.
Authored-by: Josh Soref <jsoref@users.noreply.github.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
This PR aims to upgrade PySpark's embedded cloudpickle to the latest cloudpickle v1.5.0 (See https://github.com/cloudpipe/cloudpickle/blob/v1.5.0/cloudpickle/cloudpickle.py)
### Why are the changes needed?
There are many bug fixes. For example, the bug described in the JIRA:
dill unpickling fails because they define `types.ClassType`, which is undefined in dill. This results in the following error:
```
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py", line 279, in loads
return dill.loads(s)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 317, in loads
return load(file, ignore)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 305, in load
obj = pik.load()
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 577, in _load_type
return _reverse_typemap[name]
KeyError: 'ClassType'
```
See also https://github.com/cloudpipe/cloudpickle/issues/82. This was fixed for cloudpickle 1.3.0+ (https://github.com/cloudpipe/cloudpickle/pull/337), but PySpark's cloudpickle.py doesn't have this change yet.
More notably, now it supports C pickle implementation with Python 3.8 which hugely improve performance. This is already adopted in another project such as Ray.
### Does this PR introduce _any_ user-facing change?
Yes, as described above, the bug fixes. Internally, users also could leverage the fast cloudpickle backed by C pickle.
### How was this patch tested?
Jenkins will test it out.
Closes#29114 from HyukjinKwon/SPARK-32094.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>