ded1a7495b
## What changes were proposed in this pull request? Because the local default locale isn't in available locales at `Locale`, when I did some tests locally with python code, `StopWordsRemover` related python test hits some errors, like: ``` Traceback (most recent call last): File "/spark-1/python/pyspark/ml/tests/test_feature.py", line 87, in test_stopwordsremover stopWordRemover = StopWordsRemover(inputCol="input", outputCol="output") File "/spark-1/python/pyspark/__init__.py", line 111, in wrapper return func(self, **kwargs) File "/spark-1/python/pyspark/ml/feature.py", line 2646, in __init__ self.uid) File "/spark-1/python/pyspark/ml/wrapper.py", line 67, in _new_java_obj return java_obj(*java_args) File /spark-1/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1554, in __call__ answer, self._gateway_client, None, self._fqn) File "/spark-1/python/pyspark/sql/utils.py", line 93, in deco raise converted pyspark.sql.utils.IllegalArgumentException: 'StopWordsRemover_4598673ee802 parameter locale given invalid value en_TW.' ``` As per HyukjinKwon's advice, instead of setting up locale to pass test, it is better to have a workable locale if system default locale can't be found in available locales in JVM. Otherwise, users have to manually change system locale or accessing a private property _jvm in PySpark. ## How was this patch tested? Added test and manual test. ``` scala> val remover = new StopWordsRemover().setInputCol("raw").setOutputCol("filtered") 19/07/14 19:20:03 WARN StopWordsRemover: Default locale set was [en_TW]; however, it was not found in available locales in JVM, falling back to en_US locale. Set param `locale` in order to respect another locale. ``` Closes #25133 from viirya/pytest-default-locale. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |