spark-instrumented-optimizer/python/pyspark/sql
HyukjinKwon 00cb2f99cc [SPARK-28881][PYTHON][TESTS] Add a test to make sure toPandas with Arrow optimization throws an exception per maxResultSize
### What changes were proposed in this pull request?
This PR proposes to add a test case for:

```bash
./bin/pyspark --conf spark.driver.maxResultSize=1m
spark.conf.set("spark.sql.execution.arrow.enabled",True)
```

```python
spark.range(10000000).toPandas()
```

```
Empty DataFrame
Columns: [id]
Index: []
```

which can result in partial results (see https://github.com/apache/spark/pull/25593#issuecomment-525153808). This regression was found between Spark 2.3 and Spark 2.4, and accidentally fixed.

### Why are the changes needed?
To prevent the same regression in the future.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Test was added.

Closes #25594 from HyukjinKwon/SPARK-28881.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-08-27 17:30:06 +09:00
..
avro [SPARK-28698][SQL] Support user-specified output schema in to_avro 2019-08-13 20:52:16 +08:00
tests [SPARK-28881][PYTHON][TESTS] Add a test to make sure toPandas with Arrow optimization throws an exception per maxResultSize 2019-08-27 17:30:06 +09:00
__init__.py [SPARK-22369][PYTHON][DOCS] Exposes catalog API documentation in PySpark 2017-11-02 15:22:52 +01:00
catalog.py [SPARK-24665][PYSPARK][FOLLOWUP] Use SQLConf in PySpark to manage all sql configs 2018-08-17 10:18:08 +08:00
column.py [SPARK-28031][PYSPARK][TEST] Improve doctest on over function of Column 2019-06-13 11:04:41 +09:00
conf.py [SPARK-23698][PYTHON] Resolve undefined names in Python 3 2018-08-22 10:06:59 -07:00
context.py [SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis 2019-01-17 19:40:39 -06:00
dataframe.py [SPARK-28378][PYTHON] Remove usage of cgi.escape 2019-07-14 15:26:00 +09:00
functions.py [SPARK-28777][PYTHON][DOCS] Fix format_string doc string with the correct parameters 2019-08-19 20:44:46 -07:00
group.py [SPARK-24722][SQL] pivot() with Column type argument 2018-08-04 14:17:32 +08:00
readwriter.py [SPARK-28471][SQL] Replace yyyy by uuuu in date-timestamp patterns without era 2019-07-28 20:36:36 -07:00
session.py [SPARK-27995][PYTHON] Note the difference between str of Python 2 and 3 at Arrow optimized 2019-06-11 18:43:59 +09:00
streaming.py [SPARK-28651][SS] Force the schema of Streaming file source to be nullable 2019-08-09 18:54:55 +09:00
types.py [SPARK-28454][PYTHON] Validate LongType in createDataFrame(verifySchema=True) 2019-08-08 11:47:25 +09:00
udf.py [SPARK-28273][SQL][PYTHON] Convert and port 'pgSQL/case.sql' into UDF test base 2019-07-09 10:50:07 +08:00
utils.py [SPARK-27609][PYTHON] Convert values of function options to strings 2019-07-18 13:37:03 +09:00
window.py [MINOR][PYSPARK][SQL][DOC] Fix rowsBetween doc in Window 2019-06-14 09:56:37 +09:00