spark-instrumented-optimizer/python/pyspark/sql
HyukjinKwon e5abbab0ed [SPARK-30128][DOCS][PYTHON][SQL] Document/promote 'recursiveFileLookup' and 'pathGlobFilter' in file sources 'mergeSchema' in ORC
### What changes were proposed in this pull request?

This PR adds and exposes the options, 'recursiveFileLookup' and 'pathGlobFilter' in file sources 'mergeSchema' in ORC, into documentation.

- `recursiveFileLookup` at file sources: https://github.com/apache/spark/pull/24830 ([SPARK-27627](https://issues.apache.org/jira/browse/SPARK-27627))
- `pathGlobFilter` at file sources: https://github.com/apache/spark/pull/24518 ([SPARK-27990](https://issues.apache.org/jira/browse/SPARK-27990))
- `mergeSchema` at ORC: https://github.com/apache/spark/pull/24043 ([SPARK-11412](https://issues.apache.org/jira/browse/SPARK-11412))

**Note that** `timeZone` option was not moved from `DataFrameReader.options` as I assume it will likely affect other datasources as well once DSv2 is complete.

### Why are the changes needed?

To document available options in sources properly.

### Does this PR introduce any user-facing change?

In PySpark, `pathGlobFilter` can be set via `DataFrameReader.(text|orc|parquet|json|csv)` and `DataStreamReader.(text|orc|parquet|json|csv)`.

### How was this patch tested?

Manually built the doc and checked the output. Option setting in PySpark is rather a logical change. I manually tested one only:

```bash
$ ls -al tmp
...
-rw-r--r--   1 hyukjin.kwon  staff     3 Dec 20 12:19 aa
-rw-r--r--   1 hyukjin.kwon  staff     3 Dec 20 12:19 ab
-rw-r--r--   1 hyukjin.kwon  staff     3 Dec 20 12:19 ac
-rw-r--r--   1 hyukjin.kwon  staff     3 Dec 20 12:19 cc
```

```python
>>> spark.read.text("tmp", pathGlobFilter="*c").show()
```

```
+-----+
|value|
+-----+
|   ac|
|   cc|
+-----+
```

Closes #26958 from HyukjinKwon/doc-followup.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-12-23 09:57:42 +09:00
..
avro [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas 2019-12-11 01:26:29 -08:00
tests [SPARK-29188][PYTHON] toPandas (without Arrow) gets wrong dtypes when applied on empty DF 2019-12-12 20:49:10 +09:00
__init__.py [SPARK-27463][PYTHON][FOLLOW-UP] Miscellaneous documentation and code cleanup of cogroup pandas UDF 2019-09-30 22:25:35 +09:00
catalog.py [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 2019-09-09 10:19:40 -05:00
cogroup.py [SPARK-27463][PYTHON][FOLLOW-UP] Miscellaneous documentation and code cleanup of cogroup pandas UDF 2019-09-30 22:25:35 +09:00
column.py [SPARK-29664][PYTHON][SQL] Column.getItem behavior is not consistent with Scala 2019-11-01 12:25:48 +09:00
conf.py [SPARK-23698][PYTHON] Resolve undefined names in Python 3 2018-08-22 10:06:59 -07:00
context.py [MINOR][PYSPARK][DOCS] Fix typo in example documentation 2019-11-01 11:55:29 -07:00
dataframe.py [SPARK-30200][SQL][FOLLOW-UP] Expose only explain(mode: String) in Scala side, and clean up related codes 2019-12-16 14:42:35 +09:00
functions.py [MINOR][DOCS] Fix documentation for slide function 2019-12-16 16:29:09 +09:00
group.py [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs 2019-09-17 17:13:50 -07:00
readwriter.py [SPARK-30128][DOCS][PYTHON][SQL] Document/promote 'recursiveFileLookup' and 'pathGlobFilter' in file sources 'mergeSchema' in ORC 2019-12-23 09:57:42 +09:00
session.py [MINOR][PYSPARK][DOCS] Fix typo in example documentation 2019-11-01 11:55:29 -07:00
streaming.py [SPARK-30128][DOCS][PYTHON][SQL] Document/promote 'recursiveFileLookup' and 'pathGlobFilter' in file sources 'mergeSchema' in ORC 2019-12-23 09:57:42 +09:00
types.py [SPARK-29798][PYTHON][SQL] Infers bytes as binary type in createDataFrame in Python 3 at PySpark 2019-11-08 12:10:39 -08:00
udf.py [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs 2019-09-17 17:13:50 -07:00
utils.py [SPARK-29376][SQL][PYTHON] Upgrade Apache Arrow to version 0.15.1 2019-11-15 13:27:30 +09:00
window.py [SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations 2019-09-01 10:15:00 -05:00