spark-instrumented-optimizer/python/pyspark/sql
Reynold Xin 42de5253f3 [SPARK-11745][SQL] Enable more JSON parsing options
This patch adds the following options to the JSON data source, for dealing with non-standard JSON files:
* `allowComments` (default `false`): ignores Java/C++ style comment in JSON records
* `allowUnquotedFieldNames` (default `false`): allows unquoted JSON field names
* `allowSingleQuotes` (default `true`): allows single quotes in addition to double quotes
* `allowNumericLeadingZeros` (default `false`): allows leading zeros in numbers (e.g. 00012)

To avoid passing a lot of options throughout the json package, I introduced a new JSONOptions case class to define all JSON config options.

Also updated documentation to explain these options.

Scala

![screen shot 2015-11-15 at 6 12 12 pm](https://cloud.githubusercontent.com/assets/323388/11172965/e3ace6ec-8bc4-11e5-805e-2d78f80d0ed6.png)

Python

![screen shot 2015-11-15 at 6 11 28 pm](https://cloud.githubusercontent.com/assets/323388/11172964/e23ed6ee-8bc4-11e5-8216-312f5983acd5.png)

Author: Reynold Xin <rxin@databricks.com>

Closes #9724 from rxin/SPARK-11745.
2015-11-16 00:06:14 -08:00
..
__init__.py [SPARK-10373] [PYSPARK] move @since into pyspark from sql 2015-09-08 20:56:22 -07:00
column.py [SPARK-9014] [SQL] Allow Python spark API to use built-in exponential operator 2015-09-11 15:19:04 -07:00
context.py [SPARK-11671] documentation code example typo 2015-11-12 15:42:30 -08:00
dataframe.py [SPARK-11420] Updating Stddev support via Imperative Aggregate 2015-11-12 13:47:34 -08:00
functions.py [SPARK-11567] [PYTHON] Add Python API for corr Aggregate function 2015-11-10 15:47:10 -08:00
group.py [SPARK-11690][PYSPARK] Add pivot to python api 2015-11-13 10:31:17 -08:00
readwriter.py [SPARK-11745][SQL] Enable more JSON parsing options 2015-11-16 00:06:14 -08:00
tests.py [SPARK-9830][SQL] Remove AggregateExpression1 and Aggregate Operator used to evaluate AggregateExpression1s 2015-11-10 11:06:29 -08:00
types.py [SPARK-11158][SQL] Modified _verify_type() to be more informative on Errors by presenting the Object 2015-10-18 11:39:19 -07:00
utils.py [SPARK-11322] [PYSPARK] Keep full stack trace in captured exception 2015-10-28 21:45:00 -07:00
window.py [SPARK-10373] [PYSPARK] move @since into pyspark from sql 2015-09-08 20:56:22 -07:00