81 lines
2.9 KiB
Markdown
81 lines
2.9 KiB
Markdown
|
---
|
||
|
layout: global
|
||
|
title: JSON Files
|
||
|
displayTitle: JSON Files
|
||
|
---
|
||
|
|
||
|
<div class="codetabs">
|
||
|
|
||
|
<div data-lang="scala" markdown="1">
|
||
|
Spark SQL can automatically infer the schema of a JSON dataset and load it as a `Dataset[Row]`.
|
||
|
This conversion can be done using `SparkSession.read.json()` on either a `Dataset[String]`,
|
||
|
or a JSON file.
|
||
|
|
||
|
Note that the file that is offered as _a json file_ is not a typical JSON file. Each
|
||
|
line must contain a separate, self-contained valid JSON object. For more information, please see
|
||
|
[JSON Lines text format, also called newline-delimited JSON](http://jsonlines.org/).
|
||
|
|
||
|
For a regular multi-line JSON file, set the `multiLine` option to `true`.
|
||
|
|
||
|
{% include_example json_dataset scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala %}
|
||
|
</div>
|
||
|
|
||
|
<div data-lang="java" markdown="1">
|
||
|
Spark SQL can automatically infer the schema of a JSON dataset and load it as a `Dataset<Row>`.
|
||
|
This conversion can be done using `SparkSession.read().json()` on either a `Dataset<String>`,
|
||
|
or a JSON file.
|
||
|
|
||
|
Note that the file that is offered as _a json file_ is not a typical JSON file. Each
|
||
|
line must contain a separate, self-contained valid JSON object. For more information, please see
|
||
|
[JSON Lines text format, also called newline-delimited JSON](http://jsonlines.org/).
|
||
|
|
||
|
For a regular multi-line JSON file, set the `multiLine` option to `true`.
|
||
|
|
||
|
{% include_example json_dataset java/org/apache/spark/examples/sql/JavaSQLDataSourceExample.java %}
|
||
|
</div>
|
||
|
|
||
|
<div data-lang="python" markdown="1">
|
||
|
Spark SQL can automatically infer the schema of a JSON dataset and load it as a DataFrame.
|
||
|
This conversion can be done using `SparkSession.read.json` on a JSON file.
|
||
|
|
||
|
Note that the file that is offered as _a json file_ is not a typical JSON file. Each
|
||
|
line must contain a separate, self-contained valid JSON object. For more information, please see
|
||
|
[JSON Lines text format, also called newline-delimited JSON](http://jsonlines.org/).
|
||
|
|
||
|
For a regular multi-line JSON file, set the `multiLine` parameter to `True`.
|
||
|
|
||
|
{% include_example json_dataset python/sql/datasource.py %}
|
||
|
</div>
|
||
|
|
||
|
<div data-lang="r" markdown="1">
|
||
|
Spark SQL can automatically infer the schema of a JSON dataset and load it as a DataFrame. using
|
||
|
the `read.json()` function, which loads data from a directory of JSON files where each line of the
|
||
|
files is a JSON object.
|
||
|
|
||
|
Note that the file that is offered as _a json file_ is not a typical JSON file. Each
|
||
|
line must contain a separate, self-contained valid JSON object. For more information, please see
|
||
|
[JSON Lines text format, also called newline-delimited JSON](http://jsonlines.org/).
|
||
|
|
||
|
For a regular multi-line JSON file, set a named parameter `multiLine` to `TRUE`.
|
||
|
|
||
|
{% include_example json_dataset r/RSparkSQLExample.R %}
|
||
|
|
||
|
</div>
|
||
|
|
||
|
<div data-lang="sql" markdown="1">
|
||
|
|
||
|
{% highlight sql %}
|
||
|
|
||
|
CREATE TEMPORARY VIEW jsonTable
|
||
|
USING org.apache.spark.sql.json
|
||
|
OPTIONS (
|
||
|
path "examples/src/main/resources/people.json"
|
||
|
)
|
||
|
|
||
|
SELECT * FROM jsonTable
|
||
|
|
||
|
{% endhighlight %}
|
||
|
|
||
|
</div>
|
||
|
|
||
|
</div>
|