spark-instrumented-optimizer/project
Gengliang Wang 395860a986 [SPARK-24768][SQL] Have a built-in AVRO data source implementation
## What changes were proposed in this pull request?

Apache Avro (https://avro.apache.org) is a popular data serialization format. It is widely used in the Spark and Hadoop ecosystem, especially for Kafka-based data pipelines.  Using the external package https://github.com/databricks/spark-avro, Spark SQL can read and write the avro data. Making spark-Avro built-in can provide a better experience for first-time users of Spark SQL and structured streaming. We expect the built-in Avro data source can further improve the adoption of structured streaming.
The proposal is to inline code from spark-avro package (https://github.com/databricks/spark-avro). The target release is Spark 2.4.

[Built-in AVRO Data Source In Spark 2.4.pdf](https://github.com/apache/spark/files/2181511/Built-in.AVRO.Data.Source.In.Spark.2.4.pdf)

## How was this patch tested?

Unit test

Author: Gengliang Wang <gengliang.wang@databricks.com>

Closes #21742 from gengliangwang/export_avro.
2018-07-12 13:55:25 -07:00
..
build.properties [SPARK-24419][BUILD] Upgrade SBT to 0.13.17 with Scala 2.10.7 for JDK9+ 2018-05-30 05:18:18 -07:00
MimaBuild.scala [SPARK-23070] Bump previousSparkVersion in MimaBuild.scala to be 2.2.0 2018-01-15 22:32:38 +08:00
MimaExcludes.scala [SPARK-6237][NETWORK] Network-layer changes to allow stream upload. 2018-06-26 15:56:58 -07:00
plugins.sbt [SPARK-22269][BUILD] Run Java linter via SBT for Jenkins 2018-05-24 14:19:32 +08:00
SparkBuild.scala [SPARK-24768][SQL] Have a built-in AVRO data source implementation 2018-07-12 13:55:25 -07:00