5f83c6991c
CSV is the most common data format in the "small data" world. It is often the first format people want to try when they see Spark on a single node. Having to rely on a 3rd party component for this leads to poor user experience for new users. This PR merges the popular spark-csv data source package (https://github.com/databricks/spark-csv) with SparkSQL. This is a first PR to bring the functionality to spark 2.0 master. We will complete items outlines in the design document (see JIRA attachment) in follow up pull requests. Author: Hossein <hossein@databricks.com> Author: Reynold Xin <rxin@databricks.com> Closes #10766 from rxin/csv.
217 B
217 B
1 | ~ Version 1.0 |
---|---|
2 | ~ Using a non-standard comment char to test CSV parser defaults are overridden |
3 | 1,2,3,4,5.01,2015-08-20 15:57:00 |
4 | 6,7,8,9,0,2015-08-21 16:58:01 |
5 | ~0,9,8,7,6,2015-08-22 17:59:02 |
6 | 1,2,3,4,5,2015-08-23 18:00:42 |