spark-instrumented-optimizer

History

Reynold Xin 578bfeeff5 [SPARK-7654][SQL] DataFrameReader and DataFrameWriter for input/output API This patch introduces DataFrameWriter and DataFrameReader. DataFrameReader interface, accessible through SQLContext.read, contains methods that create DataFrames. These methods used to reside in SQLContext. Example usage: ```scala sqlContext.read.json("...") sqlContext.read.parquet("...") ``` DataFrameWriter interface, accessible through DataFrame.write, implements a builder pattern to avoid the proliferation of options in writing DataFrame out. It currently implements: - mode - format (e.g. "parquet", "json") - options (generic options passed down into data sources) - partitionBy (partitioning columns) Example usage: ```scala df.write.mode("append").format("json").partitionBy("date").saveAsTable("myJsonTable") ``` TODO: - [ ] Documentation update - [ ] Move JDBC into reader / writer? - [ ] Deprecate the old interfaces - [ ] Move the generic load interface into reader. - [ ] Update example code and documentation Author: Reynold Xin <rxin@databricks.com> Closes #6175 from rxin/reader-writer and squashes the following commits: b146c95 [Reynold Xin] Deprecation of old APIs. bd8abdf [Reynold Xin] Fixed merge conflict. 26abea2 [Reynold Xin] Added general load methods. 244fbec [Reynold Xin] Added equivalent to example. 4f15d92 [Reynold Xin] Added documentation for partitionBy. 7e91611 [Reynold Xin] [SPARK-7654][SQL] DataFrameReader and DataFrameWriter for input/output API.		2015-05-15 22:00:31 -07:00
..
scala-2.10/src/main	[SPARK-7396] [STREAMING] [EXAMPLE] Update KafkaWordCountProducer to use new Producer API	2015-05-06 17:44:43 -07:00
src/main	[SPARK-7654][SQL] DataFrameReader and DataFrameWriter for input/output API	2015-05-15 22:00:31 -07:00
pom.xml	update the deprecated CountMinSketchMonoid function to TopPctCMS function	2015-04-25 08:42:38 -04:00