spark-instrumented-optimizer

History

gatorsmile b28fe4a4a9 [SPARK-18538][SQL] Fix Concurrent Table Fetching Using DataFrameReader JDBC APIs ### What changes were proposed in this pull request? The following two `DataFrameReader` JDBC APIs ignore the user-specified parameters of parallelism degree. ```Scala def jdbc( url: String, table: String, columnName: String, lowerBound: Long, upperBound: Long, numPartitions: Int, connectionProperties: Properties): DataFrame ``` ```Scala def jdbc( url: String, table: String, predicates: Array[String], connectionProperties: Properties): DataFrame ``` This PR is to fix the issues. To verify the behavior correctness, we improve the plan output of `EXPLAIN` command by adding `numPartitions` in the `JDBCRelation` node. Before the fix, ``` == Physical Plan == Scan JDBCRelation(TEST.PEOPLE) [NAME#1896,THEID#1897] ReadSchema: struct<NAME:string,THEID:int> ``` After the fix, ``` == Physical Plan == Scan JDBCRelation(TEST.PEOPLE) [numPartitions=3] [NAME#1896,THEID#1897] ReadSchema: struct<NAME:string,THEID:int> ``` ### How was this patch tested? Added the verification logics on all the test cases for JDBC concurrent fetching. Author: gatorsmile <gatorsmile@gmail.com> Closes #15975 from gatorsmile/jdbc.		2016-12-01 15:42:30 +08:00
..
benchmarks	[SPARK-17335][SQL] Fix ArrayType and MapType CatalogString.	2016-09-03 19:02:20 +02:00
src	[SPARK-18538][SQL] Fix Concurrent Table Fetching Using DataFrameReader JDBC APIs	2016-12-01 15:42:30 +08:00
pom.xml	[SPARK-17346][SQL][TEST-MAVEN] Generate the sql test jar to fix the maven build	2016-10-05 18:11:31 -07:00