spark-instrumented-optimizer/python
Davies Liu 885d1621bc [SPARK-3500] [SQL] use JavaSchemaRDD as SchemaRDD._jschema_rdd
Currently, SchemaRDD._jschema_rdd is SchemaRDD, the Scala API (coalesce(), repartition()) can not been called in Python easily, there is no way to specify the implicit parameter `ord`. The _jrdd is an JavaRDD, so _jschema_rdd should also be JavaSchemaRDD.

In this patch, change _schema_rdd to JavaSchemaRDD, also added an assert for it. If some methods are missing from JavaSchemaRDD, then it's called by _schema_rdd.baseSchemaRDD().xxx().

BTW, Do we need JavaSQLContext?

Author: Davies Liu <davies.liu@gmail.com>

Closes #2369 from davies/fix_schemardd and squashes the following commits:

abee159 [Davies Liu] use JavaSchemaRDD as SchemaRDD._jschema_rdd
2014-09-12 19:05:39 -07:00
..
lib [SPARK-2305] [PySpark] Update Py4J to version 0.8.2.1 2014-07-29 19:02:06 -07:00
pyspark [SPARK-3500] [SQL] use JavaSchemaRDD as SchemaRDD._jschema_rdd 2014-09-12 19:05:39 -07:00
test_support [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
.gitignore SPARK-1004. PySpark on YARN 2014-04-29 23:24:34 -07:00
epydoc.conf [SPARK-2538] [PySpark] Hash based disk spilling aggregation 2014-07-24 22:53:47 -07:00
run-tests [SPARK-3094] [PySpark] compatitable with PyPy 2014-09-12 18:42:50 -07:00