spark-instrumented-optimizer

History

Wenchen Fan d57daf1f77 [SPARK-13593] [SQL] improve the `createDataFrame` to accept data type string and verify the data ## What changes were proposed in this pull request? This PR improves the `createDataFrame` method to make it also accept datatype string, then users can convert python RDD to DataFrame easily, for example, `df = rdd.toDF("a: int, b: string")`. It also supports flat schema so users can convert an RDD of int to DataFrame directly, we will automatically wrap int to row for users. If schema is given, now we checks if the real data matches the given schema, and throw error if it doesn't. ## How was this patch tested? new tests in `test.py` and doc test in `types.py` Author: Wenchen Fan <wenchen@databricks.com> Closes #11444 from cloud-fan/pyrdd.		2016-03-08 14:00:03 -08:00
..
docs	[SPARK-13154][PYTHON] Add linting for pydocs	2016-02-12 02:13:06 -08:00
lib	[SPARK-12652][PYSPARK] Upgrade Py4J to 0.9.1	2016-01-12 14:27:05 -08:00
pyspark	[SPARK-13593] [SQL] improve the `createDataFrame` to accept data type string and verify the data	2016-03-08 14:00:03 -08:00
test_support	[SPARK-13509][SPARK-13507][SQL] Support for writing CSV with a single function call	2016-02-29 09:44:29 -08:00
.gitignore	[SPARK-3946] gitignore in /python includes wrong directory	2014-10-14 14:09:39 -07:00
pylintrc	[SPARK-13596][BUILD] Move misc top-level build files into appropriate subdirs	2016-03-07 14:48:02 -08:00
run-tests	[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system	2015-06-27 20:24:34 -07:00
run-tests.py	[SPARK-12243][BUILD][PYTHON] PySpark tests are slow in Jenkins.	2016-03-07 12:06:46 -08:00