spark-instrumented-optimizer/python/pyspark/mllib
Reynold Xin e98dfe627c [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
- The old implicit would convert RDDs directly to DataFrames, and that added too many methods.
- toDataFrame -> toDF
- Dsl -> functions
- implicits moved into SQLContext.implicits
- addColumn -> withColumn
- renameColumn -> withColumnRenamed

Python changes:
- toDataFrame -> toDF
- Dsl -> functions package
- addColumn -> withColumn
- renameColumn -> withColumnRenamed
- add toDF functions to RDD on SQLContext init
- add flatMap to DataFrame

Author: Reynold Xin <rxin@databricks.com>
Author: Davies Liu <davies@databricks.com>

Closes #4556 from rxin/SPARK-5752 and squashes the following commits:

5ef9910 [Reynold Xin] More fix
61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into SPARK-5752
ff5832c [Reynold Xin] Fix python
749c675 [Reynold Xin] count(*) fixes.
5806df0 [Reynold Xin] Fix build break again.
d941f3d [Reynold Xin] Fixed explode compilation break.
fe1267a [Davies Liu] flatMap
c4afb8e [Reynold Xin] style
d9de47f [Davies Liu] add comment
b783994 [Davies Liu] add comment for toDF
e2154e5 [Davies Liu] schema() -> schema
3a1004f [Davies Liu] Dsl -> functions, toDF()
fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
97dd47c [Davies Liu] fix mistake
6168f74 [Davies Liu] fix test
1fc0199 [Davies Liu] fix test
a075cd5 [Davies Liu] clean up, toPandas
663d314 [Davies Liu] add test for agg('*')
9e214d5 [Reynold Xin] count(*) fixes.
1ed7136 [Reynold Xin] Fix build break again.
921b2e3 [Reynold Xin] Fixed explode compilation break.
14698d4 [Davies Liu] flatMap
ba3e12d [Reynold Xin] style
d08c92d [Davies Liu] add comment
5c8b524 [Davies Liu] add comment for toDF
a4e5e66 [Davies Liu] schema() -> schema
d377fc9 [Davies Liu] Dsl -> functions, toDF()
6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
2015-02-13 23:03:22 -08:00
..
stat [SPARK-5012][MLLib][PySpark]Python API for Gaussian Mixture Model 2015-02-02 23:04:55 -08:00
__init__.py [SPARK-4821] [mllib] [python] [docs] Fix for pyspark.mllib.rand doc 2014-12-17 14:12:46 -08:00
classification.py [SPARK-4822] Use sphinx tags for Python doc annotations 2014-12-17 17:31:24 -08:00
clustering.py [SPARK-5012][MLLib][PySpark]Python API for Gaussian Mixture Model 2015-02-02 23:04:55 -08:00
common.py [SPARK-5223] [MLlib] [PySpark] fix MapConverter and ListConverter in MLlib 2015-01-13 12:50:31 -08:00
feature.py [SPARK-4822] Use sphinx tags for Python doc annotations 2014-12-17 17:31:24 -08:00
linalg.py [SPARK-5469] restructure pyspark.sql into multiple files 2015-02-09 20:49:22 -08:00
rand.py [SPARK-4891][PySpark][MLlib] Add gamma/log normal/exp dist sampling to P... 2015-01-08 15:03:43 -08:00
recommendation.py [SPARK-5536] replace old ALS implementation by the new one 2015-02-02 23:49:09 -08:00
regression.py [SPARK-4531] [MLlib] cache serialized java object 2014-11-21 15:02:31 -08:00
tests.py [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames 2015-02-13 23:03:22 -08:00
tree.py [SPARK-5094][MLlib] Add Python API for Gradient Boosted Trees 2015-01-30 00:39:44 -08:00
util.py [SPARK-4324] [PySpark] [MLlib] support numpy.array for all MLlib API 2014-11-10 22:26:16 -08:00