spark-instrumented-optimizer/python/pyspark
Xiangrui Meng 825d4fe47b [SPARK-3136][MLLIB] Create Java-friendly methods in RandomRDDs
Though we don't use default argument for methods in RandomRDDs, it is still not easy for Java users to use because the output type is either `RDD[Double]` or `RDD[Vector]`. Java users should expect `JavaDoubleRDD` and `JavaRDD[Vector]`, respectively. We should create dedicated methods for Java users, and allow default arguments in Scala methods in RandomRDDs, to make life easier for both Java and Scala users. This PR also contains documentation for random data generation. brkyvz

Author: Xiangrui Meng <meng@databricks.com>

Closes #2041 from mengxr/stat-doc and squashes the following commits:

fc5eedf [Xiangrui Meng] add missing comma
ffde810 [Xiangrui Meng] address comments
aef6d07 [Xiangrui Meng] add doc for random data generation
b99d94b [Xiangrui Meng] add java-friendly methods to RandomRDDs
2014-08-19 16:06:48 -07:00
..
mllib [SPARK-3136][MLLIB] Create Java-friendly methods in RandomRDDs 2014-08-19 16:06:48 -07:00
__init__.py [SPARK-2724] Python version of RandomRDDGenerators 2014-07-31 20:32:57 -07:00
accumulators.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
broadcast.py [SPARK-1065] [PySpark] improve supporting for large broadcast 2014-08-16 16:59:34 -07:00
cloudpickle.py [SPARK-791] [PySpark] fix pickle itemgetter with cloudpickle 2014-07-29 01:02:18 -07:00
conf.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
context.py [SPARK-1065] [PySpark] improve supporting for large broadcast 2014-08-16 16:59:34 -07:00
daemon.py [SPARK-2898] [PySpark] fix bugs in deamon.py 2014-08-10 13:00:38 -07:00
files.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
java_gateway.py [SPARK-2894] spark-shell doesn't accept flags 2014-08-09 21:11:00 -07:00
join.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
rdd.py [SPARK-2790] [PySpark] fix zip with serializers which have different batch sizes. 2014-08-19 14:46:32 -07:00
rddsampler.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
resultiterable.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
serializers.py [SPARK-2790] [PySpark] fix zip with serializers which have different batch sizes. 2014-08-19 14:46:32 -07:00
shell.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
shuffle.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
sql.py [SQL] Using safe floating-point numbers in doctest 2014-08-16 11:26:51 -07:00
statcounter.py StatCounter on NumPy arrays [PYSPARK][SPARK-2012] 2014-08-01 22:33:25 -07:00
storagelevel.py [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
tests.py [SPARK-2790] [PySpark] fix zip with serializers which have different batch sizes. 2014-08-19 14:46:32 -07:00
worker.py [SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL. 2014-08-18 20:42:19 -07:00