Commit graph

260 commits

Author SHA1 Message Date
Matei Zaharia 5b3a3e28d7 Complain if Python and NumPy versions are too old for MLlib 2014-01-14 12:27:58 -08:00
Matei Zaharia 4c28a2bad8 Update some Python MLlib parameters to use camelCase, and tweak docs
We've used camel case in other Spark methods so it felt reasonable to
keep using it here and make the code match Scala/Java as much as
possible. Note that parameter names matter in Python because it allows
passing optional parameters by name.
2014-01-11 22:30:48 -08:00
Matei Zaharia 9a0dfdf868 Add Naive Bayes to Python MLlib, and some API fixes
- Added a Python wrapper for Naive Bayes
- Updated the Scala Naive Bayes to match the style of our other
  algorithms better and in particular make it easier to call from Java
  (added builder pattern, removed default value in train method)
- Updated Python MLlib functions to not require a SparkContext; we can
  get that from the RDD the user gives
- Added a toString method in LabeledPoint
- Made the Python MLlib tests run as part of run-tests as well (before
  they could only be run individually through each file)
2014-01-11 22:30:48 -08:00
Hossein Falaki 754f5300a1 Added predictAll python function to MatrixFactorizationModel 2014-01-06 12:19:43 -08:00
Hossein Falaki 04132ea9b2 Added Rating deserializer 2014-01-06 12:19:08 -08:00
Hossein Falaki 8d0c2f7399 Added python binding for bulk recommendation 2014-01-04 16:23:17 -08:00
Tor Myklebust 9cbcf81453 Remove commented code in __init__.py. 2013-12-25 14:12:42 -05:00
Tor Myklebust 5e71354cb7 Fix copypasta in __init__.py. Don't import anything directly into pyspark.mllib. 2013-12-25 14:10:55 -05:00
Tor Myklebust 02208a175c Initial weights in Scala are ones; do that too. Also fix some errors. 2013-12-25 00:53:48 -05:00
Tor Myklebust 05163057a1 Split the mllib bindings into a whole bunch of modules and rename some things. 2013-12-25 00:08:05 -05:00