Shivaram Venkataraman
07da72b451
Remove duplicate loss history and clarify why.
...
Also some minor style fixes.
2013-07-29 16:25:17 -07:00
Karen Feng
2d6da9195a
Alphabetized imports
2013-07-29 15:50:52 -07:00
Karen Feng
478a2886d9
Added started tasks to progress bar
2013-07-29 14:51:07 -07:00
Karen Feng
e04a37a332
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
...
cially if it merges an updated upstream into a topic branch.
2013-07-29 14:32:48 -07:00
Reynold Xin
fe7298b587
Merge pull request #741 from pwendell/usability
...
Fix two small usability issues
2013-07-29 14:01:00 -07:00
Karen Feng
43a2cc15c0
Use Bootstrap progress bars in web UI
2013-07-29 13:37:24 -07:00
shivaram
c34c0f6a7c
Merge pull request #731 from pxinghao/master
...
Adding SVM and Lasso
2013-07-29 13:18:10 -07:00
Xinghao
2b2630ba3c
Style fix
...
Lines shortened to < 100 characters
2013-07-29 09:22:49 -07:00
Xinghao
07f17439a5
Fix validatePrediction functions for Classification models
...
Classifiers return categorical (Int) values that should be compared
directly
2013-07-29 09:22:31 -07:00
Xinghao
3a8d07df8c
Deleting extra LogisticRegressionGenerator and RidgeRegressionGenerator
2013-07-29 09:20:26 -07:00
Xinghao
75f3757300
Fix rounding error in LogisticRegression.scala
2013-07-29 09:19:56 -07:00
Matei Zaharia
d8158ced12
Merge branch 'master' of github.com:mesos/spark
2013-07-29 02:52:02 -04:00
Matei Zaharia
497f55755f
Add docs about ipython
2013-07-29 02:51:43 -04:00
Matei Zaharia
feba7ee540
SPARK-815. Python parallelize() should split lists before batching
...
One unfortunate consequence of this fix is that we materialize any
collections that are given to us as generators, but this seems necessary
to get reasonable behavior on small collections. We could add a
batchSize parameter later to bypass auto-computation of batch size if
this becomes a problem (e.g. if users really want to parallelize big
generators nicely)
2013-07-29 02:51:43 -04:00
Matei Zaharia
d75c308695
Use None instead of empty string as it's slightly smaller/faster
2013-07-29 02:51:43 -04:00
Matei Zaharia
96b50e82dc
Allow python/run-tests to run from any directory
2013-07-29 02:51:43 -04:00
Matei Zaharia
b5ec355622
Optimize Python foreach() to not return as many objects
2013-07-29 02:51:43 -04:00
Matei Zaharia
b9d6783f36
Optimize Python take() to not compute entire first partition
2013-07-29 02:51:43 -04:00
Xinghao
c823ee1e2b
Replace map-reduce with dot operator using DoubleMatrix
2013-07-28 22:17:53 -07:00
Xinghao
96e04f4cb7
Fixed SVM and LR train functions to take Int instead of Double for Classification
2013-07-28 22:12:39 -07:00
Xinghao
9398dced03
Changed Classification to return Int instead of Double
...
Also minor changes to formatting and comments
2013-07-28 21:39:19 -07:00
Xinghao
67de051bbb
SVMSuite and LassoSuite rewritten to follow closely with LogisticRegressionSuite
2013-07-28 21:09:56 -07:00
Xinghao
29e042940a
Move data generators to util
2013-07-28 20:39:52 -07:00
Matei Zaharia
72ff62a37c
Two fixes to IPython support:
...
- Don't attempt to run worker processes with ipython (that can cause
some crashes as ipython prints things to standard out)
- Allow passing some IPYTHON_OPTS to launch things like the notebook
2013-07-28 22:23:13 -04:00
Xinghao
ccfa362dde
Change *_LocalRandomSGD to *LocalRandomSGD
2013-07-28 10:33:57 -07:00
Dmitriy Lyubimov
0862494d44
typo
2013-07-27 23:16:20 -07:00
Dmitriy Lyubimov
f5067abe85
changes per comments.
2013-07-27 23:08:00 -07:00
Dmitriy Lyubimov
b241fcfb35
reverting changes to SparkBuild.scala
2013-07-27 22:53:01 -07:00
Matei Zaharia
f11ad72d4e
Some fixes to Python examples (style and package name for LR)
2013-07-27 21:12:22 -04:00
Karen Feng
077f2dad22
Fixed outdated bugs
2013-07-27 16:39:36 -07:00
Patrick Wendell
bcafb36c1e
Slight wording change
2013-07-27 16:03:50 -07:00
Patrick Wendell
8177165ac4
Log executor on finish
2013-07-27 16:02:06 -07:00
Patrick Wendell
c2223e6801
Improve catch scope and logging for client stop()
...
This does two things:
1. Catches the more general `TimeoutException`, since those can be thrown.
2. Logs at info level when a timeout is detected.
2013-07-27 16:02:06 -07:00
Karen Feng
5a93e3c58c
Cleaned up code based on pwendell's suggestions
2013-07-27 15:55:26 -07:00
Karen Feng
dcc4743a95
Moved val now to render
2013-07-27 12:52:53 -07:00
Karen Feng
1714693324
Current time called once with value now
2013-07-27 12:24:41 -07:00
Dmitriy Lyubimov
6a47cee721
style
2013-07-26 22:35:13 -07:00
Dmitriy Lyubimov
0c391feb73
Maximum task failures configurable
2013-07-26 22:34:43 -07:00
Dmitriy Lyubimov
23f3e0f117
mixing in SharedSparkContext for the kryo-collect test
2013-07-26 19:15:11 -07:00
Xinghao
b0bbc7f6a8
Resolve conflicts with master, removed regParam for LogisticRegression
2013-07-26 18:57:39 -07:00
Xinghao
071afe2a33
New files from merge with master
2013-07-26 18:21:20 -07:00
Xinghao
10fd3949e6
Making ClassificationModel serializable
2013-07-26 17:49:11 -07:00
Xinghao
f0a1f95228
Rename LogisticRegression, SVM and Lasso to *_LocalRandomSGD
2013-07-26 17:36:14 -07:00
Xinghao
f74a03c6d8
Multiple changes
...
- Changed LogisticRegression regularization parameter to 0
- Removed println from SVM predict function
- Fixed "Lasso" -> "SVM" in SVMGenerator
- Added comment in Updater.scala to indicate L1 regularization leads to
soft thresholding proximal function
2013-07-26 17:29:44 -07:00
Karen Feng
bd4cc52e30
Made metrics Option instead of Some, fixed NullPointerException
2013-07-26 17:23:18 -07:00
Reynold Xin
f3d72ff2fe
Merge pull request #739 from markhamstra/toolsPom
...
Missing tools/pom.xml scalatest dependency
2013-07-26 17:19:27 -07:00
Reynold Xin
cb366774c8
Merge pull request #738 from harsha2010/pruning
...
Fix bug in Partition Pruning.
2013-07-26 16:59:30 -07:00
Mark Hamstra
3fc6408903
Added missing scalatest dependency
2013-07-26 16:10:20 -07:00
harshars
392d7474fd
Code review
2013-07-26 15:23:15 -07:00
harshars
72cf7ec0e5
Indentation
2013-07-26 15:16:41 -07:00