Commit graph

3337 commits

Author SHA1 Message Date
shivaram 744da8eefd Merge pull request #679 from ryanlecompte/master
Make binSearch method tail-recursive for RidgeRegression
2013-07-07 17:42:25 -07:00
ryanlecompte be123aa6ef update to use ListBuffer, faster than Vector for append operations 2013-07-07 15:35:06 -07:00
Shivaram Venkataraman 4af0d63cb1 Remove akka LogLevel fix as we no longer use spray 2013-07-07 10:42:43 -07:00
Shivaram Venkataraman d362d0f411 Ignore stderr when calling cat on a non-existing file 2013-07-07 04:09:46 -07:00
Shivaram Venkataraman 3350ad0d7f Catch RejectedExecution exception in Checkpoint handler. 2013-07-07 04:09:37 -07:00
Shivaram Venkataraman 7d6d9e6ab2 Set DriverSuite log level to WARN 2013-07-07 04:09:15 -07:00
Shivaram Venkataraman a948f06725 Suppress log messages in sbt test with two changes:
1. Set akka log level to ERROR before shutting down the actorSystem.
This avoids akka log messages (like Spray) from falling back to INFO
on the Stdout logger
2. Initialize netty to use SLF4J in LocalSparkContext. This ensures that
stack trace thrown during shutdown is handled by SLF4J instead of stdout
2013-07-07 04:09:08 -07:00
Matei Zaharia 3cc6818f13 Merge pull request #668 from shimingfei/guava-14.0.1
update guava version from 11.0.1 to 14.0.1
2013-07-06 19:51:20 -07:00
ryanlecompte f78f8d0b41 fix formatting and use Vector instead of List to maintain order 2013-07-06 16:46:53 -07:00
Matei Zaharia ebe1efc862 Merge remote-tracking branch 'pwendell/ui-updates' 2013-07-06 16:46:15 -07:00
Matei Zaharia fd6665122b Fix some other references to Cloudera Avro and updated Avro version 2013-07-06 16:45:15 -07:00
Patrick Wendell 32b9d21a97 Fix occasional failure in UI listener.
If a task fails before the metrics are initialized, it remains possible
that the metrics field will be `None`. This patch accounts for that possbility
by keeping metrics as an `Option` at all times.
2013-07-06 16:40:02 -07:00
Matei Zaharia 22161887ee Merge pull request #676 from c0s/asf-avro
Use standard ASF published avro module instead of a proprietory built one
2013-07-06 16:18:15 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia 94871e4703 Merge pull request #655 from tgravescs/master
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Matei Zaharia 3f918b33f8 Merge pull request #672 from holdenk/master
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-06 12:45:18 -07:00
Matei Zaharia 2a36e5449b Merge pull request #673 from xiajunluan/master
Add config template file for fair scheduler feature
2013-07-06 12:43:21 -07:00
Matei Zaharia 7ba7fa110b Merge pull request #674 from liancheng/master
Bug fix: SPARK-789
2013-07-06 11:45:08 -07:00
Matei Zaharia f4416a1d7e Merge pull request #681 from BlackNiuza/memory_leak
Remove active job from idToActiveJob when job finished or aborted
2013-07-06 11:41:58 -07:00
BlackNiuza 44a2440039 Remove active job from idToActiveJob when job finished or aborted 2013-07-07 01:33:09 +08:00
Patrick Wendell 37abe84212 Tracking some task metrics even during failures. 2013-07-06 09:19:59 -07:00
Matei Zaharia e063e29af8 Merge pull request #680 from tdas/master
Fixed major performance bug in Network Receiver
2013-07-05 21:54:52 -07:00
Tathagata Das 280418ac45 Reduced the number of Iterator to ArrayBuffer copies in NetworkReceiver. 2013-07-05 21:38:21 -07:00
ryanlecompte 757e56dfc7 make binSearch a tail-recursive method 2013-07-05 19:54:28 -07:00
shivaram bf1311e6d2 Merge pull request #678 from mateiz/ml-examples
Start of ML package
2013-07-05 17:32:44 -07:00
Matei Zaharia 8bbe907556 Replaced string constants in test 2013-07-05 17:25:23 -07:00
Patrick Wendell 84b7fc54e6 Enforcing correct sort order for formatted strings 2013-07-05 17:21:08 -07:00
Matei Zaharia 653043beb6 Renamed files to match package 2013-07-05 17:18:55 -07:00
Matei Zaharia de67deeaab Addressed style comments from Ryan LeCompte 2013-07-05 17:16:49 -07:00
Matei Zaharia 43b24635ee Renamed ML package to MLlib and added it to classpath 2013-07-05 11:38:53 -07:00
Matei Zaharia 399bd65ef5 Fixed compile error due to merge 2013-07-05 11:27:06 -07:00
Shivaram Venkataraman 0e33c88cbd Rename package gradient to optimization 2013-07-05 11:15:19 -07:00
Shivaram Venkataraman 09f187a400 Add top-level methods for regression methods.
Also add multiple versions of them to make it easier to call them from java.
2013-07-05 11:15:19 -07:00
Matei Zaharia 9441d3ef09 Use random seeds for K-means and ALS, and increase tolerance in tests
Random seeds make more sense by default for a machine learning library
because other libraries behave the same way (people expect to be able to
run the algorithm multiple times and get a better answer), but we can
add configuration later if needed. Tests that depend on specific seed
choices seem brittle.
2013-07-05 11:15:19 -07:00
Matei Zaharia e7d49388e3 Added unit test for K-means, and fixed some bugs 2013-07-05 11:15:19 -07:00
Matei Zaharia 652ea0f1d8 Allow RDD.takeSample to give samples bigger than the RDD
Before, when withReplacement was set to true, we would not get a sample
bigger than the RDD's count().

Conflicts:
	core/src/main/scala/spark/RDD.scala
	core/src/test/scala/spark/RDDSuite.scala
2013-07-05 11:15:13 -07:00
Matei Zaharia cffe3340c5 Fix logistic regression test failure and test suite cleanup 2013-07-05 11:13:46 -07:00
Shivaram Venkataraman 496c7548bb Change test to use fewer iterations 2013-07-05 11:13:46 -07:00
Matei Zaharia 52f491125e Implementation of k-means and k-means|| 2013-07-05 11:13:46 -07:00
Matei Zaharia 39684eafe3 Formatting 2013-07-05 11:13:46 -07:00
Matei Zaharia 6586c5e28b Added a SparkContext accessor to RDD 2013-07-05 11:13:46 -07:00
Matei Zaharia 43dae967d7 Renamed "als" package to "recommendation" 2013-07-05 11:13:46 -07:00
Matei Zaharia d3ce898b8e Scaffolding and model for K-means 2013-07-05 11:13:46 -07:00
Matei Zaharia 3c046a6eca Some small fixes to ALS. 2013-07-05 11:13:46 -07:00
Matei Zaharia 6f0ebb2db2 Remove unused import 2013-07-05 11:13:46 -07:00
Matei Zaharia d903b3887f Initial implementation of Alternating Least Squares.
Includes unit tests and sample data to run on.
2013-07-05 11:13:46 -07:00
Matei Zaharia 05be233ce2 Removed dependency on Apache Commons Math 2013-07-05 11:13:46 -07:00
Shivaram Venkataraman 39ed41652b Move to regression, util and gradient packages 2013-07-05 11:13:46 -07:00
Shivaram Venkataraman 43b398db6a Fix logistic regression to not center data.
Also add a feature to get the intercept correct and test these
using a small unit test.
2013-07-05 11:13:45 -07:00
Shivaram Venkataraman 6dd3a816c8 Use a private constructor instead of private vars 2013-07-05 11:13:45 -07:00