Commit graph

6387 commits

Author SHA1 Message Date
Prashant Sharma 355a033893 SPARK-1005 Ning upgrade 2014-01-06 14:38:27 +05:30
Xusen Yin 05e6d5b454 Added GradientDescentSuite 2014-01-06 16:54:00 +08:00
Tathagata Das ac1f4b06c1 Added a hashmap to cache file mod times. 2014-01-05 23:42:53 -08:00
Holden Karau 2dc83de72e CR feedback (sbt -> sbt/sbt and correct JAR path in script) :) 2014-01-05 23:29:26 -08:00
Patrick Wendell a2e7e04974 Merge pull request #333 from pwendell/logging-silence
Quiet ERROR-level Akka Logs

This fixes an issue I've seen where akka logs a bunch of things at ERROR level when connecting to a standalone cluster, even in the normal case. I noticed that even when lifecycle logging was disabled, the netty code inside of akka still logged away via akka's EndpointWriter class. There are also some other log streams that I think are new in akka 2.2.1 that I've disabled.

Finally, I added some better logging to the standalone client. This makes it more clear when a connection failure occurs what is going on. Previously it never explicitly said if a connection attempt had failed.

The commit messages here have some more detail.
2014-01-05 22:37:36 -08:00
Holden Karau 7d0094bb56 Finish documentation changes 2014-01-05 22:12:47 -08:00
Holden Karau 5a598b2d7b Fix indentatation 2014-01-05 22:07:32 -08:00
Holden Karau d86dc74d79 Code review feedback 2014-01-05 22:05:30 -08:00
Patrick Wendell 675d7eb4f0 Responding to Aaron's review 2014-01-05 21:23:14 -08:00
Xusen Yin a72107284a fix logistic loss bug 2014-01-06 12:30:17 +08:00
Lian, Cheng eb24684748 Fixed test suite compilation errors 2014-01-06 11:26:59 +08:00
Reynold Xin 5b0986a1d6 Merge pull request #334 from pwendell/examples-fix
Removing SPARK_EXAMPLES_JAR in the code

This re-writes all of the examples to use the `SparkContext.jarOfClass` mechanism for loading the examples jar. This necessary for environments like YARN and the Standalone mode where example programs will be submit from inside the cluster rather than at the client using `./spark-example`.

This still leaves SPARK_EXAMPLES_JAR in place in the shell scripts for setting up the classpath if `./spark-example` is run.
2014-01-05 19:25:09 -08:00
Lian, Cheng 5c152e3e21 Fixed several compilation errors in test suites 2014-01-06 10:39:05 +08:00
Tathagata Das 2394794591 Merge branch 'filestream-fix' into driver-test
Conflicts:
	streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
2014-01-06 02:23:53 +00:00
Tathagata Das 8e88db3ca5 Bug fixes to the DriverRunner and minor changes here and there. 2014-01-06 02:21:56 +00:00
Reza Zadeh 746148bc18 fix docs to use SparseMatrix 2014-01-05 18:03:57 -08:00
Lian, Cheng a4048ff31e Get rid of `Either[ActorRef, ActorSelection]'
Although we can send messages via an ActorSelection, it would be better to identify the actor and obtain an ActorRef first, so that we can get informed earlier if the remote actor doesn't exist, and get rid of the annoying Either wrapper.
2014-01-06 09:18:17 +08:00
Reynold Xin f4b924f662 Merge pull request #335 from rxin/ser
Fall back to zero-arg constructor for Serializer initialization if there is no constructor that accepts SparkConf.

This maintains backward compatibility with older serializers implemented by users.
2014-01-05 17:11:47 -08:00
Reynold Xin 63f906322d Fall back to zero-arg constructor for Serializer initialization if there is no constructor that accepts SparkConf.
This maintains backward compatibility with older serializers implemented by users.
2014-01-05 15:52:43 -08:00
Patrick Wendell 94fdcda896 Provide logging when attempts to connect to the master fail.
Without these it's a bit less clear what's going on for the user.

One thing I realize when doing this is that akka itself actually retries
the initial association. So the retry we currently have is redundant with
akka's.
2014-01-05 15:16:01 -08:00
Patrick Wendell aaaa673184 Quite akka when remote lifecycle logging is disabled.
I noticed when connecting to a standalone cluster Spark gives a bunch
of Akka ERROR logs that make it seem like something is failing.

This patch does two things:

1. Akka dead letter logging is turned on/off according to the existing
   lifecycle spark property.
2. We explicitly silence akka's EndpointWriter log in log4j. This is necessary
   because for some reason that log doesn't pick up on the lifecycle
   logging settings. After a few hours of debugging this was the only solution
   I found that worked.
2014-01-05 15:15:59 -08:00
Patrick Wendell 79f52809c8 Removing SPARK_EXAMPLES_JAR in the code 2014-01-05 11:49:42 -08:00
Holden Karau df92f1c025 reindent 2014-01-04 21:48:35 -08:00
Holden Karau d7d95a099f And update docs to match 2014-01-04 21:45:22 -08:00
Holden Karau 0d6700eb5a Make sbt in the sbt directory 2014-01-04 21:44:26 -08:00
Holden Karau d2a5c75a4d Spelling 2014-01-04 21:44:04 -08:00
Holden Karau b4a1ffc6c2 Switch from sbt to ./sbt in the README file 2014-01-04 20:17:30 -08:00
Holden Karau 97123be1d7 Pass commands down to system sbt as well 2014-01-04 20:16:56 -08:00
Holden Karau 9e9a913c2f Add a script to download sbt if not present on the system 2014-01-04 20:08:35 -08:00
Reynold Xin d43ad3ef2c Merge pull request #292 from soulmachine/naive-bayes
standard Naive Bayes classifier

Has implemented the standard Naive Bayes classifier. This is an updated version of #288, which is closed because of misoperations.
2014-01-04 16:29:30 -08:00
Hossein Falaki 8d0c2f7399 Added python binding for bulk recommendation 2014-01-04 16:23:17 -08:00
Dan Crankshaw 86404dac74 Merge pull request #127 from jegonzal/MapByPartition
Adding mapEdges and mapTriplets by Partition
2014-01-04 14:55:54 -08:00
Reza Zadeh 06c0f7628a use SparseMatrix everywhere 2014-01-04 14:28:07 -08:00
Joey e68cdb1b82 Merge pull request #124 from jianpingjwang/master
refactor and bug fix
2014-01-04 13:46:02 -08:00
Joseph E. Gonzalez 6592be2594 slightly more efficient map operation 2014-01-04 13:13:20 -08:00
Joseph E. Gonzalez 0b3efbcf62 Adding partition level mapEdges and mapTriplets. This is necessary to support computation with random number generation. 2014-01-04 13:13:20 -08:00
Joey 280ddf64bd Merge pull request #121 from ankurdave/more-simplify
Simplify GraphImpl internals further
2014-01-04 12:54:41 -08:00
Reza Zadeh cdff9fc858 prettify 2014-01-04 12:44:04 -08:00
Reza Zadeh e9bd6cb51d new example file 2014-01-04 12:33:22 -08:00
Reza Zadeh 8bfcce1ad8 fix tests 2014-01-04 11:52:42 -08:00
Reza Zadeh 35adc72794 set methods 2014-01-04 11:30:36 -08:00
Andrew Or 4de9c9554c Use AtomicInteger for numRunningTasks 2014-01-04 11:16:30 -08:00
Thomas Graves ad35c1a5f2 Fix handling of empty SPARK_EXAMPLES_JAR 2014-01-04 11:42:17 -06:00
Tathagata Das 3d4474330d Removed the exponential backoff for testing. 2014-01-04 08:39:00 -08:00
Reza Zadeh 73daa700bd add k parameter 2014-01-04 01:52:28 -08:00
Andrew Or 2db7884f6f Address Mark's comments 2014-01-04 01:20:09 -08:00
Reza Zadeh 26a74f0c41 using decomposed matrix struct now 2014-01-04 00:38:53 -08:00
Reza Zadeh d2d5e5e062 new return struct 2014-01-04 00:15:04 -08:00
Andrew Or 4296d96c82 Assign spill threshold as a fraction of maximum memory
Further, divide this threshold by the number of tasks running concurrently.

Note that this does not guard against the following scenario: a new task
quickly fills up its share of the memory before old tasks finish spilling
their contents, in which case the total memory used by such maps may exceed
what was specified. Currently, spark.shuffle.safetyFraction mitigates the
effect of this.
2014-01-04 00:00:57 -08:00
Patrick Wendell 10fe23bc34 Merge pull request #329 from pwendell/remove-binaries
SPARK-1002: Remove Binaries from Spark Source

This adds a few changes on top of the work by @scrapcodes.
2014-01-03 23:50:14 -08:00