Commit graph

5199 commits

Author SHA1 Message Date
Patrick Wendell e688e11206 Add log4j exclusion rule to maven.
To make this work I had to rename the defaults file. Otherwise
maven's pattern matching rules included it when trying to match
other log4j.properties files.

I also fixed a bug in the existing maven build where two
<transformers> tags were present in assembly/pom.xml
such that one overwrote the other.
2014-01-07 12:56:24 -08:00
Reynold Xin 15d9534501 Merge pull request #318 from srowen/master
Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance

Sorry if this is too abrupt or not a welcome set of changes, but thought I'd see if I could contribute a little. I'm a Java developer and just getting seriously into Spark. So I thought I'd suggest a number of small changes to the couple Java parts of the code to make it a little tighter, more standard and even a bit faster.

Feel free to take all, some or none of this. Happy to explain any of it.
2014-01-07 08:10:02 -08:00
Reynold Xin 468af0fa03 Merge pull request #348 from prabeesh/master
spark -> org.apache.spark

Changed package name spark to org.apache.spark which was missing in some of the files
2014-01-07 08:09:01 -08:00
Sean Owen 4b92a20232 Issue #318 : minor style updates per review from Reynold Xin 2014-01-07 09:38:45 +00:00
Patrick Wendell c3cf0475e8 Merge pull request #339 from ScrapCodes/conf-improvements
Conf improvements

There are two new features.

1. Allow users to set arbitrary akka configurations via spark conf.

2. Allow configuration to be printed in logs for diagnosis.
2014-01-07 00:54:25 -08:00
Reynold Xin a862cafacf Merge pull request #331 from holdenk/master
Add a script to download sbt if not present on the system

As per the discussion on the dev mailing list this script will use the system sbt if present or otherwise attempt to install the sbt launcher. The fall back error message in the event it fails instructs the user to install sbt. While the URLs it fetches from aren't controlled by the spark project directly, they are stable and the current authoritative sources.
2014-01-07 00:18:20 -08:00
Holden Karau 60a7a6b31a Use awk to extract the version 2014-01-06 23:45:27 -08:00
Prashant Sharma c729fa7c8e formatting related fixes suggested by Patrick. 2014-01-07 13:08:16 +05:30
Prashant Sharma b84dc780d3 Allow configuration to be printed in logs for diagnosis. 2014-01-07 13:01:43 +05:30
Prashant Sharma b3018811e1 Allow users to set arbitrary akka configurations via spark conf. 2014-01-07 13:01:43 +05:30
Holden Karau b590adb2ad Put quote arround arguments passed down to system sbt 2014-01-06 23:31:39 -08:00
prabeesh a91f14cfdc spark -> org.apache.spark 2014-01-07 12:21:20 +05:30
Patrick Wendell b97ef218f3 Merge pull request #346 from sproblvem/patch-1
Update stop-slaves.sh

The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.
2014-01-06 20:12:57 -08:00
sproblvem dea4ba9d80 Update stop-slaves.sh
The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.
2014-01-07 11:11:59 +08:00
Patrick Wendell e4d6057b66 Merge pull request #343 from pwendell/build-fix
Fix test breaking downstream builds

This wasn't detected in the pull-request-builder because it manually sets SPARK_HOME. I'm going to change that (it should't do this) to make it like the other builds.
2014-01-06 14:56:54 -08:00
Patrick Wendell 9272a004af Fix test breaking downstream builds 2014-01-06 13:03:19 -08:00
Patrick Wendell 93bf96205d Merge pull request #340 from ScrapCodes/sbt-fixes
Made java options to be applied during tests so that they become self explanatory.
2014-01-06 11:42:41 -08:00
Patrick Wendell 60edeb3d65 Merge pull request #338 from ScrapCodes/ning-upgrade
SPARK-1005 Ning upgrade
2014-01-06 11:40:32 -08:00
Patrick Wendell c708e81793 Merge pull request #341 from ash211/patch-5
Clarify spark.cores.max in docs

It controls the count of cores across the cluster, not on a per-machine basis.
2014-01-06 11:35:48 -08:00
Patrick Wendell 33fcb91e81 Merge pull request #342 from tgravescs/fix_maven_protobuf
Change protobuf version for yarn alpha back to 2.4.1

The maven build for yarn-alpha uses the wrong protobuf version and hence the generated assembly jar doesn't work with Hadoop 0.23.  Removing the setting for the yarn-alpha profile since the default protobuf version is 2.4.1 at the top of the pom file.
2014-01-06 11:19:23 -08:00
Patrick Wendell 357083c29f Merge pull request #330 from tgravescs/fix_addjars_null_handling
Fix handling of empty SPARK_EXAMPLES_JAR

Currently if SPARK_EXAMPLES_JAR is left unset you get a null pointer exception when running the examples (atleast on spark on yarn).  The null now gets turned into a string of "null" when its put into the SparkConf so addJar no longer properly ignores it. This fixes that so that it can be left unset.
2014-01-06 10:29:04 -08:00
Thomas Graves 1f7c090e4b Change protobuf version for yarn alpha back to 2.4.1 2014-01-06 12:04:22 -06:00
Andrew Ash 2dd4fb5698 Clarify spark.cores.max
It controls the count of cores across the cluster, not on a per-machine basis.
2014-01-06 09:01:46 -08:00
Sean Owen 7379b2915f Merge remote-tracking branch 'upstream/master' 2014-01-06 15:13:16 +00:00
Thomas Graves 25446dd931 Add warning to null setJars check 2014-01-06 07:58:59 -06:00
Prashant Sharma 2d0825e9f4 Made java options to be applied during tests so that they become self explanatory. 2014-01-06 16:03:31 +05:30
Prashant Sharma 355a033893 SPARK-1005 Ning upgrade 2014-01-06 14:38:27 +05:30
Holden Karau 2dc83de72e CR feedback (sbt -> sbt/sbt and correct JAR path in script) :) 2014-01-05 23:29:26 -08:00
Patrick Wendell a2e7e04974 Merge pull request #333 from pwendell/logging-silence
Quiet ERROR-level Akka Logs

This fixes an issue I've seen where akka logs a bunch of things at ERROR level when connecting to a standalone cluster, even in the normal case. I noticed that even when lifecycle logging was disabled, the netty code inside of akka still logged away via akka's EndpointWriter class. There are also some other log streams that I think are new in akka 2.2.1 that I've disabled.

Finally, I added some better logging to the standalone client. This makes it more clear when a connection failure occurs what is going on. Previously it never explicitly said if a connection attempt had failed.

The commit messages here have some more detail.
2014-01-05 22:37:36 -08:00
Holden Karau 7d0094bb56 Finish documentation changes 2014-01-05 22:12:47 -08:00
Holden Karau 5a598b2d7b Fix indentatation 2014-01-05 22:07:32 -08:00
Holden Karau d86dc74d79 Code review feedback 2014-01-05 22:05:30 -08:00
Patrick Wendell 675d7eb4f0 Responding to Aaron's review 2014-01-05 21:23:14 -08:00
Reynold Xin 5b0986a1d6 Merge pull request #334 from pwendell/examples-fix
Removing SPARK_EXAMPLES_JAR in the code

This re-writes all of the examples to use the `SparkContext.jarOfClass` mechanism for loading the examples jar. This necessary for environments like YARN and the Standalone mode where example programs will be submit from inside the cluster rather than at the client using `./spark-example`.

This still leaves SPARK_EXAMPLES_JAR in place in the shell scripts for setting up the classpath if `./spark-example` is run.
2014-01-05 19:25:09 -08:00
Reynold Xin f4b924f662 Merge pull request #335 from rxin/ser
Fall back to zero-arg constructor for Serializer initialization if there is no constructor that accepts SparkConf.

This maintains backward compatibility with older serializers implemented by users.
2014-01-05 17:11:47 -08:00
Reynold Xin 63f906322d Fall back to zero-arg constructor for Serializer initialization if there is no constructor that accepts SparkConf.
This maintains backward compatibility with older serializers implemented by users.
2014-01-05 15:52:43 -08:00
Patrick Wendell 94fdcda896 Provide logging when attempts to connect to the master fail.
Without these it's a bit less clear what's going on for the user.

One thing I realize when doing this is that akka itself actually retries
the initial association. So the retry we currently have is redundant with
akka's.
2014-01-05 15:16:01 -08:00
Patrick Wendell aaaa673184 Quite akka when remote lifecycle logging is disabled.
I noticed when connecting to a standalone cluster Spark gives a bunch
of Akka ERROR logs that make it seem like something is failing.

This patch does two things:

1. Akka dead letter logging is turned on/off according to the existing
   lifecycle spark property.
2. We explicitly silence akka's EndpointWriter log in log4j. This is necessary
   because for some reason that log doesn't pick up on the lifecycle
   logging settings. After a few hours of debugging this was the only solution
   I found that worked.
2014-01-05 15:15:59 -08:00
Patrick Wendell 79f52809c8 Removing SPARK_EXAMPLES_JAR in the code 2014-01-05 11:49:42 -08:00
Holden Karau df92f1c025 reindent 2014-01-04 21:48:35 -08:00
Holden Karau d7d95a099f And update docs to match 2014-01-04 21:45:22 -08:00
Holden Karau 0d6700eb5a Make sbt in the sbt directory 2014-01-04 21:44:26 -08:00
Holden Karau d2a5c75a4d Spelling 2014-01-04 21:44:04 -08:00
Holden Karau b4a1ffc6c2 Switch from sbt to ./sbt in the README file 2014-01-04 20:17:30 -08:00
Holden Karau 97123be1d7 Pass commands down to system sbt as well 2014-01-04 20:16:56 -08:00
Holden Karau 9e9a913c2f Add a script to download sbt if not present on the system 2014-01-04 20:08:35 -08:00
Reynold Xin d43ad3ef2c Merge pull request #292 from soulmachine/naive-bayes
standard Naive Bayes classifier

Has implemented the standard Naive Bayes classifier. This is an updated version of #288, which is closed because of misoperations.
2014-01-04 16:29:30 -08:00
Thomas Graves ad35c1a5f2 Fix handling of empty SPARK_EXAMPLES_JAR 2014-01-04 11:42:17 -06:00
Patrick Wendell 10fe23bc34 Merge pull request #329 from pwendell/remove-binaries
SPARK-1002: Remove Binaries from Spark Source

This adds a few changes on top of the work by @scrapcodes.
2014-01-03 23:50:14 -08:00
Patrick Wendell 604fad9c39 Merge remote-tracking branch 'apache-github/master' into remove-binaries
Conflicts:
	core/src/test/scala/org/apache/spark/DriverSuite.scala
	docs/python-programming-guide.md
2014-01-03 21:29:33 -08:00