ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Ankur Dave	91227566bc	Merge remote-tracking branch 'spark-upstream/master' into HEAD Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala pom.xml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala	2014-01-08 21:19:08 -08:00
Patrick Wendell	bc81ce040d	Merge remote-tracking branch 'apache-github/master' into standalone-driver Conflicts: core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala pom.xml	2014-01-08 00:38:31 -08:00
Patrick Wendell	c0f0155eca	Merge pull request #313 from tdas/project-refactor Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc. At a high level, these are the following changes. 1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules. 2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`. 3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information). 4. Jars of the external projects have been added to examples project but not to the assembly project. 5. In some files, imports have been rearrange to conform to the Spark coding guidelines.	2014-01-07 22:21:52 -08:00
Patrick Wendell	e21a707a13	Adding unit tests and some refactoring to promote testability.	2014-01-07 15:39:47 -08:00
Reynold Xin	a862cafacf	Merge pull request #331 from holdenk/master Add a script to download sbt if not present on the system As per the discussion on the dev mailing list this script will use the system sbt if present or otherwise attempt to install the sbt launcher. The fall back error message in the event it fails instructs the user to install sbt. While the URLs it fetches from aren't controlled by the spark project directly, they are stable and the current authoritative sources.	2014-01-07 00:18:20 -08:00
Holden Karau	60a7a6b31a	Use awk to extract the version	2014-01-06 23:45:27 -08:00
Patrick Wendell	93bf96205d	Merge pull request #340 from ScrapCodes/sbt-fixes Made java options to be applied during tests so that they become self explanatory.	2014-01-06 11:42:41 -08:00
Tathagata Das	3b4c4c7f4d	Merge remote-tracking branch 'apache/master' into project-refactor Conflicts: examples/src/main/java/org/apache/spark/streaming/examples/JavaFlumeEventCount.java streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala	2014-01-06 03:05:52 -08:00
Prashant Sharma	2d0825e9f4	Made java options to be applied during tests so that they become self explanatory.	2014-01-06 16:03:31 +05:30
Prashant Sharma	355a033893	SPARK-1005 Ning upgrade	2014-01-06 14:38:27 +05:30
Holden Karau	2dc83de72e	CR feedback (sbt -> sbt/sbt and correct JAR path in script) :)	2014-01-05 23:29:26 -08:00
Holden Karau	9e9a913c2f	Add a script to download sbt if not present on the system	2014-01-04 20:08:35 -08:00
Patrick Wendell	604fad9c39	Merge remote-tracking branch 'apache-github/master' into remove-binaries Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md	2014-01-03 21:29:33 -08:00
Patrick Wendell	9e6f3bdcda	Changes on top of Prashant's patch. Closes #316	2014-01-03 18:30:17 -08:00
Prashant Sharma	94f2fffa23	fixed review comments	2014-01-03 14:43:37 +05:30
Prashant Sharma	b4bb80002b	Merge branch 'master' into spark-1002-remove-jars	2014-01-03 12:12:04 +05:30
Raymond Liu	ebdfa6bb97	Using name yarn-alpha/yarn instead of yarn-2.0/yarn-2.2	2014-01-03 12:14:38 +08:00
Raymond Liu	a47ebf7228	Add yarn/common/src/test dir in building script	2014-01-03 12:14:38 +08:00
Raymond Liu	d1a6f7aabc	Use unmanaged source dir to include common yarn code	2014-01-03 12:14:37 +08:00
Raymond Liu	3dc379ce5a	Reorganize yarn related codes into sub projects to remove duplicate files.	2014-01-03 12:12:37 +08:00
Prashant Sharma	8821c3a526	Deleted py4j jar and added to assembly dependency	2014-01-02 13:09:46 +05:30
Matei Zaharia	0e5b2adb5c	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: project/SparkBuild.scala	2014-01-01 13:28:54 -05:00
Reynold Xin	8b8e70ebde	Merge pull request #73 from falaki/ApproximateDistinctCount Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.	2013-12-31 17:48:24 -08:00
Matei Zaharia	ba9338f104	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2013-12-31 18:23:14 -05:00
Tathagata Das	97630849ff	Added pom.xml for external projects and removed unnecessary dependencies and repositoris from other poms and sbt.	2013-12-31 00:28:57 -08:00
Hossein Falaki	b75d7c98bc	Added stream 2.5.1 jar depenency	2013-12-30 19:29:17 -08:00
Tathagata Das	f4e4066191	Refactored kafka, flume, zeromq, mqtt as separate external projects, with their own self-contained scala API, java API, scala unit tests and java unit tests. Updated examples to use the external projects.	2013-12-30 11:13:24 -08:00
Matei Zaharia	b4ceed40d6	Merge remote-tracking branch 'origin/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala	2013-12-29 15:08:08 -05:00
Tathagata Das	6e43039614	Refactored streaming project to separate out the twitter functionality.	2013-12-26 18:02:49 -08:00
Binh Nguyen	040dd3ecd5	upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final	2013-12-24 14:58:18 -08:00
Prashant Sharma	2573add94c	spark-544, introducing SparkConf and related configuration overhaul.	2013-12-25 00:09:36 +05:30
Reynold Xin	fc80b2e693	Show full stack trace and time taken in unit tests.	2013-12-23 21:20:20 -08:00
Aaron Davidson	eaf6a269b1	[SPARK-959] Explicitly depend on org.eclipse.jetty.orbit jar Without this, in some cases, Ivy attempts to download the wrong file and fails, stopping the whole build. See bug for more details. (This is probably also the beginning of the slow death of our recently prettified dependencies. Form follow function.)	2013-12-18 23:37:31 -08:00
Patrick Wendell	c6f95e603e	Attempt with extra repositories	2013-12-16 21:53:51 -08:00
Prashant Sharma	a854cc536d	Review comments on the PR for scala 2.10 migration.	2013-12-13 15:19:51 +05:30
Prashant Sharma	589b83a18f	Disabled yarn 2.2 and added a message in the sbt build	2013-12-12 16:25:30 +05:30
Prashant Sharma	f4c73df5c9	Merge branch 'akka-bug-fix' of github.com:ScrapCodes/incubator-spark into akka-bug-fix	2013-12-11 10:22:44 +05:30
Prashant Sharma	603af51bb5	Merge branch 'master' into akka-bug-fix Conflicts: core/pom.xml core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala pom.xml project/SparkBuild.scala streaming/pom.xml yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala	2013-12-11 10:21:53 +05:30
Prashant Sharma	0b82b5af1e	added eclipse repository for spark streaming.	2013-12-11 08:17:02 +05:30
Harvey Feng	1b6e450771	Use published "org.spark-project.akka-*" in sbt build for Hadoop-2.2 dependencies. This also includes: -Change `isNewYarn` to `isNewHadoop`, since the protobuf-2.5 dependency is from Hadoop-2.2 itself. -Regexp bugix Credits to @alig for this patch.	2013-12-03 00:28:33 -08:00
Harvey Feng	afe4fe7f5e	Merge remote-tracking branch 'origin/master' into yarn-2.2 Conflicts: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala	2013-11-26 15:03:03 -08:00
Harvey Feng	a1a1c62a3e	Add optional Hadoop 2.2 settings in sbt build. If the Hadoop used is version 2.2 or derived from it, then Spark will be compiled against protobuf-2.5 and a protobuf-2.5 version of Akka 2.0.5.	2013-11-26 14:58:41 -08:00
Prashant Sharma	44fd30d3fb	Merge branch 'master' into scala-2.10-wip Conflicts: core/src/main/scala/org/apache/spark/rdd/RDD.scala project/SparkBuild.scala	2013-11-25 18:10:54 +05:30
Reynold Xin	6bcac986b2	Merge branch 'master' of github.com:apache/incubator-spark	2013-11-25 15:47:47 +08:00
Matei Zaharia	859d62dc2a	Merge pull request #151 from russellcardullo/add-graphite-sink Add graphite sink for metrics This adds a metrics sink for graphite. The sink must be configured with the host and port of a graphite node and optionally may be configured with a prefix that will be prepended to all metrics that are sent to graphite.	2013-11-24 16:19:51 -08:00
Aaron Davidson	ce1d2af7e4	Use Kafka 2.10 (again)	2013-11-14 23:00:39 -08:00
Aaron Davidson	f629ba95b6	Various merge corrections I've diff'd this patch against my own -- since they were both created independently, this means that two sets of eyes have gone over all the merge conflicts that were created, so I'm feeling significantly more confident in the resulting PR. @rxin has looked at the changes to the repl and is resoundingly confident that they are correct.	2013-11-14 22:13:09 -08:00
Raymond Liu	d4cd32330e	Some fixes for previous master merge commits	2013-11-15 10:22:31 +08:00
Raymond Liu	a60620b76a	Merge branch 'master' into scala-2.10	2013-11-14 12:44:19 +08:00
Matei Zaharia	9290e5bcd2	Merge pull request #165 from NathanHowell/kerberos-master spark-assembly.jar fails to authenticate with YARN ResourceManager The META-INF/services/ sbt MergeStrategy was discarding support for Kerberos, among others. This pull request changes to a merge strategy similar to sbt-assembly's default. I've also included an update to sbt-assembly 0.9.2, a minor fix to it's zip file handling.	2013-11-13 16:48:44 -08:00
Raymond Liu	0f2e3c6e31	Merge branch 'master' into scala-2.10	2013-11-13 16:55:11 +08:00
Matei Zaharia	f49ea28d25	Merge pull request #137 from tgravescs/sparkYarnJarsHdfsRebase Allow spark on yarn to be run from HDFS. Allows the spark.jar, app.jar, and log4j.properties to be put into hdfs. Allows you to specify the files on a different hdfs cluster and it will copy them over. It makes sure permissions are correct and makes sure to put things into public distributed cache so they can be reused amongst users if their permissions are appropriate. Also add a bit of error handling for missing arguments.	2013-11-12 19:13:39 -08:00
Nathan Howell	48eac0bcbf	Upgrade to sbt-assembly 0.9.2	2013-11-12 13:29:25 -08:00
Nathan Howell	23146a6705	spark-assembly.jar fails to authenticate with YARN ResourceManager sbt-assembly is setup to pick the first META-INF/services/org.apache.hadoop.security.SecurityInfo file instead of merging them. This causes Kerberos authentication to fail, this manifests itself in the "info:null" debug log statement: DEBUG SaslRpcClient: Get token info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null DEBUG SaslRpcClient: Get kerberos info proto:interface org.apache.hadoop.yarn.api.ApplicationClientProtocolPB info:null ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] DEBUG UserGroupInformation: PrivilegedAction as:foo@BAR (auth:KERBEROS) from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:583) WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] ERROR UserGroupInformation: PriviledgedActionException as:foo@BAR (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] This previously would just contain a single class: $ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfo Archive: assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo org.apache.hadoop.security.AnnotatedSecurityInfo And now has the full list of classes: $ unzip -c assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar META-INF/services/org.apache.hadoop.security.SecurityInfoArchive: assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.2.0.jar inflating: META-INF/services/org.apache.hadoop.security.SecurityInfo org.apache.hadoop.security.AnnotatedSecurityInfo org.apache.hadoop.mapreduce.v2.app.MRClientSecurityInfo org.apache.hadoop.mapreduce.v2.security.client.ClientHSSecurityInfo org.apache.hadoop.yarn.security.client.ClientRMSecurityInfo org.apache.hadoop.yarn.security.ContainerManagerSecurityInfo org.apache.hadoop.yarn.security.SchedulerSecurityInfo org.apache.hadoop.yarn.security.admin.AdminSecurityInfo org.apache.hadoop.yarn.server.RMNMSecurityInfoClass	2013-11-12 13:27:50 -08:00
tgravescs	17bb9a27b2	Add mockito to the sbt build	2013-11-11 10:01:23 -06:00
Josh Rosen	a37ff0f1db	Add spark-tools assembly to spark-class classpath. This allows the JavaAPICompletenessChecker to be run with Spark 0.8+.	2013-11-09 13:42:45 -08:00
Russell Cardullo	ef85a51f85	Add graphite sink for metrics This adds a metrics sink for graphite. The sink must be configured with the host and port of a graphite node and optionally may be configured with a prefix that will be prepended to all metrics that are sent to graphite.	2013-11-08 16:36:03 -08:00
Ankur Dave	5064f9b2d2	Merge remote-tracking branch 'spark-upstream/master' Conflicts: project/SparkBuild.scala	2013-10-30 15:59:09 -07:00
Patrick Wendell	af4a529f6e	Exclude jopt from kafka dependency. Kafka uses an older version of jopt that causes bad conflicts with the version used by spark-perf. It's not easy to remove this downstream because of the way that spark-perf uses Spark (by including a spark assembly as an unmanaged jar). This fixes the problem at its source by just never including it.	2013-10-25 09:20:30 -07:00
Prashant Sharma	c77ca1fed9	Updating to latest akka 2.2.3, which fixes our only failing Driver Suite	2013-10-24 16:11:40 +05:30
Matei Zaharia	dadfc63b03	Fix Maven build to use MQTT repository	2013-10-23 15:29:22 -07:00
Matei Zaharia	dd659642e7	Merge pull request #64 from prabeesh/master MQTT Adapter for Spark Streaming MQTT is a machine-to-machine (M2M)/Internet of Things connectivity protocol. It was designed as an extremely lightweight publish/subscribe messaging transport. You may read more about it here http://mqtt.org/ Message Queue Telemetry Transport (MQTT) is an open message protocol for M2M communications. It enables the transfer of telemetry-style data in the form of messages from devices like sensors and actuators, to mobile phones, embedded systems on vehicles, or laptops and full scale computers. The protocol was invented by Andy Stanford-Clark of IBM, and Arlen Nipper of Cirrus Link Solutions This protocol enables a publish/subscribe messaging model in an extremely lightweight way. It is useful for connections with remote locations where line of code and network bandwidth is a constraint. MQTT is one of the widely used protocol for 'Internet of Things'. This protocol is getting much attraction as anything and everything is getting connected to internet and they all produce data. Researchers and companies predict some 25 billion devices will be connected to the internet by 2015. Plugin/Support for MQTT is available in popular MQs like RabbitMQ, ActiveMQ etc. Support for MQTT in Spark will help people with Internet of Things (IoT) projects to use Spark Streaming for their real time data processing needs (from sensors and other embedded devices etc).	2013-10-23 15:07:59 -07:00
Matei Zaharia	731c94e91d	Merge pull request #56 from jerryshao/kafka-0.8-dev Upgrade Kafka 0.7.2 to Kafka 0.8.0-beta1 for Spark Streaming Conflicts: streaming/pom.xml	2013-10-21 23:31:38 -07:00
Matei Zaharia	8de9706b86	Merge pull request #66 from shivaram/sbt-assembly-deps Add SBT target to assemble dependencies This pull request is an attempt to address the long assembly build times during development. Instead of rebuilding the assembly jar for every Spark change, this pull request adds a new SBT target `spark` that packages all the Spark modules and builds an assembly of the dependencies. So the work flow that should work now would be something like ``` ./sbt/sbt spark # Doing this once should suffice ## Make changes ./sbt/sbt compile ./sbt/sbt test or ./spark-shell ```	2013-10-18 20:32:39 -07:00
Joseph E. Gonzalez	1856b37e9d	Merge branch 'master' of https://github.com/apache/incubator-spark into indexedrdd_graphx	2013-10-18 12:21:19 -07:00
prabeesh	29245605bf	remove unused dependency	2013-10-17 09:57:30 +05:30
Shivaram Venkataraman	0a4b76fcc2	Rename SBT target to assemble-deps.	2013-10-16 17:05:46 -07:00
prabeesh	06de3d516d	added mqtt adapter library dependencies	2013-10-16 13:38:37 +05:30
Patrick Wendell	35befe07bb	Fixing spark streaming example and a bug in examples build. - Examples assembly included a log4j.properties which clobbered Spark's - Example had an error where some classes weren't serializable - Did some other clean-up in this example	2013-10-15 22:55:43 -07:00
Shivaram Venkataraman	051cd960d9	Merge branch 'master' of https://github.com/apache/incubator-spark into sbt-assembly-deps	2013-10-15 13:26:40 -07:00
Joseph E. Gonzalez	ef7c369092	merged with upstream changes	2013-10-14 22:56:42 -07:00
jerryshao	c23cd72b4b	Upgrade Kafka 0.7.2 to Kafka 0.8.0-beta1 for Spark Streaming	2013-10-12 20:00:42 +08:00
Shivaram Venkataraman	c441904bce	Add a comment and exclude tools	2013-10-11 18:23:15 -07:00
Matei Zaharia	c71499b779	Merge pull request #19 from aarondav/master-zk Standalone Scheduler fault tolerance using ZooKeeper This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch `d5a96fe`), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from `d5a96fe`. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again.	2013-10-10 17:16:42 -07:00
Prashant Sharma	26860639c5	Merge branch 'scala-2.10' of github.com:ScrapCodes/spark into scala-2.10 Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala project/SparkBuild.scala	2013-10-10 09:42:23 +05:30
Shivaram Venkataraman	484166d520	Add new SBT target for dependency assembly	2013-10-09 04:24:34 -07:00
Prashant Sharma	7be75682b9	Merge branch 'master' into wip-merge-master Conflicts: bagel/pom.xml core/pom.xml core/src/test/scala/org/apache/spark/ui/UISuite.scala examples/pom.xml mllib/pom.xml pom.xml project/SparkBuild.scala repl/pom.xml streaming/pom.xml tools/pom.xml In scala 2.10, a shorter representation is used for naming artifacts so changed to shorter scala version for artifacts and made it a property in pom.	2013-10-08 11:29:40 +05:30
Reynold Xin	213b70a2db	Merge pull request #31 from sundeepn/branch-0.8 Resolving package conflicts with hadoop 0.23.9 Hadoop 0.23.9 is having a package conflict with easymock's dependencies. (cherry picked from commit `023e3fdf00`) Signed-off-by: Reynold Xin <rxin@apache.org>	2013-10-07 10:54:22 -07:00
Martin Weindel	9b0c9c893d	scala 2.10 requires Java 1.6, using Scala 2.10.3, resolved maven-scala-plugin warning	2013-10-05 21:41:09 +02:00
Prashant Sharma	c810ee0690	Merge branch 'master' into scala-2.10 Conflicts: core/src/test/scala/org/apache/spark/DistributedSuite.scala project/SparkBuild.scala	2013-10-05 15:52:57 +05:30
Du Li	9fd6bba60d	ask ivy/sbt to check local maven repo under ~/.m2	2013-10-01 15:46:51 -07:00
Prashant Sharma	5829692885	Merge branch 'master' into scala-2.10 Conflicts: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressUI.scala docs/_config.yml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala	2013-10-01 11:57:24 +05:30
Aaron Davidson	f549ea33d3	Standalone Scheduler fault tolerance using ZooKeeper This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch 194ba4b8), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from 194ba4b8. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again. Forthcoming: Documentation, tests (! - only ad hoc testing has been performed so far) I do not intend for this commit to be merged until tests are added, but this patch should still be mostly reviewable until then.	2013-09-26 15:04:23 -07:00
Reynold Xin	3f283278b0	Removed scala -optimize flag.	2013-09-26 13:58:10 -07:00
Reynold Xin	c514cd1587	Merge pull request #930 from holdenk/master Add mapPartitionsWithIndex	2013-09-26 13:48:20 -07:00
Prashant Sharma	604dc40996	Sync with master and some build fixes	2013-09-26 11:40:02 +05:30
Prashant Sharma	7ff4c2d399	fixed maven build for scala 2.10	2013-09-26 10:48:24 +05:30
Patrick Wendell	6079721fa1	Update build version in master	2013-09-24 11:41:51 -07:00
Prashant Sharma	276c37a51c	Akka 2.2 migration	2013-09-22 08:20:12 +05:30
Joseph E. Gonzalez	55696e2584	GraphX now builds with all merged changes.	2013-09-17 22:42:12 -07:00
Joseph E. Gonzalez	8b59fb72c4	Merging latest changes from spark main branch	2013-09-17 20:56:12 -07:00
Patrick Wendell	c856860c5b	Bumping Mesos version to 0.13.0	2013-09-15 12:46:26 -07:00
Prashant Sharma	383e151fd7	Merge branch 'master' of git://github.com/mesos/spark into scala-2.10 Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala project/SparkBuild.scala	2013-09-15 10:55:12 +05:30
Prashant Sharma	20c65bc334	Fixed repl suite	2013-09-15 10:43:06 +05:30
Holden Karau	68068977b8	Fix build on ubuntu	2013-09-14 20:51:11 -07:00
Patrick Wendell	91a59e6b10	Merge pull request #919 from mateiz/jets3t Add explicit jets3t dependency, which is excluded in hadoop-client	2013-09-11 10:21:48 -07:00
Patrick Wendell	0c1985b153	Fix HDFS access bug with assembly build. Due to this change in HDFS: https://issues.apache.org/jira/browse/HADOOP-7549 there is a bug when using the new assembly builds. The symptom is that any HDFS access results in an exception saying "No filesystem for scheme 'hdfs'". This adds a merge strategy in the assembly build which fixes the problem.	2013-09-10 22:05:13 -07:00
Matei Zaharia	f117dc6d0d	Add explicit jets3t dependency, which is excluded in hadoop-client	2013-09-10 06:39:25 +00:00
Patrick Wendell	f68848d95d	Merge pull request #906 from pwendell/ganglia-sink Clean-up of Metrics Code/Docs and Add Ganglia Sink	2013-09-08 18:32:16 -07:00
Matei Zaharia	0b957997ad	Merge pull request #908 from pwendell/master Fix target JVM version in scala build	2013-09-08 15:30:16 -07:00
Patrick Wendell	27bd74c8ad	Fix target JVM version in scala build	2013-09-08 14:37:45 -07:00
Patrick Wendell	8de8ee5d3c	Ganglia sink	2013-09-08 10:08:18 -07:00
Patrick Wendell	a8e376ec0f	Merge pull request #904 from pwendell/master Adding Apache license to two files	2013-09-07 21:16:01 -07:00
Patrick Wendell	6d2198643c	Adding Apache license to two files	2013-09-07 20:46:58 -07:00
Jey Kottalam	30a32c8335	Minor YARN build cleanups	2013-09-06 11:31:16 -07:00
Prashant Sharma	4106ae9fbf	Merged with master	2013-09-06 17:53:01 +05:30
Matei Zaharia	59218bdd49	Add Apache parent POM	2013-09-02 18:34:03 -07:00
Matei Zaharia	5701eb92c7	Fix some URLs	2013-09-01 14:13:16 -07:00
Matei Zaharia	46eecd110a	Initial work to rename package to org.apache.spark	2013-09-01 14:13:13 -07:00
Matei Zaharia	666d93c294	Update Maven build to create assemblies expected by new scripts This includes the following changes: - The "assembly" package now builds in Maven by default, and creates an assembly containing both hadoop-client and Spark, unlike the old BigTop distribution assembly that skipped hadoop-client - There is now a bigtop-dist package to build the old BigTop assembly - The repl-bin package is no longer built by default since the scripts don't reply on it; instead it can be enabled with -Prepl-bin - Py4J is now included in the assembly/lib folder as a local Maven repo, so that the Maven package can link to it - run-example now adds the original Spark classpath as well because the Maven examples assembly lists spark-core and such as provided - The various Maven projects add a spark-yarn dependency correctly	2013-08-29 21:19:06 -07:00
Matei Zaharia	8d81358a05	Provide more memory for tests	2013-08-29 21:19:06 -07:00
Matei Zaharia	53cd50c069	Change build and run instructions to use assemblies This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.	2013-08-29 21:19:04 -07:00
Reynold Xin	9db1e50344	Revert "Merge pull request #841 from rxin/json" This reverts commit `1fb1b09928`, reversing changes made to `c69c48947d`.	2013-08-26 11:05:14 -07:00
Jey Kottalam	a9db1b7b6e	Upgrade SBT IDE project generators	2013-08-23 10:27:18 -07:00
Jey Kottalam	b7f9e6374a	Fix SBT generation of IDE project files	2013-08-23 10:26:37 -07:00
Jey Kottalam	281b6c5f28	Re-add removed dependency on 'commons-daemon' Fixes SBT build under Hadoop 0.23.9 and 2.0.4	2013-08-22 15:45:45 -07:00
Matei Zaharia	ae8ba83ef2	Merge pull request #855 from jey/update-build-docs Update build docs	2013-08-22 10:14:54 -07:00
Matei Zaharia	8a36fd09dd	Merge pull request #854 from markhamstra/pomUpdate Synced sbt and maven builds to use the same dependencies, etc.	2013-08-22 10:13:35 -07:00
Jey Kottalam	f9cc1fbf27	Remove references to unsupported Hadoop versions	2013-08-21 17:14:36 -07:00
Mark Hamstra	ff6f1b0500	Synced sbt and maven builds	2013-08-21 13:50:24 -07:00
Reynold Xin	af602ba9d3	Downgraded default build hadoop version to 1.0.4.	2013-08-21 11:38:24 -07:00
Matei Zaharia	aa2b89d98d	Merge remote-tracking branch 'jey/hadoop-agnostic' Conflicts: core/src/main/scala/spark/PairRDDFunctions.scala	2013-08-20 10:14:15 -07:00
Jey Kottalam	6f6944c807	Update SBT build to use simpler fix for Hadoop 0.23.9	2013-08-19 12:33:13 -07:00
Jey Kottalam	67b593607c	Rename YARN build flag to SPARK_WITH_YARN	2013-08-16 14:00:05 -07:00
Jey Kottalam	b1d99744a8	Fix SBT build under Hadoop 0.23.x	2013-08-16 13:50:12 -07:00
Jey Kottalam	8add2d7a59	Fix repl/assembly when YARN enabled	2013-08-16 13:50:12 -07:00
Jey Kottalam	3f98eff63a	Allow make-distribution.sh to specify Hadoop version used	2013-08-16 13:50:09 -07:00
Reynold Xin	c961c19b7b	Use the JSON formatter from Scala library and removed dependency on lift-json. It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).	2013-08-15 18:23:01 -07:00
Jey Kottalam	a0f0848463	Update default version of Hadoop to 1.2.1	2013-08-15 16:50:37 -07:00
Jey Kottalam	cb4ef19214	yarn support	2013-08-15 16:50:37 -07:00
Jey Kottalam	273b499b9a	yarn sbt	2013-08-15 16:50:37 -07:00
Jey Kottalam	69c3bbf688	dynamically detect hadoop version	2013-08-15 16:50:37 -07:00
Matei Zaharia	d9588183fa	Update to Mesos 0.12.1	2013-08-13 18:51:35 -07:00
jerryshao	320e87e7ab	Add MetricsServlet for Spark metrics system	2013-08-12 13:23:23 +08:00
Matei Zaharia	dce5e47435	Merge pull request #800 from dlyubimov/HBASE_VERSION Pull HBASE_VERSION in the head of sbt build	2013-08-09 21:53:45 -07:00
Matei Zaharia	cd247ba5bb	Merge pull request #786 from shivaram/mllib-java Java fixes, tests and examples for ALS, KMeans	2013-08-09 20:41:13 -07:00
Dmitriy Lyubimov	27f674f82b	fewer words	2013-08-09 13:54:41 -07:00
Dmitriy Lyubimov	ae95b57469	Pull HBASE_VERSION in the head of sbt build	2013-08-09 12:45:18 -07:00
Matei Zaharia	5a4003c1ac	Update to Chill 0.3.1	2013-08-08 13:30:27 -07:00
Shivaram Venkataraman	471fbadd0c	Java examples, tests for KMeans and ALS - Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init	2013-08-06 15:43:46 -07:00
Joseph E. Gonzalez	499a0d8383	Merged graphx from @rxin into master	2013-08-06 12:28:29 -07:00
Matei Zaharia	e466a55a6b	Revert Mesos version to 0.9 since the 0.12 artifact has target Java 7	2013-08-01 15:45:21 -07:00
Matei Zaharia	b2b86c2575	Merge pull request #753 from shivaram/glm-refactor Build changes for ML lib	2013-07-31 15:51:39 -07:00
Matei Zaharia	14bf2fe039	Merge pull request #749 from benh/spark-executor-uri Added property 'spark.executor.uri' for launching on Mesos.	2013-07-31 14:18:16 -07:00
Shivaram Venkataraman	15fd0d619d	Add mllib, bagel to repl dependencies Also don't build an assembly jar for them	2013-07-30 18:31:11 -07:00
Reynold Xin	3b1ced83fb	Exclude older version of Snappy in streaming and examples.	2013-07-30 17:25:36 -07:00
Reynold Xin	368c58eac5	Merge branch 'lazy_file_open' of github.com:lyogavin/spark into compression Conflicts: project/SparkBuild.scala	2013-07-30 16:04:18 -07:00
Shivaram Venkataraman	48851d4dd9	Add bagel, mllib to SBT assembly. Also add jblas dependency to mllib pom.xml	2013-07-30 14:03:15 -07:00
Benjamin Hindman	f6f46455eb	Added property 'spark.executor.uri' for launching on Mesos without requiring Spark to be installed. Using 'make_distribution.sh' a user can put a Spark distribution at a URI supported by Mesos (e.g., 'hdfs://...') and then set that when launching their job. Also added SPARK_EXECUTOR_URI for the REPL.	2013-07-29 23:32:52 -07:00
ryanlecompte	8e0939f5a9	refactor Kryo serializer support to use chill/chill-java	2013-07-24 20:43:57 -07:00
jerryshao	5730193e0c	Fix some typos	2013-07-24 14:57:47 +08:00
jerryshao	576528f0f9	Add dependency of Codahale's metrics library	2013-07-24 14:57:46 +08:00
Josh Rosen	c83680434b	Add JavaAPICompletenessChecker. This is used to find methods in the Scala API that need to be ported to the Java API. To use it: ./run spark.tools.JavaAPICompletenessChecker Conflicts: project/SparkBuild.scala run run2.cmd	2013-07-22 16:11:49 -07:00
Liang-Chi Hsieh	d1738d72ba	also exclude asm for hadoop2. hadoop1 looks like no need to do that too.	2013-07-20 00:37:24 +08:00
Liang-Chi Hsieh	3aad452653	fix a bug in build process that pulls in two versionf of ASM.	2013-07-19 02:29:46 +08:00
Matei Zaharia	cad48edb70	Merge pull request #708 from ScrapCodes/dependencies-upgrade Dependency upgrade Akka 2.0.3 -> 2.0.5	2013-07-16 21:41:28 -07:00
Matei Zaharia	af3c9d5042	Add Apache license headers and LICENSE and NOTICE files	2013-07-16 17:21:33 -07:00
Prashant Sharma	2748e73eb9	Dependency upgrade Akka 2.0.3 -> 2.0.5	2013-07-16 16:08:46 +05:30
Prashant Sharma	9d7781c4e1	Adding commons io as dependency	2013-07-15 12:03:48 +05:30
Prashant Sharma	a3494d405d	Merge branch 'master' of github.com:mesos/spark into scala-2.10 Conflicts: core/src/main/scala/spark/Utils.scala core/src/test/scala/spark/ui/UISuite.scala project/SparkBuild.scala run	2013-07-15 11:15:55 +05:30
Matei Zaharia	668b0dc6a7	Merge branch 'master' of github.com:mesos/spark	2013-07-13 19:10:46 -07:00
Matei Zaharia	cd28d9c147	Merge remote-tracking branch 'origin/pr/662' Conflicts: bin/compute-classpath.sh	2013-07-13 19:10:00 -07:00
seanm	c4d5b01e44	changing com.google.code.findbugs maven coordinates	2013-07-13 14:56:23 -06:00
Prashant Sharma	e86d5dbaad	Merge branch 'master' into master-merge Conflicts: README.md core/pom.xml core/src/main/scala/spark/deploy/JsonProtocol.scala core/src/main/scala/spark/deploy/LocalSparkCluster.scala core/src/main/scala/spark/deploy/master/Master.scala core/src/main/scala/spark/deploy/master/MasterWebUI.scala core/src/main/scala/spark/deploy/worker/Worker.scala core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala core/src/main/scala/spark/storage/BlockManagerUI.scala core/src/main/scala/spark/util/AkkaUtils.scala pom.xml project/SparkBuild.scala streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala	2013-07-12 14:49:16 +05:30
Prashant Sharma	69ae7ea227	Removed some unnecessary code and fixed dependencies	2013-07-11 18:30:18 +05:30
Matei Zaharia	3cc6818f13	Merge pull request #668 from shimingfei/guava-14.0.1 update guava version from 11.0.1 to 14.0.1	2013-07-06 19:51:20 -07:00
Matei Zaharia	1ffadb2d9e	Merge remote-tracking branch 'pwendell/ui-updates' Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala core/src/main/scala/spark/util/AkkaUtils.scala pom.xml	2013-07-06 15:51:41 -07:00
Matei Zaharia	43b24635ee	Renamed ML package to MLlib and added it to classpath	2013-07-05 11:38:53 -07:00
Matei Zaharia	05be233ce2	Removed dependency on Apache Commons Math	2013-07-05 11:13:46 -07:00
Reynold Xin	6a9a9a364c	Minor clean up of the RidgeRegression code. I am not even sure why I did this :s.	2013-07-05 11:13:45 -07:00
Matei Zaharia	729e463f64	Import RidgeRegression example Conflicts: run	2013-07-05 11:13:41 -07:00
Gavin Li	94238aae57	fix dependencies	2013-07-03 18:08:38 +00:00
Mingfei	04567a1771	update guava version from 11.0.1 to 14.0.1	2013-07-03 17:43:37 +08:00
Prashant Sharma	a5f1f6a907	Merge branch 'master' into master-merge Conflicts: core/pom.xml core/src/main/scala/spark/MapOutputTracker.scala core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/RDDCheckpointData.scala core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/Utils.scala core/src/main/scala/spark/api/python/PythonRDD.scala core/src/main/scala/spark/deploy/client/Client.scala core/src/main/scala/spark/deploy/master/MasterWebUI.scala core/src/main/scala/spark/deploy/worker/Worker.scala core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala core/src/main/scala/spark/rdd/BlockRDD.scala core/src/main/scala/spark/rdd/ZippedRDD.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/BlockManagerMaster.scala core/src/main/scala/spark/storage/BlockManagerMasterActor.scala core/src/main/scala/spark/storage/BlockManagerUI.scala core/src/main/scala/spark/util/AkkaUtils.scala core/src/test/scala/spark/SizeEstimatorSuite.scala pom.xml project/SparkBuild.scala repl/src/main/scala/spark/repl/SparkILoop.scala repl/src/test/scala/spark/repl/ReplSuite.scala streaming/src/main/scala/spark/streaming/StreamingContext.scala streaming/src/main/scala/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala streaming/src/main/scala/spark/streaming/util/MasterFailureTest.scala	2013-07-03 11:43:26 +05:30
Matei Zaharia	5cfcd3c336	Remove Twitter4J specific repo since it's in Maven central	2013-06-29 15:37:27 -07:00
Reynold Xin	564d902d79	Merge branch 'master' of github.com:mesos/spark into graph Conflicts: run run2.cmd	2013-06-29 15:30:21 -07:00
Evan Chan	1107b4d55b	Merge branch 'master' into 2013-06/assembly-jar-deploy Conflicts: run Previous changes that I made to run and set-dev-classpath.sh instead have been folded into compute-classpath.sh	2013-06-28 17:18:35 -07:00
Matei Zaharia	32370da4e4	Don't use forward slash in exclusion for JAR signature files	2013-06-25 22:08:19 -04:00
Evan Chan	d2f46ac680	Merge branch 'master' into 2013-06/assembly-jar-deploy Conflicts: run	2013-06-25 14:50:16 -07:00
Tathagata Das	c89af0a7f9	Merge branch 'master' into streaming Conflicts: .gitignore	2013-06-24 23:57:47 -07:00
Patrick Wendell	91ec5a1a04	Changing JSON protocol and removing spray code	2013-06-22 10:31:36 -07:00
Matei Zaharia	b350f34703	Increase memory for tests to prevent a crash on JDK 7	2013-06-22 07:48:20 -07:00
Evan Chan	071ff7efa1	Enable building a fat jar for the Spark REPL	2013-06-20 17:53:23 -07:00
Matei Zaharia	ae7a5da6b3	Fix some dependency issues in SBT build (same will be needed for Maven): - Exclude a version of ASM 3.x that comes from HBase - Don't use a special ASF repo for HBase - Update SLF4J version - Add sbt-dependency-graph plugin so we can easily find dependency trees	2013-06-20 18:44:46 +02:00
Matei Zaharia	7902baddc7	Update ASM to version 4.0	2013-06-19 13:34:30 +02:00
Matei Zaharia	dbfab49d2a	Merge remote-tracking branch 'milliondreams/casdemo' Conflicts: project/SparkBuild.scala	2013-06-18 14:55:31 +02:00
Matei Zaharia	73f4c7d2d1	Merge pull request #605 from esjewett/SPARK-699 Add hBase example (retry of pull request #596)	2013-06-18 04:21:17 -07:00
Matei Zaharia	2ab311f4ce	Removed second version of junit test plugin from plugins.sbt	2013-06-18 00:40:25 +02:00
Christopher Nguyen	479442a9b9	Add zeroLengthPartitions() test to make sure, e.g., StatCounter.scala can handle empty partitions without incorrectly returning NaN	2013-06-15 17:35:55 -07:00
Matei Zaharia	5b5b5aedbf	Fixed a few test issues due to Akka 2.1, as well as SBT memory. Unfortunately, in Akka 2.1, ActorSystem.awaitTermination hangs for remote actors, and Akka also leaves a non-daemon Netty thread even when run in daemon mode. Thus I had to comment out some of the calls to awaitTermination, and we still have one failing test.	2013-06-08 01:09:24 -07:00
Rohit Rai	6d8423fd1b	Adding deps to examples/pom.xml Fixing exclusion in examples deps in SparkBuild.scala	2013-06-02 13:03:45 +05:30
Rohit Rai	3be7bdcefd	Adding example to make Spark RDD from Cassandra	2013-06-01 19:32:17 +05:30
Reynold Xin	b0403d3f2b	Merge branch 'master' of github.com:mesos/spark into graph Conflicts: run	2013-06-01 00:48:27 -07:00
Reynold Xin	f742435f18	Removed the duplicated netty dependency in SBT build file.	2013-05-16 14:31:03 -07:00
Reynold Xin	f3491cb89b	Merge branch 'master' of github.com:mesos/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/test/scala/spark/DistributedSuite.scala project/SparkBuild.scala	2013-05-15 00:31:52 -07:00
Reynold Xin	81ad2fa331	Merge branch 'jdbc' of github.com:koeninger/spark Conflicts: project/SparkBuild.scala	2013-05-14 23:12:00 -07:00
Cody Koeninger	b16c4896f6	add test for JdbcRDD using embedded derby, per rxin suggestion	2013-05-14 23:44:04 -05:00
Ethan Jewett	ee6f6aa6cd	Add hBase example	2013-05-09 18:33:38 -05:00
Reynold Xin	012c9e5ab0	Revert "Merge pull request #596 from esjewett/master" because the dependency on hbase introduces netty-3.2.2 which conflicts with netty-3.5.3 already in Spark. This caused multiple test failures. This reverts commit `0f1b7a06e1`, reversing changes made to `aacca1b8a8`.	2013-05-09 14:20:01 -07:00
Reynold Xin	90577ada69	Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/DiskStore.scala project/SparkBuild.scala	2013-05-07 15:56:19 -07:00
Ethan Jewett	02e8cfa617	HBase example	2013-05-04 12:31:30 -05:00
Reynold Xin	f54bc544c5	Merge branch 'master' of github.com:mesos/spark into graph	2013-05-02 17:25:09 -07:00
Jey Kottalam	207afe4088	Remove spark-repl's extraneous dependency on spark-streaming	2013-05-01 16:57:31 -07:00
Prashant Sharma	4041a2689e	Updated to latest stable scala 2.10.1 and akka 2.1.2	2013-05-01 11:35:35 +05:30
Matei Zaharia	f1f92c88eb	Build against Hadoop 1 by default	2013-04-29 17:08:45 -07:00
Prashant Sharma	4b4a36ea7d	Fixed pom.xml with updated dependencies.	2013-04-29 12:55:43 +05:30
Matei Zaharia	1b169f190c	Exclude old versions of Netty, which had a different Maven organization	2013-04-25 19:52:12 -07:00
Matei Zaharia	eef9ea1993	Update unit test memory to 2 GB	2013-04-25 00:42:29 -07:00
Matei Zaharia	01d9ba5038	Add back line removed during YARN merge	2013-04-25 00:11:27 -07:00
Prashant Sharma	ad88f083a6	scala 2.10 and master merge	2013-04-24 18:08:26 +05:30
Mridul Muralidharan	3b594a4e3b	Do not add signature files - results in validation errors when using assembled file	2013-04-24 10:18:25 +05:30
Mridul Muralidharan	dd515ca3ee	Attempt at fixing merge conflict	2013-04-24 09:24:17 +05:30
Mridul Muralidharan	adcda84f96	Pull latest SparkBuild.scala from master and merge conflicts	2013-04-24 08:57:25 +05:30
Mridul Muralidharan	5b85c715c8	Revert back to 2.0.2-alpha : 0.23.7 has protocol changes which break against cloudera	2013-04-24 02:57:51 +05:30
Mridul Muralidharan	8faf5c51c3	Patch from Thomas Graves to improve the YARN Client, and move to more production ready hadoop yarn branch	2013-04-24 02:31:57 +05:30
Prashant Sharma	185bb9525a	Manually merged scala-2.10 and master	2013-04-22 14:14:03 +05:30
Mridul Muralidharan	7acab3ab45	Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo	2013-04-22 08:01:13 +05:30
Matei Zaharia	17e076de80	Turn on forking in test JVMs to reduce the pressure on perm gen and code cache sizes due to having 2 instances of the Scala compiler and a bunch of classloaders.	2013-04-18 22:25:57 -07:00
Mridul Muralidharan	5d891534fd	Move back to 2.0.2-alpha, since 2.0.3-alpha is not available in cloudera yet. Also, add netty dependency explicitly to prevent resolving to older 2.3x version. Additionally, comment out retrievePattern to ensure correct netty is picked up	2013-04-17 05:54:43 +05:30
Matei Zaharia	ec5e553b41	Merge pull request #558 from ash211/patch-jackson-conflict Don't pull in old versions of Jackson via hadoop-core	2013-04-14 08:20:13 -07:00
Matei Zaharia	ed336e0d44	Fix tests from different projects running in parallel in SBT 0.12	2013-04-11 22:29:37 -04:00
Prashant Sharma	9f26318bbd	Fixed previously removed dependencies	2013-04-10 14:46:42 +05:30
Andrew Ash	18bd41d1a3	Don't pull in old versions of Jackson via hadoop-core	2013-04-09 14:44:47 -04:00
Matei Zaharia	65caa8f711	Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0' Conflicts: docs/_config.yml project/SparkBuild.scala	2013-04-08 12:43:17 -04:00
Matei Zaharia	1cb3eb9762	Merge remote-tracking branch 'kalpit/master' Conflicts: project/SparkBuild.scala	2013-04-07 20:54:18 -04:00
Matei Zaharia	b362df39ea	Merge pull request #552 from MLnick/master Bumping version for Twitter Algebird to latest	2013-04-07 17:17:52 -07:00
Mridul Muralidharan	6798a09df8	Add support for building against hadoop2-yarn : adding new maven profile for it	2013-04-07 17:47:38 +05:30
Reynold Xin	3728e1bc40	Code to run bagel vs graph experiments.	2013-04-07 15:05:46 +08:00
shane-huang	df47b40b76	Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages change reference from io.Source to scala.io.Source to avoid looking into io.netty package Signed-off-by: shane-huang <shengsheng.huang@intel.com>	2013-04-07 14:37:12 +08:00
Andy Konwinski	5555811bd5	Update build to Scala 2.9.3	2013-04-04 13:26:45 -07:00
Nick Pentreath	0f54344fd8	Bumping Algebird version in examples now that it supports JDK 1.6	2013-04-03 13:15:34 +02:00
Reynold Xin	f130eb624c	Merge branch 'master' of github.com:mesos/spark into graph	2013-04-01 20:06:30 +08:00
Jey Kottalam	bc8ba222ff	Bump development version to 0.8.0	2013-03-28 15:42:01 -07:00
kalpit	f0164e5047	upgraded sbt version, sbt plugins and some library dependencies to latest stable version	2013-03-26 17:49:29 -07:00
Holden Karau	8456d673e2	Re-enable deprecation warnings since there are only two	2013-03-24 17:30:23 -07:00
Holden Karau	e104a76016	Makes the syntax highlighting on the build file not broken in emacs.	2013-03-24 16:16:05 -07:00
Reynold Xin	ba9d00c44a	Merge branch 'master' into graph Conflicts: run2.cmd	2013-03-18 18:30:14 +08:00
Prashant Sharma	15530c2b23	porting of repl to scala-2.10	2013-03-17 10:47:17 +05:30
seanm	42822cf95d	changing streaming resolver for akka	2013-03-13 11:40:42 -06:00
seanm	4aa1205202	adding typesafe repo to streaming resolvers so that akka-zeromq is found	2013-03-11 12:37:29 -06:00
Hiral Patel	664e5fd24b	Fix reference bug in Kryo serializer, add test, update version	2013-03-07 22:16:11 -08:00
Matei Zaharia	db9b90fdbd	Change version to 0.7.1-SNAPSHOT for development branch	2013-02-27 09:15:26 -08:00
Matei Zaharia	7e67c626ee	Change version number to 0.7.0	2013-02-25 20:30:47 -08:00
Matei Zaharia	6494cab19d	Update Hadoop dependency to 1.0.4	2013-02-25 15:38:21 -08:00
Prashant Sharma	254acb1666	Moving akka dependency resolver to shared.	2013-02-25 13:37:07 +05:30
Tathagata Das	5ab37be983	Fixed class paths and dependencies based on Matei's comments.	2013-02-24 16:24:52 -08:00
Tathagata Das	f282bc4960	Changed Algebird from 0.1.9 to 0.1.8	2013-02-24 12:44:12 -08:00
Tathagata Das	24c0cd6168	Fixed resolver for akka-zeromq	2013-02-22 18:23:29 -08:00
Tathagata Das	cfa65ebff1	Merge pull request #480 from MLnick/streaming-eg-algebird [Streaming] Examples using Twitter's Algebird library	2013-02-22 12:29:04 -08:00
Nick Pentreath	718474b9c6	Bumping Algebird to 0.1.9	2013-02-21 12:11:31 +02:00
Prashant Sharma	4e5b09664c	fixes corresponding to review feedback at pull request #479	2013-02-20 19:14:52 +05:30
Reynold Xin	19d3b059e3	Merge branch 'master' into graph	2013-02-19 12:44:05 -08:00
Reynold Xin	81c4d19c61	Maven and sbt build changes for SparkGraph.	2013-02-19 12:43:13 -08:00
Prashant Sharma	f7d3e309cb	ZeroMQ stream as receiver	2013-02-19 19:32:52 +05:30
Nick Pentreath	315ea069e8	Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird Conflicts: project/SparkBuild.scala	2013-02-19 13:58:05 +02:00
Nick Pentreath	015893f0e8	Adding streaming HyperLogLog example using Algebird	2013-02-19 13:21:33 +02:00
Tathagata Das	def8126d77	Added TwitterInputDStream from example to StreamingContext. Renamed example TwitterBasic to TwitterPopularTags.	2013-02-14 17:49:43 -08:00
Charles Reiss	0f81025eca	Add easymock to SBT configuration.	2013-01-29 18:55:42 -08:00
Matei Zaharia	6e3754bf47	Add Maven build file for streaming, and fix some issues in SBT file As part of this, changed our Scala 2.9.2 Kafka library to be available as a local Maven repository, following the example in (http://blog.dub.podval.org/2010/01/maven-in-project-repository.html)	2013-01-20 19:22:24 -08:00
Tathagata Das	cd1521cfdb	Merge branch 'master' into streaming Conflicts: core/src/main/scala/spark/rdd/CoGroupedRDD.scala core/src/main/scala/spark/rdd/FilteredRDD.scala docs/_layouts/global.html docs/index.md run	2013-01-15 12:08:51 -08:00
folone	25c0739bad	Moved to scala 2.10.0. Notable changes are: - akka 2.0.3 → 2.1.0 - spray 1.0-M1 → 1.1-M7 For now the repl subproject is commented out, as scala reflection api changed very much since the introduction of macros.	2013-01-14 09:52:11 +01:00
Matei Zaharia	6d1c230281	Merge pull request #357 from tysonjh/master JSON support added to WebUI	2013-01-10 19:06:07 -08:00
Tyson	549ee388a1	Removed io.spray spray-json dependency as it is not needed.	2013-01-09 15:12:23 -05:00
Tyson	6e8c8f61c4	Added the spray implicit marshaller library Added the io.spray JSON library	2013-01-09 10:40:33 -05:00
Stephen Haberman	c3f1675f9c	Retrieve jars to a flat directory so * can be used for the classpath.	2013-01-08 14:44:33 -06:00
Tathagata Das	64dceec293	Merge branch 'streaming-merge' into dev-merge	2013-01-07 16:54:35 -08:00
Shivaram Venkataraman	aed368a970	Update Hadoop dependency to 1.0.3 as 0.20 has Sun specific dependencies. Also fix SequenceFileRDDFunctions to pick the right type conversion across Hadoop versions	2013-01-07 15:57:33 -08:00
Tathagata Das	af8738dfb5	Moved Spark Streaming examples to examples sub-project.	2013-01-06 19:31:54 -08:00
Patrick Wendell	518111573f	Merge pull request #8 from radlab/twitter-example Adding a Twitter InputDStream with an example	2012-12-29 14:23:01 -08:00
Tathagata Das	7c33f76291	Merge branch 'mesos' into dev-merge	2012-12-26 19:19:07 -08:00
Patrick Wendell	9ac4cb1c5f	Adding a Twitter InputDStream with an example	2012-12-21 17:18:19 -08:00
Matei Zaharia	3334b7c6b5	Merge pull request #341 from rxin/4a3fb06ac2d11125feb08acbbd4df76d1e91b677 Kryo2 update against Spark master	2012-12-21 15:31:23 -08:00
Reynold Xin	eac566a7f4	Merge branch 'master' of github.com:mesos/spark into dev Conflicts: core/src/main/scala/spark/MapOutputTracker.scala core/src/main/scala/spark/PairRDDFunctions.scala core/src/main/scala/spark/ParallelCollection.scala core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/rdd/BlockRDD.scala core/src/main/scala/spark/rdd/CartesianRDD.scala core/src/main/scala/spark/rdd/CoGroupedRDD.scala core/src/main/scala/spark/rdd/CoalescedRDD.scala core/src/main/scala/spark/rdd/FilteredRDD.scala core/src/main/scala/spark/rdd/FlatMappedRDD.scala core/src/main/scala/spark/rdd/GlommedRDD.scala core/src/main/scala/spark/rdd/HadoopRDD.scala core/src/main/scala/spark/rdd/MapPartitionsRDD.scala core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala core/src/main/scala/spark/rdd/MappedRDD.scala core/src/main/scala/spark/rdd/PipedRDD.scala core/src/main/scala/spark/rdd/SampledRDD.scala core/src/main/scala/spark/rdd/ShuffledRDD.scala core/src/main/scala/spark/rdd/UnionRDD.scala core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/BlockManagerId.scala core/src/main/scala/spark/storage/BlockManagerMaster.scala core/src/main/scala/spark/storage/StorageLevel.scala core/src/main/scala/spark/util/MetadataCleaner.scala core/src/main/scala/spark/util/TimeStampedHashMap.scala core/src/test/scala/spark/storage/BlockManagerSuite.scala run	2012-12-20 14:53:40 -08:00
Reynold Xin	9397c5014e	Let the slave notify the master block removal.	2012-12-20 01:37:09 -08:00
Patrick Wendell	3ff9710265	Adding Flume InputDStream	2012-12-07 16:42:39 -08:00
Denny	556c38ed91	Added kafka JAR	2012-12-05 11:54:42 -08:00
Denny	0c1de43fc7	Working on kafka.	2012-11-06 09:41:42 -08:00
Matei Zaharia	863a55ae42	Merge remote-tracking branch 'public/master' into dev Conflicts: core/src/main/scala/spark/BlockStoreShuffleFetcher.scala core/src/main/scala/spark/KryoSerializer.scala core/src/main/scala/spark/MapOutputTracker.scala core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/executor/Executor.scala core/src/main/scala/spark/network/Connection.scala core/src/main/scala/spark/network/ConnectionManagerTest.scala core/src/main/scala/spark/rdd/BlockRDD.scala core/src/main/scala/spark/rdd/NewHadoopRDD.scala core/src/main/scala/spark/scheduler/ShuffleMapTask.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/BlockMessage.scala core/src/main/scala/spark/storage/BlockStore.scala core/src/main/scala/spark/storage/StorageLevel.scala core/src/main/scala/spark/util/AkkaUtils.scala project/SparkBuild.scala run	2012-10-24 23:21:00 -07:00
Matei Zaharia	0967e71a00	Bump up version to 0.7.0-SNAPSHOT for master branch	2012-10-22 11:49:42 -07:00
Matei Zaharia	902a608187	Update version to 0.6.1-SNAPSHOT to show this is in development	2012-10-22 11:43:57 -07:00
Thomas Dudziak	d9c2a89c57	Support for Hadoop 2 distributions such as cdh4	2012-10-18 16:08:54 -07:00
Reynold Xin	4a3fb06ac2	Updated Kryo to 2.20.	2012-10-16 01:10:01 -07:00
Patrick Wendell	629dd2691e	Removing credentials line in build.	2012-10-14 19:33:39 -07:00
Matei Zaharia	f8768da418	Comment out PGP stuff for publish-local to work	2012-10-14 17:37:21 -07:00
Matei Zaharia	64b52166ee	Changed default Hadoop version back to 0.20.205	2012-10-14 09:51:34 -07:00
Matei Zaharia	ce6b5a3ee5	Uncomment Maven publishing stuff and set version to 0.6.0	2012-10-13 15:55:39 -07:00
Patrick Wendell	6d328f54d0	Changing tabs to spaces	2012-10-10 18:54:22 -07:00
Patrick Wendell	3ed172ea59	Adding code for publishing to Sonatype. By default - I'm leaving this commented out. This is because there is a bug in the PGP signing plugin which causes it to active even duing a publish-local. So we'll just uncomment when we decide to publish.	2012-10-10 17:25:29 -07:00
Andy Konwinski	5897567679	Removes the included mesos-0.9.0.jar and adds a libraryDependency to the build file so that mesos-0.9.0-incubating.jar (which contains the same class files, but has a silightly different name) will be pulled down from Maven Central instead.	2012-10-03 08:58:05 -07:00
Matei Zaharia	6112b1a83c	Don't build an assembly for the REPL	2012-10-02 17:08:16 -07:00
Matei Zaharia	a925754675	Place Spray in front of Cloudera in Maven search path	2012-10-02 12:02:00 -07:00
Matei Zaharia	22684653a5	Revert "Place Spray repo ahead of Cloudera in Maven search path" This reverts commit `42e0a68082`.	2012-10-02 12:01:32 -07:00
Matei Zaharia	42e0a68082	Place Spray repo ahead of Cloudera in Maven search path	2012-10-02 11:37:19 -07:00
Patrick Wendell	6fee76d6d5	publish-local should go to maven + ivy by default	2012-10-01 15:34:47 -07:00
Reynold Xin	5783236ae6	Added a new command "pl" in sbt to publish to both Maven and Ivy.	2012-10-01 00:17:13 -07:00
Matei Zaharia	35cc9f13e9	Update Akka to 2.0.3	2012-09-24 14:17:10 -07:00
Matei Zaharia	1f539aa473	Update Scala version dependency to 2.9.2	2012-09-24 14:12:48 -07:00
Tathagata Das	7419d2c7ea	Added transformRDD DStream operation and TransformedDStream. Added sbt assembly option for streaming project.	2012-09-02 02:35:17 -07:00
Matei Zaharia	5a8015d2db	Merge remote-tracking branch 'public/dev' into dev	2012-08-24 16:11:44 -07:00
Denny	0008994044	merged dev branch	2012-08-02 16:00:33 -07:00

... 4 5 6 7 8 ...

640 commits