Commit graph

1931 commits

Author SHA1 Message Date
Imran Rashid c73107500e send sparkHome as String instead of File over network 2013-01-21 11:21:39 -08:00
Imran Rashid 5bf73df7f0 oops, fix stupid compile error 2013-01-21 11:21:33 -08:00
Imran Rashid aae5a920a4 get sparkHome the correct way 2013-01-21 11:21:28 -08:00
Imran Rashid f116d6b5c6 executor can use a different sparkHome from Worker 2013-01-21 11:21:22 -08:00
Matei Zaharia 414b41e1d8 Merge pull request #362 from stephenh/hadoopconf
Add SparkContext.hadoopConfiguration
2013-01-21 11:18:48 -08:00
Stephen Haberman 6ded481999 Merge branch 'master' into hadoopconf
Conflicts:
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/api/java/JavaSparkContext.scala
2013-01-21 12:56:48 -06:00
Stephen Haberman 69a417858b Also use hadoopConfiguration in newAPI methods. 2013-01-21 12:42:11 -06:00
Matei Zaharia c0b9ceb8c3 Log remote lifecycle events in Akka for easier debugging 2013-01-21 00:23:53 -08:00
Matei Zaharia 4750907c3d Update run script to deal with change to build of REPL shaded JAR 2013-01-20 21:05:17 -08:00
Matei Zaharia 6e3754bf47 Add Maven build file for streaming, and fix some issues in SBT file
As part of this, changed our Scala 2.9.2 Kafka library to be available
as a local Maven repository, following the example in
(http://blog.dub.podval.org/2010/01/maven-in-project-repository.html)
2013-01-20 19:22:24 -08:00
Matei Zaharia c7b5e5f1ec Merge pull request #389 from JoshRosen/python_rdd_checkpointing
Add checkpointing to the Python API
2013-01-20 17:10:44 -08:00
Matei Zaharia 14360aba89 Merge pull request #390 from JoshRosen/spark-654
Fix PythonPartitioner equality
2013-01-20 17:08:53 -08:00
Josh Rosen 9f211dd3f0 Fix PythonPartitioner equality; see SPARK-654.
PythonPartitioner did not take the Python-side partitioning function
into account when checking for equality, which might cause problems
in the future.
2013-01-20 15:41:42 -08:00
Josh Rosen 00d70cd660 Clean up setup code in PySpark checkpointing tests 2013-01-20 15:38:11 -08:00
Josh Rosen 5b6ea9e9a0 Update checkpointing API docs in Python/Java. 2013-01-20 15:31:41 -08:00
Josh Rosen d0ba80dc72 Add checkpointFile() and more tests to PySpark. 2013-01-20 13:59:45 -08:00
Josh Rosen 7ed1bf4b48 Add RDD checkpointing to Python API. 2013-01-20 13:19:19 -08:00
Matei Zaharia fe85a07511 Merge pull request #361 from mesos/streaming
Merge Streaming into master
2013-01-20 12:48:15 -08:00
Matei Zaharia 86057ec7c8 Merge branch 'master' into streaming
Conflicts:
	core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Josh Rosen 17035db159 Add __repr__ to Accumulator; fix bug in sc.accumulator 2013-01-20 11:58:57 -08:00
Josh Rosen 9f54d7e1f5 Merge pull request #387 from mateiz/python-accumulators
Add accumulators to PySpark
2013-01-20 11:00:36 -08:00
Matei Zaharia 2a8c2a6790 Minor formatting fixes 2013-01-20 10:24:53 -08:00
Matei Zaharia 5d473f050e Merge pull request #376 from MLnick/python-als
Python ALS example
2013-01-20 10:21:29 -08:00
Matei Zaharia 922c5ec069 Merge pull request #385 from pwendell/ec2-guide-fix
Clarifying log directory in EC2 guide
2013-01-20 10:05:38 -08:00
Patrick Wendell 5f74ead636 Changes based on Matei's comment 2013-01-20 08:59:20 -08:00
Tathagata Das 76ff962edc Merge pull request #380 from tdas/streaming
Merging pySpark to streaming
2013-01-20 03:54:46 -08:00
Tathagata Das 33bad85bb9 Fixed streaming testsuite bugs 2013-01-20 03:51:11 -08:00
Matei Zaharia ee5a07955c Fix Python guide to say accumulators are available 2013-01-20 02:11:58 -08:00
Matei Zaharia a23ed25f3c Add a class comment to Accumulator 2013-01-20 02:10:25 -08:00
Matei Zaharia 61b6382a35 Launch accumulator tests in run-tests 2013-01-20 01:59:07 -08:00
Matei Zaharia 8e7f098a2c Added accumulators to PySpark 2013-01-20 01:57:44 -08:00
Tathagata Das 4f8fe58b25 Merge branch 'mesos-streaming' into streaming
Conflicts:
	core/src/main/scala/spark/api/java/JavaRDDLike.scala
	core/src/main/scala/spark/api/java/JavaSparkContext.scala
	core/src/test/scala/spark/JavaAPISuite.java
2013-01-20 01:13:56 -08:00
Tathagata Das 214345ceac Fixed issue https://spark-project.atlassian.net/browse/STREAMING-29, along with updates to doc comments in SparkContext.checkpoint(). 2013-01-19 23:50:17 -08:00
Patrick Wendell ecdff861f7 Clarifying log directory in EC2 guide 2013-01-19 22:59:35 -08:00
Patrick Wendell 11bbe23140 Merge pull request #369 from pwendell/streaming-java-api
Java API For Spark Streaming
2013-01-17 22:39:30 -08:00
Patrick Wendell 12b72b3e73 NetworkWordCount example 2013-01-17 22:37:56 -08:00
Patrick Wendell c46dd2de78 Moving tests to appropriate directory 2013-01-17 21:43:17 -08:00
Patrick Wendell e0165bf714 Adding queueStream and some slight refactoring 2013-01-17 21:25:49 -08:00
Patrick Wendell 6fba7683c2 Small doc fix 2013-01-17 18:46:24 -08:00
Patrick Wendell ee0314c3b3 Merge branch 'streaming' into streaming-java-api 2013-01-17 18:43:00 -08:00
Patrick Wendell 70ba994d6d Import fixup 2013-01-17 18:41:59 -08:00
Patrick Wendell 2261e62ee5 Style cleanup 2013-01-17 18:41:59 -08:00
Patrick Wendell 82b8707c6b Checkpointing in Streaming java API 2013-01-17 18:41:58 -08:00
Patrick Wendell 61b877c688 Adding flatMap 2013-01-17 18:41:58 -08:00
Patrick Wendell d5570c7968 Adding checkpointing to Java API 2013-01-17 18:41:58 -08:00
Patrick Wendell 8e6cbbc6c7 Adding other updateState functions 2013-01-17 18:41:58 -08:00
Patrick Wendell 2a872335c5 Bug fix and test cleanup 2013-01-17 18:41:58 -08:00
Matei Zaharia 54c0f9f185 Fix code that assumed spark.local.dir is only a single directory 2013-01-17 17:40:55 -08:00
Matei Zaharia b534fd363f Merge pull request #382 from fanuo/master
HttpBroadcast server cache by default in spark.local.dir instead of java.io.tmpdir
2013-01-17 17:00:25 -08:00
Fernand Pajot 742bc841ad changed HttpBroadcast server cache to be in spark.local.dir instead of java.io.tmpdir 2013-01-17 16:56:11 -08:00