Commit graph

4551 commits

Author SHA1 Message Date
Mosharaf Chowdhury e663750488 Removed unused code.
Changes to match Spark coding style.
2013-10-17 00:19:50 -07:00
Kay Ousterhout 809f547633 Fixed unit tests 2013-10-16 23:16:12 -07:00
KarthikTunga 8537f19268 SPARK-627 , Implementing --config arguments in the scripts 2013-10-16 23:00:33 -07:00
Reynold Xin 3e7df8f6c6 Added a number of very fast, memory-efficient data structures: BitSet, OpenHashSet, OpenHashMap, PrimitiveKeyOpenHashMap. 2013-10-16 22:58:52 -07:00
KarthikTunga ff4fb1f7ee SPARK-627 , Implementing --config arguments in the scripts 2013-10-16 22:55:15 -07:00
KarthikTunga a32aa6b351 Implementing --config argument in the scripts 2013-10-16 22:51:09 -07:00
Mosharaf Chowdhury e96bd0068f BroadcastTest2 --> BroadcastTest 2013-10-16 21:33:33 -07:00
Mosharaf Chowdhury a8d0981832 Fixes for the new BlockId naming convention. 2013-10-16 21:33:33 -07:00
Mosharaf Chowdhury feb45d391f Default blockSize is 4MB.
BroadcastTest2 example added for testing broadcasts.
2013-10-16 21:33:33 -07:00
Mosharaf Chowdhury 6e5a60fab4 Removed unnecessary code, and added comment of memory-latency tradeoff. 2013-10-16 21:33:33 -07:00
Mosharaf Chowdhury 4602e2bf6e Torrent-ish broadcast based on BlockManager. 2013-10-16 21:33:33 -07:00
prabeesh 890f8fe439 modify code, use Spark Logging Class 2013-10-17 10:00:40 +05:30
prabeesh ee4178f144 remove unused dependency 2013-10-17 09:57:48 +05:30
prabeesh 29245605bf remove unused dependency 2013-10-17 09:57:30 +05:30
Shivaram Venkataraman 0a4b76fcc2 Rename SBT target to assemble-deps. 2013-10-16 17:05:46 -07:00
Kay Ousterhout ec512583ab Removed TaskSchedulerListener interface.
The interface was used only by the DAG scheduler (so it wasn't necessary
to define the additional interface), and the naming makes it very
confusing when reading the code (because "listener" was used
to describe the DAG scheduler, rather than SparkListeners, which
implement a nearly-identical interface but serve a different
function).
2013-10-16 16:57:42 -07:00
Matei Zaharia f9973cae3a Merge pull request #65 from tgravescs/fixYarn
Fix yarn build

Fix the yarn build after renaming StandAloneX to CoarseGrainedX from pull request 34.
2013-10-16 15:58:41 -07:00
Shivaram Venkataraman 1dcded45e2 Exclude assembly jar from classpath if using deps 2013-10-16 13:43:41 -07:00
tgravescs cc7df2b3cc Fix yarn build 2013-10-16 10:09:16 -05:00
prabeesh 9a7575728d add maven dependencies for mqtt 2013-10-16 13:41:49 +05:30
prabeesh 7d36a117c1 add maven dependencies for mqtt 2013-10-16 13:41:26 +05:30
prabeesh 9eaf68fd40 added mqtt adapter wordcount example 2013-10-16 13:40:38 +05:30
prabeesh 06de3d516d added mqtt adapter library dependencies 2013-10-16 13:38:37 +05:30
prabeesh 2e48b23eae added mqtt adapter 2013-10-16 13:36:25 +05:30
prabeesh 742ada91e0 mqttinputdstream for mqttstreaming adapter 2013-10-16 13:35:29 +05:30
Matei Zaharia 28e9c2abc0 Merge pull request #63 from pwendell/master
Fixing spark streaming example and a bug in examples build.

- Examples assembly included a log4j.properties which clobbered Spark's
- Example had an error where some classes weren't serializable
- Did some other clean-up in this example
2013-10-15 23:59:56 -07:00
Matei Zaharia 4e46fde818 Merge pull request #62 from harveyfeng/master
Make TaskContext's stageId publicly accessible.
2013-10-15 23:14:27 -07:00
Patrick Wendell 35befe07bb Fixing spark streaming example and a bug in examples build.
- Examples assembly included a log4j.properties which clobbered Spark's
- Example had an error where some classes weren't serializable
- Did some other clean-up in this example
2013-10-15 22:55:43 -07:00
Harvey Feng 65b46236e7 Proper formatting for SparkHadoopWriter class extensions. 2013-10-15 21:51:52 -07:00
Matei Zaharia b5346064d6 Merge pull request #8 from vchekan/checkpoint-ttl-restore
Serialize and restore spark.cleaner.ttl to savepoint

In accordance to conversation in spark-dev maillist, preserve spark.cleaner.ttl parameter when serializing checkpoint.
2013-10-15 21:25:03 -07:00
Matei Zaharia 6dbd2208ff Merge pull request #34 from kayousterhout/rename
Renamed StandaloneX to CoarseGrainedX.

(as suggested by @rxin here https://github.com/apache/incubator-spark/pull/14)

The previous names were confusing because the components weren't just
used in Standalone mode.  The scheduler used for Standalone
mode is called SparkDeploySchedulerBackend, so referring to the base class
as StandaloneSchedulerBackend was misleading.
2013-10-15 19:02:57 -07:00
Matei Zaharia 983b83f24d Merge pull request #61 from kayousterhout/daemon_thread
Unified daemon thread pools

As requested by @mateiz in an earlier pull request, this refactors various daemon thread pools to use a set of methods in utils.scala, and also changes the thread-pool-creation methods in utils.scala to use named thread pools for improved debugging.
2013-10-15 19:02:46 -07:00
Harvey Feng c4c76e37a7 Fix line length > 100 chars in SparkHadoopWriter 2013-10-15 18:35:59 -07:00
Harvey Feng 5b8083fee5 Make TaskContext's stageId publicly accessible. 2013-10-15 18:06:37 -07:00
Kay Ousterhout f95a2be045 Fixed build error after merging in master 2013-10-15 14:51:37 -07:00
Kay Ousterhout acc7638f7c Merge remote branch 'upstream/master' into rename 2013-10-15 14:43:56 -07:00
Kay Ousterhout 707ad8cc4f Unified daemon thread pools 2013-10-15 14:23:43 -07:00
Matei Zaharia 3249e0e90d Merge pull request #59 from rxin/warning
Bump up logging level to warning for failed tasks.
2013-10-15 14:12:33 -07:00
Shivaram Venkataraman 051cd960d9 Merge branch 'master' of https://github.com/apache/incubator-spark into sbt-assembly-deps 2013-10-15 13:26:40 -07:00
Reynold Xin 678dec6680 Merge pull request #58 from hsaputra/update-pom-asf
Update pom.xml to use version 13 of the ASF parent pom

Update pom.xml to use version 13 of the ASF parent pom.
Add mailingList element to pom.xml.
2013-10-15 10:51:46 -07:00
KarthikTunga 6c6b146fc2 Merge branch 'master' of https://github.com/apache/incubator-spark
Updating local branch
2013-10-15 00:46:35 -07:00
KarthikTunga d2c86e7188 SPARK-627 - reading --config argument 2013-10-15 00:35:44 -07:00
Reynold Xin f41feb7b33 Bump up logging level to warning for failed tasks. 2013-10-14 23:35:32 -07:00
Henry Saputra 3fed3e2283 Update pom.xml to use version 13 of the ASF parent pom and add mailingLists element. 2013-10-14 23:10:54 -07:00
Patrick Wendell e33b1839e2 Merge pull request #29 from rxin/kill
Job killing

Moving https://github.com/mesos/spark/pull/935 here

The high level idea is to have an "interrupted" field in TaskContext, and a task should check that flag to determine if its execution should continue. For convenience, I provide an InterruptibleIterator which wraps around a normal iterator but checks for the interrupted flag. I also provide an InterruptibleRDD that wraps around an existing RDD.

As part of this pull request, I added an AsyncRDDActions class that provides a number of RDD actions that return a FutureJob (extending scala.concurrent.Future). The FutureJob can be used to kill the job execution, or waits until the job finishes.

This is NOT ready for merging yet. Remaining TODOs:

1. Add unit tests
2. Add job killing functionality for local scheduler (current job killing functionality only works in cluster scheduler)

-------------

Update on Oct 10, 2013:

This is ready!

Related future work:
- Figure out how to handle the job triggered by RangePartitioner (this one is tough; might become future work)
- Java API
- Python API
2013-10-14 22:25:47 -07:00
Reynold Xin 9cd8786e4a Merge branch 'master' of github.com:apache/incubator-spark into kill
Conflicts:
	core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
2013-10-14 21:51:30 -07:00
Reynold Xin 3b11f43e36 Merge pull request #57 from aarondav/bid
Refactor BlockId into an actual type

Converts all of our BlockId strings into actual BlockId types. Here are some advantages of doing this now:

+ Type safety
+  Code clarity - it's now obvious what the key of a shuffle or rdd block is, for instance. Additionally, appearing in tuple/map type signatures is a big readability bonus. A Seq[(String, BlockStatus)] is not very clear. Further, we can now use more Scala features, like matching on BlockId types.
+ Explicit usage - we can now formally tell where various BlockIds are being used (without doing string searches); this makes updating current BlockIds a much clearer process, and compiler-supported.
  (I'm looking at you, shuffle file consolidation.)
+ It will only get harder to make this change as time goes on.

Downside is, of course, that this is a very invasive change touching a lot of different files, which will inevitably lead to merge conflicts for many.
2013-10-14 14:20:01 -07:00
Aaron Davidson 4a45019fb0 Address Matei's comments 2013-10-14 00:24:17 -07:00
Aaron Davidson da896115ec Change BlockId filename to name + rest of Patrick's comments 2013-10-13 11:15:02 -07:00
Aaron Davidson d60352283c Add unit test and address rest of Reynold's comments 2013-10-12 22:45:15 -07:00