Tathagata Das
dff53d1b94
Merge branch 'mesos-master' into streaming
2013-02-24 12:17:22 -08:00
Matei Zaharia
3b9f929467
Merge pull request #468 from haitaoyao/master
...
support customized java options for Master, Worker, Executor, and Repl
2013-02-23 23:38:15 -08:00
Stephen Haberman
37c7a71f9c
Add subtract to JavaRDD, JavaDoubleRDD, and JavaPairRDD.
2013-02-24 00:27:53 -06:00
Stephen Haberman
f442e7d83c
Update for split->partition rename.
2013-02-24 00:27:14 -06:00
Stephen Haberman
cec87a0653
Merge branch 'master' into subtract
2013-02-23 23:27:55 -06:00
Tathagata Das
d853aa9658
Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs.
2013-02-23 17:42:26 -08:00
Patrick Wendell
931f439be9
Responding to code review
2013-02-23 15:40:41 -08:00
Patrick Wendell
f51b0f93f2
Adding Java-accessible methods to Vector.scala
...
This is needed for the Strata machine learning tutorial (and
also is generally helpful).
2013-02-23 13:26:59 -08:00
Matei Zaharia
d942d39072
Handle exceptions in RecordReader.close() better (suggested by Jim
...
Donahue)
2013-02-23 11:19:07 -08:00
Matei Zaharia
c89824046a
Merge pull request #490 from woggling/conn-death
...
Detect when SendingConnections disconnect even if we aren't sending to them
2013-02-22 22:58:19 -08:00
Charles Reiss
50cf8c8b79
Add fault tolerance test that uses replicated RDDs.
2013-02-22 16:11:53 -08:00
Charles Reiss
c8a7886921
Detect when SendingConnections drop by trying to read them.
...
Comment fix
2013-02-22 16:11:52 -08:00
Matei Zaharia
d4d7993bf5
Several fixes to the work to log when no resources can be used by a job.
...
Fixed some of the messages as well as code style.
2013-02-22 15:51:37 -08:00
Matei Zaharia
f33662c133
Merge remote-tracking branch 'pwendell/starvation-check'
...
Also fixed a bug where master was offering executors on dead workers
Conflicts:
core/src/main/scala/spark/deploy/master/Master.scala
2013-02-22 15:27:41 -08:00
Matei Zaharia
7341de0d48
Merge pull request #475 from JoshRosen/spark-668
...
Remove hack workaround for SPARK-668
2013-02-22 14:56:18 -08:00
Patrick Wendell
f8c3a03d55
SPARK-702: Replace Function --> JFunction in JavaAPI Suite.
...
In a few places the Scala (rather than Java) function class is used.
2013-02-22 12:54:15 -08:00
Imran Rashid
0f37b43b40
make the ShuffleFetcher responsible for collecting shuffle metrics, which gives us metrics for CoGroupedRDD and ShuffledRDD
2013-02-21 16:56:28 -08:00
Imran Rashid
9230617f23
add cleanup iterator
2013-02-21 16:55:14 -08:00
Imran Rashid
81bd07da26
sparkListeners should be a val
2013-02-21 15:21:45 -08:00
Imran Rashid
796e934d31
add some docs & some cleanup
2013-02-21 15:19:34 -08:00
Imran Rashid
394d3acc3e
store taskInfo & metrics together in a tuple
2013-02-21 15:19:34 -08:00
Imran Rashid
7960927cf4
get rid of a bunch of boilerplate; more formatting happens in Listener, not StageInfo
2013-02-21 15:19:34 -08:00
Imran Rashid
d0bfac3eed
taskInfo tracks if a task is run on a preferred host
2013-02-21 15:19:34 -08:00
Imran Rashid
6f62a57858
add runtime breakdowns
2013-02-21 15:19:34 -08:00
Imran Rashid
176cb20703
add task result size; better formatting for time interval distributions; cleanup distribution formatting
2013-02-21 15:19:33 -08:00
Imran Rashid
f2fcabf2ea
add timing around parts of executor & track result size
2013-02-21 15:19:33 -08:00
Imran Rashid
ff127cfcd3
Merge branch 'master' into stageInfo
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/storage/BlockManager.scala
2013-02-21 15:16:21 -08:00
Imran Rashid
69f9a7035f
fully revert change to addOnCompleteCallback -- missed this in e9f53ec
2013-02-21 15:07:46 -08:00
Imran Rashid
baab23abdf
TaskContext does not hold a reference to Task; instead, it has a shared instance of TaskMetrics with Task
2013-02-21 14:13:01 -08:00
haitao.yao
8215b95547
Merge branch 'mesos'
2013-02-21 10:07:24 +08:00
Tathagata Das
334ab92441
Fixed bug in CheckpointSuite
2013-02-20 10:26:36 -08:00
Tathagata Das
1cb725e417
Merge branch 'mesos-master' into streaming
2013-02-20 09:55:35 -08:00
Tathagata Das
fb9956256d
Merge branch 'mesos-master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Matei Zaharia
05bc02e80b
Merge pull request #482 from woggling/shutdown-exceptions
...
Don't call System.exit over uncaught exceptions from shutdown hooks
2013-02-19 20:56:15 -08:00
haitao.yao
6a3d44c673
Merge branch 'mesos'
2013-02-20 10:23:58 +08:00
Charles Reiss
092c631fa8
Pull detection of being in a shutdown hook into utility function.
2013-02-19 17:49:55 -08:00
Reynold Xin
130f704baf
Added a method to create PartitionPruningRDD.
2013-02-19 16:03:52 -08:00
Charles Reiss
d0588bd6d7
Catch/log errors deleting temp dirs
2013-02-19 13:04:06 -08:00
Charles Reiss
687581c3ec
Paranoid uncaught exception handling for exceptions during shutdown
2013-02-19 13:03:02 -08:00
haitao.yao
7c129388fb
Merge branch 'mesos'
2013-02-19 11:22:24 +08:00
Matei Zaharia
7151e1e4c8
Rename "jobs" to "applications" in the standalone cluster
2013-02-17 23:23:08 -08:00
Matei Zaharia
06e5e6627f
Renamed "splits" to "partitions"
2013-02-17 22:13:26 -08:00
Matei Zaharia
340cc54e47
Merge pull request #471 from stephenh/parallelrdd
...
Move ParallelCollection into spark.rdd package.
2013-02-16 16:39:15 -08:00
Matei Zaharia
3260b6120e
Merge pull request #470 from stephenh/morek
...
Make CoGroupedRDDs explicitly have the same key type.
2013-02-16 16:38:38 -08:00
Stephen Haberman
924f47dd11
Add RDD.subtract.
...
Instead of reusing the cogroup primitive, this adds a SubtractedRDD
that knows it only needs to keep rdd1's values (per split) in memory.
2013-02-16 13:38:42 -06:00
Stephen Haberman
e7713adb99
Move ParallelCollection into spark.rdd package.
2013-02-16 13:20:48 -06:00
Stephen Haberman
ae2234687d
Make CoGroupedRDDs explicitly have the same key type.
2013-02-16 13:10:31 -06:00
Stephen Haberman
4328873294
Add assertion about dependencies.
2013-02-16 01:16:40 -06:00
Stephen Haberman
c34b8ad2c5
Avoid a shuffle if combineByKey is passed the same partitioner.
2013-02-16 00:54:03 -06:00
Stephen Haberman
4281e579c2
Update more javadocs.
2013-02-16 00:45:03 -06:00
Stephen Haberman
6a2d957843
Tweak test names.
2013-02-16 00:33:49 -06:00
Stephen Haberman
37397106ce
Remove fileServerSuite.txt.
2013-02-16 00:31:07 -06:00
Stephen Haberman
6cd68c31cb
Update default.parallelism docs, have StandaloneSchedulerBackend use it.
...
Only brand new RDDs (e.g. parallelize and makeRDD) now use default
parallelism, everything else uses their largest parent's partitioner
or partition size.
2013-02-16 00:29:11 -06:00
haitao.yao
a9cfac347a
Merge branch 'mesos'
2013-02-16 10:11:28 +08:00
Imran Rashid
bffee929ab
Merge branch 'master' into stageInfo
...
Conflicts:
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/storage/BlockManager.scala
2013-02-15 10:35:04 -08:00
Imran Rashid
893bad9089
use appid instead of frameworkid; simplify stupid condition
2013-02-13 20:30:21 -08:00
Imran Rashid
8f18e7e863
include jobid in Executor commandline args
2013-02-13 13:05:13 -08:00
Matei Zaharia
fd7e414bd0
Merge pull request #464 from pwendell/java-type-fix
...
SPARK-694: All references to [K, V] in JavaDStreamLike should be changed to [K2, V2]
2013-02-11 19:19:05 -08:00
Matei Zaharia
bfeed4725d
Merge pull request #465 from pwendell/java-sort-fix
...
SPARK-696: sortByKey should use 'ascending' parameter
2013-02-11 18:23:12 -08:00
Patrick Wendell
21df6ffc13
SPARK-696: sortByKey should use 'ascending' parameter
2013-02-11 17:43:26 -08:00
Matei Zaharia
ea08537143
Fixed an exponential recursion that could happen with doCheckpoint due
...
to lack of memoization
2013-02-11 13:23:50 -08:00
Josh Rosen
e9fb25426e
Remove hack workaround for SPARK-668.
...
Renaming the type paramters solves this problem (see SPARK-694).
I tried this fix earlier, but it didn't work because I didn't run
`sbt/sbt clean` first.
2013-02-11 11:19:20 -08:00
Patrick Wendell
f0b68c623c
Initial cut at replacing K, V in Java files
2013-02-11 10:03:37 -08:00
Imran Rashid
e9f53ec0ea
undo chnage to onCompleteCallbacks
2013-02-11 09:36:49 -08:00
Matei Zaharia
da8afbc77e
Some bug and formatting fixes to FT
...
Conflicts:
core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:43:38 -08:00
root
1b47fa2752
Detect hard crashes of workers using a heartbeat mechanism.
...
Also fixes some issues in the rest of the code with detecting workers this way.
Conflicts:
core/src/main/scala/spark/deploy/master/Master.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:28:28 -08:00
Matei Zaharia
8c66c49962
Tweak web UI so that people don't get confused about master URL format
...
Conflicts:
core/src/main/twirl/spark/deploy/master/index.scala.html
core/src/main/twirl/spark/deploy/worker/index.scala.html
2013-02-10 21:58:34 -08:00
Imran Rashid
d9461b15d3
cleanup a bunch of imports
2013-02-10 21:41:40 -08:00
Tathagata Das
16baea62bc
Fixed bug in CheckpointRDD to prevent exception when the original RDD had zero splits.
2013-02-10 19:14:49 -08:00
Imran Rashid
383af599bb
SparkContext.addSparkListener; "std" listener in StatsReportListener
2013-02-10 14:19:37 -08:00
Imran Rashid
b7d9e24394
use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver
2013-02-10 14:18:52 -08:00
Stephen Haberman
680f42e6cd
Change defaultPartitioner to use upstream split size.
...
Previously it used the SparkContext.defaultParallelism, which occassionally
ended up being a very bad guess. Looking at upstream RDDs seems to make
better use of the context.
Also sorted the upstream RDDs by partition size first, as if we have
a hugely-partitioned RDD and tiny-partitioned RDD, it is unlikely
we want the resulting RDD to be tiny-partitioned.
2013-02-10 02:27:03 -06:00
Patrick Wendell
2ed791fd7f
Minor fixes
2013-02-09 22:00:38 -08:00
Patrick Wendell
1859c9f93c
Changing to use Timer based on code review
2013-02-09 21:55:17 -08:00
Matei Zaharia
ccb1ca4a23
Merge pull request #448 from squito/fetch_maxBytesInFlight
...
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-09 18:15:18 -08:00
Matei Zaharia
f750daa510
Merge pull request #452 from stephenh/misc
...
Add RDD.coalesce, clean up some RDDs, other misc.
2013-02-09 18:12:56 -08:00
Stephen Haberman
4619ee0787
Move JavaRDDLike.coalesce into the right places.
2013-02-09 20:05:42 -06:00
Stephen Haberman
921be76533
Use stubs instead of mocks for DAGSchedulerSuite.
2013-02-09 16:42:18 -06:00
Stephen Haberman
fb7599870f
Fix JavaRDDLike.coalesce return type.
2013-02-09 16:10:52 -06:00
Stephen Haberman
2a18cd826c
Add back return types.
2013-02-09 10:12:04 -06:00
Stephen Haberman
da52b16b38
Remove RDD.coalesce default arguments.
2013-02-09 10:11:54 -06:00
Imran Rashid
04e828f7c1
general fixes to Distribution, plus some tests
2013-02-08 19:07:36 -08:00
Mark Hamstra
b8863a79d3
Merge branch 'master' of https://github.com/mesos/spark into commutative
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Mark Hamstra
934a53c8b6
Change docs on 'reduce' since the merging of local reduces no longer preserves
...
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Stephen Haberman
a9c8d53cfa
Clean up RDDs, mainly to use getSplits.
...
Also made sure clearDependencies() was calling super, to ensure
the getSplits/getDependencies vars in the RDD base class get
cleaned up.
2013-02-05 22:16:59 -06:00
Stephen Haberman
f4d43cb43e
Remove unneeded zipWithIndex.
...
Also rename r->rdd and remove unneeded extra type info.
2013-02-05 21:26:45 -06:00
Stephen Haberman
f2bc748013
Add RDD.coalesce.
2013-02-05 21:23:36 -06:00
Stephen Haberman
67df7f2fa2
Add private, minor formatting.
2013-02-05 21:08:21 -06:00
Imran Rashid
379564c7e0
setup plumbing to get task metrics; lots of unfinished parts, but basic flow in place
2013-02-05 18:30:21 -08:00
Matei Zaharia
9cfa068379
Merge pull request #450 from stephenh/inlinemergepair
...
Inline mergePair to look more like the narrow dep branch.
2013-02-05 18:28:44 -08:00
Stephen Haberman
870b2aaf5d
Merge branch 'master' into fixdeathpactexception
...
Conflicts:
core/src/main/scala/spark/deploy/worker/Worker.scala
2013-02-05 20:27:09 -06:00
Matei Zaharia
a4611d66f0
Merge pull request #449 from stephenh/longerdriversuite
...
Increase DriverSuite timeout.
2013-02-05 17:58:22 -08:00
Stephen Haberman
0e19093fd8
Handle Terminated to avoid endless DeathPactExceptions.
...
Credit to Roland Kuhn, Akka's tech lead, for pointing out this
various obvious fix, but StandaloneExecutorBackend.preStart's
catch block would never (ever) get hit, because all of the
operation's in preStart are async.
So, the System.exit in the catch block was skipped, and instead
Akka was sending Terminated messages which, since we didn't
handle, it turned into DeathPactException, which started
a postRestart/preStart infinite loop.
2013-02-05 18:58:00 -06:00
Stephen Haberman
1ba3393ceb
Increase DriverSuite timeout.
2013-02-05 17:56:50 -06:00
Stephen Haberman
8bd0e888f3
Inline mergePair to look more like the narrow dep branch.
...
No functionality changes, I think this is just more consistent
given mergePair isn't called multiple times/recursive.
Also added a comment to explain the usual case of having two parent RDDs.
2013-02-05 17:50:25 -06:00
Imran Rashid
1704b124d8
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-05 14:33:52 -08:00
Imran Rashid
cfab1a3528
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-05 14:31:46 -08:00
Imran Rashid
696e4b2167
track remoteFetchTime
2013-02-05 14:29:16 -08:00
Imran Rashid
b29f9cc978
BlockManager.getMultiple returns a custom iterator, to enable tracking of shuffle performance
2013-02-05 14:00:44 -08:00
Imran Rashid
e319ac74c1
cogrouped RDD stores the amount of time taken to read shuffle data in each task
2013-02-05 10:18:16 -08:00
Imran Rashid
295b534398
task context keeps a handle on Task -- giant hack, temporary for tracking shuffle times & amount
2013-02-05 10:18:16 -08:00
Imran Rashid
9df7e2ae55
Shuffle Fetchers use a timed iterator
2013-02-05 10:18:16 -08:00
Imran Rashid
1ad77c4766
add TimedIterator
2013-02-05 10:18:15 -08:00
Imran Rashid
843084d69d
track total bytes written by ShuffleMapTasks
2013-02-05 10:18:15 -08:00
haitao.yao
f609182e5b
Merge branch 'mesos'
2013-02-05 14:09:45 +08:00
Imran Rashid
b430d2359d
Merge branch 'master' into stageInfo
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/scheduler/local/LocalScheduler.scala
2013-02-04 21:40:44 -08:00
Matei Zaharia
f6ec547ea7
Small fix to test for distinct
2013-02-04 13:14:54 -08:00
Matei Zaharia
aa4ee1e9e5
Fix failing test
2013-02-04 11:06:31 -08:00
Matei Zaharia
f7b4e428be
Merge pull request #445 from JoshRosen/pyspark_fixes
...
Fix exit status in PySpark unit tests; fix/optimize PySpark's RDD.take()
2013-02-03 21:36:36 -08:00
haitao.yao
faa4d9e31f
Merge branch 'mesos'
2013-02-04 11:40:15 +08:00
Patrick Wendell
b14322956c
Starvation check in Standlone scheduler
2013-02-03 12:45:10 -08:00
Patrick Wendell
667860448a
Starvation check in ClusterScheduler
2013-02-03 12:45:04 -08:00
Matei Zaharia
3bfaf3ab1d
Merge pull request #379 from stephenh/sparkmem
...
Add spark.executor.memory to differentiate executor memory from spark-shell
2013-02-02 23:58:23 -08:00
Matei Zaharia
88ee6163a1
Merge pull request #422 from squito/blockmanager_info
...
RDDInfo available from SparkContext
2013-02-02 23:44:13 -08:00
Matei Zaharia
cd4ca93679
Merge pull request #436 from stephenh/removeextraloop
...
Once we find a split with no block, we don't have to look for more.
2013-02-02 23:39:28 -08:00
Matei Zaharia
d5daaab381
Merge pull request #442 from stephenh/fixsystemnames
...
Fix createActorSystem not actually using the systemName parameter.
2013-02-02 23:38:46 -08:00
Matei Zaharia
9163c3705d
Formatting
2013-02-02 23:34:47 -08:00
Josh Rosen
8fbd5380b7
Fetch fewer objects in PySpark's take() method.
2013-02-03 06:44:49 +00:00
Matei Zaharia
34a7bcdb3a
Formatting
2013-02-02 19:40:30 -08:00
Stephen Haberman
7aba123f0c
Further simplify checking for Nil.
2013-02-02 13:53:28 -06:00
Charles Reiss
6107957962
Merge remote-tracking branch 'base/master' into dag-sched-tests
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
2013-02-02 00:33:30 -08:00
Stephen Haberman
cae8a6795c
Fix dangling old variable names.
2013-02-02 02:15:39 -06:00
Stephen Haberman
696eec32c9
Move executorMemory up into SchedulerBackend.
2013-02-02 02:03:26 -06:00
Stephen Haberman
103c375ba0
Merge branch 'master' into sparkmem
2013-02-02 01:57:18 -06:00
Stephen Haberman
28e0cb9f31
Fix createActorSystem not actually using the systemName parameter.
...
This meant all system names were "spark", which worked, but didn't
lead to the most intuitive log output.
This fixes createActorSystem to use the passed system name, and
refactors Master/Worker to encapsulate their system/actor names
instead of having the clients guess at them.
Note that the driver system name, "spark", is left as is, and is
still repeated a few times, but that seems like a separate issue.
2013-02-02 01:11:37 -06:00
Charles Reiss
1fd5ee323d
Code review changes: add sc.stop; style of multiline comments; parens on procedure calls.
2013-02-01 22:33:38 -08:00
Matei Zaharia
ae26911ec0
Add back test for distinct without parens
2013-02-01 21:07:24 -08:00
Stephen Haberman
12c1eb4756
Reduce the amount of duplicate logging Akka does to stdout.
...
Given we have Akka logging go through SLF4j to log4j, we don't need
all the extra noise of Akka's stdout logger that is supposedly only
used during Akka init time but seems to continue logging lots of
noisy network events that we either don't care about or are in the
log4j logs anyway.
See:
http://doc.akka.io/docs/akka/2.0/general/configuration.html
# Log level for the very basic logger activated during AkkaApplication startup
# Options: ERROR, WARNING, INFO, DEBUG
# stdout-loglevel = "WARNING"
2013-02-01 21:21:44 -06:00
Matei Zaharia
8b3041c723
Reduced the memory usage of reduce and similar operations
...
These operations used to wait for all the results to be available in an
array on the driver program before merging them. They now merge values
incrementally as they arrive.
2013-02-01 15:38:42 -08:00
Matei Zaharia
4529876db0
Merge branch 'master' of github.com:mesos/spark
2013-02-01 14:07:38 -08:00
Matei Zaharia
9970926ede
formatting
2013-02-01 14:07:34 -08:00
Matei Zaharia
79c24abe4c
Merge pull request #432 from stephenh/moreprivacy
...
Add more private declarations.
2013-02-01 14:06:55 -08:00
Matei Zaharia
de340ddf0b
Merge pull request #437 from stephenh/cancelmetacleaner
...
Stop BlockManagers metadataCleaner.
2013-02-01 12:59:25 -08:00
Imran Rashid
c6190067ae
remove unneeded (and unused) filter on block info
2013-02-01 09:55:25 -08:00
Stephen Haberman
59c57e48df
Stop BlockManagers metadataCleaner.
2013-02-01 10:34:02 -06:00
Matei Zaharia
571af31304
Merge pull request #433 from rxin/master
...
Changed PartitionPruningRDD's split to make sure it returns the correct split index.
2013-02-01 00:32:41 -08:00
Imran Rashid
8a0a5ed533
track total partitions, in addition to cached partitions; use scala string formatting
2013-02-01 00:23:38 -08:00
Imran Rashid
f127f2ae76
fixup merge (master -> driver renaming)
2013-02-01 00:20:49 -08:00
Reynold Xin
f9af9cee6f
Moved PruneDependency into PartitionPruningRDD.scala.
2013-02-01 00:02:46 -08:00
haitao.yao
b57570fd12
Merge branch 'mesos'
2013-02-01 14:06:45 +08:00
Matei Zaharia
7e2e046e37
Merge pull request #434 from pwendell/python-exceptions
...
SPARK-673: Capture and re-throw Python exceptions
2013-01-31 21:58:26 -08:00
Patrick Wendell
39ab83e957
Small fix from last commit
2013-01-31 21:52:52 -08:00
Patrick Wendell
c33f0ef41a
Some style cleanup
2013-01-31 21:50:02 -08:00
Patrick Wendell
3446d5c8d6
SPARK-673: Capture and re-throw Python exceptions
...
This patch alters the Python <-> executor protocol to pass on
exception data when they occur in user Python code.
2013-01-31 18:06:11 -08:00
Reynold Xin
6289d9654e
Removed the TODO comment from PartitionPruningRDD.
2013-01-31 17:49:36 -08:00
Reynold Xin
5b0fc265c2
Changed PartitionPruningRDD's split to make sure it returns the correct
...
split index.
2013-01-31 17:48:39 -08:00
Stephen Haberman
782187c210
Once we find a split with no block, we don't have to look for more.
2013-01-31 18:27:25 -06:00
Stephen Haberman
418e36caa8
Add more private declarations.
2013-01-31 17:18:33 -06:00
Mikhail Bautin
fe3eceab57
Remove activation of profiles by default
...
See the discussion at https://github.com/mesos/spark/pull/355 for why
default profile activation is a problem.
2013-01-31 13:30:41 -08:00
haitao.yao
3190483b98
bug fix for javadoc
2013-01-31 14:23:51 +08:00
Imran Rashid
02a6761589
Merge branch 'master' into blockmanager_info
...
Conflicts:
core/src/main/scala/spark/storage/BlockManagerMaster.scala
2013-01-30 18:52:35 -08:00
Imran Rashid
c1df24d085
rename Slaves --> Executor
2013-01-30 18:51:14 -08:00
Matei Zaharia
d12330bd2c
Merge pull request #426 from woggling/conn-manager-ips
...
Remember ConnectionManagerId used to initiate SendingConnections
2013-01-30 15:02:53 -08:00
Matei Zaharia
612a9fee71
Merge pull request #428 from woggling/mesos-exec-id
...
Make ExecutorIDs include SlaveIDs when running Mesos
2013-01-30 15:01:46 -08:00
Stephen Haberman
871476d506
Include message and exitStatus if availalbe.
2013-01-30 16:56:46 -06:00
Charles Reiss
252845d304
Remove remants of attempt to use slaveId-executorId in MesosExecutorBackend
2013-01-30 10:38:06 -08:00
Charles Reiss
f7de6978c1
Use Mesos ExecutorIDs to hold SlaveIDs. Then we can safely use
...
the Mesos ExecutorID as a Spark ExecutorID.
2013-01-30 09:38:57 -08:00
Charles Reiss
7f51458774
Comment at top of DAGSchedulerSuite
2013-01-30 09:34:53 -08:00
Charles Reiss
9c0bae75ad
Change DAGSchedulerSuite to run DAGScheduler in the same Thread.
2013-01-30 09:22:07 -08:00
Charles Reiss
178b89204c
Refactor DAGScheduler more to allow testing without a separate thread.
2013-01-30 09:19:55 -08:00
Charles Reiss
4bf3d7ea12
Clear spark.master.port to cleanup for other tests
2013-01-29 19:05:58 -08:00
Charles Reiss
9eac7d01f0
Add DAGScheduler tests.
2013-01-29 18:55:43 -08:00
Charles Reiss
a3d14c0404
Refactoring to DAGScheduler to aid testing
2013-01-29 18:55:42 -08:00
Charles Reiss
16a0789e10
Remember ConnectionManagerId used to initiate SendingConnections.
...
This prevents ConnectionManager from getting confused if a machine
has multiple host names and the one getHostName() finds happens
not to be the one that was passed from, e.g., the BlockManagerMaster.
2013-01-29 18:13:59 -08:00
Matei Zaharia
d54b10b6ad
Merge remote-tracking branch 'stephenh/removefailedjob'
...
Conflicts:
core/src/main/scala/spark/deploy/master/Master.scala
2013-01-29 18:12:29 -08:00
Matei Zaharia
ccb67ff2ca
Merge pull request #425 from stephenh/toDebugString
...
Add RDD.toDebugString.
2013-01-29 10:44:18 -08:00
Matei Zaharia
9ae11603b4
Merge pull request #415 from stephenh/driver
...
Replace old 'master' term with 'driver'.
2013-01-29 10:41:42 -08:00
Charles Reiss
a34096a76d
Add easymock to POMs
2013-01-29 10:04:33 -08:00
Imran Rashid
b92259ba57
Merge branch 'master' into blockmanager_info
2013-01-29 09:45:10 -08:00
Matei Zaharia
64ba6a8c2c
Simplify checkpointing code and RDD class a little:
...
- RDD's getDependencies and getSplits methods are now guaranteed to be
called only once, so subclasses can safely do computation in there
without worrying about caching the results.
- The management of a "splits_" variable that is cleared out when we
checkpoint an RDD is now done in the RDD class.
- A few of the RDD subclasses are simpler.
- CheckpointRDD's compute() method no longer assumes that it is given a
CheckpointRDDSplit -- it can work just as well on a split from the
original RDD, because it only looks at its index. This is important
because things like UnionRDD and ZippedRDD remember the parent's
splits as part of their own and wouldn't work on checkpointed parents.
- RDD.iterator can now reuse cached data if an RDD is computed before it
is checkpointed. It seems like it wouldn't do this before (it always
called iterator() on the CheckpointRDD, which read from HDFS).
2013-01-28 22:30:12 -08:00
Stephen Haberman
cbf72bffa5
Include name, if set, in RDD.toString().
2013-01-29 00:20:36 -06:00
Stephen Haberman
3cda14af3f
Add number of splits.
2013-01-29 00:12:31 -06:00
Matei Zaharia
a1ecec8d79
Merge branch 'master' of github.com:mesos/spark
2013-01-28 22:08:44 -08:00
Stephen Haberman
951cfd9ba2
Add JavaRDDLike.toDebugString().
2013-01-29 00:02:17 -06:00
Matei Zaharia
f6eb1f0825
Merge pull request #413 from pwendell/stage-logging
...
SPARK-658: Adding logging of stage duration
2013-01-28 22:01:52 -08:00
Stephen Haberman
b45857c965
Add RDD.toDebugString.
...
Original idea by Nathan Kronenfeld.
2013-01-28 23:56:56 -06:00
Patrick Wendell
7ee824e42e
Units from ms -> s
2013-01-28 21:48:32 -08:00
Stephen Haberman
13368818af
Merge branch 'master' into driver
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/SparkEnv.scala
core/src/main/scala/spark/deploy/LocalSparkCluster.scala
core/src/main/scala/spark/executor/StandaloneExecutorBackend.scala
core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/ThreadingTest.scala
core/src/test/scala/spark/MapOutputTrackerSuite.scala
2013-01-28 23:30:24 -06:00
Matei Zaharia
dda2ce017c
Merge pull request #424 from pwendell/logging-cleanup
...
Some DEBUG-level log cleanup.
2013-01-28 21:18:54 -08:00
Patrick Wendell
1f9b486a8b
Some DEBUG-level log cleanup.
...
A few changes to make the DEBUG-level logs less
noisy and more readable.
- Moved a few very frequent messages to Trace
- Changed some BlockManger log messages to make them
more understandable
SPARK-666 #resolve
2013-01-28 20:29:35 -08:00
Imran Rashid
efff7bfb33
add long and float accumulatorparams
2013-01-28 20:23:11 -08:00
Imran Rashid
cec9c768c2
convenient name available in StageInfo
2013-01-28 20:09:41 -08:00
Imran Rashid
01d77f329f
expose stageInfo in SparkContext
2013-01-28 20:09:40 -08:00
Imran Rashid
38b83bc66b
can get task runtime summary from task info
2013-01-28 20:09:40 -08:00
Imran Rashid
b88daee916
simple util to summarize distributions
2013-01-28 20:09:40 -08:00
Imran Rashid
b14841455c
track task completion in DAGScheduler, and send a stageCompleted event with taskInfo to SparkListeners
2013-01-28 20:09:40 -08:00
Imran Rashid
0f22c4207f
better formatting for RDDInfo
2013-01-28 20:07:53 -08:00
Imran Rashid
a423ee546c
expose RDD & storage info directly via SparkContext
2013-01-28 20:07:53 -08:00
Patrick Wendell
501433f1d5
Making submission time a field
2013-01-28 10:45:57 -08:00
Patrick Wendell
c423be7d8e
Renaming stage finished function
2013-01-28 10:45:57 -08:00
Patrick Wendell
07f568e1bf
SPARK-658: Adding logging of stage duration
2013-01-28 10:45:57 -08:00
Matei Zaharia
286f8f876f
Change time unit in MetadataCleaner to seconds
2013-01-28 01:29:27 -08:00
Matei Zaharia
f03d9760fd
Clean up BlockManagerUI a little (make it not be an object, merge with
...
Directives, and bind to a random port)
2013-01-27 23:56:14 -08:00
Matei Zaharia
909850729e
Rename more things from slave to executor
2013-01-27 23:17:20 -08:00
Matei Zaharia
44b4a0f88f
Track workers by executor ID instead of hostname to allow multiple
...
executors per machine and remove the need for multiple IP addresses in
unit tests.
2013-01-27 19:23:49 -08:00
Matei Zaharia
6ad8540b40
Merge pull request #401 from squito/blockmanager_ui
...
Blockmanager ui
2013-01-27 15:51:08 -08:00
Matei Zaharia
49f6472c0f
Merge pull request #418 from woggling/reregister-deadlock
...
Fix BlockManager reregistration deadlock; do BlockManager reregistration more asynchronously
2013-01-26 18:59:02 -08:00
Charles Reiss
58fc6b2bed
Handle duplicate registrations better.
2013-01-26 18:30:44 -08:00
Charles Reiss
ad4232b4da
Fix deadlock in BlockManager reregistration triggered by failed updates.
2013-01-26 18:30:38 -08:00
Josh Rosen
d49cf0e587
Fix JavaRDDLike.flatMap(PairFlatMapFunction) (SPARK-668).
...
This workaround is easier than rewriting JavaRDDLike in Java.
2013-01-26 16:13:18 -08:00