Stephen Haberman
418e36caa8
Add more private declarations.
2013-01-31 17:18:33 -06:00
Matei Zaharia
55327a283e
Merge pull request #430 from pwendell/pyspark-guide
...
Minor improvements to PySpark docs
2013-01-30 15:35:29 -08:00
Patrick Wendell
3f945e3b83
Make module help available in python shell.
...
Also, adds a line in doc explaining how to use.
2013-01-30 15:04:06 -08:00
Patrick Wendell
58a7d320d7
Inclue packaging and launching pyspark in guide.
...
It's nicer if all the commands you need are made explicit.
2013-01-30 15:04:02 -08:00
Matei Zaharia
d12330bd2c
Merge pull request #426 from woggling/conn-manager-ips
...
Remember ConnectionManagerId used to initiate SendingConnections
2013-01-30 15:02:53 -08:00
Matei Zaharia
612a9fee71
Merge pull request #428 from woggling/mesos-exec-id
...
Make ExecutorIDs include SlaveIDs when running Mesos
2013-01-30 15:01:46 -08:00
Matei Zaharia
dfb721b970
Merge pull request #429 from stephenh/includemessage
...
Include message and exitStatus if availalbe.
2013-01-30 15:01:24 -08:00
Stephen Haberman
871476d506
Include message and exitStatus if availalbe.
2013-01-30 16:56:46 -06:00
Charles Reiss
252845d304
Remove remants of attempt to use slaveId-executorId in MesosExecutorBackend
2013-01-30 10:38:06 -08:00
Charles Reiss
f7de6978c1
Use Mesos ExecutorIDs to hold SlaveIDs. Then we can safely use
...
the Mesos ExecutorID as a Spark ExecutorID.
2013-01-30 09:38:57 -08:00
Charles Reiss
16a0789e10
Remember ConnectionManagerId used to initiate SendingConnections.
...
This prevents ConnectionManager from getting confused if a machine
has multiple host names and the one getHostName() finds happens
not to be the one that was passed from, e.g., the BlockManagerMaster.
2013-01-29 18:13:59 -08:00
Matei Zaharia
d54b10b6ad
Merge remote-tracking branch 'stephenh/removefailedjob'
...
Conflicts:
core/src/main/scala/spark/deploy/master/Master.scala
2013-01-29 18:12:29 -08:00
Matei Zaharia
ccb67ff2ca
Merge pull request #425 from stephenh/toDebugString
...
Add RDD.toDebugString.
2013-01-29 10:44:18 -08:00
Matei Zaharia
9ae11603b4
Merge pull request #415 from stephenh/driver
...
Replace old 'master' term with 'driver'.
2013-01-29 10:41:42 -08:00
Matei Zaharia
64ba6a8c2c
Simplify checkpointing code and RDD class a little:
...
- RDD's getDependencies and getSplits methods are now guaranteed to be
called only once, so subclasses can safely do computation in there
without worrying about caching the results.
- The management of a "splits_" variable that is cleared out when we
checkpoint an RDD is now done in the RDD class.
- A few of the RDD subclasses are simpler.
- CheckpointRDD's compute() method no longer assumes that it is given a
CheckpointRDDSplit -- it can work just as well on a split from the
original RDD, because it only looks at its index. This is important
because things like UnionRDD and ZippedRDD remember the parent's
splits as part of their own and wouldn't work on checkpointed parents.
- RDD.iterator can now reuse cached data if an RDD is computed before it
is checkpointed. It seems like it wouldn't do this before (it always
called iterator() on the CheckpointRDD, which read from HDFS).
2013-01-28 22:30:12 -08:00
Matei Zaharia
b29599e5cf
Fix code that depended on metadata cleaner interval being in minutes
2013-01-28 22:24:47 -08:00
Stephen Haberman
cbf72bffa5
Include name, if set, in RDD.toString().
2013-01-29 00:20:36 -06:00
Stephen Haberman
3cda14af3f
Add number of splits.
2013-01-29 00:12:31 -06:00
Matei Zaharia
a1ecec8d79
Merge branch 'master' of github.com:mesos/spark
2013-01-28 22:08:44 -08:00
Stephen Haberman
951cfd9ba2
Add JavaRDDLike.toDebugString().
2013-01-29 00:02:17 -06:00
Matei Zaharia
f6eb1f0825
Merge pull request #413 from pwendell/stage-logging
...
SPARK-658: Adding logging of stage duration
2013-01-28 22:01:52 -08:00
Stephen Haberman
b45857c965
Add RDD.toDebugString.
...
Original idea by Nathan Kronenfeld.
2013-01-28 23:56:56 -06:00
Patrick Wendell
7ee824e42e
Units from ms -> s
2013-01-28 21:48:32 -08:00
Stephen Haberman
13368818af
Merge branch 'master' into driver
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/SparkEnv.scala
core/src/main/scala/spark/deploy/LocalSparkCluster.scala
core/src/main/scala/spark/executor/StandaloneExecutorBackend.scala
core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/ThreadingTest.scala
core/src/test/scala/spark/MapOutputTrackerSuite.scala
2013-01-28 23:30:24 -06:00
Matei Zaharia
dda2ce017c
Merge pull request #424 from pwendell/logging-cleanup
...
Some DEBUG-level log cleanup.
2013-01-28 21:18:54 -08:00
Matei Zaharia
8160f03ac4
Merge pull request #423 from squito/long_float_accums
...
add long and float accumulatorparams
2013-01-28 21:18:01 -08:00
Patrick Wendell
1f9b486a8b
Some DEBUG-level log cleanup.
...
A few changes to make the DEBUG-level logs less
noisy and more readable.
- Moved a few very frequent messages to Trace
- Changed some BlockManger log messages to make them
more understandable
SPARK-666 #resolve
2013-01-28 20:29:35 -08:00
Imran Rashid
efff7bfb33
add long and float accumulatorparams
2013-01-28 20:23:11 -08:00
Patrick Wendell
501433f1d5
Making submission time a field
2013-01-28 10:45:57 -08:00
Patrick Wendell
c423be7d8e
Renaming stage finished function
2013-01-28 10:45:57 -08:00
Patrick Wendell
07f568e1bf
SPARK-658: Adding logging of stage duration
2013-01-28 10:45:57 -08:00
Matei Zaharia
286f8f876f
Change time unit in MetadataCleaner to seconds
2013-01-28 01:29:27 -08:00
Matei Zaharia
f03d9760fd
Clean up BlockManagerUI a little (make it not be an object, merge with
...
Directives, and bind to a random port)
2013-01-27 23:56:14 -08:00
Matei Zaharia
909850729e
Rename more things from slave to executor
2013-01-27 23:17:20 -08:00
Matei Zaharia
44b4a0f88f
Track workers by executor ID instead of hostname to allow multiple
...
executors per machine and remove the need for multiple IP addresses in
unit tests.
2013-01-27 19:23:49 -08:00
Matei Zaharia
b9e2d9efec
Merge pull request #419 from shivaram/ec2-ip-change
...
Detect whether we run on EC2 using ec2-metadata as well
2013-01-27 18:41:11 -08:00
Matei Zaharia
6ad8540b40
Merge pull request #401 from squito/blockmanager_ui
...
Blockmanager ui
2013-01-27 15:51:08 -08:00
Shivaram Venkataraman
717b221cca
Detect whether we run on EC2 using ec2-metadata as well
2013-01-26 23:03:11 -08:00
Matei Zaharia
49f6472c0f
Merge pull request #418 from woggling/reregister-deadlock
...
Fix BlockManager reregistration deadlock; do BlockManager reregistration more asynchronously
2013-01-26 18:59:02 -08:00
Charles Reiss
58fc6b2bed
Handle duplicate registrations better.
2013-01-26 18:30:44 -08:00
Charles Reiss
ad4232b4da
Fix deadlock in BlockManager reregistration triggered by failed updates.
2013-01-26 18:30:38 -08:00
Matei Zaharia
ec2dadb521
Merge pull request #417 from JoshRosen/spark-668
...
Fix JavaRDDLike.flatMap(PairFlatMapFunction) (SPARK-668).
2013-01-26 16:20:57 -08:00
Josh Rosen
d49cf0e587
Fix JavaRDDLike.flatMap(PairFlatMapFunction) (SPARK-668).
...
This workaround is easier than rewriting JavaRDDLike in Java.
2013-01-26 16:13:18 -08:00
Imran Rashid
49c05608f5
add metadatacleaner for persisentRdd map
2013-01-25 17:04:16 -08:00
Matei Zaharia
2435b7b5b7
Merge pull request #416 from stephenh/morefinally
...
Call executeOnCompleteCallbacks in more finally blocks.
2013-01-25 15:33:26 -08:00
Stephen Haberman
8efbda0b17
Call executeOnCompleteCallbacks in more finally blocks.
2013-01-25 14:55:33 -06:00
Imran Rashid
a1d9d1767d
fixup 1cadaa1
, changed api of map
2013-01-25 10:05:26 -08:00
Imran Rashid
1cadaa164e
switch to TimeStampedHashMap for storing persistent Rdds
2013-01-25 09:30:21 -08:00
Imran Rashid
539491bbc3
code reformatting
2013-01-25 09:29:59 -08:00
Stephen Haberman
7dfb82a992
Replace old 'master' term with 'driver'.
2013-01-25 11:03:00 -06:00