jerryshao
f3dbe6b215
Fix removed block zero size log reporting
2013-08-30 09:39:01 +08:00
Patrick Wendell
abdbacf252
Merge pull request #871 from pwendell/expose-local
...
Expose `isLocal` in SparkContext.
2013-08-28 21:11:31 -07:00
Patrick Wendell
30d2421112
Make local variable public
2013-08-28 19:53:31 -07:00
Matei Zaharia
baa84e7e4c
Merge pull request #865 from tgravescs/fixtmpdir
...
Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
2013-08-28 12:44:46 -07:00
Y.CORP.YAHOO.COM\tgraves
aac1214ee4
Change Executor to only look at the env variable SPARK_YARN_MODE
2013-08-28 13:26:26 -05:00
Y.CORP.YAHOO.COM\tgraves
3f206bf0b5
Updated based on review comments.
2013-08-27 14:34:27 -05:00
Y.CORP.YAHOO.COM\tgraves
cf52a3cba6
Allow for Executors to have different directories then the Spark Master for Yarn
2013-08-27 11:00:21 -05:00
Reynold Xin
a77e0abb96
Added worker state to the cluster master JSON ui.
2013-08-26 11:21:03 -07:00
Reynold Xin
9db1e50344
Revert "Merge pull request #841 from rxin/json"
...
This reverts commit 1fb1b09928
, reversing
changes made to c69c48947d
.
2013-08-26 11:05:14 -07:00
Matei Zaharia
c2d00f12e2
Merge pull request #832 from alig/coalesce
...
Coalesced RDD with locality
2013-08-22 10:13:03 -07:00
Mark Hamstra
5eea613ec0
Removed meaningless types
2013-08-20 16:49:18 -07:00
Ali Ghodsi
f20ed14e87
Merged in from upstream to use TaskLocation instead of strings
2013-08-20 16:21:43 -07:00
Ali Ghodsi
5cd21c4195
added curly braces to make the code more consistent
2013-08-20 16:16:05 -07:00
Ali Ghodsi
db4bc55bef
indent
2013-08-20 16:16:05 -07:00
Ali Ghodsi
7b123b3126
Simpler code
2013-08-20 16:16:05 -07:00
Ali Ghodsi
9192c358e4
simpler code
2013-08-20 16:16:05 -07:00
Ali Ghodsi
a75a64eade
Fixed almost all of Matei's feedback
2013-08-20 16:16:05 -07:00
Ali Ghodsi
f1c853d76d
fixed Matei's comments
2013-08-20 16:16:04 -07:00
Ali Ghodsi
890ea6ba79
making CoalescedRDDPartition public
2013-08-20 16:16:04 -07:00
Ali Ghodsi
b69e7166ba
Coalescer now uses current preferred locations for derived RDDs. Made run() in DAGScheduler thread safe and added a method to be able to ask it for preferred locations. Added a similar method that wraps the former inside SparkContext.
2013-08-20 16:16:04 -07:00
Ali Ghodsi
abcefb3858
fixed matei's comments
2013-08-20 16:13:37 -07:00
Ali Ghodsi
35537e6341
Made a function object that returns the coalesced groups
2013-08-20 16:13:37 -07:00
Ali Ghodsi
339598c080
several of Reynold's suggestions implemented
2013-08-20 16:13:37 -07:00
Ali Ghodsi
02d6464f2f
space removed
2013-08-20 16:13:37 -07:00
Ali Ghodsi
4f99be1ffd
use count rather than foreach
2013-08-20 16:13:37 -07:00
Ali Ghodsi
f67753cdfc
made preferredLocation a val of the surrounding case class
2013-08-20 16:13:37 -07:00
Ali Ghodsi
f24861b60a
Fix bug in tests
2013-08-20 16:13:36 -07:00
Ali Ghodsi
f6e47e8b51
Renamed split to partition
2013-08-20 16:13:36 -07:00
Ali Ghodsi
937f72feb8
word wrap before 100 chars per line
2013-08-20 16:13:36 -07:00
Ali Ghodsi
c4d59910b1
added goals inline as comment
2013-08-20 16:13:36 -07:00
Ali Ghodsi
7a2a33e32d
Large scale load and locality tests for the coalesced partitions added
2013-08-20 16:13:36 -07:00
Ali Ghodsi
66edf854aa
Bug, should compute slack wrt parent partition size, not number of bins
2013-08-20 16:13:36 -07:00
Ali Ghodsi
1ede102ba5
load balancing coalescer
2013-08-20 16:13:36 -07:00
Matei Zaharia
aa2b89d98d
Merge remote-tracking branch 'jey/hadoop-agnostic'
...
Conflicts:
core/src/main/scala/spark/PairRDDFunctions.scala
2013-08-20 10:14:15 -07:00
Mark Hamstra
1630fbf838
changeGeneration --> changeEpoch renaming
2013-08-20 00:17:16 -07:00
Mark Hamstra
ad18410427
Renamed 'priority' to 'jobId' and assorted minor changes
2013-08-20 00:07:04 -07:00
Matei Zaharia
8cae72e94e
Merge pull request #828 from mateiz/sched-improvements
...
Scheduler fixes and improvements
2013-08-19 23:40:04 -07:00
Matei Zaharia
efeb142981
Merge pull request #849 from mateiz/web-fixes
...
Small fixes to web UI
2013-08-19 19:23:50 -07:00
Matei Zaharia
abdc1f8bbb
Merge pull request #847 from rxin/rdd
...
Allow subclasses of Product2 in all key-value related classes
2013-08-19 18:30:56 -07:00
Matei Zaharia
498a26189b
Small fixes to web UI:
...
- Use SPARK_PUBLIC_DNS environment variable if set (for EC2)
- Use a non-ephemeral port (3030 instead of 33000) by default
- Updated test to use non-ephemeral port too
2013-08-19 18:17:49 -07:00
Reynold Xin
5054abd41b
Code review feedback. (added tests for cogroup and substract; added more documentation on MutablePair)
2013-08-19 12:58:02 -07:00
Reynold Xin
71d705a66e
Made PairRDDFunctions taking only Tuple2, but made the rest of the shuffle code path working with general Product2.
2013-08-19 00:40:43 -07:00
Reynold Xin
2a7b99c08b
Added the missing RDD files and cleaned up SparkContext.
2013-08-18 20:39:29 -07:00
Reynold Xin
82bf4c0339
Allow subclasses of Product2 in all key-value related classes (ShuffleDependency, PairRDDFunctions, etc).
2013-08-18 20:25:45 -07:00
Matei Zaharia
8ac3d1e263
Added unit tests for ClusterTaskSetManager, and fix a bug found with
...
resetting locality level after a non-local launch
2013-08-18 19:51:07 -07:00
Matei Zaharia
4004cf775d
Added some comments on threading in scheduler code
2013-08-18 19:51:07 -07:00
Matei Zaharia
2a4ed10210
Address some review comments:
...
- When a resourceOffers() call has multiple offers, force the TaskSets
to consider them in increasing order of locality levels so that they
get a chance to launch stuff locally across all offers
- Simplify ClusterScheduler.prioritizeContainers
- Add docs on the new configuration options
2013-08-18 19:51:07 -07:00
Matei Zaharia
222c897128
Comment cleanup (via Kay) and some debug messages
2013-08-18 19:51:07 -07:00
Matei Zaharia
cf39d45d14
More scheduling fixes:
...
- Added periodic revival of offers in StandaloneSchedulerBackend
- Replaced task scheduling aggression with multi-level delay scheduling
in ClusterTaskSetManager
- Fixed ZippedRDD preferred locations because they can't currently be
process-local
- Fixed some uses of hostPort
2013-08-18 19:51:07 -07:00
Matei Zaharia
90a04dab8d
Initial work towards scheduler refactoring:
...
- Replace use of hostPort vs host in Task.preferredLocations with a
TaskLocation class that contains either an executorId and a host or
just a host. This is part of a bigger effort to eliminate hostPort
based data structures and just use executorID, since the hostPort vs
host stuff is confusing (and not checkable with static typing, leading
to ugly debug code), and hostPorts are not provided by Mesos.
- Replaced most hostPort-based data structures and fields as above.
- Simplified ClusterTaskSetManager to deal with preferred locations in a
more concise way and generally be more concise.
- Updated the way ClusterTaskSetManager handles racks: instead of
enqueueing a task to a separate queue for all the hosts in the rack,
which would create lots of large queues, have one queue per rack name.
- Removed non-local fallback stuff in ClusterScheduler that tried to
launch less-local tasks on a node once the local ones were all
assigned. This change didn't work because many cluster schedulers send
offers for just one node at a time (even the standalone and YARN ones
do so as nodes join the cluster one by one). Thus, lots of non-local
tasks would be assigned even though a node with locality for them
would be able to receive tasks just a short time later.
- Renamed MapOutputTracker "generations" to "epochs".
2013-08-18 19:51:06 -07:00
Matei Zaharia
8fa0747978
Merge pull request #840 from AndreSchumacher/zipegg
...
Implementing SPARK-878 for PySpark: adding zip and egg files to context ...
2013-08-18 17:02:54 -07:00
Reynold Xin
2c00ea3efc
Moved shuffle serializer setting from a constructor parameter to a setSerializer method in various RDDs that involve shuffle operations.
2013-08-17 21:43:29 -07:00
Reynold Xin
0e84fee76b
Removed the mapSideCombine option in partitionBy.
2013-08-17 21:13:41 -07:00
Reynold Xin
10af952a3d
Removed the mapSideCombine option in CoGroupedRDD.
2013-08-17 21:07:34 -07:00
Reynold Xin
5d050a3e1f
Removed the unused shuffleId in ShuffleDependency's constructor.
2013-08-16 23:23:16 -07:00
Matei Zaharia
e89ffc7b3c
Merge pull request #839 from jegonzal/zip_partitions
...
Currying RDD.zipPartitions
2013-08-16 14:02:34 -07:00
Joseph E. Gonzalez
53b2639a1e
Reversing the argument order in zipPartitions to enable stronger type inference.
2013-08-16 12:38:59 -07:00
Andre Schumacher
c7e348faec
Implementing SPARK-878 for PySpark: adding zip and egg files to context and passing it down to workers which add these to their sys.path
2013-08-16 11:58:20 -07:00
Reynold Xin
c961c19b7b
Use the JSON formatter from Scala library and removed dependency on lift-json.
...
It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
2013-08-15 18:23:01 -07:00
Reynold Xin
eddbf43b54
Revert "Merge pull request #834 from Daemoen/master"
...
This reverts commit 230ab2722e
, reversing
changes made to 659553b21d
.
2013-08-15 17:49:37 -07:00
Reynold Xin
230ab2722e
Merge pull request #834 from Daemoen/master
...
Updated json output to allow for display of worker state
2013-08-15 17:45:17 -07:00
Patrick Wendell
659553b21d
Merge pull request #836 from pwendell/rename
...
Rename `memoryBytesToString` and `memoryMegabytesToString`
2013-08-15 16:56:31 -07:00
Jey Kottalam
a06a9d5c5f
Rename HadoopWriter to SparkHadoopWriter since it's outside of our package
2013-08-15 16:50:37 -07:00
Jey Kottalam
8f979edef5
Fix newTaskAttemptID to work under YARN
2013-08-15 16:50:37 -07:00
Jey Kottalam
e2d7656ca3
re-enable YARN support
2013-08-15 16:50:37 -07:00
Jey Kottalam
bd0bab47c9
SparkEnv isn't available this early, and not needed anyway
2013-08-15 16:50:37 -07:00
Jey Kottalam
4f43fd791a
make SparkHadoopUtil a member of SparkEnv
2013-08-15 16:50:37 -07:00
Jey Kottalam
43ebcb8484
rename HadoopMapRedUtil => SparkHadoopMapRedUtil, HadoopMapReduceUtil => SparkHadoopMapReduceUtil
2013-08-15 16:50:37 -07:00
Jey Kottalam
8b1c1520fc
add comment
2013-08-15 16:50:37 -07:00
Jey Kottalam
69c3bbf688
dynamically detect hadoop version
2013-08-15 16:50:37 -07:00
Jey Kottalam
f67b94ad4f
remove core/src/hadoop{1,2} dirs
2013-08-15 16:50:36 -07:00
Patrick Wendell
4c6ade1ad5
Rename memoryBytesToString
and memoryMegabytesToString
...
These are used all over the place now and they are not specific to memory at all.
memoryBytesToString --> bytesToString
memoryMegabytesToString --> megabytesToString
2013-08-15 15:58:07 -07:00
Reynold Xin
1a51deae8a
More minor UI changes including code review feedback.
2013-08-15 14:34:07 -07:00
Daemoen
ad2e8b5126
Updated json output to allow for display of worker state
...
Ops teams need to ensure that the cluster is functional and performant. Having to scrape the html source for worker state won't work reliably, and will be slow. By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.
2013-08-15 12:19:14 -07:00
Reynold Xin
2d2a556bdf
Various UI improvements.
2013-08-14 23:23:09 -07:00
Reynold Xin
290e3e6e65
Renamed setCurrentJobDescription to setJobDescription.
2013-08-14 18:40:53 -07:00
Reynold Xin
3886b54933
A few small scheduler / job description changes.
...
1. Renamed SparkContext.addLocalProperty to setLocalProperty. And allow this function to unset a property.
2. Renamed SparkContext.setDescription to setCurrentJobDescription.
3. Throw an exception if the fair scheduler allocation file is invalid.
2013-08-14 17:19:42 -07:00
Matei Zaharia
839f2d4f3f
Merge pull request #822 from pwendell/ui-features
...
Adding GC Stats to TaskMetrics (and three small fixes)
2013-08-14 16:17:23 -07:00
Patrick Wendell
04ad78b09d
Style cleanup based on Matei feedback
2013-08-14 14:57:21 -07:00
Kay Ousterhout
a88aa5e6ed
Fixed 2 bugs in executor UI.
...
1) UI crashed if the executor UI was loaded before any tasks started.
2) The total tasks was incorrectly reported due to using string (rather
than int) arithmetic.
2013-08-13 23:44:58 -07:00
Patrick Wendell
c223176388
Small style clean-up
2013-08-13 16:56:37 -07:00
Patrick Wendell
fab5cee111
Correcting terminology in RDD page
2013-08-13 16:25:55 -07:00
Patrick Wendell
024e5c5ce1
Correct sorting order for stages
2013-08-13 16:25:55 -07:00
Patrick Wendell
4e9f0c2df6
Capturing GC detials in TaskMetrics
2013-08-13 16:25:55 -07:00
Patrick Wendell
f0382007dc
Bug fix for display of shuffle read/write metrics.
...
This fixes an error where empty cells are missing if a given task
has no shuffle read/write.
2013-08-13 16:25:55 -07:00
Matei Zaharia
d316af9c84
Merge pull request #821 from pwendell/print-launch-command
...
Print run command to stderr rather than stdout
2013-08-13 15:31:01 -07:00
Patrick Wendell
a7feb69ae8
Print run command to stderr rather than stdout
2013-08-13 15:07:03 -07:00
Kay Ousterhout
1beb843a6f
Reuse the set of failed states rather than creating a new object each time
2013-08-13 14:27:40 -07:00
Kay Ousterhout
c92dd627ca
Properly account for killed tasks.
...
The TaskState class's isFinished() method didn't return true for
KILLED tasks, which means some resources are never reclaimed
for tasks that are killed. This also made it inconsistent with the
isFinished() method used by CoarseMesosSchedulerBackend.
2013-08-13 12:40:15 -07:00
Patrick Wendell
ed6a1646e6
Slight change to pr-784
2013-08-13 09:29:40 -07:00
Patrick Wendell
a0133bfbad
Merge pull request #784 from jerryshao/dev-metrics-servlet
...
Add MetricsServlet for Spark metrics system
2013-08-13 09:28:18 -07:00
Matei Zaharia
65d0d91fba
Merge pull request #807 from JoshRosen/guava-optional
...
Change scala.Option to Guava Optional in Java APIs
2013-08-12 19:00:57 -07:00
Josh Rosen
cf08bb7a3e
Fix import organization.
2013-08-12 18:55:02 -07:00
jerryshao
09c7179e81
MetricsServlet code refactor according to comments
2013-08-12 13:23:23 +08:00
jerryshao
320e87e7ab
Add MetricsServlet for Spark metrics system
2013-08-12 13:23:23 +08:00
Reynold Xin
e5b9ed2833
Merge pull request #808 from pwendell/ui_compressed_bytes
...
Report compressed bytes read when calculating TaskMetrics
2013-08-11 17:22:47 -07:00
Patrick Wendell
3d8f281604
Report compressed bytes read when calculating TaskMetrics
2013-08-11 16:25:57 -07:00
Matei Zaharia
379648630b
Merge pull request #805 from woggle/hadoop-rdd-jobconf
...
Use new Configuration() instead of slower new JobConf() in SerializableWritable
2013-08-11 14:51:47 -07:00
Josh Rosen
d7f78b443b
Change scala.Option to Guava Optional in Java APIs.
2013-08-11 12:05:09 -07:00
Charles Reiss
6402b539d0
Use new Configuration() instead of new JobConf() for ObjectWritable.
...
JobConf's constructor loads default config files in some verisons of
Hadoop, which is quite slow, and we only need the Configuration object
to pass the correct ClassLoader.
2013-08-10 21:31:05 -07:00
Matei Zaharia
71c63de22f
Merge pull request #795 from mridulm/master
...
Fix bug reported in PR 791 : a race condition in ConnectionManager and Connection
2013-08-10 10:21:20 -07:00
Matei Zaharia
d3277a0daf
Merge remote-tracking branch 'origin/pr/792'
...
Conflicts:
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/jobs/StagePage.scala
2013-08-10 10:18:50 -07:00
Patrick Wendell
d17eeb997d
Merge pull request #785 from anfeng/master
...
expose HDFS file system stats via Executor metrics
2013-08-10 09:02:27 -07:00
Kay Ousterhout
14d14f451a
Shortened names, as per Matei's suggestion
2013-08-10 07:50:27 -07:00
Matei Zaharia
cd247ba5bb
Merge pull request #786 from shivaram/mllib-java
...
Java fixes, tests and examples for ALS, KMeans
2013-08-09 20:41:13 -07:00
Kay Ousterhout
7810a76512
Only print event queue full error message once
2013-08-09 18:20:48 -07:00
Kay Ousterhout
44ca8629d8
Style fix: removing unnecessary return type
2013-08-09 17:22:50 -07:00
Kay Ousterhout
29b79714f9
Style fixes based on code review
2013-08-09 16:46:34 -07:00
Kay Ousterhout
81e1d4a7d1
Refactored SparkListener to process all events asynchronously.
...
This commit fixes issues where SparkListeners that take a while to
process events slow the DAGScheduler.
This commit also fixes a bug in the UI where if a user goes to a
web page of a stage that does not exist, they can create a memory
leak (granted, this is not an issue at small scale -- probably only
an issue if someone actively tried to DOS the UI).
2013-08-09 13:27:41 -07:00
Matei Zaharia
b09d4b79e8
Merge pull request #799 from woggle/sync-fix
...
Remove extra synchronization in ResultTask
2013-08-09 13:17:08 -07:00
Patrick Wendell
cc6b92e80e
Merge pull request #775 from pwendell/print-launch-command
...
Log the launch command for Spark daemons
2013-08-09 13:00:33 -07:00
Patrick Wendell
3970b580c2
Using quotes when printing out command
2013-08-09 11:53:32 -07:00
Charles Reiss
9dfc280f74
Remove extra synchronization in ResultTask
2013-08-09 11:09:02 -07:00
Matei Zaharia
f94fc75c3f
Merge pull request #788 from shane-huang/sparkjavaopts
...
For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as ...
2013-08-09 10:04:03 -07:00
Mridul Muralidharan
c230ca3b4e
Change line size
2013-08-08 22:28:40 +05:30
Mridul Muralidharan
dc47084f4e
Attempt to fix bug reported in PR 791 : a race condition in ConnectionManager and Connection
2013-08-08 22:19:27 +05:30
Kay Ousterhout
88049a214d
Fixed 3 bugs that caused UI to crash (including SPARK-810).
...
One bug caused the UI to crash if you try to look at a job's status
before any of the tasks have finished.
The second bug was a concurrency issue where two different threads
(the scheduling thread and a UI thread) could be reading/updating
the data structures in JobProgressListener concurrently.
The third bug mis-used an Option, also causing the UI to crash
under certain conditions.
2013-08-07 23:09:25 -07:00
Patrick Wendell
b4321edf68
Reverting boostrap change
2013-08-07 22:18:18 -07:00
Patrick Wendell
21392f2a73
Change I forgot to merge in
2013-08-07 21:45:32 -07:00
Patrick Wendell
706394b370
Bumping font size to 14px and fixing sytle issue in progress bars
2013-08-07 21:27:04 -07:00
Patrick Wendell
8c0d668468
Merge branch 'master' into bootstrap-design
...
Conflicts:
core/src/main/scala/spark/ui/UIUtils.scala
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/storage/RDDPage.scala
2013-08-07 21:06:03 -07:00
Kay Ousterhout
b88e26248e
Fixed issue in UI that limited scheduler throughput.
...
Removal of items from ArrayBuffers in the UI code was slow and
significantly impacted scheduler throughput. This commit
improves scheduler throughput by 5x.
2013-08-07 14:42:05 -07:00
shane-huang
cbc5107e36
For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as default and let application env override default options if applicable
...
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-08-07 14:36:48 +08:00
Matei Zaharia
6b043a6f11
Merge pull request #724 from dlyubimov/SPARK-826
...
SPARK-826: fold(), reduce(), collect() always attempt to use java serialization
2013-08-06 22:31:02 -07:00
Matei Zaharia
7c4b7a53b1
Merge remote-tracking branch 'origin/pr/781'
...
Conflicts:
core/src/main/resources/spark/ui/static/webui.css
2013-08-06 17:19:49 -07:00
Karen Feng
908032e79b
Used saturated colors for progress bars
2013-08-06 16:52:21 -07:00
Karen Feng
8bc497fa10
Lightened color of progress bars
2013-08-06 16:33:05 -07:00
Karen Feng
ca1903ea63
Overlays progress text on top of bar
2013-08-06 15:45:42 -07:00
Matei Zaharia
df4d10d630
Merge pull request #779 from adatao/adatao-global-SparkEnv
...
[HOTFIX] Extend thread safety for SparkEnv.get()
2013-08-06 15:44:05 -07:00
Shivaram Venkataraman
471fbadd0c
Java examples, tests for KMeans and ALS
...
- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
examples project.
Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
KMeans init
2013-08-06 15:43:46 -07:00
anfeng
dda2ac8b5d
reformat registerFileSystemStat()
2013-08-06 15:22:25 -07:00
Karen Feng
099528b6c4
Pre-sorts stage/env tables, changes text/link of stage summaries
2013-08-06 14:52:12 -07:00
Karen Feng
254a930730
Reverse sorts StageTable by submitted time
2013-08-06 14:18:38 -07:00
Karen Feng
5ed5b73026
Sorts first column of env tables
2013-08-06 13:59:53 -07:00
anfeng
0748c60817
expose HDFS file system stats via Executor metrics
2013-08-06 11:47:06 -07:00
Reynold Xin
d031f73679
Merge pull request #782 from WANdisco/master
...
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 22:33:00 -07:00
Matei Zaharia
1b63dea816
Merge pull request #769 from markhamstra/NegativeCores
...
SPARK-847 + SPARK-845: Zombie workers and negative cores
2013-08-05 22:21:26 -07:00
Alexander Pivovarov
a30866438b
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 21:48:43 -07:00
Matei Zaharia
8b277892c9
Merge pull request #774 from pwendell/job-description
...
Show user-defined job name in UI
2013-08-05 19:14:52 -07:00
Christopher Nguyen
b1bbbe699c
[HOTFIX] Mark lastSetSparkEnv @volatile in case it gets HotSpot-cached
...
On branch adatao-global-SparkEnv
Changes to be committed:
modified: core/src/main/scala/spark/SparkEnv.scala
2013-08-05 17:22:27 -07:00
Mark Hamstra
35d8f5ee52
Moved handling of timed out workers within the Master actor
2013-08-05 13:13:56 -07:00
Mark Hamstra
37ccf9301a
milliseconds -> seconds in timeOutDeadWorkers logging
2013-08-05 13:13:56 -07:00
Mark Hamstra
cdd1af562e
Timeout zombie workers
2013-08-05 13:13:56 -07:00
Mikhail Bautin
e8bec8365f
Only reduce the number of cores once when removing an executor
2013-08-05 13:13:56 -07:00
Karen Feng
95025afdec
Made most small fixes for SPARK-849 except for table sort, task progress overlay
2013-08-05 13:04:56 -07:00
Bill Zhao
87134b3648
SPARK-850: give better console message
2013-08-05 11:55:35 -07:00
Christopher Nguyen
39e4fda76f
[HOTFIX] Extend thread safety for SparkEnv.get()
...
A ThreadLocal SparkEnv.env is facing various situations leading to
NullPointerExceptions, where SparkEnv.env set in one thread is not
gettable in another thread, but often assumed to be available.
See, e.g., https://groups.google.com/forum/#!topic/spark-developers/GLx8yunSj0A
This hotfixes SparkEnv.env to return either (a) the ThreadLocal
value if non-null, or (b) the previously set value in any thread.
This approach preserves SparkEnv.set() thread safety needed by
RDD.compute() and possibly other places. A refactoring that
parameterizes SparkEnv should be addressed subsequently.
On branch adatao-global-SparkEnv
Changes to be committed:
modified: core/src/main/scala/spark/SparkEnv.scala
2013-08-05 02:09:54 -07:00
Patrick Wendell
f3660d5ab8
Make output formatting consistent between bash/scala
2013-08-03 21:30:15 -07:00
Patrick Wendell
ad94fbb322
Log the launch command for Spark executors
2013-08-03 09:19:46 -07:00
Matei Zaharia
22abbc10d6
Merge pull request #772 from karenfeng/ui-843
...
Show app duration
2013-08-02 16:37:59 -07:00
Patrick Wendell
5b3784a79c
Show user-defined job name in UI
2013-08-02 15:47:41 -07:00
Karen Feng
b3ae5b25d5
Shows time the app has been running
2013-08-02 13:25:14 -07:00
Patrick Wendell
9d7dfd2d5a
Merge pull request #743 from pwendell/app-metrics
...
Add application metrics to standalone master
2013-08-01 17:41:58 -07:00
Patrick Wendell
f1d2ad550e
under_scores --> camelCase for config options
2013-08-01 15:26:26 -07:00
Patrick Wendell
12d9c82c9b
Small style fix
2013-08-01 15:25:52 -07:00
Patrick Wendell
37bc64a205
Adding application-level metrics.
...
This adds metrics for applications in the deploy Master.
2013-08-01 15:25:52 -07:00
Karen Feng
73692f3cb9
Unify, reduce body font size
2013-08-01 15:10:30 -07:00
Patrick Wendell
87fd321a5a
Minor refactoring and code cleanup
2013-08-01 15:02:31 -07:00
Patrick Wendell
b10199413a
Slight refactoring to SparkContext functions
2013-08-01 15:00:42 -07:00
Patrick Wendell
cfcd77b5da
Increasing inter job arrival
2013-08-01 15:00:42 -07:00
Patrick Wendell
5faac7f4f3
Minor style fixes
2013-08-01 15:00:42 -07:00
Patrick Wendell
5e7b38fbb3
Merge pull request #695 from xiajunluan/pool_ui
...
Enhance job ui in spark ui system with adding pool information
2013-08-01 14:59:33 -07:00
Karen Feng
47600e9579
Removed hr margin
2013-08-01 14:57:04 -07:00
Karen Feng
e648a62fc8
Inserted needed line break for log paging
2013-08-01 14:46:19 -07:00
Karen Feng
686d6266c4
Use nav pills instead of default
2013-08-01 14:41:49 -07:00
Karen Feng
86d372d17f
Removed line breaks
2013-08-01 14:37:21 -07:00
Karen Feng
99803d88b9
Reduced all header sizes
2013-08-01 14:18:33 -07:00
Karen Feng
d216d687ef
Reduced size of table text to compact
2013-08-01 13:27:23 -07:00
Karen Feng
5dae283996
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
2013-08-01 11:28:28 -07:00
Matei Zaharia
0a96493ac6
Merge pull request #760 from karenfeng/heading-update
...
Clean up web UI page headers
2013-08-01 11:27:17 -07:00
Patrick Wendell
9177bea2b4
Removing extra imports
2013-08-01 10:42:50 -07:00
Patrick Wendell
3e4d5e5f8b
Merge branch 'master' into master-json
...
Conflicts:
core/src/main/scala/spark/deploy/master/ui/IndexPage.scala
2013-08-01 10:42:07 -07:00
Patrick Wendell
ffc034e4fb
Import cleanup
2013-08-01 10:39:56 -07:00
Andrew xia
d58502a156
fix bug of spark "SubmitStage" listener as unit test error
2013-08-01 23:21:41 +08:00
Andrew xia
3b5a11e765
change function name "setName" to "setProperties" as "setName" is also member of Thread class
2013-08-01 19:37:15 +08:00
Dmitriy Lyubimov
cb6be5bd7e
Merge remote-tracking branch 'mesos/master' into SPARK-826
...
Conflicts:
core/src/main/scala/spark/scheduler/cluster/ClusterTaskSetManager.scala
core/src/main/scala/spark/scheduler/local/LocalTaskSetManager.scala
core/src/test/scala/spark/KryoSerializerSuite.scala
2013-07-31 22:09:22 -07:00
Dmitriy Lyubimov
28f1550f01
More elegant rewrite of the same.
2013-07-31 21:41:00 -07:00
Dmitriy Lyubimov
7c52ecc6a4
(1) added reduce test case.
...
(2) added nested streaming in ParallelCollectionRDD
(3) added kryo with fold test which still doesn't work
2013-07-31 19:27:30 -07:00
Matei Zaharia
3097d75d6f
Merge remote-tracking branch 'dlyubimov/SPARK-827'
...
Conflicts:
docs/configuration.md
2013-07-31 18:36:43 -07:00
Karen Feng
7c9c5ef6c6
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
2013-07-31 16:39:26 -07:00
Karen Feng
02cde8efdf
Replaces theme with Bootswatch Spacelab theme
2013-07-31 16:34:07 -07:00
Karen Feng
09cd67bf98
Changed bootstrap colors, fixed logpaging buttons
2013-07-31 16:18:53 -07:00
Matei Zaharia
39c75f3033
Merge pull request #757 from BlackNiuza/result_task_generation
...
Bug fix: SPARK-837
2013-07-31 15:52:36 -07:00
Matei Zaharia
14bf2fe039
Merge pull request #749 from benh/spark-executor-uri
...
Added property 'spark.executor.uri' for launching on Mesos.
2013-07-31 14:18:16 -07:00
Benjamin Hindman
4692ea4892
Used 'uri.split('/').last' instead of 'new File(uri).getName()'.
2013-07-31 12:29:44 -07:00
Karen Feng
c453967f9a
Reduced size of heading
2013-07-31 11:57:50 -07:00
Matei Zaharia
a386ced2c6
Merge pull request #754 from rxin/compression
...
Compression codec change
2013-07-31 11:22:50 -07:00
Karen Feng
49e6344142
Removed master URL from job UI, reduced heading size of basic spark pages
2013-07-31 11:17:59 -07:00
Reynold Xin
c61843a69f
Changed other LZF uses to use the compression codec interface.
2013-07-31 10:32:13 -07:00
Patrick Wendell
89da9d94b3
Add JSON path to master index page
2013-07-31 09:47:53 -07:00
BlackNiuza
9a815de4bf
write and read generation in ResultTask
2013-08-01 00:36:47 +08:00
Roman Tkalenko
0c6553714a
Refactored Vector.apply(length, initializer) replacing excessive code with library method
...
(also removed unused variable ```ans``` as minor change)
2013-07-31 19:05:46 +03:00
Matei Zaharia
12553e5c55
Simplified nonNegativeMod to match previous version
2013-07-31 08:50:28 -07:00
Matei Zaharia
d4556f4207
Merge pull request #751 from cdshines/master
...
Cleaned Partitioner & PythonPartitioner source by taking out non-related logic to Utils
2013-07-31 08:48:14 -07:00
Andrew xia
5670c96f29
Merge branch 'master' into Pool_UI
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/scheduler/SparkListener.scala
core/src/main/scala/spark/scheduler/cluster/ClusterTaskSetManager.scala
core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
core/src/main/scala/spark/scheduler/local/LocalTaskSetManager.scala
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/jobs/JobProgressUI.scala
2013-07-31 19:36:36 +08:00
cdshines
fefb03cbd7
Eliminated code duplication, refactored to pattern-matching style Partitioner and PythonPartitioner
2013-07-31 13:19:42 +03:00
Dmitriy Lyubimov
96664431cb
IDEA flipped JavaSerialized import at some point to a wrong class.
2013-07-30 23:10:09 -07:00
Dmitriy Lyubimov
c219fc94fd
Minor, style
2013-07-30 22:08:39 -07:00
Dmitriy Lyubimov
f4b4b8836e
reverting back to one-by-one serialization for parallelize()
2013-07-30 19:00:58 -07:00
jerryshao
bf9318091a
Add Apache license header to metrics system
2013-07-31 09:42:16 +08:00
Reynold Xin
98024eadc3
Renamed compressionOutputStream and compressionInputStream to compressedOutputStream and compressedInputStream.
2013-07-30 18:28:46 -07:00
Dmitriy Lyubimov
abada94ebf
removing default constructor (not Externalizable any more)
2013-07-30 18:04:02 -07:00
Dmitriy Lyubimov
943c6590c9
realiging "extends" back manually
2013-07-30 18:01:35 -07:00
Dmitriy Lyubimov
ca33b12e98
resetting wrap and continuation indent = 4
2013-07-30 17:51:44 -07:00
Reynold Xin
dae12fef9e
Updated the configuration option for Snappy block size to be consistent with the documentation.
2013-07-30 17:49:31 -07:00
Dmitriy Lyubimov
984b56155a
changing approaches for parallelize(): java serialization needs to avoid writing headers!
2013-07-30 17:36:59 -07:00
Reynold Xin
ad7e9d0d64
CompressionCodec cleanup. Moved it to spark.io package.
2013-07-30 17:11:54 -07:00
Dmitriy Lyubimov
ef9529a943
refactoring using writeByteBuffer() from Utils.
2013-07-30 16:24:23 -07:00
Dmitriy Lyubimov
43394b9a6d
fixing formatting
2013-07-30 16:13:41 -07:00
Reynold Xin
368c58eac5
Merge branch 'lazy_file_open' of github.com:lyogavin/spark into compression
...
Conflicts:
project/SparkBuild.scala
2013-07-30 16:04:18 -07:00
Patrick Wendell
e87de037d6
Merge pull request #744 from karenfeng/bootstrap-update
...
Use Bootstrap progress bars in web UI
2013-07-30 15:00:08 -07:00
Karen Feng
26144c400f
Fixed wrap style
2013-07-30 12:40:41 -07:00
Karen Feng
218d7c4ed8
Fixed style, lowered height of progress bars
2013-07-30 12:39:17 -07:00
Karen Feng
f1cab31b73
Removed intermediate set for activeTasks, removed progress bar margin
2013-07-30 11:06:47 -07:00
Dmitriy Lyubimov
1bca91633e
+ bug fixes;
...
test added
Conflicts:
core/src/test/scala/spark/KryoSerializerSuite.scala
2013-07-30 11:04:11 -07:00
Benjamin Hindman
f6f46455eb
Added property 'spark.executor.uri' for launching on Mesos without
...
requiring Spark to be installed. Using 'make_distribution.sh' a user
can put a Spark distribution at a URI supported by Mesos (e.g.,
'hdfs://...') and then set that when launching their job. Also added
SPARK_EXECUTOR_URI for the REPL.
2013-07-29 23:32:52 -07:00
Josh Rosen
49be084ed3
Use File.pathSeparator instead of hardcoding ':'.
2013-07-29 22:08:57 -07:00
Josh Rosen
b95732632b
Do not inherit master's PYTHONPATH on workers.
...
This fixes SPARK-832, an issue where PySpark
would not work when the master and workers used
different SPARK_HOME paths.
This change may potentially break code that relied
on the master's PYTHONPATH being used on workers.
To have custom PYTHONPATH additions used on the
workers, users should set a custom PYTHONPATH in
spark-env.sh rather than setting it in the shell.
2013-07-29 22:08:57 -07:00
Andrew xia
5406013997
refactor codes less than 100 character per line
2013-07-30 11:41:38 +08:00
Andrew xia
614ee16cc4
refactor job ui with pool information
2013-07-30 10:57:26 +08:00
Dmitriy Lyubimov
8e5cd041bb
initial externalization of ParallelCollectionRDD's split
2013-07-29 19:02:53 -07:00
Reynold Xin
81720e13fc
Moved all StandaloneClusterMessage's into StandaloneClusterMessages object.
2013-07-29 17:53:01 -07:00
Reynold Xin
23b5da14ed
Moved block manager messages into BlockManagerMessages object.
2013-07-29 17:42:05 -07:00
Reynold Xin
105f4d22e9
Removed Cache and SoftReferenceCache since they are no longer used.
2013-07-29 17:30:38 -07:00
Reynold Xin
17e62113d4
Moved DeployMessage's into its own DeployMessages object.
...
Also renamed MasterState to MasterStateResponse and WorkerState to WorkerStateResponse for clarity.
2013-07-29 17:14:44 -07:00
Karen Feng
87b821dc39
Fixed continuity of executorToTasksActive, changed color of progress bars
2013-07-29 16:50:51 -07:00
Karen Feng
c7b2788948
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
...
Conflicts:
core/src/main/scala/spark/ui/jobs/IndexPage.scala
2013-07-29 16:36:07 -07:00
Patrick Wendell
c99b674405
Merge pull request #735 from karenfeng/ui-807
...
Totals for shuffle data and CPU time
2013-07-29 16:32:55 -07:00
Karen Feng
2d6da9195a
Alphabetized imports
2013-07-29 15:50:52 -07:00
Karen Feng
478a2886d9
Added started tasks to progress bar
2013-07-29 14:51:07 -07:00
Karen Feng
e04a37a332
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
...
cially if it merges an updated upstream into a topic branch.
2013-07-29 14:32:48 -07:00
Reynold Xin
fe7298b587
Merge pull request #741 from pwendell/usability
...
Fix two small usability issues
2013-07-29 14:01:00 -07:00
Karen Feng
43a2cc15c0
Use Bootstrap progress bars in web UI
2013-07-29 13:37:24 -07:00
Matei Zaharia
b9d6783f36
Optimize Python take() to not compute entire first partition
2013-07-29 02:51:43 -04:00
Dmitriy Lyubimov
f5067abe85
changes per comments.
2013-07-27 23:08:00 -07:00
Karen Feng
077f2dad22
Fixed outdated bugs
2013-07-27 16:39:36 -07:00
Patrick Wendell
bcafb36c1e
Slight wording change
2013-07-27 16:03:50 -07:00
Patrick Wendell
8177165ac4
Log executor on finish
2013-07-27 16:02:06 -07:00
Patrick Wendell
c2223e6801
Improve catch scope and logging for client stop()
...
This does two things:
1. Catches the more general `TimeoutException`, since those can be thrown.
2. Logs at info level when a timeout is detected.
2013-07-27 16:02:06 -07:00
Karen Feng
5a93e3c58c
Cleaned up code based on pwendell's suggestions
2013-07-27 15:55:26 -07:00
Karen Feng
dcc4743a95
Moved val now to render
2013-07-27 12:52:53 -07:00
Karen Feng
1714693324
Current time called once with value now
2013-07-27 12:24:41 -07:00
Dmitriy Lyubimov
6a47cee721
style
2013-07-26 22:35:13 -07:00
Dmitriy Lyubimov
0c391feb73
Maximum task failures configurable
2013-07-26 22:34:43 -07:00
Karen Feng
bd4cc52e30
Made metrics Option instead of Some, fixed NullPointerException
2013-07-26 17:23:18 -07:00
Reynold Xin
cb366774c8
Merge pull request #738 from harsha2010/pruning
...
Fix bug in Partition Pruning.
2013-07-26 16:59:30 -07:00
harshars
392d7474fd
Code review
2013-07-26 15:23:15 -07:00
harshars
72cf7ec0e5
Indentation
2013-07-26 15:16:41 -07:00
harshars
822aac8f5a
Indentation
2013-07-26 15:10:32 -07:00
harshars
743fc4e7aa
Fix Bug in Partition Pruning, index of Pruned Partitions should inherit from parent
2013-07-26 14:35:17 -07:00
Karen Feng
3fbe9eaac0
Displys shuffle read/write only if exists, wraps if statements, trims old vals, grabs current time once
2013-07-26 11:51:38 -07:00
Karen Feng
22faeab261
Split Shuffle Activity overview column for read/write
2013-07-25 17:14:18 -07:00
Karen Feng
d4bbc8bd25
Shows totals for shuffle data and CPU time in Stage, homepage overviews including active time
2013-07-25 15:59:52 -07:00
Charles Reiss
a6de90c927
For standalone mode, get JAVA_HOME, SPARK_JAVA_OPTS, SPARK_LIBRARY_PATH from application env, not worker env
2013-07-25 12:42:30 -07:00
ryanlecompte
e56aa75de0
fix wrapping
2013-07-24 22:08:09 -07:00
ryanlecompte
8e0939f5a9
refactor Kryo serializer support to use chill/chill-java
2013-07-24 20:43:57 -07:00
Karen Feng
57009eef90
Fixed consistency of "success" status string
2013-07-24 13:43:09 -07:00
Karen Feng
4280e1768d
Removed finished status for task info, changed name of success case
2013-07-24 12:48:48 -07:00
Karen Feng
bd3931c874
Changed ifs with returns to if/else
2013-07-24 11:27:17 -07:00
Karen Feng
93c6015f82
Shows task status and running tasks on Stage Page: fixes SPARK-804 and 811
2013-07-24 10:53:02 -07:00
jerryshao
31ec72b243
Code refactor according to comments
2013-07-24 14:57:47 +08:00
jerryshao
8d1ef7f2df
Code style changes
2013-07-24 14:57:47 +08:00
Andrew xia
05637de842
Change class xxxInstrumentation to class xxxSource
2013-07-24 14:57:47 +08:00
Andrew xia
ed1a3bc206
continue to refactor code style and functions
2013-07-24 14:57:47 +08:00
jerryshao
5730193e0c
Fix some typos
2013-07-24 14:57:47 +08:00
jerryshao
a79f6077f0
Add Maven metrics library dependency and code changes
2013-07-24 14:57:47 +08:00
jerryshao
1daff54b2e
Change Executor MetricsSystem initialize code to SparkEnv
2013-07-24 14:57:47 +08:00
Andrew xia
5f8802c1fb
Register and init metricsSystem in SparkContext
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/SparkEnv.scala
2013-07-24 14:57:47 +08:00
Andrew xia
7d2eada451
Add metrics source of DAGScheduler and blockManager
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/SparkEnv.scala
2013-07-24 14:57:47 +08:00
jerryshao
e9ac88754d
Remove twice add Source bug and code clean
2013-07-24 14:57:47 +08:00
jerryshao
5ce5dc9fcd
Add default properties to deal with no configure file situation
2013-07-24 14:57:47 +08:00
jerryshao
871bc1687e
Add Executor instrumentation
2013-07-24 14:57:46 +08:00
jerryshao
7fb574bf66
Code clean and remarshal
2013-07-24 14:57:46 +08:00
Andrew xia
4d6dd67fa1
refactor metrics system
...
1.change source abstract class to support MetricRegistry
2.change master/work/jvm source class
2013-07-24 14:57:46 +08:00
jerryshao
03f9871116
MetricsSystem refactor
2013-07-24 14:57:46 +08:00
jerryshao
c3daad3f65
Update metric source support for instrumentation
2013-07-24 14:57:46 +08:00
jerryshao
9dec8c73e6
Add Master and Worker instrumentation support
2013-07-24 14:57:46 +08:00
jerryshao
503acd3a37
Build metrics system framwork
2013-07-24 14:57:46 +08:00
Matei Zaharia
b011329040
Merge pull request #727 from rxin/scheduler
...
Scheduler code style cleanup.
2013-07-23 22:50:09 -07:00
Matei Zaharia
876125b997
Merge pull request #726 from rxin/spark-826
...
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure
2013-07-23 22:28:21 -07:00
Reynold Xin
3dae1df66f
Moved non-serializable closure catching exception from submitStage to submitMissingTasks
2013-07-23 20:29:07 -07:00
Reynold Xin
d33b8a2a0f
Added comments on task closure serialization.
2013-07-23 20:28:39 -07:00
Reynold Xin
85ab8114bc
Moved non-serializable closure catching exception from submitStage to submitMissingTasks
2013-07-23 20:25:58 -07:00
Matei Zaharia
6a31b7191d
Small bug fix
2013-07-23 16:20:24 -07:00
Matei Zaharia
2f1736c396
Merge pull request #725 from karenfeng/task-start
...
Creates task start events
2013-07-23 15:53:30 -07:00
Karen Feng
abc78cd331
Modifies instead of copies HashSets, fixes comment style
2013-07-23 15:47:16 -07:00
Karen Feng
383684daaa
Replaces Seq with HashSet, removes redundant import
2013-07-23 15:33:27 -07:00
Reynold Xin
f2422d4f29
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure.
2013-07-23 15:30:20 -07:00
Reynold Xin
5ed38b4d1d
Scheduler code style cleanup.
2013-07-23 15:28:59 -07:00
Reynold Xin
101b8cc78a
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure.
2013-07-23 15:28:20 -07:00
Karen Feng
9f2dbb2a7c
Adds/removes active tasks only once
2013-07-23 15:10:09 -07:00
Dmitriy Lyubimov
ef82ff8564
Merge branch 'master' into SPARK-826
...
Conflicts:
core/src/main/scala/spark/scheduler/local/LocalScheduler.scala
2013-07-23 13:43:00 -07:00
Karen Feng
0200801a55
Tracks task start events and shows number of active tasks on Executor UI
2013-07-23 13:35:43 -07:00
Dmitriy Lyubimov
310e73d566
style
2013-07-23 13:23:25 -07:00
Matei Zaharia
f369e0e51b
Merge pull request #720 from ooyala/2013-07/persistent-rdds-api
...
Add a public method getCachedRdds to SparkContext
2013-07-23 13:22:27 -07:00
Dmitriy Lyubimov
ac60d06381
Re-working in terms of changes to TaskSetManager. Verified with Standalone and Local mode.
2013-07-23 13:13:19 -07:00
Evan Chan
4830e22562
Rename method per rxin feedback
2013-07-23 09:50:13 -07:00
Evan Chan
2c2bfbe294
Add toMap method to TimeStampedHashMap and use it
2013-07-23 01:36:44 -07:00
Matei Zaharia
401aac8b18
Merge pull request #719 from karenfeng/ui-808
...
Creates Executors tab for Jobs UI
2013-07-22 16:57:16 -07:00
Karen Feng
872c97ad82
Split task columns, memory columns sort by numeric value
2013-07-22 16:54:37 -07:00
Karen Feng
2eea974795
Executors UI now calls executor ID from TaskInfo instead of TaskMetrics
2013-07-22 15:15:54 -07:00
Dmitriy Lyubimov
8ca0c31944
removing non-pertinent comment
2013-07-22 14:48:46 -07:00
Dmitriy Lyubimov
b4b230e606
Fixing for LocalScheduler with test, that much works ..
2013-07-22 14:42:47 -07:00
Karen Feng
85c4d7bf3b
Shows number of complete/total/failed tasks (bug: failed tasks assigned to null executor)
2013-07-22 14:35:47 -07:00
Josh Rosen
f649dabb4a
Fix bug: DoubleRDDFunctions.sampleStdev() computed non-sample stdev().
...
Update JavaDoubleRDD to add new methods and docs.
Fixes SPARK-825.
2013-07-22 13:21:48 -07:00
Karen Feng
8901f379c9
Fixed memory used/remaining/total bug
2013-07-22 09:58:03 -07:00
Karen Feng
636b19f833
Merge branch 'master' of https://github.com/mesos/spark into ui-808
2013-07-22 09:53:26 -07:00
Evan Chan
0337d88321
Add a public method getCachedRdds to SparkContext
2013-07-21 18:26:14 -07:00
Karen Feng
865dc63bac
Changed table format for executors
2013-07-19 15:57:01 -07:00
Karen Feng
81bb5dc640
Creates Executors tab for application with RDD block and memory/disk used, solves SPARK-808
2013-07-19 14:08:30 -07:00
Konstantin Boudnik
cfce9a6a36
Regression: default webui-port can't be set via command line "--webui-port" anymore
2013-07-19 14:00:58 -07:00
Liang-Chi Hsieh
aa6f83289b
A better fix for giving local jars unde Yarn mode.
2013-07-19 22:25:28 +08:00
Liang-Chi Hsieh
a613628c50
Do not copy local jars given to SparkContext in yarn mode since the Context is not running on local. This bug causes failure when jars can not be found. Example codes (such as spark.examples.SparkPi) can not work without this fix under yarn mode.
2013-07-19 16:59:12 +08:00
Prashant Sharma
039087e1e3
Fixed formatting As per review comments on #709
2013-07-17 11:46:00 +05:30
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Matei Zaharia
b1f9f64743
Merge branch 'master' of github.com:mesos/spark
2013-07-16 11:01:53 -07:00
Matei Zaharia
5c388808a8
SPARK-814: Result stages should be named after action
2013-07-16 11:01:14 -07:00
Prashant Sharma
f89cc7ae3c
Fixed warning for type erasure
2013-07-16 14:59:24 +05:30
Prashant Sharma
50f3cd8890
Fixed warning enumerations
2013-07-16 14:39:46 +05:30
Prashant Sharma
55da6e9504
Fixed warning erasure -> runtimeClass
2013-07-16 14:37:08 +05:30
Prashant Sharma
ff14f38f3d
Fixed warning Throwables
2013-07-16 14:34:56 +05:30
Prashant Sharma
63addd93a8
Fixed warning ClassManifest -> ClassTag
2013-07-16 14:09:52 +05:30
Reynold Xin
69316603d6
Throw a more meaningful message when runJob is called to launch tasks on non-existent partitions.
2013-07-15 22:50:11 -07:00
Karen Feng
6dc7c9bfb1
Removed job UI column, linked description to job UI
2013-07-15 16:33:50 -07:00
Karen Feng
fbf5aa761e
Removed log message, added field in master UI to link to log UI
2013-07-15 15:50:03 -07:00
Karen Feng
eac381a957
Merge branch 'ui-802' of https://github.com/karenfeng/spark into ui-802
2013-07-15 15:48:44 -07:00
Karen Feng
3955711250
Added field to master UI with link to job UI
2013-07-15 15:47:21 -07:00
Karen Feng
0d78b6d9cd
Links to job UI from standalone deploy cluster web UI: fixes SPARK-802
2013-07-15 13:47:38 -07:00
Karen Feng
b2aaa1199e
Adds app name in HTML page titles on job web UI: fixes SPARK-806
2013-07-15 11:44:42 -07:00
Prashant Sharma
a1e56a43b3
Fixed compilation issues as Map is by default immutable.Map in scala-2.10
2013-07-15 11:28:18 +05:30
Prashant Sharma
a3494d405d
Merge branch 'master' of github.com:mesos/spark into scala-2.10
...
Conflicts:
core/src/main/scala/spark/Utils.scala
core/src/test/scala/spark/ui/UISuite.scala
project/SparkBuild.scala
run
2013-07-15 11:15:55 +05:30
Matei Zaharia
d47c16f78d
Add an option to disable reference tracking in Kryo
2013-07-15 01:55:54 +00:00
Matei Zaharia
10c05937bd
Merge pull request #699 from pwendell/ui-env
...
Add `Environment` tab to SparkUI.
2013-07-14 11:45:18 -07:00
Patrick Wendell
4883586838
Responding to Matei's review
2013-07-14 10:37:26 -07:00
Matei Zaharia
b91a218cea
Cosmetic fixes to web UI
2013-07-14 07:31:33 +00:00
Matei Zaharia
a44a7b1238
Determine Spark core classes better in getCallSite
2013-07-14 07:23:09 +00:00
root
e271fde10b
Fixed a delay scheduling bug in the YARN branch, found by Patrick
2013-07-14 06:24:29 +00:00
Patrick Wendell
ddb97f0fdf
Add Environment
tab to SparkUI.
...
This adds a tab which displays system property and classpath information. This
can be useful in debugging various types of issues such as:
1. Extra/incorrect Hadoop jars being included in the classpath
2. Spark launching with a different JRE version than intended
3. Spark system properties not being set to intended values
4. User added jars that conflict with Spark jars
2013-07-13 16:14:40 -07:00
Matei Zaharia
77c69ae5a0
Merge pull request #697 from pwendell/block-locations
...
Show block locations in Web UI.
2013-07-12 23:05:21 -07:00
Matei Zaharia
5a7835c152
Merge pull request #691 from karenfeng/logpaging
...
Create log pages
2013-07-12 20:28:21 -07:00
Matei Zaharia
71ccca0cc1
Merge pull request #696 from woggle/executor-env
...
Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.sh
2013-07-12 20:25:06 -07:00
Matei Zaharia
90fc3f30cd
Merge pull request #692 from Reinvigorate/takeOrdered
...
adding takeOrdered() to RDD
2013-07-12 20:23:36 -07:00
Patrick Wendell
08150f19ab
Minor style fix
2013-07-12 19:32:35 -07:00
Patrick Wendell
6855338e14
Show block locations in Web UI.
...
This fixes SPARK-769. Support is added for enumerating the locations of blocks
in the UI. There is also some minor cleanup in StorageUtils.
2013-07-12 19:30:32 -07:00
Charles Reiss
531a7e5574
Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.
2013-07-12 12:58:25 -07:00
seanm
a1662326e9
comment adjustment to takeOrdered
2013-07-12 08:38:19 -07:00
Prashant Sharma
a220e11a07
Merge branch 'master' of github.com:mesos/spark into scala-2.10
2013-07-12 15:12:46 +05:30
Prashant Sharma
e86d5dbaad
Merge branch 'master' into master-merge
...
Conflicts:
README.md
core/pom.xml
core/src/main/scala/spark/deploy/JsonProtocol.scala
core/src/main/scala/spark/deploy/LocalSparkCluster.scala
core/src/main/scala/spark/deploy/master/Master.scala
core/src/main/scala/spark/deploy/master/MasterWebUI.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
core/src/main/scala/spark/storage/BlockManagerUI.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
project/SparkBuild.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
2013-07-12 14:49:16 +05:30
Andrew xia
2080e25006
Enhance job ui in spark ui system with adding pool information
2013-07-12 14:25:18 +08:00
seanm
a2c915fba8
giving order to top and making tests more clear
2013-07-11 18:55:00 -07:00
Karen Feng
5c67ca0278
Remove "Bytes" in lieu of String notation
2013-07-11 17:31:59 -07:00
Karen Feng
6d054487bf
Replace default buffer value to 100 GB, changed buttons to use String notation, removed default buffer parameter in UI URLs
2013-07-11 17:12:17 -07:00
Karen Feng
a32784109d
Fixed links for "Back to Master"
2013-07-11 16:57:55 -07:00
Karen Feng
ece2388585
Removed logPageLength from logPage
2013-07-11 16:35:56 -07:00
Karen Feng
9ed036ccdb
Replaced logPageLength with byteLength to prevent buffer shrink bug
2013-07-11 16:33:53 -07:00
Karen Feng
fdc226a14c
Clarified start and end byte variable names
2013-07-11 15:36:43 -07:00
Karen Feng
5d5dbc39f6
getByteRange moved to WorkerWebUI, takes converted parameters, returns only start/end offset
2013-07-11 15:22:45 -07:00
Karen Feng
15fd11d657
Removed redundant calls to request by logPage
2013-07-11 15:01:50 -07:00
Karen Feng
11872888ca
Created getByteRange function for logs and log pages, removed lastNBytes function
2013-07-11 14:56:37 -07:00
Matei Zaharia
018d04c64e
Merge pull request #684 from woggle/mesos-classloader
...
Explicitly set class loader for MesosSchedulerDriver callbacks.
2013-07-11 12:48:37 -07:00
Karen Feng
e3a3fcf61b
Scrollbar on log pages appear automatically
2013-07-11 12:16:38 -07:00
Karen Feng
044d4577ec
Fixed capitalization of log page
2013-07-11 12:02:15 -07:00
Karen Feng
0ecc33f0c8
Added byte range, page title with log name, previous/next bytes buttons, initialization to end of log, large default buffer, buggy back to master link
2013-07-11 11:25:58 -07:00
Karen Feng
74bd3fc680
Added byte range on log pages
2013-07-10 15:44:28 -07:00
Karen Feng
24196c91f0
Changed buffer to 10,000 bytes, created scrollbar for fixed-height log
2013-07-10 15:27:52 -07:00
Karen Feng
f5f3b272f8
Fixed mixup of start/end, moved more import files
2013-07-10 14:52:29 -07:00
Karen Feng
0d4580360b
Fixed docstring of offsetBytes to match params and wrapped for 100+ character lines
2013-07-10 13:24:26 -07:00
Karen Feng
04263e4d46
Made some minor style changes
2013-07-10 13:15:42 -07:00
Karen Feng
cfb6447ac4
Fixed for nonexistent bytes, added unit tests, changed stdout-page to stdout
2013-07-10 11:47:57 -07:00
seanm
ee4ce2fc51
adding takeOrdered to java API
2013-07-10 10:46:04 -07:00
seanm
24705d0f46
adding takeOrdered() to RDD
2013-07-10 10:33:11 -07:00
Karen Feng
620a6974c6
Allows for larger files, refactors lastNBytes, removes old Log column, fixes imports, uses map
2013-07-10 10:20:53 -07:00
Karen Feng
b6072b58bf
Fixes style, makes "std__-page" consistent, reads only parts of files
2013-07-09 17:25:10 -07:00
Karen Feng
13fc6f248c
Clean commit of log paging
2013-07-09 14:17:15 -07:00
Charles Reiss
e47253e0cc
Reset ClassLoader in MesosSchedulerBackend, too. (per review comments).
...
Also set ClassLoader for all mesos callbacks, not just statusUpdate,
registered.
2013-07-09 01:23:23 -07:00
Charles Reiss
8c1d1c98e0
Explicitly set class loader for MesosSchedulerDriver callbacks.
2013-07-08 12:25:46 -07:00
Shivaram Venkataraman
4af0d63cb1
Remove akka LogLevel fix as we no longer use spray
2013-07-07 10:42:43 -07:00
Shivaram Venkataraman
a948f06725
Suppress log messages in sbt test with two changes:
...
1. Set akka log level to ERROR before shutting down the actorSystem.
This avoids akka log messages (like Spray) from falling back to INFO
on the Stdout logger
2. Initialize netty to use SLF4J in LocalSparkContext. This ensures that
stack trace thrown during shutdown is handled by SLF4J instead of stdout
2013-07-07 04:09:08 -07:00
Patrick Wendell
32b9d21a97
Fix occasional failure in UI listener.
...
If a task fails before the metrics are initialized, it remains possible
that the metrics field will be `None`. This patch accounts for that possbility
by keeping metrics as an `Option` at all times.
2013-07-06 16:40:02 -07:00
Matei Zaharia
1ffadb2d9e
Merge remote-tracking branch 'pwendell/ui-updates'
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia
94871e4703
Merge pull request #655 from tgravescs/master
...
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Matei Zaharia
3f918b33f8
Merge pull request #672 from holdenk/master
...
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-06 12:45:18 -07:00
Matei Zaharia
7ba7fa110b
Merge pull request #674 from liancheng/master
...
Bug fix: SPARK-789
2013-07-06 11:45:08 -07:00
BlackNiuza
44a2440039
Remove active job from idToActiveJob when job finished or aborted
2013-07-07 01:33:09 +08:00
Patrick Wendell
37abe84212
Tracking some task metrics even during failures.
2013-07-06 09:19:59 -07:00
Patrick Wendell
84b7fc54e6
Enforcing correct sort order for formatted strings
2013-07-05 17:21:08 -07:00
Matei Zaharia
652ea0f1d8
Allow RDD.takeSample to give samples bigger than the RDD
...
Before, when withReplacement was set to true, we would not get a sample
bigger than the RDD's count().
Conflicts:
core/src/main/scala/spark/RDD.scala
core/src/test/scala/spark/RDDSuite.scala
2013-07-05 11:15:13 -07:00
Matei Zaharia
6586c5e28b
Added a SparkContext accessor to RDD
2013-07-05 11:13:46 -07:00
jerryshao
e4ff544a8d
Clean StageToInfos periodically when spark.cleaner.ttl is enabled
2013-07-05 10:34:45 +08:00
Lian Cheng
c0c3155c3c
Bug fix: SPARK-789
...
https://spark-project.atlassian.net/browse/SPARK-789
2013-07-05 00:54:10 +08:00
Holden Karau
0f06d6217d
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-04 01:05:39 -07:00
Gavin Li
94238aae57
fix dependencies
2013-07-03 18:08:38 +00:00
Prashant Sharma
a5f1f6a907
Merge branch 'master' into master-merge
...
Conflicts:
core/pom.xml
core/src/main/scala/spark/MapOutputTracker.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/RDDCheckpointData.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/api/python/PythonRDD.scala
core/src/main/scala/spark/deploy/client/Client.scala
core/src/main/scala/spark/deploy/master/MasterWebUI.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
core/src/main/scala/spark/rdd/BlockRDD.scala
core/src/main/scala/spark/rdd/ZippedRDD.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/BlockManagerMasterActor.scala
core/src/main/scala/spark/storage/BlockManagerUI.scala
core/src/main/scala/spark/util/AkkaUtils.scala
core/src/test/scala/spark/SizeEstimatorSuite.scala
pom.xml
project/SparkBuild.scala
repl/src/main/scala/spark/repl/SparkILoop.scala
repl/src/test/scala/spark/repl/ReplSuite.scala
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/api/java/JavaStreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/util/MasterFailureTest.scala
2013-07-03 11:43:26 +05:30
Gavin Li
96130c30d9
add compression codec trait and snappy compression
2013-07-03 05:49:04 +00:00
Y.CORP.YAHOO.COM\tgraves
923cf92900
Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
...
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Patrick Wendell
39e2325675
Removing dead code
2013-07-02 16:28:40 -07:00
Patrick Wendell
8ca1cc1786
Adding truncation for log files
2013-07-02 16:10:50 -07:00
Patrick Wendell
9a42d04efa
Throw exception for missing resource
2013-07-01 14:43:13 -07:00
Patrick Wendell
1025d7d1ef
Package refactoring
2013-07-01 14:40:53 -07:00
Patrick Wendell
30b9034241
Fixing bug where logs aren't shown
2013-07-01 13:48:01 -07:00
Patrick Wendell
8688689387
Various formatting changes
2013-07-01 13:40:12 -07:00
Patrick Wendell
735c951a09
Adding test script
2013-07-01 09:33:22 -07:00
Patrick Wendell
5de326db7d
Print exception message
2013-07-01 09:19:45 -07:00
root
ec31e68d5d
Fixed PySpark perf regression by not using socket.makefile(), and improved
...
debuggability by letting "print" statements show up in the executor's stderr
Conflicts:
core/src/main/scala/spark/api/python/PythonRDD.scala
2013-07-01 06:26:31 +00:00
root
3296d132b6
Fix performance bug with new Python code not using buffered streams
2013-07-01 06:25:43 +00:00
Matei Zaharia
03d0b858c8
Made use of spark.executor.memory setting consistent and documented it
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Patrick Wendell
e721ff7e5a
Allowing details for failed stages
2013-06-29 11:26:30 -07:00
Patrick Wendell
473961d82e
Styling for progress bar
2013-06-29 08:38:04 -07:00
Patrick Wendell
249f0e54ba
Minor changes from Matei's review
2013-06-28 13:25:26 -07:00
Patrick Wendell
c537e869f3
Missing logo file
2013-06-27 22:02:03 -07:00
Patrick Wendell
62c2c6b856
Forcing Jetty to run as daemon
2013-06-27 21:47:22 -07:00
Patrick Wendell
a55190d314
Adding better tabs for UI headers.
2013-06-27 19:14:51 -07:00
Patrick Wendell
362d996c81
Handful of changes based on matei's review
...
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
Patrick Wendell
92a4c2a5f6
Fixing bug in local scheduler time recording
2013-06-27 12:33:06 -07:00
Stephen Haberman
d7011632d1
Wrap lines.
2013-06-26 12:35:57 -05:00
Patrick Wendell
ee692482a6
One more private class
2013-06-26 09:07:32 -07:00
Patrick Wendell
a59c15a37e
Adding config option for retained stages
2013-06-26 08:54:57 -07:00
Patrick Wendell
274193664a
Bumping timeouts
2013-06-26 08:51:28 -07:00
Patrick Wendell
b14ad509ba
Moving static ui package
2013-06-26 08:46:51 -07:00
Patrick Wendell
2cbaa0734b
Making all new classes package private
2013-06-26 08:44:55 -07:00
Stephen Haberman
d11025dc6a
Be cute with Option and getenv.
2013-06-26 09:53:35 -05:00
Matei Zaharia
6c8d1b2ca6
Fix computation of classpath when we launch java directly
...
The previous version assumed that a CLASSPATH environment variable was
set by the "run" script when launching the process that starts the
ExecutorRunner, but unfortunately this is not true in tests. Instead, we
factor the classpath calculation into an extenral script and call that.
NOTE: This includes a Windows version but hasn't yet been tested there.
2013-06-25 18:21:00 -04:00
Matei Zaharia
15b00914c5
Some fixes to the launch-java-directly change:
...
- Split SPARK_JAVA_OPTS into multiple command-line arguments if it
contains spaces; this splitting follows quoting rules in bash
- Add the Scala JARs to the classpath if they're not in the CLASSPATH
variable because the ExecutorRunner is launched with "scala" (this can
happen when using local-cluster URLs in spark-shell)
2013-06-25 17:17:27 -04:00
Matei Zaharia
7e0191c6ea
Merge remote-tracking branch 'cgrothaus/SPARK-698'
...
Conflicts:
run
2013-06-25 15:47:40 -04:00
Patrick Wendell
d66bd6f885
Adding another unit test to Web UI suite
2013-06-24 17:12:55 -07:00
Patrick Wendell
f7389330c3
Allowing for requested port on construction
2013-06-24 16:51:52 -07:00
Patrick Wendell
42157027f2
A few bug fixes and a unit test
2013-06-24 16:25:05 -07:00
Patrick Wendell
a4248138b4
Minor style cleanup
2013-06-24 14:22:28 -07:00
Patrick Wendell
b5e6e8bcc8
Cleaning up some code for Job Progress
2013-06-24 14:13:24 -07:00
Patrick Wendell
93e8ed85aa
Work around for initalization issue
2013-06-24 13:11:18 -07:00
Patrick Wendell
f6e64b5cd6
Updating based on changes to JobLogger (and one small change to JobLogger)
2013-06-24 12:40:41 -07:00
Matei Zaharia
78ffe164b3
Clone the zero value for each key in foldByKey
...
The old version reused the object within each task, leading to
overwriting of the object when a mutable type is used, which is expected
to be common in fold.
Conflicts:
core/src/test/scala/spark/ShuffleSuite.scala
2013-06-23 10:26:53 -07:00
Matei Zaharia
0e0f9d3069
Fix search path for REPL class loader to really find added JARs
2013-06-22 17:44:04 -07:00
Matei Zaharia
3e61beff7b
Merge pull request #648 from shivaram/netty-dbg
...
Shuffle fixes and cleanup
2013-06-22 16:22:47 -07:00
Patrick Wendell
7e9f1ed0de
Some cleanup of styling
2013-06-22 10:31:37 -07:00
Patrick Wendell
3b7ebdeeb8
Handling entirely failed stages
2013-06-22 10:31:37 -07:00
Patrick Wendell
be6107ce44
Some tweaking with shared page header
2013-06-22 10:31:37 -07:00
Patrick Wendell
9a24d1a2d0
Using scala in XML imports
2013-06-22 10:31:37 -07:00
Patrick Wendell
f91e1c4822
Linking RDD information when available in stages
2013-06-22 10:31:37 -07:00
Patrick Wendell
a86bb459e2
Showing shuffle status and purging old stages
2013-06-22 10:31:37 -07:00
Patrick Wendell
3485e73376
Style cleanup
2013-06-22 10:31:37 -07:00
Patrick Wendell
dd696f3a3d
Some renaming and comments
2013-06-22 10:31:37 -07:00
Patrick Wendell
5c872e9ef5
Documentation and some refactoring
2013-06-22 10:31:37 -07:00
Patrick Wendell
17776323a6
More work on percentile data:
2013-06-22 10:31:37 -07:00
Patrick Wendell
dcf6a68177
Refactoring into different modules
2013-06-22 10:31:36 -07:00
Patrick Wendell
ce81c320ac
Adding helper function to make listing tables
2013-06-22 10:31:36 -07:00
Patrick Wendell
9fd5dc3ea9
Initial steps towards job progress UI
2013-06-22 10:31:36 -07:00
Patrick Wendell
bc4a811c57
Stash
2013-06-22 10:31:36 -07:00
Patrick Wendell
77c53f7868
Refactoring UI packages
2013-06-22 10:31:36 -07:00
Patrick Wendell
8b5c7e71c4
Import cleanup
2013-06-22 10:31:36 -07:00
Patrick Wendell
32a45d01b1
Removing twirl files
2013-06-22 10:31:36 -07:00
Patrick Wendell
4e1f202481
Removing dead code
2013-06-22 10:31:36 -07:00
Patrick Wendell
d6fde4ffe4
Some JSON cleanup
2013-06-22 10:31:36 -07:00
Patrick Wendell
91ec5a1a04
Changing JSON protocol and removing spray code
2013-06-22 10:31:36 -07:00
Patrick Wendell
fc94576ece
Adding worker version of UI
2013-06-22 10:31:36 -07:00
Patrick Wendell
ee73c09ac9
Some comments
2013-06-22 10:31:36 -07:00
Patrick Wendell
9161db5478
Cleaning up master web UI
2013-06-22 10:31:36 -07:00
Patrick Wendell
e55cf0245f
Adding WebUI file
2013-06-22 10:31:35 -07:00
Patrick Wendell
f85fd7a793
Commenting unfinished part
2013-06-22 10:31:35 -07:00
Patrick Wendell
2c36a514aa
Spray refactoring for master web UI
2013-06-22 10:31:35 -07:00
Patrick Wendell
7e6977b6c5
Fix in storage status page
2013-06-22 10:31:35 -07:00
Patrick Wendell
950f83535a
Adding deterministic port
2013-06-22 10:31:35 -07:00
Patrick Wendell
7cd70dc2c1
Minor cleanup
2013-06-22 10:31:35 -07:00
Patrick Wendell
e66f570194
Completely hacked version of block manager UI in jetty
2013-06-22 10:31:35 -07:00
Patrick Wendell
60fbf7e461
Partially working checkpoint
2013-06-22 10:31:35 -07:00
Matei Zaharia
1ef5d0d2c9
Merge pull request #644 from shimingfei/joblogger
...
add Joblogger to Spark (on new Spark code)
2013-06-22 09:35:57 -07:00
Jey Kottalam
1ba3c17303
use parens when calling method with side-effects
2013-06-21 12:14:16 -04:00
Jey Kottalam
edb18ca928
Rename PythonWorker to PythonWorkerFactory
2013-06-21 12:14:16 -04:00
Jey Kottalam
62c4781400
Add tests and fixes for Python daemon shutdown
2013-06-21 12:14:16 -04:00
Jey Kottalam
c79a6078c3
Prefork Python worker processes
2013-06-21 12:14:16 -04:00
Jey Kottalam
40afe0d2a5
Add Python timing instrumentation
2013-06-21 12:14:16 -04:00
Mingfei
2fc794a6c7
small modify in DAGScheduler
2013-06-21 18:21:35 +08:00
Mingfei
4b9862ac9c
small format modification
2013-06-21 17:55:32 +08:00
Mingfei
aa7aa587be
some format modification
2013-06-21 17:48:41 +08:00
Mingfei
5240795154
edit according to comments
2013-06-21 17:38:23 +08:00
Matei Zaharia
71030ba3eb
Merge pull request #654 from lyogavin/enhance_pipe
...
fix typo and coding style in #638
2013-06-19 15:21:03 -07:00
Thomas Graves
bad51c7cb4
upmerge with latest mesos/spark master and fix hbase compile with hadoop2-yarn profile
2013-06-19 14:39:13 -05:00
Thomas Graves
75d78c7ac9
Add support for Spark on Yarn on a secure Hadoop cluster
2013-06-19 11:18:42 -05:00
Matei Zaharia
7902baddc7
Update ASM to version 4.0
2013-06-19 13:34:30 +02:00
Gavin Li
0a2a9bce1e
fix typo and coding style
2013-06-18 21:30:13 +00:00
jerryshao
1e9269c3ee
reduce ZippedPartitionsRDD's getPreferredLocations complexity
2013-06-18 09:49:06 +08:00
Matei Zaharia
db42451a52
Merge pull request #643 from adatao/master
...
Bug fix: Zero-length partitions result in NaN for overall mean & variance
2013-06-17 15:26:36 -07:00
Matei Zaharia
e82a2ffcc9
Merge pull request #653 from rxin/logging
...
SPARK-781: Log the temp directory path when Spark says "Failed to create temp directory."
2013-06-17 15:13:15 -07:00
Matei Zaharia
ec193c7d89
Merge remote-tracking branch 'xiajunluan/xiajunluan'
...
Conflicts:
core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
2013-06-18 00:11:50 +02:00
Reynold Xin
be3c406edf
Fixed the typo pointed out by Matei.
2013-06-17 17:07:51 -04:00
Reynold Xin
1450296797
SPARK-781: Log the temp directory path when Spark says "Failed to create
...
temp directory".
2013-06-17 16:58:23 -04:00
Gavin Li
4508089fc3
refine comments and add sc.clean
2013-06-17 05:23:46 +00:00
Gavin Li
e6ae049283
Merge remote-tracking branch 'upstream1/master' into enhance_pipe
2013-06-16 22:53:39 +00:00
Gavin Li
fb6d733fa8
update according to comments
2013-06-16 22:32:55 +00:00
Matei Zaharia
f961aac8b2
Merge pull request #649 from ryanlecompte/master
...
Add top K method to RDD using a bounded priority queue
2013-06-15 00:53:41 -07:00
ryanlecompte
e8801d4490
use delegation for BoundedPriorityQueue, add Java API
2013-06-14 23:39:05 -07:00
Reynold Xin
2cc188fd54
SPARK-774: cogroup should also disable map side combine by default
2013-06-14 00:10:54 -07:00
Reynold Xin
6738178d0d
SPARK-772: groupByKey should disable map side combine.
2013-06-13 23:59:42 -07:00
ryanlecompte
93b3f5e535
drop unneeded ClassManifest implicit
2013-06-13 16:26:35 -07:00
ryanlecompte
44b8dbaede
use Iterator.single(elem) instead of Iterator(elem) for improved performance based on scaladocs
2013-06-13 16:23:15 -07:00
Shivaram Venkataraman
1d9f0df065
Fix some comments and style
2013-06-13 14:46:25 -07:00
Mingfei
967a6a699d
modify sparklister function interface according to comments
2013-06-13 14:36:07 +08:00
Shivaram Venkataraman
5da4287b1d
Merge branch 'netty-dbg' of github.com:shivaram/spark into netty-dbg
2013-06-12 16:38:37 -07:00
Shivaram Venkataraman
5e9a9317c5
Merge branch 'master' of git://github.com/mesos/spark into netty-dbg
2013-06-12 16:38:01 -07:00
ryanlecompte
db5bca08ff
add a new top K method to RDD using a bounded priority queue
2013-06-12 10:54:16 -07:00
Andrew xia
190ec61799
change code style and debug info
2013-06-10 15:27:02 +08:00
Patrick Wendell
ef14dc2e77
Adding Java-API version of compression codec
2013-06-09 18:09:46 -07:00
Patrick Wendell
df592192e7
Monads FTW
2013-06-09 18:09:24 -07:00
Patrick Wendell
d1bbcebae5
Adding compression to Hadoop save functions
2013-06-09 11:39:35 -07:00
Mingfei
ade822011d
not check return value of eventQueue.take
2013-06-08 16:26:45 +08:00
Matei Zaharia
5b5b5aedbf
Fixed a few test issues due to Akka 2.1, as well as SBT memory.
...
Unfortunately, in Akka 2.1, ActorSystem.awaitTermination hangs for
remote actors, and Akka also leaves a non-daemon Netty thread even when
run in daemon mode. Thus I had to comment out some of the calls to
awaitTermination, and we still have one failing test.
2013-06-08 01:09:24 -07:00
Mingfei
4fd86e0e10
delete test code for joblogger in SparkContext
2013-06-08 15:45:47 +08:00
Mingfei
362f0f93ac
Merge branch 'master' of https://github.com/mesos/spark
2013-06-08 15:20:13 +08:00
Mingfei
1a4d93c025
modify to pass job annotation by localProperties and use daeamon thread to do joblogger's work
2013-06-08 14:23:39 +08:00
Matei Zaharia
b58a29295b
Small formatting and style fixes
2013-06-07 22:51:28 -07:00
Matei Zaharia
c8fc423bc2
Merge pull request #631 from jerryshao/master
...
Fix block manager UI display issue when enable spark.cleaner.ttl
2013-06-07 22:43:18 -07:00
Matei Zaharia
c9ca0a4a58
Small code style fix to SchedulingAlgorithm.scala
2013-06-07 22:40:44 -07:00
Matei Zaharia
1ae60bcb36
Merge pull request #634 from xiajunluan/master
...
[Spark-753] Fix ClusterSchedulSuite unit test failed
2013-06-07 22:39:06 -07:00
Shivaram Venkataraman
ac480fd977
Clean up variables and counters in BlockFetcherIterator
2013-06-06 16:34:27 -07:00
Gavin Li
e179ff8a32
update according to comments
2013-06-05 22:41:05 +00:00
Shivaram Venkataraman
cb2f5046ee
Pass in bufferSize to BufferedOutputStream
2013-06-05 15:09:02 -07:00
Shivaram Venkataraman
c851957fe4
Don't write zero block files with java serializer
2013-06-05 14:28:38 -07:00
Christopher Nguyen
9d35904357
In the current code, when both partitions happen to have zero-length, the return mean will be NaN.
...
Consequently, the result of mean after reducing over all partitions will also be NaN,
which is not correct if there are partitions with non-zero length. This patch fixes this issue.
2013-06-04 22:12:47 -07:00
Matei Zaharia
fff3728552
Merge pull request #640 from pwendell/timeout-update
...
Fixing bug in BlockManager timeout
2013-06-04 16:09:50 -07:00
Patrick Wendell
061fd3ae36
Fixing bug in BlockManager timeout
2013-06-04 19:02:44 -04:00
Matei Zaharia
f420d4f228
Merge pull request #639 from pwendell/timeout-update
...
Bump akka and blockmanager timeouts to 60 seconds
2013-06-04 15:25:58 -07:00
Patrick Wendell
8bd4e12104
Bump akka and blockmanager timeouts to 60 seconds
2013-06-04 18:14:24 -04:00
Shivaram Venkataraman
96943a1cc0
var to val
2013-06-03 12:29:38 -07:00
Shivaram Venkataraman
cd347f547a
Reuse the file object as it is valid after delete
2013-06-03 12:27:51 -07:00
Shivaram Venkataraman
a058b0acf3
Delete a file for a block if it already exists.
2013-06-03 12:10:00 -07:00
Andrew xia
606bb1b450
Fix schedulingAlgorithm bugs for unit test
2013-06-03 10:29:23 +08:00
Shivaram Venkataraman
038cfc1a9a
Make connect timeout configurable
2013-05-31 23:32:18 -07:00
Shivaram Venkataraman
91aca92249
Another round of Netty fixes.
...
1. Avoid race condition between stop and copier completion
2. Handle socket exceptions by reporting them and filling in a failed
FetchResult
2013-05-31 23:21:38 -07:00
Gavin Li
9f84315c05
enhance pipe to support what we can do in hadoop streaming
2013-06-01 00:26:10 +00:00
Reynold Xin
de1167bf2c
Incorporated Charles' feedback to put rdd metadata removal in
...
BlockManagerMasterActor.
2013-05-31 15:54:57 -07:00
Reynold Xin
ba5e544461
More block manager cleanup.
...
Implemented a removeRdd method in BlockManager, and use that to
implement RDD.unpersist. Previously, unpersist needs to send B akka
messages, where B = number of blocks. Now unpersist only needs to send W
akka messages, where W = the number of workers.
2013-05-31 01:48:16 -07:00
jerryshao
926f41cc52
fix block manager UI display issue when enable spark.cleaner.ttl
2013-05-31 09:32:52 +08:00
Reynold Xin
bed1b08169
Do not create symlink for local add file. Instead, copy the file.
...
This prevents Spark from changing the original file's permission, and
also allow add file to work on non-posix operating systems.
2013-05-30 16:21:49 -07:00
Shivaram Venkataraman
3b0cd17343
Merge branch 'master' of git://github.com/mesos/spark
...
Conflicts:
core/src/test/scala/spark/ShuffleSuite.scala
2013-05-30 14:36:24 -07:00
Andrew xia
c3db3ea554
1. Add unit test for local scheduler
...
2. Move localTaskSetManager to a new file
2013-05-30 20:49:40 +08:00
Andrew xia
ecceb101d3
implement FIFO and fair scheduler for spark local mode
2013-05-30 10:43:01 +08:00
Shivaram Venkataraman
19fd6d54c0
Also flush serializer in revertPartialWrites
2013-05-29 17:29:34 -07:00
Shivaram Venkataraman
618c8cae1e
Skip fetching zero-sized blocks in OIO.
...
Also unify splitLocalRemoteBlocks for netty/nio and add a test case
2013-05-29 13:18:54 -07:00
Matei Zaharia
6ed71390d9
Merge pull request #626 from stephenh/remove-add-if-no-port
...
Remove unused addIfNoPort.
2013-05-29 10:14:22 -07:00
Shivaram Venkataraman
b79b10a6d6
Flush serializer to fix zero-size kryo blocks bug.
...
Also convert the local-cluster test case to check for non-zero block sizes
2013-05-29 00:52:55 -07:00
Matei Zaharia
41d230ccb0
Merge pull request #611 from squito/classloader
...
Use default classloaders for akka & deserializing task results
2013-05-28 23:35:24 -07:00
Shivaram Venkataraman
fbc1ab3468
Couple of Netty fixes
...
a. Fix the port number by reading it from the bound channel
b. Fix the shutdown sequence to make sure we actually block on the channel
c. Fix the unit test to use two JVMs.
2013-05-28 16:27:16 -07:00
Stephen Haberman
4fe1fbdd51
Remove unused addIfNoPort.
2013-05-28 16:26:32 -05:00
Matei Zaharia
3db1e17baa
Merge pull request #620 from jerryshao/master
...
Fix CheckpointRDD java.io.FileNotFoundException when calling getPreferredLocations
2013-05-27 21:31:43 -07:00
Matei Zaharia
e8d4b6c296
Merge pull request #529 from xiajunluan/master
...
[SPARK-663]Implement Fair Scheduler in Spark Cluster Scheduler
2013-05-25 21:09:03 -07:00
Reynold Xin
26962c9340
Automatically configure Netty port. This makes unit tests using
...
local-cluster pass. Previously they were failing because Netty was
trying to bind to the same port for all processes.
Pair programmed with @shivaram.
2013-05-24 16:39:33 -07:00
Reynold Xin
6ea085169d
Fixed the bug that shuffle serializer is ignored by the new shuffle
...
block iterators for local blocks. Also added a unit test for that.
2013-05-24 14:08:37 -07:00
jerryshao
bd3ea8f2a6
fix CheckpointRDD getPreferredLocations java.io.FileNotFoundException
2013-05-24 14:26:19 +08:00
Charles Reiss
f350f14084
Use ARRAY_SAMPLE_SIZE constant instead of 100.0
2013-05-21 18:11:33 -07:00
Andrew xia
ecd6d75c6a
fix bug of unit tests
2013-05-21 06:49:23 +08:00
Reynold Xin
5912cc4967
Merge pull request #610 from JoshRosen/spark-747
...
Throw exception if TaskResult exceeds Akka frame size
2013-05-17 19:58:40 -07:00
Reynold Xin
8d78c5f89f
Changed the logging level from info to warning when addJar(null) is
...
called.
2013-05-17 18:51:35 -07:00
Andrew xia
3d4672eaa9
Merge branch 'master' into xiajunluan
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/scheduler/cluster/ClusterScheduler.scala
core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
2013-05-18 07:28:03 +08:00
Andrew xia
d19753b9c7
expose TaskSetManager type to resourceOffer function in ClusterScheduler
2013-05-18 06:45:19 +08:00
Andrew xia
c6e2770bfe
Fix ClusterScheduler bug to avoid allocating tasks to same slave
2013-05-17 05:10:38 +08:00
Mridul Muralidharan
f0881f8d48
Hope this does not turn into a bike shed change
2013-05-17 01:58:50 +05:30
Mridul Muralidharan
feddd2530d
Filter out nulls - prevent NPE
2013-05-16 17:49:14 +05:30
Josh Rosen
b8e46b6074
Abort job if result exceeds Akka frame size; add test.
2013-05-16 01:57:57 -07:00
Matei Zaharia
2f576aba8f
Merge pull request #602 from rxin/shufflemerge
...
Manual merge & cleanup of Shane's Shuffle Performance Optimization
2013-05-15 18:06:24 -07:00
Reynold Xin
203d7b7c14
Merge pull request #593 from squito/driver_ui_link
...
Master UI has link to Application UI
2013-05-15 00:47:20 -07:00
Reynold Xin
f3491cb89b
Merge branch 'master' of github.com:mesos/spark into shufflemerge
...
Conflicts:
core/src/main/scala/spark/storage/BlockManager.scala
core/src/test/scala/spark/DistributedSuite.scala
project/SparkBuild.scala
2013-05-15 00:31:52 -07:00
Reynold Xin
f9d40a5848
Added a comment in JdbcRDD for example usage.
2013-05-14 23:29:57 -07:00
Reynold Xin
81ad2fa331
Merge branch 'jdbc' of github.com:koeninger/spark
...
Conflicts:
project/SparkBuild.scala
2013-05-14 23:12:00 -07:00
Imran Rashid
38d4b97c6d
use threads classloader when deserializing task results; classnotfoundexception includes classloader
2013-05-14 22:32:14 -07:00
Imran Rashid
d7d1da79d3
when akka starts, use akkas default classloader (current thread)
2013-05-14 22:32:09 -07:00
Matei Zaharia
016ac86830
Merge pull request #601 from rxin/emptyrdd-master
...
EmptyRDD (master branch 0.8)
2013-05-13 21:45:36 -07:00
Matei Zaharia
4b354e0a08
Merge pull request #589 from mridulm/master
...
Add support for instance local scheduling
2013-05-13 17:39:19 -07:00
Patrick Wendell
7f0833647b
Capturing class name
2013-05-12 07:54:03 -07:00
Patrick Wendell
72b9c4cb6e
Small fix
2013-05-11 23:53:50 -07:00
Patrick Wendell
1c15b85051
Removing import
2013-05-11 23:52:53 -07:00
Patrick Wendell
059ab88754
Changing technique to use same code path in all cases
2013-05-11 23:50:54 -07:00
Cody Koeninger
3da2305ed0
code cleanup per rxin comments
2013-05-11 23:59:07 -05:00
Josh Rosen
440719109e
Throw exception if task result exceeds Akka frame size.
...
This partially addresses SPARK-747.
2013-05-11 19:17:13 -07:00
Patrick Wendell
0345954530
SPARK-738: Spark should detect and squash nonserializable exceptions
2013-05-11 14:17:09 -07:00
Mark Hamstra
6e6b3e0d7e
Actually use the cleaned closure in foreachPartition
2013-05-10 13:02:34 -07:00
Imran Rashid
0ab818d508
fix linebreak
2013-05-09 00:38:59 -07:00
Reynold Xin
5d70ee4663
Cleaned up connection manager (moved many classes to their own files).
2013-05-07 22:42:15 -07:00
Reynold Xin
8388e8dd7a
Minor style fix in DiskStore...
2013-05-07 18:40:35 -07:00
Reynold Xin
547dcbe494
Cleaned up Scala files in network/netty from Shane's PR.
2013-05-07 18:39:33 -07:00
Reynold Xin
9e64396ca4
Cleaned up the Java files from Shane's PR.
2013-05-07 18:30:54 -07:00
Reynold Xin
0e5cc30868
Cleaned up BlockManager and BlockFetcherIterator from Shane's PR.
2013-05-07 18:18:24 -07:00
Reynold Xin
8b79485171
Moved BlockFetcherIterator to its own file.
2013-05-07 17:02:32 -07:00
Reynold Xin
90577ada69
Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge
...
Conflicts:
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/DiskStore.scala
project/SparkBuild.scala
2013-05-07 15:56:19 -07:00
Reynold Xin
0fd84965f6
Added EmptyRDD.
2013-05-06 15:40:34 -07:00
Imran Rashid
22a5063ae4
switch from separating appUI host & port to combining into just appUiUrl
2013-05-05 12:19:11 -07:00
Matei Zaharia
7af92f248b
Merge pull request #597 from JoshRosen/webui-fixes
...
Two minor bug fixes for Spark Web UI
2013-05-04 22:29:17 -07:00
Josh Rosen
42b1953c53
Fix SPARK-630: app details page shows finished executors as running.
2013-05-04 18:34:47 -07:00
Josh Rosen
c0688451a6
Fix wrong closing tags in web UI HTML.
2013-05-04 18:34:46 -07:00
Josh Rosen
d48e9fde01
Fix SPARK-629: weird number of cores in job details page.
2013-05-04 18:34:45 -07:00
Mridul Muralidharan
25198d7e9e
Merge branch 'master' of github.com:mridulm/spark
2013-05-04 20:45:56 +05:30
Mridul Muralidharan
5b011d18d7
Merge from master
2013-05-04 20:41:27 +05:30
Mridul Muralidharan
edb57c8331
Add support for instance local in getPreferredLocations of ZippedPartitionsBaseRDD. Add comments to both ZippedPartitionsBaseRDD and ZippedRDD to better describe the potential problem with the approach
2013-05-04 19:47:45 +05:30
Matei Zaharia
3bf2c868c3
Merge pull request #594 from shivaram/master
...
Add zip partitions to Java API
2013-05-03 18:27:30 -07:00
Shivaram Venkataraman
bb8a434f9d
Add zipPartitions to Java API.
2013-05-03 15:14:02 -07:00
Imran Rashid
6fae936088
applications (aka drivers) send their webUI address to master when registering so it can be displayed in the master web ui
2013-05-03 12:59:10 -07:00
Mridul Muralidharan
ea2a6f91d3
pull from master
2013-05-04 00:35:59 +05:30
Reynold Xin
93091f6936
Merge branch 'master' of github.com:mesos/spark into blockmanager
2013-05-03 01:02:32 -07:00
Reynold Xin
2bc895a829
Updated according to Matei's code review comment.
2013-05-03 01:02:16 -07:00
Mridul Muralidharan
11589c39d9
Fix ZippedRDD as part Matei's suggestion
2013-05-03 12:23:30 +05:30
Matei Zaharia
6fe9d4e61e
Merge pull request #592 from woggling/localdir-fix
...
Don't accept generated local directory names that can't be created
2013-05-02 21:33:56 -07:00
Matei Zaharia
538ee755b4
Merge pull request #581 from jerryshao/master
...
fix [SPARK-740] block manage UI throws exception when enabling Spark Streaming
2013-05-02 09:01:42 -07:00
Charles Reiss
c847dd3da2
Don't accept generated temp directory names that can't be created successfully.
2013-05-01 23:19:10 -07:00
Reynold Xin
4a31877408
Added the unpersist api to JavaRDD.
2013-05-01 20:31:54 -07:00
Reynold Xin
98df9d2853
Added removeRdd function in BlockManager.
2013-05-01 20:17:09 -07:00
Mridul Muralidharan
dfde9ce9dd
comment out debug versions of checkHost, etc from Utils - which were used to test
2013-05-02 07:41:33 +05:30
Mridul Muralidharan
1b5aaeadc7
Integrate review comments 2
2013-05-02 07:30:06 +05:30
jerryshao
c047f0e3ad
filter out Spark streaming block RDD and sort RDDInfo with id
2013-05-02 09:48:32 +08:00
Mridul Muralidharan
609a817f52
Integrate review comments on pull request
2013-05-02 06:44:33 +05:30
Reynold Xin
204eb32e14
Changed the type of the persistentRdds hashmap back to
...
TimeStampedHashMap.
2013-05-01 16:14:58 -07:00
Reynold Xin
34637b97ec
Added SparkContext.cleanup back. Not sure why it was removed before ...
2013-05-01 16:12:37 -07:00
Reynold Xin
3227ec8edd
Cleaned up Ram's code. Moved SparkContext.remove to RDD.unpersist.
...
Also updated unit tests to make sure they are properly testing for
concurrency.
2013-05-01 16:07:44 -07:00
harshars
8481562731
Merged Ram's commit on removing RDDs.
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2013-05-01 14:42:17 -07:00
Mridul Muralidharan
27764a00f4
Fix some npe introduced accidentally
2013-05-01 20:56:05 +05:30
Mridul Muralidharan
d960e7e0f8
a) Add support for hyper local scheduling - specific to a host + port - before trying host local scheduling.
...
b) Add some fixes to test code to ensure it passes (and fixes some other issues).
c) Fix bug in task scheduling which incorrectly used availableCores instead of all cores on the node.
2013-05-01 20:24:00 +05:30
Prashant Sharma
dbe2887da7
Fixed deprecated method warning
2013-05-01 13:22:49 +05:30
Matei Zaharia
aa8fe1a209
Merge pull request #586 from mridulm/master
...
Pull request to address issues Reynold Xin reported
2013-04-30 22:30:18 -07:00
Reynold Xin
dd7bef3147
Two minor fixes according to Ryan LeCompte's review.
2013-04-30 15:02:32 -07:00
Reynold Xin
cea6174573
Merge branch 'master' of github.com:mesos/spark into blockmanager
...
Conflicts:
core/src/main/scala/spark/BlockStoreShuffleFetcher.scala
2013-04-30 13:28:35 -07:00
Mridul Muralidharan
60cabb35cb
Add addition catch block for exception too
2013-05-01 01:17:14 +05:30
Mridul Muralidharan
3b748ced22
Be more aggressive and defensive in all uses of SelectionKey in select loop
2013-05-01 00:30:30 +05:30
Mridul Muralidharan
0f45477be1
Change indentation
2013-05-01 00:10:02 +05:30
Mridul Muralidharan
538614acfe
Be more aggressive and defensive in select also
2013-05-01 00:05:32 +05:30
Mridul Muralidharan
48854e1dbf
If key is not valid, close connection
2013-04-30 23:59:33 +05:30
Matei Zaharia
f708dda81e
Merge pull request #585 from pwendell/listener-perf
...
[Fix SPARK-742] Task Metrics should not employ per-record timing by default
2013-04-30 07:51:40 -07:00
Mridul Muralidharan
e46d547ccd
Fix issues reported by Reynold
2013-04-30 16:15:56 +05:30
Reynold Xin
1055785a83
Allow specifying the shuffle write file buffer size. The default buffer
...
size is 8KB in FastBufferedOutputStream, which is too small and would
cause a lot of disk seeks.
2013-04-29 23:33:56 -07:00
Reynold Xin
7007201201
Added a shuffle block manager so it is easier in the future to
...
consolidate shuffle output files.
2013-04-29 23:07:03 -07:00
Reynold Xin
d3586ef438
Merge branch 'blockmanager' of github.com:rxin/spark into blockmanager
...
Conflicts:
core/src/main/scala/spark/storage/DiskStore.scala
2013-04-29 15:44:18 -07:00
Patrick Wendell
016ce1fa9c
Using full package name for util
2013-04-29 12:02:27 -07:00
Patrick Wendell
540be6b154
Modified version of the fix which just removes all per-record tracking.
2013-04-29 11:32:07 -07:00
Patrick Wendell
224fbac061
Spark-742: TaskMetrics should not employ per-record timing.
...
This patch does three things:
1. Makes TimedIterator a trait with two implementations (one a no-op)
2. Makes the default behavior to use the no-op implementation
3. Removes DelegateBlockFetchTracker. This is just cleanup, but it seems like
the triat doesn't really reduce complexity in any way.
In the future we can add other implementations, e.g. ones which perform sampling.
2013-04-29 11:13:43 -07:00
Prashant Sharma
24bbf318b3
Fixied other warnings
2013-04-29 19:56:28 +05:30
Prashant Sharma
d3518f57cd
Fixed warning: erasure -> runtimeClass
2013-04-29 18:14:25 +05:30
Prashant Sharma
8f3ac240cb
Fixed Warning: ClassManifest -> ClassTag
2013-04-29 16:39:13 +05:30
Shivaram Venkataraman
604d3bf56c
Rename partition class and add scala doc
2013-04-28 16:31:07 -07:00
Shivaram Venkataraman
15acd49f07
Actually rename classes to ZippedPartitions*
...
(the previous commit only renamed the file)
2013-04-28 16:03:22 -07:00
Shivaram Venkataraman
6e84635ab9
Rename classes from MapZipped* to Zipped*
2013-04-28 15:58:40 -07:00
Shivaram Venkataraman
0cc6642b7c
Rename to zipPartitions and style changes
2013-04-28 05:11:03 -07:00
Shivaram Venkataraman
c9c4954d99
Add an interface to zip iterators of multiple RDDs
...
The current code supports 2, 3 or 4 arguments but can be extended
to more arguments if required.
2013-04-26 16:57:46 -07:00
Matei Zaharia
6e6b5204ea
Create an empty directory when checkpointing a 0-partition RDD (fixes a
...
test failure on Hadoop 2.0)
2013-04-25 00:42:37 -07:00
Reynold Xin
ba6ffa6a5f
Allow the specification of a shuffle serializer in the read path (for
...
local block reads).
2013-04-24 17:38:07 -07:00
Reynold Xin
aa618ed2a2
Allow changing the serializer on a per shuffle basis.
2013-04-24 14:52:49 -07:00
Prashant Sharma
ad88f083a6
scala 2.10 and master merge
2013-04-24 18:08:26 +05:30
Mridul Muralidharan
dd515ca3ee
Attempt at fixing merge conflict
2013-04-24 09:24:17 +05:30
Reynold Xin
31ce6c66d6
Added a BlockObjectWriter interface in block manager so ShuffleMapTask
...
doesn't need to build up an array buffer for each shuffle bucket.
2013-04-23 17:48:59 -07:00
koeninger
dfac0aa5c2
prevent mysql driver from pulling entire resultset into memory. explicitly close resultset and statement.
2013-04-22 21:12:52 -05:00
Prashant Sharma
185bb9525a
Manually merged scala-2.10 and master
2013-04-22 14:14:03 +05:30
Mridul Muralidharan
7acab3ab45
Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo
2013-04-22 08:01:13 +05:30
koeninger
b2a3f24dde
first attempt at an RDD to pull data from JDBC sources
2013-04-21 00:29:37 -05:00
Andrew xia
8436bd5d4a
remove TaskSetQueueManager and update code style
2013-04-19 02:17:22 +08:00
Andrew xia
e0603d7e8b
refactor the Schedulable interface and add unit test for SchedulingAlgorithm
2013-04-18 13:13:54 +08:00
Mridul Muralidharan
5ee2f5c483
Cache pattern, add (commented out) alternatives for check* apis
2013-04-17 23:13:34 +05:30
Mridul Muralidharan
f07961060d
Add a small note on spark.tasks.schedule.aggression
2013-04-17 23:13:02 +05:30
Mridul Muralidharan
02dffd2eb0
Ensure all ask/await block for spark.akka.askTimeout - so that it is controllable : instead of arbitrary timeouts spread across codebase. In our tests, we use 30 seconds, though default of 10 is maintained
2013-04-17 05:52:57 +05:30
Mridul Muralidharan
ad80f68eb5
remove spurious debug statements
2013-04-16 22:15:34 +05:30
Mridul Muralidharan
f7969f72ee
Fix exception when checkpoint path does not exist (no data in rdd which is being checkpointed for example)
2013-04-16 21:51:38 +05:30
Mridul Muralidharan
323ab8ff3b
Scala does not prevent variable shadowing ! Sick error due to it ...
2013-04-16 17:05:10 +05:30
shane-huang
b493f55a4f
fix a bug in netty Block Fetcher
...
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-04-16 10:01:01 +08:00
Mridul Muralidharan
59c380d69a
Fix npe
2013-04-16 03:29:38 +05:30
Mridul Muralidharan
dd2b64ec97
Fix bug with atomic update
2013-04-16 03:19:24 +05:30
Mridul Muralidharan
5540ab8243
Use hostname instead of hostport for executor, fix creation of workdir
2013-04-16 02:57:43 +05:30
Mridul Muralidharan
eb7e95e833
Commit job to persist files
2013-04-16 02:56:36 +05:30
Matei Zaharia
a64c107449
Make ShuffledRDD.prev transient
2013-04-15 16:41:51 -04:00
Mridul Muralidharan
19652a44be
Fix issue with FileSuite failing
2013-04-15 19:16:36 +05:30
Mridul Muralidharan
54b3d45b81
Checkpoint commit - compiles and passes a lot of tests - not all though, looking into FileSuite issues
2013-04-15 18:26:50 +05:30
Mridul Muralidharan
d90d2af103
Checkpoint commit - compiles and passes a lot of tests - not all though, looking into FileSuite issues
2013-04-15 18:12:11 +05:30
Matei Zaharia
c35d530bcf
Fix compile error
2013-04-13 12:43:12 -04:00
Andrew Ash
29d3440efb
Add details when BlockManager heartbeats time out
...
Makes it more clear what the threshold was for tuning spark.storage.blockManagerSlaveTimeoutMs
Before:
WARN "Removing BlockManager BlockManagerId(201304022120-1976232532-5050-27464-0, myhostname, 51337) with no recent heart beats
After:
WARN "Removing BlockManager BlockManagerId(201304022120-1976232532-5050-27464-0, myhostname, 51337) with no recent heart beats: 19216ms exceeds 15000ms
2013-04-11 01:54:02 -03:00
Andrew xia
2f883c515f
Contiue to update codes for scala code style
...
1.refactor braces for "class" "if" "while" "for" "match"
2.make code lines less than 100
3.refactor class parameter and extends defination
2013-04-09 13:02:50 +08:00
Matei Zaharia
054feb6448
Fixed a bug with zip
2013-04-07 21:15:21 -04:00
Matei Zaharia
b5900d47b1
Fix compile warning
2013-04-07 20:55:42 -04:00
Matei Zaharia
6962d40b44
Fix deprecated warning
2013-04-07 20:27:33 -04:00
Mridul Muralidharan
6798a09df8
Add support for building against hadoop2-yarn : adding new maven profile for it
2013-04-07 17:47:38 +05:30
shane-huang
df47b40b76
Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager
...
Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages
change reference from io.Source to scala.io.Source to avoid looking into io.netty package
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-04-07 14:37:12 +08:00
Andrew xia
2b373dd07a
add properties default value null to fix sbt/sbt test errors
2013-04-02 12:11:14 +08:00
Mark Hamstra
e215f67923
Correct sense of 'filter out' in comment.
2013-03-31 08:00:13 -07:00
Mark Hamstra
8bcdc64005
Fixed broken filter in getWritableClass[T]
2013-03-30 22:09:52 -07:00
Matei Zaharia
9831bc1a09
Merge pull request #539 from cgrothaus/fix-webui-workdirpath
...
Bugfix: WorkerWebUI must respect workDirPath from Worker
2013-03-29 22:16:22 -07:00
Matei Zaharia
3cc8ab6e29
Merge pull request #541 from stephenh/shufflecoalesce
...
Add a shuffle parameter to coalesce.
2013-03-29 22:14:07 -07:00
Andrew xia
1a28f92711
change some typo and some spacing
2013-03-29 08:34:28 +08:00
Andrew xia
def3d1c84a
1.remove redundant spacing in source code
...
2.replace get/set functions with val and var defination
2013-03-29 08:20:35 +08:00
Holden Karau
f5df729b12
Explicitly catch all throwables (warning in 2.10)
2013-03-24 16:15:32 -07:00
Stephen Haberman
dd854d5b9f
Use Boolean in the Java API, and != for assert.
2013-03-23 11:49:45 -05:00
Stephen Haberman
4ca273edc4
Merge branch 'master' into shufflecoalesce
...
Conflicts:
core/src/test/scala/spark/RDDSuite.scala
2013-03-23 11:45:45 -05:00
Matei Zaharia
b8949cab88
Merge pull request #505 from stephenh/volatile
...
Make Executor fields volatile since they're read from the thread pool.
2013-03-23 07:19:34 -07:00
Matei Zaharia
fd53f2fc7b
Merge pull request #510 from markhamstra/WithThing
...
mapWith, flatMapWith and filterWith
2013-03-23 07:13:21 -07:00
Andrew xia
d1d9bdaabe
Just update typo and comments
2013-03-23 07:25:30 +08:00
Stephen Haberman
00170eb0b9
Fix are/our typo.
2013-03-22 12:59:08 -05:00
Stephen Haberman
1c67c7dfd1
Add a shuffle parameter to coalesce.
...
This is useful for when you want just 1 output file (part-00000) but
still up the upstream RDD to be computed in parallel.
2013-03-22 08:54:44 -05:00
Christoph Grothaus
445f387ef4
Bugfix: WorkerWebUI must respect workDirPath from Worker
2013-03-22 11:08:40 +01:00
Matei Zaharia
35588490cb
Merge pull request #538 from rxin/cogroup
...
Added mapSideCombine flag to CoGroupedRDD. Added unit test for CoGroupedRDD.
2013-03-20 19:27:47 -07:00
Stephen Haberman
4f4215311a
Merge branch 'master' into volatile
2013-03-20 15:37:10 -05:00
Matei Zaharia
b812e6b7bb
Merge pull request #526 from markhamstra/foldByKey
...
Add foldByKey
2013-03-20 11:21:02 -07:00
Reynold Xin
d48ee7e55e
Merge branch 'master' of github.com:mesos/spark into cogroup
2013-03-20 14:00:28 +08:00
Reynold Xin
00a11304fd
Added mapSideCombine flag to CoGroupedRDD. Added unit test for
...
CoGroupedRDD.
2013-03-20 13:49:51 +08:00
Matei Zaharia
945d1e720e
Merge pull request #536 from sasurfer/master
...
CoalescedRDD for many partitions
2013-03-19 21:59:06 -07:00
Matei Zaharia
1cbbe94ac1
Merge pull request #534 from stephenh/removetrycatch
...
Remove try/catch block that can't be hit.
2013-03-19 21:34:34 -07:00
Andrey Kouznetsov
bd167f83b0
call setConf from input format if it is Configurable
2013-03-19 17:15:15 +04:00
Giovanni Delussu
aceae029f7
CoalescedRDD changed to work with a big number of partitions both in the original and the new coalesced RDD.
...
The limitation was in the range that Scala.Int can represent.
2013-03-19 11:25:45 +01:00
Stephen Haberman
fb34967815
Remove try/catch block that can't be hit.
2013-03-18 01:55:50 -05:00
Mark Hamstra
ab33e27cc9
constructorOfA -> constructA in doc comments
2013-03-16 15:29:15 -07:00
Mark Hamstra
9784fc1fcd
fix wayward comma in doc comment
2013-03-16 15:25:02 -07:00
Mark Hamstra
32979b5e7d
whitespace
2013-03-16 13:36:46 -07:00
Mark Hamstra
ca9f81e8fc
refactor foldByKey to use combineByKey
2013-03-16 13:31:01 -07:00
Mark Hamstra
1fb192ef40
Merge branch 'master' of https://github.com/mesos/spark into foldByKey
2013-03-16 12:17:13 -07:00
Mark Hamstra
80fc8c82ed
_With[Matei]
2013-03-16 12:16:29 -07:00
Mark Hamstra
38454c4aed
Merge branch 'master' of https://github.com/mesos/spark into WithThing
2013-03-16 11:54:44 -07:00
Matei Zaharia
c1e9cdc49f
Merge pull request #525 from stephenh/subtractByKey
...
Add PairRDDFunctions.subtractByKey.
2013-03-16 11:47:45 -07:00
Mark Hamstra
ef75be3bf7
Merge branch 'master' of https://github.com/mesos/spark into foldByKey
2013-03-15 21:41:24 -07:00
Andrew xia
5892393140
refactor fair scheduler implementation
...
1.Chage "pool" properties to be the memeber of ActiveJob
2.Abstract the Schedulable of Pool and TaskSetManager
3.Abstract the FIFO and FS comparator algorithm
4.Miscellaneous changing of class define and construction
2013-03-16 11:13:38 +08:00
Matei Zaharia
cdbfd1e196
Merge pull request #516 from squito/fix_local_metrics
...
Fix local metrics
2013-03-15 15:13:28 -07:00
Mark Hamstra
857010392b
Fuller implementation of foldByKey
2013-03-15 10:56:05 -07:00
Mark Hamstra
16a4ca4537
restrict V type of foldByKey in order to retain ClassManifest; added foldByKey to Java API and test
2013-03-14 13:58:37 -07:00
Mark Hamstra
b1422cbdd5
added foldByKey
2013-03-14 12:59:58 -07:00
Stephen Haberman
7786881f47
Fix tabs that snuck in.
2013-03-14 14:57:12 -05:00
Stephen Haberman
7d8bb4df3a
Allow subtractByKey's other argument to have a different value type.
2013-03-14 14:44:15 -05:00
Stephen Haberman
4632c45af1
Finished subtractByKeys.
2013-03-14 10:35:34 -05:00
Matei Zaharia
4032beba49
Merge pull request #521 from stephenh/earlyclose
...
Close the reader in HadoopRDD as soon as iteration end.
2013-03-13 19:29:46 -07:00
Stephen Haberman
63fe225587
Simplify SubtractedRDD in preparation from subtractByKey.
2013-03-13 17:17:34 -05:00
Mark Hamstra
cd5b947cf6
Merge branch 'master' of https://github.com/mesos/spark into WithThing
2013-03-13 13:16:14 -07:00
Stephen Haberman
1a175d13b9
Add NextIterator.closeIfNeeded.
2013-03-13 10:17:39 -05:00
Stephen Haberman
8f00d23598
Remove NextIterator.close default implementation.
2013-03-12 12:30:10 -05:00
Harold Lim
0b64e5f1ac
Removed some commented code
2013-03-12 13:31:27 +08:00
Harold Lim
f5b1fecb9f
Cleaned up the code
2013-03-12 13:31:27 +08:00
Harold Lim
b5325182a3
Updated/Refactored the Fair Task Scheduler. It does not inherit ClusterScheduler anymore. Rather, ClusterScheduler internally uses TaskSetQueuesManager that handles the scheduling of taskset queues. This is the class that should be extended to support other scheduling policies
2013-03-12 13:31:27 +08:00
Harold Lim
54ed7c4af4
Changed the name of the system property to set the allocation xml
2013-03-12 13:31:27 +08:00
Harold Lim
c07087364b
Made changes to the SparkContext to have a DynamicVariable for setting local properties that can be passed down the stack. Added an implementation of the fair scheduler
2013-03-12 13:31:27 +08:00
Stephen Haberman
9e68f48625
More quickly call close in HadoopRDD.
...
This also refactors out the common "gotNext" iterator pattern into
a shared utility class.
2013-03-11 23:59:17 -05:00
Charles Reiss
769d399674
Send block sizes as longs.
2013-03-11 14:17:05 -07:00
Mark Hamstra
1289e7176b
refactored _With API and added foreachPartition
2013-03-10 22:27:13 -07:00
Mark Hamstra
b57df1f5e3
Merge branch 'master' of https://github.com/mesos/spark into WithThing
2013-03-10 16:56:31 -07:00
Matei Zaharia
91a9d093bd
Merge pull request #512 from patelh/fix-kryo-serializer
...
Fix reference bug in Kryo serializer, add test, update version
2013-03-10 15:48:23 -07:00
Matei Zaharia
557cfd0f4d
Merge pull request #515 from woggling/deploy-app-death
...
Notify standalone deploy client of application death.
2013-03-10 15:44:57 -07:00
Matei Zaharia
a59cc6060f
Merge remote-tracking branch 'stephenh/nomocks'
...
Conflicts:
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
2013-03-10 13:39:10 -07:00
Imran Rashid
20f01a0a1b
enable task metrics in local mode, add tests
2013-03-09 21:17:31 -08:00
Imran Rashid
ec30188a2a
rename remoteFetchWaitTime to fetchWaitTime, since it also includes time from local fetches
2013-03-09 21:16:53 -08:00
Charles Reiss
b0983c5762
Notify standalone deploy client of application death.
...
Usually, this isn't necessary since the application will be removed
as a result of the deploy client disconnecting, but occassionally, the
standalone deploy master removes an application otherwise.
Also mark applications as FAILED instead of FINISHED when they are
killed as a result of their executors failing too many times.
2013-03-09 11:29:45 -08:00
Hiral Patel
664e5fd24b
Fix reference bug in Kryo serializer, add test, update version
2013-03-07 22:16:11 -08:00
Mark Hamstra
5ff0810b11
refactor mapWith, flatMapWith and filterWith to each use two parameter lists
2013-03-05 12:25:44 -08:00
Mark Hamstra
d046d8ad32
whitespace formatting
2013-03-05 00:48:13 -08:00
Mark Hamstra
9148b968cf
mapWith, flatMapWith and filterWith
2013-03-04 15:48:47 -08:00
Matei Zaharia
9f0dc829cb
Fix TaskMetrics not being serializable
2013-03-04 12:08:31 -08:00
Matei Zaharia
04fb81ffe5
Merge pull request #506 from rxin/spark-706
...
Fixed SPARK-706: Failures in block manager put leads to read task hanging.
2013-03-03 17:20:07 -08:00
Imran Rashid
0bd1d00c2a
minor cleanup based on feedback in review request
2013-03-03 16:46:45 -08:00
Imran Rashid
f1006b99ff
change CleanupIterator to CompletionIterator
2013-03-03 16:39:05 -08:00
Imran Rashid
8fef5b9c5f
refactoring of TaskMetrics
2013-03-03 16:34:04 -08:00
Imran Rashid
d36abdb053
Merge branch 'master' into stageInfo
2013-03-03 15:20:46 -08:00
Reynold Xin
44134e12bb
Fixed SPARK-706: Failures in block manager put leads to read task
...
hanging.
2013-02-28 15:14:59 -08:00
Stephen Haberman
6415c2bb60
Don't create the Executor until we have everything it needs.
2013-02-28 12:38:09 -06:00
Stephen Haberman
80eecd2cb1
Make Executor fields volatile since they're read from the thread pool.
2013-02-28 10:41:07 -06:00
Mosharaf Chowdhury
4ab387bcdb
Fixed master datastructure updates after removing an application; and a typo.
2013-02-27 13:52:44 -08:00
Matei Zaharia
ece3edfffa
Fix a problem with no hosts being counted as alive in the first job
2013-02-26 12:11:03 -08:00
Matei Zaharia
73697e2891
Fix overly large thread names in PySpark
2013-02-26 12:07:59 -08:00
Stephen Haberman
a65aa549ff
Override DAGScheduler.runLocally so we can remove the Thread.sleep.
2013-02-25 23:49:32 -06:00
Stephen Haberman
a4adeb255c
Merge branch 'master' into nomocks
...
Conflicts:
core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
2013-02-25 23:48:52 -06:00
Tathagata Das
c02e064938
Fixed replication bug in BlockManager
2013-02-25 17:27:46 -08:00
Matei Zaharia
490f056cdd
Allow passing sparkHome and JARs to StreamingContext constructor
...
Also warns if spark.cleaner.ttl is not set in the version where you pass
your own SparkContext.
2013-02-25 15:13:30 -08:00
Matei Zaharia
568bdaf8ae
Set spark.deploy.spreadOut to true by default in 0.7 (improves locality)
2013-02-25 14:34:55 -08:00
Matei Zaharia
1ef58dadcc
Add a config property for Akka lifecycle event logging
2013-02-25 14:01:24 -08:00
Matei Zaharia
ceaec4a675
Merge pull request #498 from pwendell/shutup-akka
...
Disable remote lifecycle logging from Akka.
2013-02-25 12:31:24 -08:00
Patrick Wendell
85a85646d9
Disable remote lifecycle logging from Akka.
...
This changes the default setting to `off` for remote lifecycle events. When this is on, it is very chatty at the INFO level. It also prints out several ERROR messages sometimes when sc.stop() is called.
2013-02-25 12:25:43 -08:00
Imran Rashid
8f17387d97
remove bogus comment
2013-02-25 10:31:06 -08:00
Matei Zaharia
6ae9a22c3e
Get spark.default.paralellism on each call to defaultPartitioner,
...
instead of only once, in case the user changes it across Spark uses
2013-02-25 10:28:08 -08:00
Matei Zaharia
d6e6abece3
Merge pull request #459 from stephenh/bettersplits
...
Change defaultPartitioner to use upstream split size.
2013-02-25 09:22:04 -08:00
Stephen Haberman
c44ccf2862
Use default parallelism if its set.
2013-02-24 23:54:03 -06:00
Stephen Haberman
44032bc476
Merge branch 'master' into bettersplits
...
Conflicts:
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/test/scala/spark/ShuffleSuite.scala
2013-02-24 22:08:14 -06:00
Christoph Grothaus
f39f2b7636
Incorporate feedback from mateiz:
...
- we do not need getEnvOrEmpty
- Instead of saving SPARK_NONDAEMON_JAVA_OPTS, it would be better to modify the scripts to use a different variable name for the JAVA_OPTS they do eventually use
2013-02-24 21:24:30 +01:00
Tathagata Das
dff53d1b94
Merge branch 'mesos-master' into streaming
2013-02-24 12:17:22 -08:00
Matei Zaharia
3b9f929467
Merge pull request #468 from haitaoyao/master
...
support customized java options for Master, Worker, Executor, and Repl
2013-02-23 23:38:15 -08:00
Stephen Haberman
37c7a71f9c
Add subtract to JavaRDD, JavaDoubleRDD, and JavaPairRDD.
2013-02-24 00:27:53 -06:00
Stephen Haberman
f442e7d83c
Update for split->partition rename.
2013-02-24 00:27:14 -06:00
Stephen Haberman
cec87a0653
Merge branch 'master' into subtract
2013-02-23 23:27:55 -06:00
Tathagata Das
d853aa9658
Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs.
2013-02-23 17:42:26 -08:00
Patrick Wendell
931f439be9
Responding to code review
2013-02-23 15:40:41 -08:00
Patrick Wendell
f51b0f93f2
Adding Java-accessible methods to Vector.scala
...
This is needed for the Strata machine learning tutorial (and
also is generally helpful).
2013-02-23 13:26:59 -08:00
Matei Zaharia
d942d39072
Handle exceptions in RecordReader.close() better (suggested by Jim
...
Donahue)
2013-02-23 11:19:07 -08:00
Matei Zaharia
c89824046a
Merge pull request #490 from woggling/conn-death
...
Detect when SendingConnections disconnect even if we aren't sending to them
2013-02-22 22:58:19 -08:00
Charles Reiss
c8a7886921
Detect when SendingConnections drop by trying to read them.
...
Comment fix
2013-02-22 16:11:52 -08:00
Matei Zaharia
d4d7993bf5
Several fixes to the work to log when no resources can be used by a job.
...
Fixed some of the messages as well as code style.
2013-02-22 15:51:37 -08:00
Matei Zaharia
f33662c133
Merge remote-tracking branch 'pwendell/starvation-check'
...
Also fixed a bug where master was offering executors on dead workers
Conflicts:
core/src/main/scala/spark/deploy/master/Master.scala
2013-02-22 15:27:41 -08:00
Matei Zaharia
7341de0d48
Merge pull request #475 from JoshRosen/spark-668
...
Remove hack workaround for SPARK-668
2013-02-22 14:56:18 -08:00
Patrick Wendell
f8c3a03d55
SPARK-702: Replace Function --> JFunction in JavaAPI Suite.
...
In a few places the Scala (rather than Java) function class is used.
2013-02-22 12:54:15 -08:00
Imran Rashid
0f37b43b40
make the ShuffleFetcher responsible for collecting shuffle metrics, which gives us metrics for CoGroupedRDD and ShuffledRDD
2013-02-21 16:56:28 -08:00
Imran Rashid
9230617f23
add cleanup iterator
2013-02-21 16:55:14 -08:00
Imran Rashid
81bd07da26
sparkListeners should be a val
2013-02-21 15:21:45 -08:00
Imran Rashid
796e934d31
add some docs & some cleanup
2013-02-21 15:19:34 -08:00
Imran Rashid
394d3acc3e
store taskInfo & metrics together in a tuple
2013-02-21 15:19:34 -08:00
Imran Rashid
7960927cf4
get rid of a bunch of boilerplate; more formatting happens in Listener, not StageInfo
2013-02-21 15:19:34 -08:00
Imran Rashid
d0bfac3eed
taskInfo tracks if a task is run on a preferred host
2013-02-21 15:19:34 -08:00
Imran Rashid
6f62a57858
add runtime breakdowns
2013-02-21 15:19:34 -08:00
Imran Rashid
176cb20703
add task result size; better formatting for time interval distributions; cleanup distribution formatting
2013-02-21 15:19:33 -08:00
Imran Rashid
f2fcabf2ea
add timing around parts of executor & track result size
2013-02-21 15:19:33 -08:00
Imran Rashid
ff127cfcd3
Merge branch 'master' into stageInfo
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/storage/BlockManager.scala
2013-02-21 15:16:21 -08:00
Imran Rashid
baab23abdf
TaskContext does not hold a reference to Task; instead, it has a shared instance of TaskMetrics with Task
2013-02-21 14:13:01 -08:00
haitao.yao
8215b95547
Merge branch 'mesos'
2013-02-21 10:07:24 +08:00
Christoph Grothaus
85a35c6840
Fix SPARK-698. From ExecutorRunner, launch java directly instead via the run scripts.
2013-02-20 21:42:11 +01:00
Tathagata Das
334ab92441
Fixed bug in CheckpointSuite
2013-02-20 10:26:36 -08:00
Tathagata Das
1cb725e417
Merge branch 'mesos-master' into streaming
2013-02-20 09:55:35 -08:00
Tathagata Das
fb9956256d
Merge branch 'mesos-master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Matei Zaharia
05bc02e80b
Merge pull request #482 from woggling/shutdown-exceptions
...
Don't call System.exit over uncaught exceptions from shutdown hooks
2013-02-19 20:56:15 -08:00
haitao.yao
6a3d44c673
Merge branch 'mesos'
2013-02-20 10:23:58 +08:00
Charles Reiss
092c631fa8
Pull detection of being in a shutdown hook into utility function.
2013-02-19 17:49:55 -08:00
Reynold Xin
130f704baf
Added a method to create PartitionPruningRDD.
2013-02-19 16:03:52 -08:00
Charles Reiss
d0588bd6d7
Catch/log errors deleting temp dirs
2013-02-19 13:04:06 -08:00
Charles Reiss
687581c3ec
Paranoid uncaught exception handling for exceptions during shutdown
2013-02-19 13:03:02 -08:00
haitao.yao
7c129388fb
Merge branch 'mesos'
2013-02-19 11:22:24 +08:00
Matei Zaharia
7151e1e4c8
Rename "jobs" to "applications" in the standalone cluster
2013-02-17 23:23:08 -08:00
Matei Zaharia
06e5e6627f
Renamed "splits" to "partitions"
2013-02-17 22:13:26 -08:00
Matei Zaharia
340cc54e47
Merge pull request #471 from stephenh/parallelrdd
...
Move ParallelCollection into spark.rdd package.
2013-02-16 16:39:15 -08:00
Matei Zaharia
3260b6120e
Merge pull request #470 from stephenh/morek
...
Make CoGroupedRDDs explicitly have the same key type.
2013-02-16 16:38:38 -08:00
Stephen Haberman
924f47dd11
Add RDD.subtract.
...
Instead of reusing the cogroup primitive, this adds a SubtractedRDD
that knows it only needs to keep rdd1's values (per split) in memory.
2013-02-16 13:38:42 -06:00
Stephen Haberman
e7713adb99
Move ParallelCollection into spark.rdd package.
2013-02-16 13:20:48 -06:00
Stephen Haberman
ae2234687d
Make CoGroupedRDDs explicitly have the same key type.
2013-02-16 13:10:31 -06:00
Stephen Haberman
4328873294
Add assertion about dependencies.
2013-02-16 01:16:40 -06:00
Stephen Haberman
c34b8ad2c5
Avoid a shuffle if combineByKey is passed the same partitioner.
2013-02-16 00:54:03 -06:00
Stephen Haberman
4281e579c2
Update more javadocs.
2013-02-16 00:45:03 -06:00
Stephen Haberman
6cd68c31cb
Update default.parallelism docs, have StandaloneSchedulerBackend use it.
...
Only brand new RDDs (e.g. parallelize and makeRDD) now use default
parallelism, everything else uses their largest parent's partitioner
or partition size.
2013-02-16 00:29:11 -06:00
haitao.yao
a9cfac347a
Merge branch 'mesos'
2013-02-16 10:11:28 +08:00
Imran Rashid
bffee929ab
Merge branch 'master' into stageInfo
...
Conflicts:
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/storage/BlockManager.scala
2013-02-15 10:35:04 -08:00
Imran Rashid
893bad9089
use appid instead of frameworkid; simplify stupid condition
2013-02-13 20:30:21 -08:00
Imran Rashid
8f18e7e863
include jobid in Executor commandline args
2013-02-13 13:05:13 -08:00
Matei Zaharia
bfeed4725d
Merge pull request #465 from pwendell/java-sort-fix
...
SPARK-696: sortByKey should use 'ascending' parameter
2013-02-11 18:23:12 -08:00
Patrick Wendell
21df6ffc13
SPARK-696: sortByKey should use 'ascending' parameter
2013-02-11 17:43:26 -08:00
Matei Zaharia
ea08537143
Fixed an exponential recursion that could happen with doCheckpoint due
...
to lack of memoization
2013-02-11 13:23:50 -08:00
Josh Rosen
e9fb25426e
Remove hack workaround for SPARK-668.
...
Renaming the type paramters solves this problem (see SPARK-694).
I tried this fix earlier, but it didn't work because I didn't run
`sbt/sbt clean` first.
2013-02-11 11:19:20 -08:00
Imran Rashid
e9f53ec0ea
undo chnage to onCompleteCallbacks
2013-02-11 09:36:49 -08:00
Matei Zaharia
da8afbc77e
Some bug and formatting fixes to FT
...
Conflicts:
core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:43:38 -08:00
root
1b47fa2752
Detect hard crashes of workers using a heartbeat mechanism.
...
Also fixes some issues in the rest of the code with detecting workers this way.
Conflicts:
core/src/main/scala/spark/deploy/master/Master.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:28:28 -08:00
Matei Zaharia
8c66c49962
Tweak web UI so that people don't get confused about master URL format
...
Conflicts:
core/src/main/twirl/spark/deploy/master/index.scala.html
core/src/main/twirl/spark/deploy/worker/index.scala.html
2013-02-10 21:58:34 -08:00
Imran Rashid
d9461b15d3
cleanup a bunch of imports
2013-02-10 21:41:40 -08:00
Tathagata Das
16baea62bc
Fixed bug in CheckpointRDD to prevent exception when the original RDD had zero splits.
2013-02-10 19:14:49 -08:00
Imran Rashid
383af599bb
SparkContext.addSparkListener; "std" listener in StatsReportListener
2013-02-10 14:19:37 -08:00
Imran Rashid
b7d9e24394
use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver
2013-02-10 14:18:52 -08:00
Stephen Haberman
680f42e6cd
Change defaultPartitioner to use upstream split size.
...
Previously it used the SparkContext.defaultParallelism, which occassionally
ended up being a very bad guess. Looking at upstream RDDs seems to make
better use of the context.
Also sorted the upstream RDDs by partition size first, as if we have
a hugely-partitioned RDD and tiny-partitioned RDD, it is unlikely
we want the resulting RDD to be tiny-partitioned.
2013-02-10 02:27:03 -06:00
Patrick Wendell
2ed791fd7f
Minor fixes
2013-02-09 22:00:38 -08:00
Patrick Wendell
1859c9f93c
Changing to use Timer based on code review
2013-02-09 21:55:17 -08:00
Matei Zaharia
ccb1ca4a23
Merge pull request #448 from squito/fetch_maxBytesInFlight
...
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-09 18:15:18 -08:00
Matei Zaharia
f750daa510
Merge pull request #452 from stephenh/misc
...
Add RDD.coalesce, clean up some RDDs, other misc.
2013-02-09 18:12:56 -08:00
Stephen Haberman
4619ee0787
Move JavaRDDLike.coalesce into the right places.
2013-02-09 20:05:42 -06:00
Stephen Haberman
921be76533
Use stubs instead of mocks for DAGSchedulerSuite.
2013-02-09 16:42:18 -06:00
Stephen Haberman
fb7599870f
Fix JavaRDDLike.coalesce return type.
2013-02-09 16:10:52 -06:00
Stephen Haberman
2a18cd826c
Add back return types.
2013-02-09 10:12:04 -06:00
Stephen Haberman
da52b16b38
Remove RDD.coalesce default arguments.
2013-02-09 10:11:54 -06:00
Imran Rashid
04e828f7c1
general fixes to Distribution, plus some tests
2013-02-08 19:07:36 -08:00
Mark Hamstra
b8863a79d3
Merge branch 'master' of https://github.com/mesos/spark into commutative
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Mark Hamstra
934a53c8b6
Change docs on 'reduce' since the merging of local reduces no longer preserves
...
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Stephen Haberman
a9c8d53cfa
Clean up RDDs, mainly to use getSplits.
...
Also made sure clearDependencies() was calling super, to ensure
the getSplits/getDependencies vars in the RDD base class get
cleaned up.
2013-02-05 22:16:59 -06:00
Stephen Haberman
f4d43cb43e
Remove unneeded zipWithIndex.
...
Also rename r->rdd and remove unneeded extra type info.
2013-02-05 21:26:45 -06:00
Stephen Haberman
f2bc748013
Add RDD.coalesce.
2013-02-05 21:23:36 -06:00
Stephen Haberman
67df7f2fa2
Add private, minor formatting.
2013-02-05 21:08:21 -06:00