Matei Zaharia
e89ffc7b3c
Merge pull request #839 from jegonzal/zip_partitions
...
Currying RDD.zipPartitions
2013-08-16 14:02:34 -07:00
Joseph E. Gonzalez
53b2639a1e
Reversing the argument order in zipPartitions to enable stronger type inference.
2013-08-16 12:38:59 -07:00
Andre Schumacher
c7e348faec
Implementing SPARK-878 for PySpark: adding zip and egg files to context and passing it down to workers which add these to their sys.path
2013-08-16 11:58:20 -07:00
Reynold Xin
c961c19b7b
Use the JSON formatter from Scala library and removed dependency on lift-json.
...
It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
2013-08-15 18:23:01 -07:00
Reynold Xin
eddbf43b54
Revert "Merge pull request #834 from Daemoen/master"
...
This reverts commit 230ab2722e
, reversing
changes made to 659553b21d
.
2013-08-15 17:49:37 -07:00
Reynold Xin
230ab2722e
Merge pull request #834 from Daemoen/master
...
Updated json output to allow for display of worker state
2013-08-15 17:45:17 -07:00
Patrick Wendell
659553b21d
Merge pull request #836 from pwendell/rename
...
Rename `memoryBytesToString` and `memoryMegabytesToString`
2013-08-15 16:56:31 -07:00
Jey Kottalam
a06a9d5c5f
Rename HadoopWriter to SparkHadoopWriter since it's outside of our package
2013-08-15 16:50:37 -07:00
Jey Kottalam
8f979edef5
Fix newTaskAttemptID to work under YARN
2013-08-15 16:50:37 -07:00
Jey Kottalam
e2d7656ca3
re-enable YARN support
2013-08-15 16:50:37 -07:00
Jey Kottalam
bd0bab47c9
SparkEnv isn't available this early, and not needed anyway
2013-08-15 16:50:37 -07:00
Jey Kottalam
4f43fd791a
make SparkHadoopUtil a member of SparkEnv
2013-08-15 16:50:37 -07:00
Jey Kottalam
43ebcb8484
rename HadoopMapRedUtil => SparkHadoopMapRedUtil, HadoopMapReduceUtil => SparkHadoopMapReduceUtil
2013-08-15 16:50:37 -07:00
Jey Kottalam
8b1c1520fc
add comment
2013-08-15 16:50:37 -07:00
Jey Kottalam
69c3bbf688
dynamically detect hadoop version
2013-08-15 16:50:37 -07:00
Jey Kottalam
f67b94ad4f
remove core/src/hadoop{1,2} dirs
2013-08-15 16:50:36 -07:00
Jey Kottalam
b877e20a33
move yarn to its own directory
2013-08-15 16:50:36 -07:00
Patrick Wendell
4c6ade1ad5
Rename memoryBytesToString
and memoryMegabytesToString
...
These are used all over the place now and they are not specific to memory at all.
memoryBytesToString --> bytesToString
memoryMegabytesToString --> megabytesToString
2013-08-15 15:58:07 -07:00
Reynold Xin
1a51deae8a
More minor UI changes including code review feedback.
2013-08-15 14:34:07 -07:00
Daemoen
ad2e8b5126
Updated json output to allow for display of worker state
...
Ops teams need to ensure that the cluster is functional and performant. Having to scrape the html source for worker state won't work reliably, and will be slow. By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.
2013-08-15 12:19:14 -07:00
Reynold Xin
2d2a556bdf
Various UI improvements.
2013-08-14 23:23:09 -07:00
Reynold Xin
290e3e6e65
Renamed setCurrentJobDescription to setJobDescription.
2013-08-14 18:40:53 -07:00
Reynold Xin
3886b54933
A few small scheduler / job description changes.
...
1. Renamed SparkContext.addLocalProperty to setLocalProperty. And allow this function to unset a property.
2. Renamed SparkContext.setDescription to setCurrentJobDescription.
3. Throw an exception if the fair scheduler allocation file is invalid.
2013-08-14 17:19:42 -07:00
Matei Zaharia
839f2d4f3f
Merge pull request #822 from pwendell/ui-features
...
Adding GC Stats to TaskMetrics (and three small fixes)
2013-08-14 16:17:23 -07:00
Patrick Wendell
04ad78b09d
Style cleanup based on Matei feedback
2013-08-14 14:57:21 -07:00
Kay Ousterhout
a88aa5e6ed
Fixed 2 bugs in executor UI.
...
1) UI crashed if the executor UI was loaded before any tasks started.
2) The total tasks was incorrectly reported due to using string (rather
than int) arithmetic.
2013-08-13 23:44:58 -07:00
Patrick Wendell
c223176388
Small style clean-up
2013-08-13 16:56:37 -07:00
Patrick Wendell
fab5cee111
Correcting terminology in RDD page
2013-08-13 16:25:55 -07:00
Patrick Wendell
024e5c5ce1
Correct sorting order for stages
2013-08-13 16:25:55 -07:00
Patrick Wendell
4e9f0c2df6
Capturing GC detials in TaskMetrics
2013-08-13 16:25:55 -07:00
Patrick Wendell
f0382007dc
Bug fix for display of shuffle read/write metrics.
...
This fixes an error where empty cells are missing if a given task
has no shuffle read/write.
2013-08-13 16:25:55 -07:00
Matei Zaharia
d316af9c84
Merge pull request #821 from pwendell/print-launch-command
...
Print run command to stderr rather than stdout
2013-08-13 15:31:01 -07:00
Patrick Wendell
a7feb69ae8
Print run command to stderr rather than stdout
2013-08-13 15:07:03 -07:00
Kay Ousterhout
1beb843a6f
Reuse the set of failed states rather than creating a new object each time
2013-08-13 14:27:40 -07:00
Kay Ousterhout
c92dd627ca
Properly account for killed tasks.
...
The TaskState class's isFinished() method didn't return true for
KILLED tasks, which means some resources are never reclaimed
for tasks that are killed. This also made it inconsistent with the
isFinished() method used by CoarseMesosSchedulerBackend.
2013-08-13 12:40:15 -07:00
Patrick Wendell
ed6a1646e6
Slight change to pr-784
2013-08-13 09:29:40 -07:00
Patrick Wendell
a0133bfbad
Merge pull request #784 from jerryshao/dev-metrics-servlet
...
Add MetricsServlet for Spark metrics system
2013-08-13 09:28:18 -07:00
Matei Zaharia
65d0d91fba
Merge pull request #807 from JoshRosen/guava-optional
...
Change scala.Option to Guava Optional in Java APIs
2013-08-12 19:00:57 -07:00
Josh Rosen
cf08bb7a3e
Fix import organization.
2013-08-12 18:55:02 -07:00
jerryshao
09c7179e81
MetricsServlet code refactor according to comments
2013-08-12 13:23:23 +08:00
jerryshao
320e87e7ab
Add MetricsServlet for Spark metrics system
2013-08-12 13:23:23 +08:00
Reynold Xin
e5b9ed2833
Merge pull request #808 from pwendell/ui_compressed_bytes
...
Report compressed bytes read when calculating TaskMetrics
2013-08-11 17:22:47 -07:00
Patrick Wendell
3d8f281604
Report compressed bytes read when calculating TaskMetrics
2013-08-11 16:25:57 -07:00
Matei Zaharia
379648630b
Merge pull request #805 from woggle/hadoop-rdd-jobconf
...
Use new Configuration() instead of slower new JobConf() in SerializableWritable
2013-08-11 14:51:47 -07:00
Josh Rosen
d7f78b443b
Change scala.Option to Guava Optional in Java APIs.
2013-08-11 12:05:09 -07:00
Charles Reiss
6402b539d0
Use new Configuration() instead of new JobConf() for ObjectWritable.
...
JobConf's constructor loads default config files in some verisons of
Hadoop, which is quite slow, and we only need the Configuration object
to pass the correct ClassLoader.
2013-08-10 21:31:05 -07:00
Matei Zaharia
71c63de22f
Merge pull request #795 from mridulm/master
...
Fix bug reported in PR 791 : a race condition in ConnectionManager and Connection
2013-08-10 10:21:20 -07:00
Matei Zaharia
d3277a0daf
Merge remote-tracking branch 'origin/pr/792'
...
Conflicts:
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/jobs/StagePage.scala
2013-08-10 10:18:50 -07:00
Patrick Wendell
d17eeb997d
Merge pull request #785 from anfeng/master
...
expose HDFS file system stats via Executor metrics
2013-08-10 09:02:27 -07:00
Kay Ousterhout
14d14f451a
Shortened names, as per Matei's suggestion
2013-08-10 07:50:27 -07:00
Matei Zaharia
cd247ba5bb
Merge pull request #786 from shivaram/mllib-java
...
Java fixes, tests and examples for ALS, KMeans
2013-08-09 20:41:13 -07:00
Kay Ousterhout
7810a76512
Only print event queue full error message once
2013-08-09 18:20:48 -07:00
Kay Ousterhout
44ca8629d8
Style fix: removing unnecessary return type
2013-08-09 17:22:50 -07:00
Kay Ousterhout
29b79714f9
Style fixes based on code review
2013-08-09 16:46:34 -07:00
Kay Ousterhout
81e1d4a7d1
Refactored SparkListener to process all events asynchronously.
...
This commit fixes issues where SparkListeners that take a while to
process events slow the DAGScheduler.
This commit also fixes a bug in the UI where if a user goes to a
web page of a stage that does not exist, they can create a memory
leak (granted, this is not an issue at small scale -- probably only
an issue if someone actively tried to DOS the UI).
2013-08-09 13:27:41 -07:00
Matei Zaharia
b09d4b79e8
Merge pull request #799 from woggle/sync-fix
...
Remove extra synchronization in ResultTask
2013-08-09 13:17:08 -07:00
Patrick Wendell
cc6b92e80e
Merge pull request #775 from pwendell/print-launch-command
...
Log the launch command for Spark daemons
2013-08-09 13:00:33 -07:00
Patrick Wendell
3970b580c2
Using quotes when printing out command
2013-08-09 11:53:32 -07:00
Charles Reiss
9dfc280f74
Remove extra synchronization in ResultTask
2013-08-09 11:09:02 -07:00
Matei Zaharia
f94fc75c3f
Merge pull request #788 from shane-huang/sparkjavaopts
...
For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as ...
2013-08-09 10:04:03 -07:00
Matei Zaharia
d1e1c1b24d
Add test for Kryo with WrappedArray (which was failing in Chill 0.3.0)
2013-08-08 13:34:11 -07:00
Mridul Muralidharan
c230ca3b4e
Change line size
2013-08-08 22:28:40 +05:30
Mridul Muralidharan
dc47084f4e
Attempt to fix bug reported in PR 791 : a race condition in ConnectionManager and Connection
2013-08-08 22:19:27 +05:30
Kay Ousterhout
88049a214d
Fixed 3 bugs that caused UI to crash (including SPARK-810).
...
One bug caused the UI to crash if you try to look at a job's status
before any of the tasks have finished.
The second bug was a concurrency issue where two different threads
(the scheduling thread and a UI thread) could be reading/updating
the data structures in JobProgressListener concurrently.
The third bug mis-used an Option, also causing the UI to crash
under certain conditions.
2013-08-07 23:09:25 -07:00
Patrick Wendell
b4321edf68
Reverting boostrap change
2013-08-07 22:18:18 -07:00
Patrick Wendell
21392f2a73
Change I forgot to merge in
2013-08-07 21:45:32 -07:00
Patrick Wendell
706394b370
Bumping font size to 14px and fixing sytle issue in progress bars
2013-08-07 21:27:04 -07:00
Patrick Wendell
8c0d668468
Merge branch 'master' into bootstrap-design
...
Conflicts:
core/src/main/scala/spark/ui/UIUtils.scala
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/storage/RDDPage.scala
2013-08-07 21:06:03 -07:00
Kay Ousterhout
b88e26248e
Fixed issue in UI that limited scheduler throughput.
...
Removal of items from ArrayBuffers in the UI code was slow and
significantly impacted scheduler throughput. This commit
improves scheduler throughput by 5x.
2013-08-07 14:42:05 -07:00
shane-huang
cbc5107e36
For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as default and let application env override default options if applicable
...
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-08-07 14:36:48 +08:00
Matei Zaharia
6b043a6f11
Merge pull request #724 from dlyubimov/SPARK-826
...
SPARK-826: fold(), reduce(), collect() always attempt to use java serialization
2013-08-06 22:31:02 -07:00
Matei Zaharia
7c4b7a53b1
Merge remote-tracking branch 'origin/pr/781'
...
Conflicts:
core/src/main/resources/spark/ui/static/webui.css
2013-08-06 17:19:49 -07:00
Karen Feng
908032e79b
Used saturated colors for progress bars
2013-08-06 16:52:21 -07:00
Karen Feng
8bc497fa10
Lightened color of progress bars
2013-08-06 16:33:05 -07:00
Karen Feng
ca1903ea63
Overlays progress text on top of bar
2013-08-06 15:45:42 -07:00
Matei Zaharia
df4d10d630
Merge pull request #779 from adatao/adatao-global-SparkEnv
...
[HOTFIX] Extend thread safety for SparkEnv.get()
2013-08-06 15:44:05 -07:00
Shivaram Venkataraman
471fbadd0c
Java examples, tests for KMeans and ALS
...
- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
examples project.
Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
KMeans init
2013-08-06 15:43:46 -07:00
anfeng
dda2ac8b5d
reformat registerFileSystemStat()
2013-08-06 15:22:25 -07:00
Karen Feng
099528b6c4
Pre-sorts stage/env tables, changes text/link of stage summaries
2013-08-06 14:52:12 -07:00
Karen Feng
254a930730
Reverse sorts StageTable by submitted time
2013-08-06 14:18:38 -07:00
Karen Feng
5ed5b73026
Sorts first column of env tables
2013-08-06 13:59:53 -07:00
anfeng
0748c60817
expose HDFS file system stats via Executor metrics
2013-08-06 11:47:06 -07:00
Reynold Xin
d031f73679
Merge pull request #782 from WANdisco/master
...
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 22:33:00 -07:00
Matei Zaharia
1b63dea816
Merge pull request #769 from markhamstra/NegativeCores
...
SPARK-847 + SPARK-845: Zombie workers and negative cores
2013-08-05 22:21:26 -07:00
Alexander Pivovarov
a30866438b
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 21:48:43 -07:00
Matei Zaharia
8b277892c9
Merge pull request #774 from pwendell/job-description
...
Show user-defined job name in UI
2013-08-05 19:14:52 -07:00
Christopher Nguyen
b1bbbe699c
[HOTFIX] Mark lastSetSparkEnv @volatile in case it gets HotSpot-cached
...
On branch adatao-global-SparkEnv
Changes to be committed:
modified: core/src/main/scala/spark/SparkEnv.scala
2013-08-05 17:22:27 -07:00
Mark Hamstra
35d8f5ee52
Moved handling of timed out workers within the Master actor
2013-08-05 13:13:56 -07:00
Mark Hamstra
37ccf9301a
milliseconds -> seconds in timeOutDeadWorkers logging
2013-08-05 13:13:56 -07:00
Mark Hamstra
cdd1af562e
Timeout zombie workers
2013-08-05 13:13:56 -07:00
Mikhail Bautin
e8bec8365f
Only reduce the number of cores once when removing an executor
2013-08-05 13:13:56 -07:00
Karen Feng
95025afdec
Made most small fixes for SPARK-849 except for table sort, task progress overlay
2013-08-05 13:04:56 -07:00
Bill Zhao
87134b3648
SPARK-850: give better console message
2013-08-05 11:55:35 -07:00
Christopher Nguyen
39e4fda76f
[HOTFIX] Extend thread safety for SparkEnv.get()
...
A ThreadLocal SparkEnv.env is facing various situations leading to
NullPointerExceptions, where SparkEnv.env set in one thread is not
gettable in another thread, but often assumed to be available.
See, e.g., https://groups.google.com/forum/#!topic/spark-developers/GLx8yunSj0A
This hotfixes SparkEnv.env to return either (a) the ThreadLocal
value if non-null, or (b) the previously set value in any thread.
This approach preserves SparkEnv.set() thread safety needed by
RDD.compute() and possibly other places. A refactoring that
parameterizes SparkEnv should be addressed subsequently.
On branch adatao-global-SparkEnv
Changes to be committed:
modified: core/src/main/scala/spark/SparkEnv.scala
2013-08-05 02:09:54 -07:00
Patrick Wendell
f3660d5ab8
Make output formatting consistent between bash/scala
2013-08-03 21:30:15 -07:00
Patrick Wendell
ad94fbb322
Log the launch command for Spark executors
2013-08-03 09:19:46 -07:00
Matei Zaharia
22abbc10d6
Merge pull request #772 from karenfeng/ui-843
...
Show app duration
2013-08-02 16:37:59 -07:00
Patrick Wendell
5b3784a79c
Show user-defined job name in UI
2013-08-02 15:47:41 -07:00
Karen Feng
b3ae5b25d5
Shows time the app has been running
2013-08-02 13:25:14 -07:00
Patrick Wendell
9d7dfd2d5a
Merge pull request #743 from pwendell/app-metrics
...
Add application metrics to standalone master
2013-08-01 17:41:58 -07:00