Ali Ghodsi
7b123b3126
Simpler code
2013-08-20 16:16:05 -07:00
Ali Ghodsi
9192c358e4
simpler code
2013-08-20 16:16:05 -07:00
Ali Ghodsi
a75a64eade
Fixed almost all of Matei's feedback
2013-08-20 16:16:05 -07:00
Ali Ghodsi
f1c853d76d
fixed Matei's comments
2013-08-20 16:16:04 -07:00
Ali Ghodsi
890ea6ba79
making CoalescedRDDPartition public
2013-08-20 16:16:04 -07:00
Ali Ghodsi
b69e7166ba
Coalescer now uses current preferred locations for derived RDDs. Made run() in DAGScheduler thread safe and added a method to be able to ask it for preferred locations. Added a similar method that wraps the former inside SparkContext.
2013-08-20 16:16:04 -07:00
Ali Ghodsi
abcefb3858
fixed matei's comments
2013-08-20 16:13:37 -07:00
Ali Ghodsi
35537e6341
Made a function object that returns the coalesced groups
2013-08-20 16:13:37 -07:00
Ali Ghodsi
339598c080
several of Reynold's suggestions implemented
2013-08-20 16:13:37 -07:00
Ali Ghodsi
02d6464f2f
space removed
2013-08-20 16:13:37 -07:00
Ali Ghodsi
4f99be1ffd
use count rather than foreach
2013-08-20 16:13:37 -07:00
Ali Ghodsi
f67753cdfc
made preferredLocation a val of the surrounding case class
2013-08-20 16:13:37 -07:00
Ali Ghodsi
f24861b60a
Fix bug in tests
2013-08-20 16:13:36 -07:00
Ali Ghodsi
f6e47e8b51
Renamed split to partition
2013-08-20 16:13:36 -07:00
Ali Ghodsi
937f72feb8
word wrap before 100 chars per line
2013-08-20 16:13:36 -07:00
Ali Ghodsi
c4d59910b1
added goals inline as comment
2013-08-20 16:13:36 -07:00
Ali Ghodsi
7a2a33e32d
Large scale load and locality tests for the coalesced partitions added
2013-08-20 16:13:36 -07:00
Ali Ghodsi
66edf854aa
Bug, should compute slack wrt parent partition size, not number of bins
2013-08-20 16:13:36 -07:00
Ali Ghodsi
1ede102ba5
load balancing coalescer
2013-08-20 16:13:36 -07:00
Matei Zaharia
aa2b89d98d
Merge remote-tracking branch 'jey/hadoop-agnostic'
...
Conflicts:
core/src/main/scala/spark/PairRDDFunctions.scala
2013-08-20 10:14:15 -07:00
Mark Hamstra
1630fbf838
changeGeneration --> changeEpoch renaming
2013-08-20 00:17:16 -07:00
Mark Hamstra
ad18410427
Renamed 'priority' to 'jobId' and assorted minor changes
2013-08-20 00:07:04 -07:00
Matei Zaharia
8cae72e94e
Merge pull request #828 from mateiz/sched-improvements
...
Scheduler fixes and improvements
2013-08-19 23:40:04 -07:00
Matei Zaharia
efeb142981
Merge pull request #849 from mateiz/web-fixes
...
Small fixes to web UI
2013-08-19 19:23:50 -07:00
Matei Zaharia
abdc1f8bbb
Merge pull request #847 from rxin/rdd
...
Allow subclasses of Product2 in all key-value related classes
2013-08-19 18:30:56 -07:00
Matei Zaharia
498a26189b
Small fixes to web UI:
...
- Use SPARK_PUBLIC_DNS environment variable if set (for EC2)
- Use a non-ephemeral port (3030 instead of 33000) by default
- Updated test to use non-ephemeral port too
2013-08-19 18:17:49 -07:00
Reynold Xin
5054abd41b
Code review feedback. (added tests for cogroup and substract; added more documentation on MutablePair)
2013-08-19 12:58:02 -07:00
Reynold Xin
71d705a66e
Made PairRDDFunctions taking only Tuple2, but made the rest of the shuffle code path working with general Product2.
2013-08-19 00:40:43 -07:00
Reynold Xin
2a7b99c08b
Added the missing RDD files and cleaned up SparkContext.
2013-08-18 20:39:29 -07:00
Reynold Xin
82bf4c0339
Allow subclasses of Product2 in all key-value related classes (ShuffleDependency, PairRDDFunctions, etc).
2013-08-18 20:25:45 -07:00
Matei Zaharia
8ac3d1e263
Added unit tests for ClusterTaskSetManager, and fix a bug found with
...
resetting locality level after a non-local launch
2013-08-18 19:51:07 -07:00
Matei Zaharia
4004cf775d
Added some comments on threading in scheduler code
2013-08-18 19:51:07 -07:00
Matei Zaharia
2a4ed10210
Address some review comments:
...
- When a resourceOffers() call has multiple offers, force the TaskSets
to consider them in increasing order of locality levels so that they
get a chance to launch stuff locally across all offers
- Simplify ClusterScheduler.prioritizeContainers
- Add docs on the new configuration options
2013-08-18 19:51:07 -07:00
Matei Zaharia
222c897128
Comment cleanup (via Kay) and some debug messages
2013-08-18 19:51:07 -07:00
Matei Zaharia
cf39d45d14
More scheduling fixes:
...
- Added periodic revival of offers in StandaloneSchedulerBackend
- Replaced task scheduling aggression with multi-level delay scheduling
in ClusterTaskSetManager
- Fixed ZippedRDD preferred locations because they can't currently be
process-local
- Fixed some uses of hostPort
2013-08-18 19:51:07 -07:00
Matei Zaharia
90a04dab8d
Initial work towards scheduler refactoring:
...
- Replace use of hostPort vs host in Task.preferredLocations with a
TaskLocation class that contains either an executorId and a host or
just a host. This is part of a bigger effort to eliminate hostPort
based data structures and just use executorID, since the hostPort vs
host stuff is confusing (and not checkable with static typing, leading
to ugly debug code), and hostPorts are not provided by Mesos.
- Replaced most hostPort-based data structures and fields as above.
- Simplified ClusterTaskSetManager to deal with preferred locations in a
more concise way and generally be more concise.
- Updated the way ClusterTaskSetManager handles racks: instead of
enqueueing a task to a separate queue for all the hosts in the rack,
which would create lots of large queues, have one queue per rack name.
- Removed non-local fallback stuff in ClusterScheduler that tried to
launch less-local tasks on a node once the local ones were all
assigned. This change didn't work because many cluster schedulers send
offers for just one node at a time (even the standalone and YARN ones
do so as nodes join the cluster one by one). Thus, lots of non-local
tasks would be assigned even though a node with locality for them
would be able to receive tasks just a short time later.
- Renamed MapOutputTracker "generations" to "epochs".
2013-08-18 19:51:06 -07:00
Matei Zaharia
8fa0747978
Merge pull request #840 from AndreSchumacher/zipegg
...
Implementing SPARK-878 for PySpark: adding zip and egg files to context ...
2013-08-18 17:02:54 -07:00
Reynold Xin
2c00ea3efc
Moved shuffle serializer setting from a constructor parameter to a setSerializer method in various RDDs that involve shuffle operations.
2013-08-17 21:43:29 -07:00
Reynold Xin
0e84fee76b
Removed the mapSideCombine option in partitionBy.
2013-08-17 21:13:41 -07:00
Reynold Xin
10af952a3d
Removed the mapSideCombine option in CoGroupedRDD.
2013-08-17 21:07:34 -07:00
Reynold Xin
5d050a3e1f
Removed the unused shuffleId in ShuffleDependency's constructor.
2013-08-16 23:23:16 -07:00
Matei Zaharia
e89ffc7b3c
Merge pull request #839 from jegonzal/zip_partitions
...
Currying RDD.zipPartitions
2013-08-16 14:02:34 -07:00
Joseph E. Gonzalez
53b2639a1e
Reversing the argument order in zipPartitions to enable stronger type inference.
2013-08-16 12:38:59 -07:00
Andre Schumacher
c7e348faec
Implementing SPARK-878 for PySpark: adding zip and egg files to context and passing it down to workers which add these to their sys.path
2013-08-16 11:58:20 -07:00
Reynold Xin
c961c19b7b
Use the JSON formatter from Scala library and removed dependency on lift-json.
...
It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
2013-08-15 18:23:01 -07:00
Reynold Xin
eddbf43b54
Revert "Merge pull request #834 from Daemoen/master"
...
This reverts commit 230ab2722e
, reversing
changes made to 659553b21d
.
2013-08-15 17:49:37 -07:00
Reynold Xin
230ab2722e
Merge pull request #834 from Daemoen/master
...
Updated json output to allow for display of worker state
2013-08-15 17:45:17 -07:00
Patrick Wendell
659553b21d
Merge pull request #836 from pwendell/rename
...
Rename `memoryBytesToString` and `memoryMegabytesToString`
2013-08-15 16:56:31 -07:00
Jey Kottalam
a06a9d5c5f
Rename HadoopWriter to SparkHadoopWriter since it's outside of our package
2013-08-15 16:50:37 -07:00
Jey Kottalam
8f979edef5
Fix newTaskAttemptID to work under YARN
2013-08-15 16:50:37 -07:00
Jey Kottalam
e2d7656ca3
re-enable YARN support
2013-08-15 16:50:37 -07:00
Jey Kottalam
bd0bab47c9
SparkEnv isn't available this early, and not needed anyway
2013-08-15 16:50:37 -07:00
Jey Kottalam
4f43fd791a
make SparkHadoopUtil a member of SparkEnv
2013-08-15 16:50:37 -07:00
Jey Kottalam
43ebcb8484
rename HadoopMapRedUtil => SparkHadoopMapRedUtil, HadoopMapReduceUtil => SparkHadoopMapReduceUtil
2013-08-15 16:50:37 -07:00
Jey Kottalam
8b1c1520fc
add comment
2013-08-15 16:50:37 -07:00
Jey Kottalam
69c3bbf688
dynamically detect hadoop version
2013-08-15 16:50:37 -07:00
Jey Kottalam
f67b94ad4f
remove core/src/hadoop{1,2} dirs
2013-08-15 16:50:36 -07:00
Patrick Wendell
4c6ade1ad5
Rename memoryBytesToString
and memoryMegabytesToString
...
These are used all over the place now and they are not specific to memory at all.
memoryBytesToString --> bytesToString
memoryMegabytesToString --> megabytesToString
2013-08-15 15:58:07 -07:00
Reynold Xin
1a51deae8a
More minor UI changes including code review feedback.
2013-08-15 14:34:07 -07:00
Daemoen
ad2e8b5126
Updated json output to allow for display of worker state
...
Ops teams need to ensure that the cluster is functional and performant. Having to scrape the html source for worker state won't work reliably, and will be slow. By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.
2013-08-15 12:19:14 -07:00
Reynold Xin
2d2a556bdf
Various UI improvements.
2013-08-14 23:23:09 -07:00
Reynold Xin
290e3e6e65
Renamed setCurrentJobDescription to setJobDescription.
2013-08-14 18:40:53 -07:00
Reynold Xin
3886b54933
A few small scheduler / job description changes.
...
1. Renamed SparkContext.addLocalProperty to setLocalProperty. And allow this function to unset a property.
2. Renamed SparkContext.setDescription to setCurrentJobDescription.
3. Throw an exception if the fair scheduler allocation file is invalid.
2013-08-14 17:19:42 -07:00
Matei Zaharia
839f2d4f3f
Merge pull request #822 from pwendell/ui-features
...
Adding GC Stats to TaskMetrics (and three small fixes)
2013-08-14 16:17:23 -07:00
Patrick Wendell
04ad78b09d
Style cleanup based on Matei feedback
2013-08-14 14:57:21 -07:00
Kay Ousterhout
a88aa5e6ed
Fixed 2 bugs in executor UI.
...
1) UI crashed if the executor UI was loaded before any tasks started.
2) The total tasks was incorrectly reported due to using string (rather
than int) arithmetic.
2013-08-13 23:44:58 -07:00
Patrick Wendell
c223176388
Small style clean-up
2013-08-13 16:56:37 -07:00
Patrick Wendell
fab5cee111
Correcting terminology in RDD page
2013-08-13 16:25:55 -07:00
Patrick Wendell
024e5c5ce1
Correct sorting order for stages
2013-08-13 16:25:55 -07:00
Patrick Wendell
4e9f0c2df6
Capturing GC detials in TaskMetrics
2013-08-13 16:25:55 -07:00
Patrick Wendell
f0382007dc
Bug fix for display of shuffle read/write metrics.
...
This fixes an error where empty cells are missing if a given task
has no shuffle read/write.
2013-08-13 16:25:55 -07:00
Matei Zaharia
d316af9c84
Merge pull request #821 from pwendell/print-launch-command
...
Print run command to stderr rather than stdout
2013-08-13 15:31:01 -07:00
Patrick Wendell
a7feb69ae8
Print run command to stderr rather than stdout
2013-08-13 15:07:03 -07:00
Kay Ousterhout
1beb843a6f
Reuse the set of failed states rather than creating a new object each time
2013-08-13 14:27:40 -07:00
Kay Ousterhout
c92dd627ca
Properly account for killed tasks.
...
The TaskState class's isFinished() method didn't return true for
KILLED tasks, which means some resources are never reclaimed
for tasks that are killed. This also made it inconsistent with the
isFinished() method used by CoarseMesosSchedulerBackend.
2013-08-13 12:40:15 -07:00
Patrick Wendell
ed6a1646e6
Slight change to pr-784
2013-08-13 09:29:40 -07:00
Patrick Wendell
a0133bfbad
Merge pull request #784 from jerryshao/dev-metrics-servlet
...
Add MetricsServlet for Spark metrics system
2013-08-13 09:28:18 -07:00
Matei Zaharia
65d0d91fba
Merge pull request #807 from JoshRosen/guava-optional
...
Change scala.Option to Guava Optional in Java APIs
2013-08-12 19:00:57 -07:00
Josh Rosen
cf08bb7a3e
Fix import organization.
2013-08-12 18:55:02 -07:00
jerryshao
09c7179e81
MetricsServlet code refactor according to comments
2013-08-12 13:23:23 +08:00
jerryshao
320e87e7ab
Add MetricsServlet for Spark metrics system
2013-08-12 13:23:23 +08:00
Reynold Xin
e5b9ed2833
Merge pull request #808 from pwendell/ui_compressed_bytes
...
Report compressed bytes read when calculating TaskMetrics
2013-08-11 17:22:47 -07:00
Patrick Wendell
3d8f281604
Report compressed bytes read when calculating TaskMetrics
2013-08-11 16:25:57 -07:00
Matei Zaharia
379648630b
Merge pull request #805 from woggle/hadoop-rdd-jobconf
...
Use new Configuration() instead of slower new JobConf() in SerializableWritable
2013-08-11 14:51:47 -07:00
Josh Rosen
d7f78b443b
Change scala.Option to Guava Optional in Java APIs.
2013-08-11 12:05:09 -07:00
Charles Reiss
6402b539d0
Use new Configuration() instead of new JobConf() for ObjectWritable.
...
JobConf's constructor loads default config files in some verisons of
Hadoop, which is quite slow, and we only need the Configuration object
to pass the correct ClassLoader.
2013-08-10 21:31:05 -07:00
Matei Zaharia
71c63de22f
Merge pull request #795 from mridulm/master
...
Fix bug reported in PR 791 : a race condition in ConnectionManager and Connection
2013-08-10 10:21:20 -07:00
Matei Zaharia
d3277a0daf
Merge remote-tracking branch 'origin/pr/792'
...
Conflicts:
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/jobs/StagePage.scala
2013-08-10 10:18:50 -07:00
Patrick Wendell
d17eeb997d
Merge pull request #785 from anfeng/master
...
expose HDFS file system stats via Executor metrics
2013-08-10 09:02:27 -07:00
Kay Ousterhout
14d14f451a
Shortened names, as per Matei's suggestion
2013-08-10 07:50:27 -07:00
Matei Zaharia
cd247ba5bb
Merge pull request #786 from shivaram/mllib-java
...
Java fixes, tests and examples for ALS, KMeans
2013-08-09 20:41:13 -07:00
Kay Ousterhout
7810a76512
Only print event queue full error message once
2013-08-09 18:20:48 -07:00
Kay Ousterhout
44ca8629d8
Style fix: removing unnecessary return type
2013-08-09 17:22:50 -07:00
Kay Ousterhout
29b79714f9
Style fixes based on code review
2013-08-09 16:46:34 -07:00
Kay Ousterhout
81e1d4a7d1
Refactored SparkListener to process all events asynchronously.
...
This commit fixes issues where SparkListeners that take a while to
process events slow the DAGScheduler.
This commit also fixes a bug in the UI where if a user goes to a
web page of a stage that does not exist, they can create a memory
leak (granted, this is not an issue at small scale -- probably only
an issue if someone actively tried to DOS the UI).
2013-08-09 13:27:41 -07:00
Matei Zaharia
b09d4b79e8
Merge pull request #799 from woggle/sync-fix
...
Remove extra synchronization in ResultTask
2013-08-09 13:17:08 -07:00
Patrick Wendell
cc6b92e80e
Merge pull request #775 from pwendell/print-launch-command
...
Log the launch command for Spark daemons
2013-08-09 13:00:33 -07:00
Patrick Wendell
3970b580c2
Using quotes when printing out command
2013-08-09 11:53:32 -07:00
Charles Reiss
9dfc280f74
Remove extra synchronization in ResultTask
2013-08-09 11:09:02 -07:00
Matei Zaharia
f94fc75c3f
Merge pull request #788 from shane-huang/sparkjavaopts
...
For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as ...
2013-08-09 10:04:03 -07:00
Mridul Muralidharan
c230ca3b4e
Change line size
2013-08-08 22:28:40 +05:30
Mridul Muralidharan
dc47084f4e
Attempt to fix bug reported in PR 791 : a race condition in ConnectionManager and Connection
2013-08-08 22:19:27 +05:30
Kay Ousterhout
88049a214d
Fixed 3 bugs that caused UI to crash (including SPARK-810).
...
One bug caused the UI to crash if you try to look at a job's status
before any of the tasks have finished.
The second bug was a concurrency issue where two different threads
(the scheduling thread and a UI thread) could be reading/updating
the data structures in JobProgressListener concurrently.
The third bug mis-used an Option, also causing the UI to crash
under certain conditions.
2013-08-07 23:09:25 -07:00
Patrick Wendell
b4321edf68
Reverting boostrap change
2013-08-07 22:18:18 -07:00
Patrick Wendell
21392f2a73
Change I forgot to merge in
2013-08-07 21:45:32 -07:00
Patrick Wendell
706394b370
Bumping font size to 14px and fixing sytle issue in progress bars
2013-08-07 21:27:04 -07:00
Patrick Wendell
8c0d668468
Merge branch 'master' into bootstrap-design
...
Conflicts:
core/src/main/scala/spark/ui/UIUtils.scala
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/storage/RDDPage.scala
2013-08-07 21:06:03 -07:00
Kay Ousterhout
b88e26248e
Fixed issue in UI that limited scheduler throughput.
...
Removal of items from ArrayBuffers in the UI code was slow and
significantly impacted scheduler throughput. This commit
improves scheduler throughput by 5x.
2013-08-07 14:42:05 -07:00
shane-huang
cbc5107e36
For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as default and let application env override default options if applicable
...
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-08-07 14:36:48 +08:00
Matei Zaharia
6b043a6f11
Merge pull request #724 from dlyubimov/SPARK-826
...
SPARK-826: fold(), reduce(), collect() always attempt to use java serialization
2013-08-06 22:31:02 -07:00
Matei Zaharia
7c4b7a53b1
Merge remote-tracking branch 'origin/pr/781'
...
Conflicts:
core/src/main/resources/spark/ui/static/webui.css
2013-08-06 17:19:49 -07:00
Karen Feng
908032e79b
Used saturated colors for progress bars
2013-08-06 16:52:21 -07:00
Karen Feng
8bc497fa10
Lightened color of progress bars
2013-08-06 16:33:05 -07:00
Karen Feng
ca1903ea63
Overlays progress text on top of bar
2013-08-06 15:45:42 -07:00
Matei Zaharia
df4d10d630
Merge pull request #779 from adatao/adatao-global-SparkEnv
...
[HOTFIX] Extend thread safety for SparkEnv.get()
2013-08-06 15:44:05 -07:00
Shivaram Venkataraman
471fbadd0c
Java examples, tests for KMeans and ALS
...
- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
examples project.
Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
KMeans init
2013-08-06 15:43:46 -07:00
anfeng
dda2ac8b5d
reformat registerFileSystemStat()
2013-08-06 15:22:25 -07:00
Karen Feng
099528b6c4
Pre-sorts stage/env tables, changes text/link of stage summaries
2013-08-06 14:52:12 -07:00
Karen Feng
254a930730
Reverse sorts StageTable by submitted time
2013-08-06 14:18:38 -07:00
Karen Feng
5ed5b73026
Sorts first column of env tables
2013-08-06 13:59:53 -07:00
anfeng
0748c60817
expose HDFS file system stats via Executor metrics
2013-08-06 11:47:06 -07:00
Reynold Xin
d031f73679
Merge pull request #782 from WANdisco/master
...
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 22:33:00 -07:00
Matei Zaharia
1b63dea816
Merge pull request #769 from markhamstra/NegativeCores
...
SPARK-847 + SPARK-845: Zombie workers and negative cores
2013-08-05 22:21:26 -07:00
Alexander Pivovarov
a30866438b
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 21:48:43 -07:00
Matei Zaharia
8b277892c9
Merge pull request #774 from pwendell/job-description
...
Show user-defined job name in UI
2013-08-05 19:14:52 -07:00
Christopher Nguyen
b1bbbe699c
[HOTFIX] Mark lastSetSparkEnv @volatile in case it gets HotSpot-cached
...
On branch adatao-global-SparkEnv
Changes to be committed:
modified: core/src/main/scala/spark/SparkEnv.scala
2013-08-05 17:22:27 -07:00
Mark Hamstra
35d8f5ee52
Moved handling of timed out workers within the Master actor
2013-08-05 13:13:56 -07:00
Mark Hamstra
37ccf9301a
milliseconds -> seconds in timeOutDeadWorkers logging
2013-08-05 13:13:56 -07:00
Mark Hamstra
cdd1af562e
Timeout zombie workers
2013-08-05 13:13:56 -07:00
Mikhail Bautin
e8bec8365f
Only reduce the number of cores once when removing an executor
2013-08-05 13:13:56 -07:00
Karen Feng
95025afdec
Made most small fixes for SPARK-849 except for table sort, task progress overlay
2013-08-05 13:04:56 -07:00
Bill Zhao
87134b3648
SPARK-850: give better console message
2013-08-05 11:55:35 -07:00
Christopher Nguyen
39e4fda76f
[HOTFIX] Extend thread safety for SparkEnv.get()
...
A ThreadLocal SparkEnv.env is facing various situations leading to
NullPointerExceptions, where SparkEnv.env set in one thread is not
gettable in another thread, but often assumed to be available.
See, e.g., https://groups.google.com/forum/#!topic/spark-developers/GLx8yunSj0A
This hotfixes SparkEnv.env to return either (a) the ThreadLocal
value if non-null, or (b) the previously set value in any thread.
This approach preserves SparkEnv.set() thread safety needed by
RDD.compute() and possibly other places. A refactoring that
parameterizes SparkEnv should be addressed subsequently.
On branch adatao-global-SparkEnv
Changes to be committed:
modified: core/src/main/scala/spark/SparkEnv.scala
2013-08-05 02:09:54 -07:00
Patrick Wendell
f3660d5ab8
Make output formatting consistent between bash/scala
2013-08-03 21:30:15 -07:00
Patrick Wendell
ad94fbb322
Log the launch command for Spark executors
2013-08-03 09:19:46 -07:00
Matei Zaharia
22abbc10d6
Merge pull request #772 from karenfeng/ui-843
...
Show app duration
2013-08-02 16:37:59 -07:00
Patrick Wendell
5b3784a79c
Show user-defined job name in UI
2013-08-02 15:47:41 -07:00
Karen Feng
b3ae5b25d5
Shows time the app has been running
2013-08-02 13:25:14 -07:00
Patrick Wendell
9d7dfd2d5a
Merge pull request #743 from pwendell/app-metrics
...
Add application metrics to standalone master
2013-08-01 17:41:58 -07:00
Patrick Wendell
f1d2ad550e
under_scores --> camelCase for config options
2013-08-01 15:26:26 -07:00
Patrick Wendell
12d9c82c9b
Small style fix
2013-08-01 15:25:52 -07:00
Patrick Wendell
37bc64a205
Adding application-level metrics.
...
This adds metrics for applications in the deploy Master.
2013-08-01 15:25:52 -07:00
Karen Feng
73692f3cb9
Unify, reduce body font size
2013-08-01 15:10:30 -07:00
Patrick Wendell
87fd321a5a
Minor refactoring and code cleanup
2013-08-01 15:02:31 -07:00
Patrick Wendell
b10199413a
Slight refactoring to SparkContext functions
2013-08-01 15:00:42 -07:00
Patrick Wendell
cfcd77b5da
Increasing inter job arrival
2013-08-01 15:00:42 -07:00
Patrick Wendell
5faac7f4f3
Minor style fixes
2013-08-01 15:00:42 -07:00
Patrick Wendell
5e7b38fbb3
Merge pull request #695 from xiajunluan/pool_ui
...
Enhance job ui in spark ui system with adding pool information
2013-08-01 14:59:33 -07:00
Karen Feng
47600e9579
Removed hr margin
2013-08-01 14:57:04 -07:00
Karen Feng
e648a62fc8
Inserted needed line break for log paging
2013-08-01 14:46:19 -07:00
Karen Feng
686d6266c4
Use nav pills instead of default
2013-08-01 14:41:49 -07:00
Karen Feng
86d372d17f
Removed line breaks
2013-08-01 14:37:21 -07:00
Karen Feng
99803d88b9
Reduced all header sizes
2013-08-01 14:18:33 -07:00
Karen Feng
d216d687ef
Reduced size of table text to compact
2013-08-01 13:27:23 -07:00
Karen Feng
5dae283996
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
2013-08-01 11:28:28 -07:00
Matei Zaharia
0a96493ac6
Merge pull request #760 from karenfeng/heading-update
...
Clean up web UI page headers
2013-08-01 11:27:17 -07:00
Patrick Wendell
9177bea2b4
Removing extra imports
2013-08-01 10:42:50 -07:00
Patrick Wendell
3e4d5e5f8b
Merge branch 'master' into master-json
...
Conflicts:
core/src/main/scala/spark/deploy/master/ui/IndexPage.scala
2013-08-01 10:42:07 -07:00
Patrick Wendell
ffc034e4fb
Import cleanup
2013-08-01 10:39:56 -07:00
Andrew xia
d58502a156
fix bug of spark "SubmitStage" listener as unit test error
2013-08-01 23:21:41 +08:00
Andrew xia
3b5a11e765
change function name "setName" to "setProperties" as "setName" is also member of Thread class
2013-08-01 19:37:15 +08:00
Dmitriy Lyubimov
cb6be5bd7e
Merge remote-tracking branch 'mesos/master' into SPARK-826
...
Conflicts:
core/src/main/scala/spark/scheduler/cluster/ClusterTaskSetManager.scala
core/src/main/scala/spark/scheduler/local/LocalTaskSetManager.scala
core/src/test/scala/spark/KryoSerializerSuite.scala
2013-07-31 22:09:22 -07:00
Dmitriy Lyubimov
28f1550f01
More elegant rewrite of the same.
2013-07-31 21:41:00 -07:00
Dmitriy Lyubimov
7c52ecc6a4
(1) added reduce test case.
...
(2) added nested streaming in ParallelCollectionRDD
(3) added kryo with fold test which still doesn't work
2013-07-31 19:27:30 -07:00
Matei Zaharia
3097d75d6f
Merge remote-tracking branch 'dlyubimov/SPARK-827'
...
Conflicts:
docs/configuration.md
2013-07-31 18:36:43 -07:00
Karen Feng
7c9c5ef6c6
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
2013-07-31 16:39:26 -07:00
Karen Feng
02cde8efdf
Replaces theme with Bootswatch Spacelab theme
2013-07-31 16:34:07 -07:00
Karen Feng
09cd67bf98
Changed bootstrap colors, fixed logpaging buttons
2013-07-31 16:18:53 -07:00
Matei Zaharia
39c75f3033
Merge pull request #757 from BlackNiuza/result_task_generation
...
Bug fix: SPARK-837
2013-07-31 15:52:36 -07:00
Matei Zaharia
14bf2fe039
Merge pull request #749 from benh/spark-executor-uri
...
Added property 'spark.executor.uri' for launching on Mesos.
2013-07-31 14:18:16 -07:00
Benjamin Hindman
4692ea4892
Used 'uri.split('/').last' instead of 'new File(uri).getName()'.
2013-07-31 12:29:44 -07:00
Karen Feng
c453967f9a
Reduced size of heading
2013-07-31 11:57:50 -07:00
Matei Zaharia
a386ced2c6
Merge pull request #754 from rxin/compression
...
Compression codec change
2013-07-31 11:22:50 -07:00
Karen Feng
49e6344142
Removed master URL from job UI, reduced heading size of basic spark pages
2013-07-31 11:17:59 -07:00
Reynold Xin
c61843a69f
Changed other LZF uses to use the compression codec interface.
2013-07-31 10:32:13 -07:00
Patrick Wendell
89da9d94b3
Add JSON path to master index page
2013-07-31 09:47:53 -07:00
BlackNiuza
9a815de4bf
write and read generation in ResultTask
2013-08-01 00:36:47 +08:00
Roman Tkalenko
0c6553714a
Refactored Vector.apply(length, initializer) replacing excessive code with library method
...
(also removed unused variable ```ans``` as minor change)
2013-07-31 19:05:46 +03:00
Matei Zaharia
12553e5c55
Simplified nonNegativeMod to match previous version
2013-07-31 08:50:28 -07:00
Matei Zaharia
d4556f4207
Merge pull request #751 from cdshines/master
...
Cleaned Partitioner & PythonPartitioner source by taking out non-related logic to Utils
2013-07-31 08:48:14 -07:00
Andrew xia
5670c96f29
Merge branch 'master' into Pool_UI
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/scheduler/SparkListener.scala
core/src/main/scala/spark/scheduler/cluster/ClusterTaskSetManager.scala
core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
core/src/main/scala/spark/scheduler/local/LocalTaskSetManager.scala
core/src/main/scala/spark/ui/jobs/IndexPage.scala
core/src/main/scala/spark/ui/jobs/JobProgressUI.scala
2013-07-31 19:36:36 +08:00
cdshines
fefb03cbd7
Eliminated code duplication, refactored to pattern-matching style Partitioner and PythonPartitioner
2013-07-31 13:19:42 +03:00
Dmitriy Lyubimov
96664431cb
IDEA flipped JavaSerialized import at some point to a wrong class.
2013-07-30 23:10:09 -07:00
Dmitriy Lyubimov
c219fc94fd
Minor, style
2013-07-30 22:08:39 -07:00
Dmitriy Lyubimov
f4b4b8836e
reverting back to one-by-one serialization for parallelize()
2013-07-30 19:00:58 -07:00
jerryshao
bf9318091a
Add Apache license header to metrics system
2013-07-31 09:42:16 +08:00
Reynold Xin
98024eadc3
Renamed compressionOutputStream and compressionInputStream to compressedOutputStream and compressedInputStream.
2013-07-30 18:28:46 -07:00
Dmitriy Lyubimov
abada94ebf
removing default constructor (not Externalizable any more)
2013-07-30 18:04:02 -07:00
Dmitriy Lyubimov
943c6590c9
realiging "extends" back manually
2013-07-30 18:01:35 -07:00
Dmitriy Lyubimov
ca33b12e98
resetting wrap and continuation indent = 4
2013-07-30 17:51:44 -07:00
Reynold Xin
dae12fef9e
Updated the configuration option for Snappy block size to be consistent with the documentation.
2013-07-30 17:49:31 -07:00
Dmitriy Lyubimov
984b56155a
changing approaches for parallelize(): java serialization needs to avoid writing headers!
2013-07-30 17:36:59 -07:00
Reynold Xin
ad7e9d0d64
CompressionCodec cleanup. Moved it to spark.io package.
2013-07-30 17:11:54 -07:00
Dmitriy Lyubimov
ef9529a943
refactoring using writeByteBuffer() from Utils.
2013-07-30 16:24:23 -07:00
Dmitriy Lyubimov
43394b9a6d
fixing formatting
2013-07-30 16:13:41 -07:00
Reynold Xin
368c58eac5
Merge branch 'lazy_file_open' of github.com:lyogavin/spark into compression
...
Conflicts:
project/SparkBuild.scala
2013-07-30 16:04:18 -07:00
Patrick Wendell
e87de037d6
Merge pull request #744 from karenfeng/bootstrap-update
...
Use Bootstrap progress bars in web UI
2013-07-30 15:00:08 -07:00
Karen Feng
26144c400f
Fixed wrap style
2013-07-30 12:40:41 -07:00
Karen Feng
218d7c4ed8
Fixed style, lowered height of progress bars
2013-07-30 12:39:17 -07:00
Karen Feng
f1cab31b73
Removed intermediate set for activeTasks, removed progress bar margin
2013-07-30 11:06:47 -07:00
Dmitriy Lyubimov
1bca91633e
+ bug fixes;
...
test added
Conflicts:
core/src/test/scala/spark/KryoSerializerSuite.scala
2013-07-30 11:04:11 -07:00
Benjamin Hindman
f6f46455eb
Added property 'spark.executor.uri' for launching on Mesos without
...
requiring Spark to be installed. Using 'make_distribution.sh' a user
can put a Spark distribution at a URI supported by Mesos (e.g.,
'hdfs://...') and then set that when launching their job. Also added
SPARK_EXECUTOR_URI for the REPL.
2013-07-29 23:32:52 -07:00
Josh Rosen
49be084ed3
Use File.pathSeparator instead of hardcoding ':'.
2013-07-29 22:08:57 -07:00
Josh Rosen
b95732632b
Do not inherit master's PYTHONPATH on workers.
...
This fixes SPARK-832, an issue where PySpark
would not work when the master and workers used
different SPARK_HOME paths.
This change may potentially break code that relied
on the master's PYTHONPATH being used on workers.
To have custom PYTHONPATH additions used on the
workers, users should set a custom PYTHONPATH in
spark-env.sh rather than setting it in the shell.
2013-07-29 22:08:57 -07:00
Andrew xia
5406013997
refactor codes less than 100 character per line
2013-07-30 11:41:38 +08:00
Andrew xia
614ee16cc4
refactor job ui with pool information
2013-07-30 10:57:26 +08:00
Dmitriy Lyubimov
8e5cd041bb
initial externalization of ParallelCollectionRDD's split
2013-07-29 19:02:53 -07:00
Reynold Xin
81720e13fc
Moved all StandaloneClusterMessage's into StandaloneClusterMessages object.
2013-07-29 17:53:01 -07:00
Reynold Xin
23b5da14ed
Moved block manager messages into BlockManagerMessages object.
2013-07-29 17:42:05 -07:00
Reynold Xin
105f4d22e9
Removed Cache and SoftReferenceCache since they are no longer used.
2013-07-29 17:30:38 -07:00
Reynold Xin
17e62113d4
Moved DeployMessage's into its own DeployMessages object.
...
Also renamed MasterState to MasterStateResponse and WorkerState to WorkerStateResponse for clarity.
2013-07-29 17:14:44 -07:00
Karen Feng
87b821dc39
Fixed continuity of executorToTasksActive, changed color of progress bars
2013-07-29 16:50:51 -07:00
Karen Feng
c7b2788948
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
...
Conflicts:
core/src/main/scala/spark/ui/jobs/IndexPage.scala
2013-07-29 16:36:07 -07:00
Patrick Wendell
c99b674405
Merge pull request #735 from karenfeng/ui-807
...
Totals for shuffle data and CPU time
2013-07-29 16:32:55 -07:00
Karen Feng
2d6da9195a
Alphabetized imports
2013-07-29 15:50:52 -07:00
Karen Feng
478a2886d9
Added started tasks to progress bar
2013-07-29 14:51:07 -07:00
Karen Feng
e04a37a332
Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
...
cially if it merges an updated upstream into a topic branch.
2013-07-29 14:32:48 -07:00
Reynold Xin
fe7298b587
Merge pull request #741 from pwendell/usability
...
Fix two small usability issues
2013-07-29 14:01:00 -07:00
Karen Feng
43a2cc15c0
Use Bootstrap progress bars in web UI
2013-07-29 13:37:24 -07:00
Matei Zaharia
b9d6783f36
Optimize Python take() to not compute entire first partition
2013-07-29 02:51:43 -04:00
Dmitriy Lyubimov
f5067abe85
changes per comments.
2013-07-27 23:08:00 -07:00
Karen Feng
077f2dad22
Fixed outdated bugs
2013-07-27 16:39:36 -07:00
Patrick Wendell
bcafb36c1e
Slight wording change
2013-07-27 16:03:50 -07:00
Patrick Wendell
8177165ac4
Log executor on finish
2013-07-27 16:02:06 -07:00
Patrick Wendell
c2223e6801
Improve catch scope and logging for client stop()
...
This does two things:
1. Catches the more general `TimeoutException`, since those can be thrown.
2. Logs at info level when a timeout is detected.
2013-07-27 16:02:06 -07:00
Karen Feng
5a93e3c58c
Cleaned up code based on pwendell's suggestions
2013-07-27 15:55:26 -07:00
Karen Feng
dcc4743a95
Moved val now to render
2013-07-27 12:52:53 -07:00
Karen Feng
1714693324
Current time called once with value now
2013-07-27 12:24:41 -07:00
Dmitriy Lyubimov
6a47cee721
style
2013-07-26 22:35:13 -07:00
Dmitriy Lyubimov
0c391feb73
Maximum task failures configurable
2013-07-26 22:34:43 -07:00
Karen Feng
bd4cc52e30
Made metrics Option instead of Some, fixed NullPointerException
2013-07-26 17:23:18 -07:00
Reynold Xin
cb366774c8
Merge pull request #738 from harsha2010/pruning
...
Fix bug in Partition Pruning.
2013-07-26 16:59:30 -07:00
harshars
392d7474fd
Code review
2013-07-26 15:23:15 -07:00
harshars
72cf7ec0e5
Indentation
2013-07-26 15:16:41 -07:00
harshars
822aac8f5a
Indentation
2013-07-26 15:10:32 -07:00
harshars
743fc4e7aa
Fix Bug in Partition Pruning, index of Pruned Partitions should inherit from parent
2013-07-26 14:35:17 -07:00
Karen Feng
3fbe9eaac0
Displys shuffle read/write only if exists, wraps if statements, trims old vals, grabs current time once
2013-07-26 11:51:38 -07:00
Karen Feng
22faeab261
Split Shuffle Activity overview column for read/write
2013-07-25 17:14:18 -07:00
Karen Feng
d4bbc8bd25
Shows totals for shuffle data and CPU time in Stage, homepage overviews including active time
2013-07-25 15:59:52 -07:00
Charles Reiss
a6de90c927
For standalone mode, get JAVA_HOME, SPARK_JAVA_OPTS, SPARK_LIBRARY_PATH from application env, not worker env
2013-07-25 12:42:30 -07:00
ryanlecompte
e56aa75de0
fix wrapping
2013-07-24 22:08:09 -07:00
ryanlecompte
8e0939f5a9
refactor Kryo serializer support to use chill/chill-java
2013-07-24 20:43:57 -07:00
Karen Feng
57009eef90
Fixed consistency of "success" status string
2013-07-24 13:43:09 -07:00
Karen Feng
4280e1768d
Removed finished status for task info, changed name of success case
2013-07-24 12:48:48 -07:00
Karen Feng
bd3931c874
Changed ifs with returns to if/else
2013-07-24 11:27:17 -07:00
Karen Feng
93c6015f82
Shows task status and running tasks on Stage Page: fixes SPARK-804 and 811
2013-07-24 10:53:02 -07:00
jerryshao
31ec72b243
Code refactor according to comments
2013-07-24 14:57:47 +08:00
jerryshao
8d1ef7f2df
Code style changes
2013-07-24 14:57:47 +08:00
Andrew xia
05637de842
Change class xxxInstrumentation to class xxxSource
2013-07-24 14:57:47 +08:00
Andrew xia
ed1a3bc206
continue to refactor code style and functions
2013-07-24 14:57:47 +08:00
jerryshao
5730193e0c
Fix some typos
2013-07-24 14:57:47 +08:00
jerryshao
a79f6077f0
Add Maven metrics library dependency and code changes
2013-07-24 14:57:47 +08:00
jerryshao
1daff54b2e
Change Executor MetricsSystem initialize code to SparkEnv
2013-07-24 14:57:47 +08:00
Andrew xia
5f8802c1fb
Register and init metricsSystem in SparkContext
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/SparkEnv.scala
2013-07-24 14:57:47 +08:00
Andrew xia
7d2eada451
Add metrics source of DAGScheduler and blockManager
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/SparkEnv.scala
2013-07-24 14:57:47 +08:00
jerryshao
e9ac88754d
Remove twice add Source bug and code clean
2013-07-24 14:57:47 +08:00
jerryshao
5ce5dc9fcd
Add default properties to deal with no configure file situation
2013-07-24 14:57:47 +08:00
jerryshao
871bc1687e
Add Executor instrumentation
2013-07-24 14:57:46 +08:00
jerryshao
7fb574bf66
Code clean and remarshal
2013-07-24 14:57:46 +08:00
Andrew xia
4d6dd67fa1
refactor metrics system
...
1.change source abstract class to support MetricRegistry
2.change master/work/jvm source class
2013-07-24 14:57:46 +08:00
jerryshao
03f9871116
MetricsSystem refactor
2013-07-24 14:57:46 +08:00
jerryshao
c3daad3f65
Update metric source support for instrumentation
2013-07-24 14:57:46 +08:00
jerryshao
9dec8c73e6
Add Master and Worker instrumentation support
2013-07-24 14:57:46 +08:00
jerryshao
503acd3a37
Build metrics system framwork
2013-07-24 14:57:46 +08:00
Matei Zaharia
b011329040
Merge pull request #727 from rxin/scheduler
...
Scheduler code style cleanup.
2013-07-23 22:50:09 -07:00
Matei Zaharia
876125b997
Merge pull request #726 from rxin/spark-826
...
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure
2013-07-23 22:28:21 -07:00
Reynold Xin
3dae1df66f
Moved non-serializable closure catching exception from submitStage to submitMissingTasks
2013-07-23 20:29:07 -07:00
Reynold Xin
d33b8a2a0f
Added comments on task closure serialization.
2013-07-23 20:28:39 -07:00
Reynold Xin
85ab8114bc
Moved non-serializable closure catching exception from submitStage to submitMissingTasks
2013-07-23 20:25:58 -07:00
Matei Zaharia
6a31b7191d
Small bug fix
2013-07-23 16:20:24 -07:00
Matei Zaharia
2f1736c396
Merge pull request #725 from karenfeng/task-start
...
Creates task start events
2013-07-23 15:53:30 -07:00
Karen Feng
abc78cd331
Modifies instead of copies HashSets, fixes comment style
2013-07-23 15:47:16 -07:00
Karen Feng
383684daaa
Replaces Seq with HashSet, removes redundant import
2013-07-23 15:33:27 -07:00
Reynold Xin
f2422d4f29
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure.
2013-07-23 15:30:20 -07:00
Reynold Xin
5ed38b4d1d
Scheduler code style cleanup.
2013-07-23 15:28:59 -07:00
Reynold Xin
101b8cc78a
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure.
2013-07-23 15:28:20 -07:00
Karen Feng
9f2dbb2a7c
Adds/removes active tasks only once
2013-07-23 15:10:09 -07:00
Dmitriy Lyubimov
ef82ff8564
Merge branch 'master' into SPARK-826
...
Conflicts:
core/src/main/scala/spark/scheduler/local/LocalScheduler.scala
2013-07-23 13:43:00 -07:00
Karen Feng
0200801a55
Tracks task start events and shows number of active tasks on Executor UI
2013-07-23 13:35:43 -07:00
Dmitriy Lyubimov
310e73d566
style
2013-07-23 13:23:25 -07:00
Matei Zaharia
f369e0e51b
Merge pull request #720 from ooyala/2013-07/persistent-rdds-api
...
Add a public method getCachedRdds to SparkContext
2013-07-23 13:22:27 -07:00
Dmitriy Lyubimov
ac60d06381
Re-working in terms of changes to TaskSetManager. Verified with Standalone and Local mode.
2013-07-23 13:13:19 -07:00
Evan Chan
4830e22562
Rename method per rxin feedback
2013-07-23 09:50:13 -07:00
Evan Chan
2c2bfbe294
Add toMap method to TimeStampedHashMap and use it
2013-07-23 01:36:44 -07:00
Matei Zaharia
401aac8b18
Merge pull request #719 from karenfeng/ui-808
...
Creates Executors tab for Jobs UI
2013-07-22 16:57:16 -07:00
Karen Feng
872c97ad82
Split task columns, memory columns sort by numeric value
2013-07-22 16:54:37 -07:00
Karen Feng
2eea974795
Executors UI now calls executor ID from TaskInfo instead of TaskMetrics
2013-07-22 15:15:54 -07:00
Dmitriy Lyubimov
8ca0c31944
removing non-pertinent comment
2013-07-22 14:48:46 -07:00
Dmitriy Lyubimov
b4b230e606
Fixing for LocalScheduler with test, that much works ..
2013-07-22 14:42:47 -07:00
Karen Feng
85c4d7bf3b
Shows number of complete/total/failed tasks (bug: failed tasks assigned to null executor)
2013-07-22 14:35:47 -07:00
Josh Rosen
f649dabb4a
Fix bug: DoubleRDDFunctions.sampleStdev() computed non-sample stdev().
...
Update JavaDoubleRDD to add new methods and docs.
Fixes SPARK-825.
2013-07-22 13:21:48 -07:00
Karen Feng
8901f379c9
Fixed memory used/remaining/total bug
2013-07-22 09:58:03 -07:00
Karen Feng
636b19f833
Merge branch 'master' of https://github.com/mesos/spark into ui-808
2013-07-22 09:53:26 -07:00
Evan Chan
0337d88321
Add a public method getCachedRdds to SparkContext
2013-07-21 18:26:14 -07:00
Karen Feng
865dc63bac
Changed table format for executors
2013-07-19 15:57:01 -07:00
Karen Feng
81bb5dc640
Creates Executors tab for application with RDD block and memory/disk used, solves SPARK-808
2013-07-19 14:08:30 -07:00
Konstantin Boudnik
cfce9a6a36
Regression: default webui-port can't be set via command line "--webui-port" anymore
2013-07-19 14:00:58 -07:00
Liang-Chi Hsieh
aa6f83289b
A better fix for giving local jars unde Yarn mode.
2013-07-19 22:25:28 +08:00
Liang-Chi Hsieh
a613628c50
Do not copy local jars given to SparkContext in yarn mode since the Context is not running on local. This bug causes failure when jars can not be found. Example codes (such as spark.examples.SparkPi) can not work without this fix under yarn mode.
2013-07-19 16:59:12 +08:00
Prashant Sharma
039087e1e3
Fixed formatting As per review comments on #709
2013-07-17 11:46:00 +05:30
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Matei Zaharia
b1f9f64743
Merge branch 'master' of github.com:mesos/spark
2013-07-16 11:01:53 -07:00
Matei Zaharia
5c388808a8
SPARK-814: Result stages should be named after action
2013-07-16 11:01:14 -07:00
Prashant Sharma
f89cc7ae3c
Fixed warning for type erasure
2013-07-16 14:59:24 +05:30
Prashant Sharma
50f3cd8890
Fixed warning enumerations
2013-07-16 14:39:46 +05:30
Prashant Sharma
55da6e9504
Fixed warning erasure -> runtimeClass
2013-07-16 14:37:08 +05:30
Prashant Sharma
ff14f38f3d
Fixed warning Throwables
2013-07-16 14:34:56 +05:30
Prashant Sharma
63addd93a8
Fixed warning ClassManifest -> ClassTag
2013-07-16 14:09:52 +05:30
Reynold Xin
69316603d6
Throw a more meaningful message when runJob is called to launch tasks on non-existent partitions.
2013-07-15 22:50:11 -07:00
Karen Feng
6dc7c9bfb1
Removed job UI column, linked description to job UI
2013-07-15 16:33:50 -07:00
Karen Feng
fbf5aa761e
Removed log message, added field in master UI to link to log UI
2013-07-15 15:50:03 -07:00
Karen Feng
eac381a957
Merge branch 'ui-802' of https://github.com/karenfeng/spark into ui-802
2013-07-15 15:48:44 -07:00
Karen Feng
3955711250
Added field to master UI with link to job UI
2013-07-15 15:47:21 -07:00
Karen Feng
0d78b6d9cd
Links to job UI from standalone deploy cluster web UI: fixes SPARK-802
2013-07-15 13:47:38 -07:00
Karen Feng
b2aaa1199e
Adds app name in HTML page titles on job web UI: fixes SPARK-806
2013-07-15 11:44:42 -07:00
Prashant Sharma
a1e56a43b3
Fixed compilation issues as Map is by default immutable.Map in scala-2.10
2013-07-15 11:28:18 +05:30
Prashant Sharma
a3494d405d
Merge branch 'master' of github.com:mesos/spark into scala-2.10
...
Conflicts:
core/src/main/scala/spark/Utils.scala
core/src/test/scala/spark/ui/UISuite.scala
project/SparkBuild.scala
run
2013-07-15 11:15:55 +05:30
Matei Zaharia
d47c16f78d
Add an option to disable reference tracking in Kryo
2013-07-15 01:55:54 +00:00
Matei Zaharia
10c05937bd
Merge pull request #699 from pwendell/ui-env
...
Add `Environment` tab to SparkUI.
2013-07-14 11:45:18 -07:00
Patrick Wendell
4883586838
Responding to Matei's review
2013-07-14 10:37:26 -07:00
Matei Zaharia
b91a218cea
Cosmetic fixes to web UI
2013-07-14 07:31:33 +00:00
Matei Zaharia
a44a7b1238
Determine Spark core classes better in getCallSite
2013-07-14 07:23:09 +00:00
root
e271fde10b
Fixed a delay scheduling bug in the YARN branch, found by Patrick
2013-07-14 06:24:29 +00:00
Patrick Wendell
ddb97f0fdf
Add Environment
tab to SparkUI.
...
This adds a tab which displays system property and classpath information. This
can be useful in debugging various types of issues such as:
1. Extra/incorrect Hadoop jars being included in the classpath
2. Spark launching with a different JRE version than intended
3. Spark system properties not being set to intended values
4. User added jars that conflict with Spark jars
2013-07-13 16:14:40 -07:00
Matei Zaharia
77c69ae5a0
Merge pull request #697 from pwendell/block-locations
...
Show block locations in Web UI.
2013-07-12 23:05:21 -07:00
Matei Zaharia
5a7835c152
Merge pull request #691 from karenfeng/logpaging
...
Create log pages
2013-07-12 20:28:21 -07:00
Matei Zaharia
71ccca0cc1
Merge pull request #696 from woggle/executor-env
...
Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.sh
2013-07-12 20:25:06 -07:00
Matei Zaharia
90fc3f30cd
Merge pull request #692 from Reinvigorate/takeOrdered
...
adding takeOrdered() to RDD
2013-07-12 20:23:36 -07:00
Patrick Wendell
08150f19ab
Minor style fix
2013-07-12 19:32:35 -07:00
Patrick Wendell
6855338e14
Show block locations in Web UI.
...
This fixes SPARK-769. Support is added for enumerating the locations of blocks
in the UI. There is also some minor cleanup in StorageUtils.
2013-07-12 19:30:32 -07:00
Charles Reiss
531a7e5574
Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.
2013-07-12 12:58:25 -07:00
seanm
a1662326e9
comment adjustment to takeOrdered
2013-07-12 08:38:19 -07:00
Prashant Sharma
a220e11a07
Merge branch 'master' of github.com:mesos/spark into scala-2.10
2013-07-12 15:12:46 +05:30
Prashant Sharma
e86d5dbaad
Merge branch 'master' into master-merge
...
Conflicts:
README.md
core/pom.xml
core/src/main/scala/spark/deploy/JsonProtocol.scala
core/src/main/scala/spark/deploy/LocalSparkCluster.scala
core/src/main/scala/spark/deploy/master/Master.scala
core/src/main/scala/spark/deploy/master/MasterWebUI.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
core/src/main/scala/spark/storage/BlockManagerUI.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
project/SparkBuild.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
2013-07-12 14:49:16 +05:30
Andrew xia
2080e25006
Enhance job ui in spark ui system with adding pool information
2013-07-12 14:25:18 +08:00
seanm
a2c915fba8
giving order to top and making tests more clear
2013-07-11 18:55:00 -07:00
Karen Feng
5c67ca0278
Remove "Bytes" in lieu of String notation
2013-07-11 17:31:59 -07:00
Karen Feng
6d054487bf
Replace default buffer value to 100 GB, changed buttons to use String notation, removed default buffer parameter in UI URLs
2013-07-11 17:12:17 -07:00
Karen Feng
a32784109d
Fixed links for "Back to Master"
2013-07-11 16:57:55 -07:00
Karen Feng
ece2388585
Removed logPageLength from logPage
2013-07-11 16:35:56 -07:00
Karen Feng
9ed036ccdb
Replaced logPageLength with byteLength to prevent buffer shrink bug
2013-07-11 16:33:53 -07:00
Karen Feng
fdc226a14c
Clarified start and end byte variable names
2013-07-11 15:36:43 -07:00
Karen Feng
5d5dbc39f6
getByteRange moved to WorkerWebUI, takes converted parameters, returns only start/end offset
2013-07-11 15:22:45 -07:00
Karen Feng
15fd11d657
Removed redundant calls to request by logPage
2013-07-11 15:01:50 -07:00
Karen Feng
11872888ca
Created getByteRange function for logs and log pages, removed lastNBytes function
2013-07-11 14:56:37 -07:00
Matei Zaharia
018d04c64e
Merge pull request #684 from woggle/mesos-classloader
...
Explicitly set class loader for MesosSchedulerDriver callbacks.
2013-07-11 12:48:37 -07:00
Karen Feng
e3a3fcf61b
Scrollbar on log pages appear automatically
2013-07-11 12:16:38 -07:00
Karen Feng
044d4577ec
Fixed capitalization of log page
2013-07-11 12:02:15 -07:00
Karen Feng
0ecc33f0c8
Added byte range, page title with log name, previous/next bytes buttons, initialization to end of log, large default buffer, buggy back to master link
2013-07-11 11:25:58 -07:00
Karen Feng
74bd3fc680
Added byte range on log pages
2013-07-10 15:44:28 -07:00
Karen Feng
24196c91f0
Changed buffer to 10,000 bytes, created scrollbar for fixed-height log
2013-07-10 15:27:52 -07:00
Karen Feng
f5f3b272f8
Fixed mixup of start/end, moved more import files
2013-07-10 14:52:29 -07:00
Karen Feng
0d4580360b
Fixed docstring of offsetBytes to match params and wrapped for 100+ character lines
2013-07-10 13:24:26 -07:00
Karen Feng
04263e4d46
Made some minor style changes
2013-07-10 13:15:42 -07:00
Karen Feng
cfb6447ac4
Fixed for nonexistent bytes, added unit tests, changed stdout-page to stdout
2013-07-10 11:47:57 -07:00
seanm
ee4ce2fc51
adding takeOrdered to java API
2013-07-10 10:46:04 -07:00
seanm
24705d0f46
adding takeOrdered() to RDD
2013-07-10 10:33:11 -07:00
Karen Feng
620a6974c6
Allows for larger files, refactors lastNBytes, removes old Log column, fixes imports, uses map
2013-07-10 10:20:53 -07:00
Karen Feng
b6072b58bf
Fixes style, makes "std__-page" consistent, reads only parts of files
2013-07-09 17:25:10 -07:00
Karen Feng
13fc6f248c
Clean commit of log paging
2013-07-09 14:17:15 -07:00
Charles Reiss
e47253e0cc
Reset ClassLoader in MesosSchedulerBackend, too. (per review comments).
...
Also set ClassLoader for all mesos callbacks, not just statusUpdate,
registered.
2013-07-09 01:23:23 -07:00
Charles Reiss
8c1d1c98e0
Explicitly set class loader for MesosSchedulerDriver callbacks.
2013-07-08 12:25:46 -07:00
Shivaram Venkataraman
4af0d63cb1
Remove akka LogLevel fix as we no longer use spray
2013-07-07 10:42:43 -07:00
Shivaram Venkataraman
a948f06725
Suppress log messages in sbt test with two changes:
...
1. Set akka log level to ERROR before shutting down the actorSystem.
This avoids akka log messages (like Spray) from falling back to INFO
on the Stdout logger
2. Initialize netty to use SLF4J in LocalSparkContext. This ensures that
stack trace thrown during shutdown is handled by SLF4J instead of stdout
2013-07-07 04:09:08 -07:00
Patrick Wendell
32b9d21a97
Fix occasional failure in UI listener.
...
If a task fails before the metrics are initialized, it remains possible
that the metrics field will be `None`. This patch accounts for that possbility
by keeping metrics as an `Option` at all times.
2013-07-06 16:40:02 -07:00
Matei Zaharia
1ffadb2d9e
Merge remote-tracking branch 'pwendell/ui-updates'
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia
94871e4703
Merge pull request #655 from tgravescs/master
...
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Matei Zaharia
3f918b33f8
Merge pull request #672 from holdenk/master
...
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-06 12:45:18 -07:00
Matei Zaharia
7ba7fa110b
Merge pull request #674 from liancheng/master
...
Bug fix: SPARK-789
2013-07-06 11:45:08 -07:00
BlackNiuza
44a2440039
Remove active job from idToActiveJob when job finished or aborted
2013-07-07 01:33:09 +08:00
Patrick Wendell
37abe84212
Tracking some task metrics even during failures.
2013-07-06 09:19:59 -07:00
Patrick Wendell
84b7fc54e6
Enforcing correct sort order for formatted strings
2013-07-05 17:21:08 -07:00
Matei Zaharia
652ea0f1d8
Allow RDD.takeSample to give samples bigger than the RDD
...
Before, when withReplacement was set to true, we would not get a sample
bigger than the RDD's count().
Conflicts:
core/src/main/scala/spark/RDD.scala
core/src/test/scala/spark/RDDSuite.scala
2013-07-05 11:15:13 -07:00
Matei Zaharia
6586c5e28b
Added a SparkContext accessor to RDD
2013-07-05 11:13:46 -07:00
jerryshao
e4ff544a8d
Clean StageToInfos periodically when spark.cleaner.ttl is enabled
2013-07-05 10:34:45 +08:00
Lian Cheng
c0c3155c3c
Bug fix: SPARK-789
...
https://spark-project.atlassian.net/browse/SPARK-789
2013-07-05 00:54:10 +08:00
Holden Karau
0f06d6217d
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-04 01:05:39 -07:00
Gavin Li
94238aae57
fix dependencies
2013-07-03 18:08:38 +00:00
Prashant Sharma
a5f1f6a907
Merge branch 'master' into master-merge
...
Conflicts:
core/pom.xml
core/src/main/scala/spark/MapOutputTracker.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/RDDCheckpointData.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/api/python/PythonRDD.scala
core/src/main/scala/spark/deploy/client/Client.scala
core/src/main/scala/spark/deploy/master/MasterWebUI.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
core/src/main/scala/spark/rdd/BlockRDD.scala
core/src/main/scala/spark/rdd/ZippedRDD.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/BlockManagerMasterActor.scala
core/src/main/scala/spark/storage/BlockManagerUI.scala
core/src/main/scala/spark/util/AkkaUtils.scala
core/src/test/scala/spark/SizeEstimatorSuite.scala
pom.xml
project/SparkBuild.scala
repl/src/main/scala/spark/repl/SparkILoop.scala
repl/src/test/scala/spark/repl/ReplSuite.scala
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/api/java/JavaStreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/util/MasterFailureTest.scala
2013-07-03 11:43:26 +05:30
Gavin Li
96130c30d9
add compression codec trait and snappy compression
2013-07-03 05:49:04 +00:00
Y.CORP.YAHOO.COM\tgraves
923cf92900
Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
...
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Patrick Wendell
39e2325675
Removing dead code
2013-07-02 16:28:40 -07:00
Patrick Wendell
8ca1cc1786
Adding truncation for log files
2013-07-02 16:10:50 -07:00
Patrick Wendell
9a42d04efa
Throw exception for missing resource
2013-07-01 14:43:13 -07:00
Patrick Wendell
1025d7d1ef
Package refactoring
2013-07-01 14:40:53 -07:00
Patrick Wendell
30b9034241
Fixing bug where logs aren't shown
2013-07-01 13:48:01 -07:00
Patrick Wendell
8688689387
Various formatting changes
2013-07-01 13:40:12 -07:00
Patrick Wendell
735c951a09
Adding test script
2013-07-01 09:33:22 -07:00
Patrick Wendell
5de326db7d
Print exception message
2013-07-01 09:19:45 -07:00
root
ec31e68d5d
Fixed PySpark perf regression by not using socket.makefile(), and improved
...
debuggability by letting "print" statements show up in the executor's stderr
Conflicts:
core/src/main/scala/spark/api/python/PythonRDD.scala
2013-07-01 06:26:31 +00:00
root
3296d132b6
Fix performance bug with new Python code not using buffered streams
2013-07-01 06:25:43 +00:00
Matei Zaharia
03d0b858c8
Made use of spark.executor.memory setting consistent and documented it
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Patrick Wendell
e721ff7e5a
Allowing details for failed stages
2013-06-29 11:26:30 -07:00
Patrick Wendell
473961d82e
Styling for progress bar
2013-06-29 08:38:04 -07:00
Patrick Wendell
249f0e54ba
Minor changes from Matei's review
2013-06-28 13:25:26 -07:00
Patrick Wendell
c537e869f3
Missing logo file
2013-06-27 22:02:03 -07:00
Patrick Wendell
62c2c6b856
Forcing Jetty to run as daemon
2013-06-27 21:47:22 -07:00
Patrick Wendell
a55190d314
Adding better tabs for UI headers.
2013-06-27 19:14:51 -07:00
Patrick Wendell
362d996c81
Handful of changes based on matei's review
...
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
Patrick Wendell
92a4c2a5f6
Fixing bug in local scheduler time recording
2013-06-27 12:33:06 -07:00
Stephen Haberman
d7011632d1
Wrap lines.
2013-06-26 12:35:57 -05:00
Patrick Wendell
ee692482a6
One more private class
2013-06-26 09:07:32 -07:00
Patrick Wendell
a59c15a37e
Adding config option for retained stages
2013-06-26 08:54:57 -07:00
Patrick Wendell
274193664a
Bumping timeouts
2013-06-26 08:51:28 -07:00
Patrick Wendell
b14ad509ba
Moving static ui package
2013-06-26 08:46:51 -07:00
Patrick Wendell
2cbaa0734b
Making all new classes package private
2013-06-26 08:44:55 -07:00
Stephen Haberman
d11025dc6a
Be cute with Option and getenv.
2013-06-26 09:53:35 -05:00
Matei Zaharia
6c8d1b2ca6
Fix computation of classpath when we launch java directly
...
The previous version assumed that a CLASSPATH environment variable was
set by the "run" script when launching the process that starts the
ExecutorRunner, but unfortunately this is not true in tests. Instead, we
factor the classpath calculation into an extenral script and call that.
NOTE: This includes a Windows version but hasn't yet been tested there.
2013-06-25 18:21:00 -04:00
Matei Zaharia
15b00914c5
Some fixes to the launch-java-directly change:
...
- Split SPARK_JAVA_OPTS into multiple command-line arguments if it
contains spaces; this splitting follows quoting rules in bash
- Add the Scala JARs to the classpath if they're not in the CLASSPATH
variable because the ExecutorRunner is launched with "scala" (this can
happen when using local-cluster URLs in spark-shell)
2013-06-25 17:17:27 -04:00
Matei Zaharia
7e0191c6ea
Merge remote-tracking branch 'cgrothaus/SPARK-698'
...
Conflicts:
run
2013-06-25 15:47:40 -04:00
Patrick Wendell
d66bd6f885
Adding another unit test to Web UI suite
2013-06-24 17:12:55 -07:00
Patrick Wendell
f7389330c3
Allowing for requested port on construction
2013-06-24 16:51:52 -07:00
Patrick Wendell
42157027f2
A few bug fixes and a unit test
2013-06-24 16:25:05 -07:00
Patrick Wendell
a4248138b4
Minor style cleanup
2013-06-24 14:22:28 -07:00
Patrick Wendell
b5e6e8bcc8
Cleaning up some code for Job Progress
2013-06-24 14:13:24 -07:00
Patrick Wendell
93e8ed85aa
Work around for initalization issue
2013-06-24 13:11:18 -07:00
Patrick Wendell
f6e64b5cd6
Updating based on changes to JobLogger (and one small change to JobLogger)
2013-06-24 12:40:41 -07:00
Matei Zaharia
78ffe164b3
Clone the zero value for each key in foldByKey
...
The old version reused the object within each task, leading to
overwriting of the object when a mutable type is used, which is expected
to be common in fold.
Conflicts:
core/src/test/scala/spark/ShuffleSuite.scala
2013-06-23 10:26:53 -07:00
Matei Zaharia
0e0f9d3069
Fix search path for REPL class loader to really find added JARs
2013-06-22 17:44:04 -07:00
Matei Zaharia
3e61beff7b
Merge pull request #648 from shivaram/netty-dbg
...
Shuffle fixes and cleanup
2013-06-22 16:22:47 -07:00
Patrick Wendell
7e9f1ed0de
Some cleanup of styling
2013-06-22 10:31:37 -07:00
Patrick Wendell
3b7ebdeeb8
Handling entirely failed stages
2013-06-22 10:31:37 -07:00
Patrick Wendell
be6107ce44
Some tweaking with shared page header
2013-06-22 10:31:37 -07:00
Patrick Wendell
9a24d1a2d0
Using scala in XML imports
2013-06-22 10:31:37 -07:00
Patrick Wendell
f91e1c4822
Linking RDD information when available in stages
2013-06-22 10:31:37 -07:00
Patrick Wendell
a86bb459e2
Showing shuffle status and purging old stages
2013-06-22 10:31:37 -07:00
Patrick Wendell
3485e73376
Style cleanup
2013-06-22 10:31:37 -07:00
Patrick Wendell
dd696f3a3d
Some renaming and comments
2013-06-22 10:31:37 -07:00
Patrick Wendell
5c872e9ef5
Documentation and some refactoring
2013-06-22 10:31:37 -07:00
Patrick Wendell
17776323a6
More work on percentile data:
2013-06-22 10:31:37 -07:00
Patrick Wendell
dcf6a68177
Refactoring into different modules
2013-06-22 10:31:36 -07:00
Patrick Wendell
ce81c320ac
Adding helper function to make listing tables
2013-06-22 10:31:36 -07:00
Patrick Wendell
9fd5dc3ea9
Initial steps towards job progress UI
2013-06-22 10:31:36 -07:00
Patrick Wendell
bc4a811c57
Stash
2013-06-22 10:31:36 -07:00
Patrick Wendell
77c53f7868
Refactoring UI packages
2013-06-22 10:31:36 -07:00
Patrick Wendell
8b5c7e71c4
Import cleanup
2013-06-22 10:31:36 -07:00
Patrick Wendell
32a45d01b1
Removing twirl files
2013-06-22 10:31:36 -07:00
Patrick Wendell
4e1f202481
Removing dead code
2013-06-22 10:31:36 -07:00
Patrick Wendell
d6fde4ffe4
Some JSON cleanup
2013-06-22 10:31:36 -07:00
Patrick Wendell
91ec5a1a04
Changing JSON protocol and removing spray code
2013-06-22 10:31:36 -07:00
Patrick Wendell
fc94576ece
Adding worker version of UI
2013-06-22 10:31:36 -07:00
Patrick Wendell
ee73c09ac9
Some comments
2013-06-22 10:31:36 -07:00
Patrick Wendell
9161db5478
Cleaning up master web UI
2013-06-22 10:31:36 -07:00
Patrick Wendell
e55cf0245f
Adding WebUI file
2013-06-22 10:31:35 -07:00
Patrick Wendell
f85fd7a793
Commenting unfinished part
2013-06-22 10:31:35 -07:00
Patrick Wendell
2c36a514aa
Spray refactoring for master web UI
2013-06-22 10:31:35 -07:00
Patrick Wendell
7e6977b6c5
Fix in storage status page
2013-06-22 10:31:35 -07:00
Patrick Wendell
950f83535a
Adding deterministic port
2013-06-22 10:31:35 -07:00
Patrick Wendell
7cd70dc2c1
Minor cleanup
2013-06-22 10:31:35 -07:00
Patrick Wendell
e66f570194
Completely hacked version of block manager UI in jetty
2013-06-22 10:31:35 -07:00
Patrick Wendell
60fbf7e461
Partially working checkpoint
2013-06-22 10:31:35 -07:00
Matei Zaharia
1ef5d0d2c9
Merge pull request #644 from shimingfei/joblogger
...
add Joblogger to Spark (on new Spark code)
2013-06-22 09:35:57 -07:00
Jey Kottalam
1ba3c17303
use parens when calling method with side-effects
2013-06-21 12:14:16 -04:00
Jey Kottalam
edb18ca928
Rename PythonWorker to PythonWorkerFactory
2013-06-21 12:14:16 -04:00
Jey Kottalam
62c4781400
Add tests and fixes for Python daemon shutdown
2013-06-21 12:14:16 -04:00
Jey Kottalam
c79a6078c3
Prefork Python worker processes
2013-06-21 12:14:16 -04:00
Jey Kottalam
40afe0d2a5
Add Python timing instrumentation
2013-06-21 12:14:16 -04:00
Mingfei
2fc794a6c7
small modify in DAGScheduler
2013-06-21 18:21:35 +08:00
Mingfei
4b9862ac9c
small format modification
2013-06-21 17:55:32 +08:00
Mingfei
aa7aa587be
some format modification
2013-06-21 17:48:41 +08:00
Mingfei
5240795154
edit according to comments
2013-06-21 17:38:23 +08:00
Matei Zaharia
71030ba3eb
Merge pull request #654 from lyogavin/enhance_pipe
...
fix typo and coding style in #638
2013-06-19 15:21:03 -07:00
Thomas Graves
bad51c7cb4
upmerge with latest mesos/spark master and fix hbase compile with hadoop2-yarn profile
2013-06-19 14:39:13 -05:00
Thomas Graves
75d78c7ac9
Add support for Spark on Yarn on a secure Hadoop cluster
2013-06-19 11:18:42 -05:00
Matei Zaharia
7902baddc7
Update ASM to version 4.0
2013-06-19 13:34:30 +02:00
Gavin Li
0a2a9bce1e
fix typo and coding style
2013-06-18 21:30:13 +00:00
jerryshao
1e9269c3ee
reduce ZippedPartitionsRDD's getPreferredLocations complexity
2013-06-18 09:49:06 +08:00
Matei Zaharia
db42451a52
Merge pull request #643 from adatao/master
...
Bug fix: Zero-length partitions result in NaN for overall mean & variance
2013-06-17 15:26:36 -07:00
Matei Zaharia
e82a2ffcc9
Merge pull request #653 from rxin/logging
...
SPARK-781: Log the temp directory path when Spark says "Failed to create temp directory."
2013-06-17 15:13:15 -07:00
Matei Zaharia
ec193c7d89
Merge remote-tracking branch 'xiajunluan/xiajunluan'
...
Conflicts:
core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
2013-06-18 00:11:50 +02:00
Reynold Xin
be3c406edf
Fixed the typo pointed out by Matei.
2013-06-17 17:07:51 -04:00
Reynold Xin
1450296797
SPARK-781: Log the temp directory path when Spark says "Failed to create
...
temp directory".
2013-06-17 16:58:23 -04:00
Gavin Li
4508089fc3
refine comments and add sc.clean
2013-06-17 05:23:46 +00:00
Gavin Li
e6ae049283
Merge remote-tracking branch 'upstream1/master' into enhance_pipe
2013-06-16 22:53:39 +00:00
Gavin Li
fb6d733fa8
update according to comments
2013-06-16 22:32:55 +00:00
Matei Zaharia
f961aac8b2
Merge pull request #649 from ryanlecompte/master
...
Add top K method to RDD using a bounded priority queue
2013-06-15 00:53:41 -07:00
ryanlecompte
e8801d4490
use delegation for BoundedPriorityQueue, add Java API
2013-06-14 23:39:05 -07:00
Reynold Xin
2cc188fd54
SPARK-774: cogroup should also disable map side combine by default
2013-06-14 00:10:54 -07:00
Reynold Xin
6738178d0d
SPARK-772: groupByKey should disable map side combine.
2013-06-13 23:59:42 -07:00
ryanlecompte
93b3f5e535
drop unneeded ClassManifest implicit
2013-06-13 16:26:35 -07:00
ryanlecompte
44b8dbaede
use Iterator.single(elem) instead of Iterator(elem) for improved performance based on scaladocs
2013-06-13 16:23:15 -07:00
Shivaram Venkataraman
1d9f0df065
Fix some comments and style
2013-06-13 14:46:25 -07:00
Mingfei
967a6a699d
modify sparklister function interface according to comments
2013-06-13 14:36:07 +08:00
Shivaram Venkataraman
5da4287b1d
Merge branch 'netty-dbg' of github.com:shivaram/spark into netty-dbg
2013-06-12 16:38:37 -07:00
Shivaram Venkataraman
5e9a9317c5
Merge branch 'master' of git://github.com/mesos/spark into netty-dbg
2013-06-12 16:38:01 -07:00
ryanlecompte
db5bca08ff
add a new top K method to RDD using a bounded priority queue
2013-06-12 10:54:16 -07:00
Andrew xia
190ec61799
change code style and debug info
2013-06-10 15:27:02 +08:00
Patrick Wendell
ef14dc2e77
Adding Java-API version of compression codec
2013-06-09 18:09:46 -07:00
Patrick Wendell
df592192e7
Monads FTW
2013-06-09 18:09:24 -07:00
Patrick Wendell
d1bbcebae5
Adding compression to Hadoop save functions
2013-06-09 11:39:35 -07:00
Mingfei
ade822011d
not check return value of eventQueue.take
2013-06-08 16:26:45 +08:00
Matei Zaharia
5b5b5aedbf
Fixed a few test issues due to Akka 2.1, as well as SBT memory.
...
Unfortunately, in Akka 2.1, ActorSystem.awaitTermination hangs for
remote actors, and Akka also leaves a non-daemon Netty thread even when
run in daemon mode. Thus I had to comment out some of the calls to
awaitTermination, and we still have one failing test.
2013-06-08 01:09:24 -07:00
Mingfei
4fd86e0e10
delete test code for joblogger in SparkContext
2013-06-08 15:45:47 +08:00
Mingfei
362f0f93ac
Merge branch 'master' of https://github.com/mesos/spark
2013-06-08 15:20:13 +08:00
Mingfei
1a4d93c025
modify to pass job annotation by localProperties and use daeamon thread to do joblogger's work
2013-06-08 14:23:39 +08:00
Matei Zaharia
b58a29295b
Small formatting and style fixes
2013-06-07 22:51:28 -07:00
Matei Zaharia
c8fc423bc2
Merge pull request #631 from jerryshao/master
...
Fix block manager UI display issue when enable spark.cleaner.ttl
2013-06-07 22:43:18 -07:00
Matei Zaharia
c9ca0a4a58
Small code style fix to SchedulingAlgorithm.scala
2013-06-07 22:40:44 -07:00
Matei Zaharia
1ae60bcb36
Merge pull request #634 from xiajunluan/master
...
[Spark-753] Fix ClusterSchedulSuite unit test failed
2013-06-07 22:39:06 -07:00
Shivaram Venkataraman
ac480fd977
Clean up variables and counters in BlockFetcherIterator
2013-06-06 16:34:27 -07:00
Gavin Li
e179ff8a32
update according to comments
2013-06-05 22:41:05 +00:00
Shivaram Venkataraman
cb2f5046ee
Pass in bufferSize to BufferedOutputStream
2013-06-05 15:09:02 -07:00
Shivaram Venkataraman
c851957fe4
Don't write zero block files with java serializer
2013-06-05 14:28:38 -07:00
Christopher Nguyen
9d35904357
In the current code, when both partitions happen to have zero-length, the return mean will be NaN.
...
Consequently, the result of mean after reducing over all partitions will also be NaN,
which is not correct if there are partitions with non-zero length. This patch fixes this issue.
2013-06-04 22:12:47 -07:00
Matei Zaharia
fff3728552
Merge pull request #640 from pwendell/timeout-update
...
Fixing bug in BlockManager timeout
2013-06-04 16:09:50 -07:00
Patrick Wendell
061fd3ae36
Fixing bug in BlockManager timeout
2013-06-04 19:02:44 -04:00
Matei Zaharia
f420d4f228
Merge pull request #639 from pwendell/timeout-update
...
Bump akka and blockmanager timeouts to 60 seconds
2013-06-04 15:25:58 -07:00
Patrick Wendell
8bd4e12104
Bump akka and blockmanager timeouts to 60 seconds
2013-06-04 18:14:24 -04:00
Shivaram Venkataraman
96943a1cc0
var to val
2013-06-03 12:29:38 -07:00
Shivaram Venkataraman
cd347f547a
Reuse the file object as it is valid after delete
2013-06-03 12:27:51 -07:00
Shivaram Venkataraman
a058b0acf3
Delete a file for a block if it already exists.
2013-06-03 12:10:00 -07:00
Andrew xia
606bb1b450
Fix schedulingAlgorithm bugs for unit test
2013-06-03 10:29:23 +08:00
Shivaram Venkataraman
038cfc1a9a
Make connect timeout configurable
2013-05-31 23:32:18 -07:00
Shivaram Venkataraman
91aca92249
Another round of Netty fixes.
...
1. Avoid race condition between stop and copier completion
2. Handle socket exceptions by reporting them and filling in a failed
FetchResult
2013-05-31 23:21:38 -07:00
Gavin Li
9f84315c05
enhance pipe to support what we can do in hadoop streaming
2013-06-01 00:26:10 +00:00
Reynold Xin
de1167bf2c
Incorporated Charles' feedback to put rdd metadata removal in
...
BlockManagerMasterActor.
2013-05-31 15:54:57 -07:00
Reynold Xin
ba5e544461
More block manager cleanup.
...
Implemented a removeRdd method in BlockManager, and use that to
implement RDD.unpersist. Previously, unpersist needs to send B akka
messages, where B = number of blocks. Now unpersist only needs to send W
akka messages, where W = the number of workers.
2013-05-31 01:48:16 -07:00
jerryshao
926f41cc52
fix block manager UI display issue when enable spark.cleaner.ttl
2013-05-31 09:32:52 +08:00
Reynold Xin
bed1b08169
Do not create symlink for local add file. Instead, copy the file.
...
This prevents Spark from changing the original file's permission, and
also allow add file to work on non-posix operating systems.
2013-05-30 16:21:49 -07:00
Shivaram Venkataraman
3b0cd17343
Merge branch 'master' of git://github.com/mesos/spark
...
Conflicts:
core/src/test/scala/spark/ShuffleSuite.scala
2013-05-30 14:36:24 -07:00
Andrew xia
c3db3ea554
1. Add unit test for local scheduler
...
2. Move localTaskSetManager to a new file
2013-05-30 20:49:40 +08:00
Andrew xia
ecceb101d3
implement FIFO and fair scheduler for spark local mode
2013-05-30 10:43:01 +08:00
Shivaram Venkataraman
19fd6d54c0
Also flush serializer in revertPartialWrites
2013-05-29 17:29:34 -07:00
Shivaram Venkataraman
618c8cae1e
Skip fetching zero-sized blocks in OIO.
...
Also unify splitLocalRemoteBlocks for netty/nio and add a test case
2013-05-29 13:18:54 -07:00
Matei Zaharia
6ed71390d9
Merge pull request #626 from stephenh/remove-add-if-no-port
...
Remove unused addIfNoPort.
2013-05-29 10:14:22 -07:00
Shivaram Venkataraman
b79b10a6d6
Flush serializer to fix zero-size kryo blocks bug.
...
Also convert the local-cluster test case to check for non-zero block sizes
2013-05-29 00:52:55 -07:00
Matei Zaharia
41d230ccb0
Merge pull request #611 from squito/classloader
...
Use default classloaders for akka & deserializing task results
2013-05-28 23:35:24 -07:00
Shivaram Venkataraman
fbc1ab3468
Couple of Netty fixes
...
a. Fix the port number by reading it from the bound channel
b. Fix the shutdown sequence to make sure we actually block on the channel
c. Fix the unit test to use two JVMs.
2013-05-28 16:27:16 -07:00
Stephen Haberman
4fe1fbdd51
Remove unused addIfNoPort.
2013-05-28 16:26:32 -05:00
Matei Zaharia
3db1e17baa
Merge pull request #620 from jerryshao/master
...
Fix CheckpointRDD java.io.FileNotFoundException when calling getPreferredLocations
2013-05-27 21:31:43 -07:00
Matei Zaharia
e8d4b6c296
Merge pull request #529 from xiajunluan/master
...
[SPARK-663]Implement Fair Scheduler in Spark Cluster Scheduler
2013-05-25 21:09:03 -07:00
Reynold Xin
26962c9340
Automatically configure Netty port. This makes unit tests using
...
local-cluster pass. Previously they were failing because Netty was
trying to bind to the same port for all processes.
Pair programmed with @shivaram.
2013-05-24 16:39:33 -07:00
Reynold Xin
6ea085169d
Fixed the bug that shuffle serializer is ignored by the new shuffle
...
block iterators for local blocks. Also added a unit test for that.
2013-05-24 14:08:37 -07:00
jerryshao
bd3ea8f2a6
fix CheckpointRDD getPreferredLocations java.io.FileNotFoundException
2013-05-24 14:26:19 +08:00
Charles Reiss
f350f14084
Use ARRAY_SAMPLE_SIZE constant instead of 100.0
2013-05-21 18:11:33 -07:00
Andrew xia
ecd6d75c6a
fix bug of unit tests
2013-05-21 06:49:23 +08:00
Reynold Xin
5912cc4967
Merge pull request #610 from JoshRosen/spark-747
...
Throw exception if TaskResult exceeds Akka frame size
2013-05-17 19:58:40 -07:00
Reynold Xin
8d78c5f89f
Changed the logging level from info to warning when addJar(null) is
...
called.
2013-05-17 18:51:35 -07:00
Andrew xia
3d4672eaa9
Merge branch 'master' into xiajunluan
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/scheduler/cluster/ClusterScheduler.scala
core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
2013-05-18 07:28:03 +08:00
Andrew xia
d19753b9c7
expose TaskSetManager type to resourceOffer function in ClusterScheduler
2013-05-18 06:45:19 +08:00
Andrew xia
c6e2770bfe
Fix ClusterScheduler bug to avoid allocating tasks to same slave
2013-05-17 05:10:38 +08:00
Mridul Muralidharan
f0881f8d48
Hope this does not turn into a bike shed change
2013-05-17 01:58:50 +05:30
Mridul Muralidharan
feddd2530d
Filter out nulls - prevent NPE
2013-05-16 17:49:14 +05:30
Josh Rosen
b8e46b6074
Abort job if result exceeds Akka frame size; add test.
2013-05-16 01:57:57 -07:00
Matei Zaharia
2f576aba8f
Merge pull request #602 from rxin/shufflemerge
...
Manual merge & cleanup of Shane's Shuffle Performance Optimization
2013-05-15 18:06:24 -07:00
Reynold Xin
203d7b7c14
Merge pull request #593 from squito/driver_ui_link
...
Master UI has link to Application UI
2013-05-15 00:47:20 -07:00
Reynold Xin
f3491cb89b
Merge branch 'master' of github.com:mesos/spark into shufflemerge
...
Conflicts:
core/src/main/scala/spark/storage/BlockManager.scala
core/src/test/scala/spark/DistributedSuite.scala
project/SparkBuild.scala
2013-05-15 00:31:52 -07:00
Reynold Xin
f9d40a5848
Added a comment in JdbcRDD for example usage.
2013-05-14 23:29:57 -07:00
Reynold Xin
81ad2fa331
Merge branch 'jdbc' of github.com:koeninger/spark
...
Conflicts:
project/SparkBuild.scala
2013-05-14 23:12:00 -07:00
Imran Rashid
38d4b97c6d
use threads classloader when deserializing task results; classnotfoundexception includes classloader
2013-05-14 22:32:14 -07:00