Josh Rosen
6a78e88237
Minor cleanup and optimizations in Java API.
...
- Add override keywords.
- Cache RDDs and counts in TC example.
- Clean up JavaRDDLike's abstract methods.
2012-07-24 09:47:00 -07:00
Denny
4f4a34c025
Stlystic changes
...
Conflicts:
core/src/test/scala/spark/MesosSchedulerSuite.scala
2012-07-23 16:32:20 -07:00
Matei Zaharia
600e99728d
Fix a bug where an input path was added to a Hadoop job configuration twice
2012-07-23 16:16:19 -07:00
Josh Rosen
042dcbde33
Add type annotations to Java API methods.
...
Add missing Scala Map to java.util.Map conversions.
2012-07-22 17:35:29 -07:00
Josh Rosen
e23938c3be
Use mapValues() in JavaPairRDD.cogroupResultToJava().
2012-07-22 15:10:01 -07:00
Josh Rosen
01dce3f569
Add Java API
...
Add distinct() method to RDD.
Fix bug in DoubleRDDFunctions.
2012-07-18 17:34:29 -07:00
Matei Zaharia
628bb5ca7f
Allow null keys in Spark's reduce and group by
2012-07-12 18:36:02 -07:00
Matei Zaharia
e2a67a8024
Fixes to coarse-grained Mesos scheduler in dealing with failed nodes
2012-07-12 18:21:52 -07:00
Matei Zaharia
be622cf867
Formatting
2012-07-11 17:31:44 -07:00
Matei Zaharia
e8ae77df24
Added more methods for loading/saving with new Hadoop API
2012-07-11 17:31:33 -07:00
Matei Zaharia
0a47284003
More work to allow Spark to run on the standalone deploy cluster.
2012-07-08 14:00:04 -07:00
Matei Zaharia
1aa63f775b
Added back coarse-grained Mesos scheduler based on StandaloneScheduler.
2012-07-08 10:52:13 -07:00
Matei Zaharia
c5cc10cda3
More work on standalone scheduler
2012-07-06 20:17:44 -07:00
Matei Zaharia
909b325243
Further refactoring, and start of a standalone scheduler backend
2012-07-06 17:56:44 -07:00
Matei Zaharia
4e2fe0bdaf
Miscellaneous bug fixes
2012-07-06 16:33:40 -07:00
Matei Zaharia
e72afdb817
Some refactoring to make cluster scheduler pluggable.
2012-07-06 15:23:26 -07:00
Matei Zaharia
5d1a887bed
Further updates to run processes on cluster.
2012-07-01 17:13:31 -07:00
Matei Zaharia
51c46eaca0
More work on standalone deploy system.
2012-07-01 01:05:59 -07:00
Matei Zaharia
a6eb9fda61
Detect connection and disconnection of slaves
2012-06-30 17:46:56 -07:00
Matei Zaharia
408b5a1332
More work on deploy code (adding Worker class)
2012-06-30 16:45:57 -07:00
Matei Zaharia
2fb6e7d71e
Initial framework to get a master and web UI up.
2012-06-30 14:45:55 -07:00
Matei Zaharia
c53670b9bf
Various code style fixes, mostly from IntelliJ IDEA
2012-06-29 18:47:12 -07:00
Matei Zaharia
c6be4ffbf9
Fixes to CoarseMesosScheduler
2012-06-29 16:18:51 -07:00
Matei Zaharia
3a58efa5a5
Allow binding to a free port and change Akka logging to use SLF4J. Also
...
fixes various bugs in the previous code when running on Mesos.
2012-06-29 16:02:21 -07:00
Matei Zaharia
3920189932
Upgraded to Akka 2 and fixed test execution (which was still parallel
...
across projects).
2012-06-28 23:51:28 -07:00
root
6ad3e1f1b4
Various fixes when running on Mesos
2012-06-20 06:48:26 +00:00
Tathagata Das
40536e3668
Fixed nasty corner case bug in ByteBufferInputStream. Could not add a test case for this as I could not figure out how to deterministically reproduce the bug in a short testcase.
2012-06-17 13:28:41 -07:00
Matei Zaharia
2893b30550
Various fixes to get unit tests running. In particular, shut down
...
ConnectionManager and DAGScheduler properly, plus a fix to
LocalScheduler that was not merged in from 0.5 and was actually caught
by one of the tests.
2012-06-17 00:28:45 -07:00
Matei Zaharia
b3eeac55b8
Fixed HttpBroadcast to work with this branch's Serializer.
2012-06-15 23:54:38 -07:00
Matei Zaharia
f58da6164e
Merge branch 'master' into dev
2012-06-15 23:47:11 -07:00
Tathagata Das
5f54bdf98b
Added shutdown for akka to SparkContext.stop(). Helps a little, but many testsuites still fail.
2012-06-13 20:49:00 -04:00
Tathagata Das
c6156da9e2
Multiple bug fixes to pass the testsuites ShuffleSuite and BlockManagerSuite.
2012-06-13 16:26:49 -04:00
Matei Zaharia
879bc0bece
Merge branch 'master' into mesos-0.9
2012-06-09 16:24:16 -07:00
Matei Zaharia
4b05798c06
Further bug fix to HttpBroadcast
2012-06-09 16:24:03 -07:00
Matei Zaharia
587a16a7ef
Merge branch 'master' into mesos-0.9
2012-06-09 16:17:07 -07:00
Matei Zaharia
8ed662862e
Bug fix to HttpBroadcast
2012-06-09 16:16:55 -07:00
Matei Zaharia
2fd9f994ae
Merge branch 'master' into mesos-0.9
2012-06-09 15:58:35 -07:00
Matei Zaharia
e75b1b5cb4
Change the default broadcast implementation to a simple HTTP-based
...
broadcast. Fixes #139 .
2012-06-09 15:58:07 -07:00
Matei Zaharia
a96558caa3
Performance improvements to shuffle operations: in particular, preserve
...
RDD partitioning in more cases where it's possible, and use iterators
instead of materializing collections when doing joins.
2012-06-09 14:44:18 -07:00
Matei Zaharia
63051dd2bc
Merge in engine improvements from the Spark Streaming project, developed
...
jointly with Tathagata Das and Haoyuan Li. This commit imports the changes
and ports them to Mesos 0.9, but does not yet pass unit tests due to
various classes not supporting a graceful stop() yet.
2012-06-07 12:45:38 -07:00
Matei Zaharia
7e1c97fc4b
Merge branch 'master' into mesos-0.9
2012-06-06 16:48:59 -07:00
Matei Zaharia
048276799a
Commit task outputs to Hadoop-supported storage systems in parallel on the
...
cluster instead of on the master. Fixes #110 .
2012-06-06 16:46:53 -07:00
Matei Zaharia
6888bc7191
Merge branch 'master' into mesos-0.9
2012-06-06 16:14:19 -07:00
Matei Zaharia
6ae2746d1e
Handle arrays that contain the same element many times better in
...
SizeEstimator. Also added a test for SizeEstimator. Fixes #136 .
2012-06-06 16:13:02 -07:00
Matei Zaharia
dbc3c86ae3
Merge branch 'master' into mesos-0.9
...
Conflicts:
core/src/main/scala/spark/Executor.scala
2012-06-03 17:44:04 -07:00
Matei Zaharia
e141f644ca
Merge pull request #132 from Benky/rb-first-iteration
...
Little refactoring and unit tests for CacheTrackerActor
2012-05-26 13:15:06 -07:00
Richard Benkovsky
ae64920337
MesosScheduler refactoring
2012-05-22 11:04:54 +02:00
Richard Benkovsky
3a1bcd4028
Added tests for CacheTrackerActor
2012-05-22 11:04:54 +02:00
Richard Benkovsky
8f2f736d53
Little refactoring
2012-05-22 11:04:54 +02:00
Richard Benkovsky
f162fc2beb
Formating fixed
2012-05-22 09:45:38 +02:00
Richard Benkovsky
565245871f
BoundedMemoryCache.put fails when estimated size of 'value' is larger than cache capacity
2012-05-20 22:13:35 +02:00
Richard Benkovsky
822a4be37d
Utils.memoryBytesToString fixed
2012-05-19 15:13:20 +02:00
Reynold Xin
d0c6e9f639
Made some RDD dependencies transient to reduce the amount of data needed
...
to be serialized in closure serialization. This can significantly reduce
the task setup time in Shark when the query involves a large number of
(Hive) partitions.
2012-05-16 14:16:55 -07:00
Reynold Xin
16461e2eda
Updated Cache's put method to use a case class for response. Previously
...
it was pretty ugly that put() should return -1 for failures.
2012-05-15 00:31:52 -07:00
Reynold Xin
019e48833f
Added the capacity to report cache usage status back to the cache
...
trackor. This is essential for building a dashboard to see the status of
caches on all slaves.
2012-05-14 18:39:04 -07:00
Matei Zaharia
f48742683a
Made caches dataset-aware so that they won't cyclically evict partitions
...
from the same dataset.
2012-05-06 20:14:40 -07:00
Matei Zaharia
bd2ab635a7
Fixed the way the JAR server is created after finding issue at Twitter
2012-05-05 20:05:15 -07:00
Matei Zaharia
32a4f4623c
Merge pull request #129 from mesos/rxin
...
Force serialize/deserialize task results in local execution mode.
2012-04-24 16:18:39 -07:00
Reynold Xin
9821cd4d42
Force serialize/deserialize task results in local execution mode.
2012-04-24 14:55:28 -07:00
Antonio
3e48818993
Removed commented-out System.exit call
2012-04-23 11:42:58 -07:00
Antonio
39d99168dc
Added exception handling instead of just exiting in LocalScheduler for tasks that throw exceptions
2012-04-20 14:46:43 -07:00
Reynold Xin
e601b3b9e5
Added the ability to set environmental variables in piped rdd.
2012-04-17 16:40:56 -07:00
Matei Zaharia
3b745176e0
Bug fix to pluggable closure serialization change
2012-04-12 17:53:02 +00:00
Matei Zaharia
112655f032
Merge pull request #121 from rxin/kryo-closure
...
Added an option (spark.closure.serializer) to specify the serializer for closures.
2012-04-10 14:21:02 -07:00
Reynold Xin
d295ccb43c
Added a closureSerializer field in SparkEnv and use it to serialize
...
tasks.
2012-04-10 13:29:46 -07:00
Reynold Xin
968f75f6af
Added an option (spark.closure.serializer) to specify the serializer for
...
closures. This enables using Kryo as the closure serializer.
2012-04-09 21:59:56 -07:00
Matei Zaharia
a69c0738d1
Merge branch 'master' into mesos-0.9
2012-04-08 23:41:36 -07:00
Matei Zaharia
a633974143
Merge branch 'master' of github.com:mesos/spark
2012-04-08 23:41:25 -07:00
Matei Zaharia
0229d5390f
Merge branch 'master' into mesos-0.9
2012-04-08 23:39:37 -07:00
Matei Zaharia
d401e1b3e8
Fix a possible deadlock in MesosScheduler
2012-04-08 23:38:49 -07:00
Ankur Dave
7be1c7b331
Report entry dropping in BoundedMemoryCache
2012-04-06 15:49:32 -07:00
Matei Zaharia
a8bb324ed9
Merge branch 'master' into mesos-0.9
2012-04-05 14:53:22 -07:00
Matei Zaharia
816d4e5840
Pass local IP address instead of hostname in spark.master.host. Fixes #117 .
2012-04-05 14:53:17 -07:00
Matei Zaharia
335a6036ad
Converted some tabs to spaces
2012-04-05 11:58:01 -07:00
Matei Zaharia
8c95a85438
Use Runtime.maxMemory instead of Runtime.totalMemory in
...
BoundedMemoryCache, in case the JVM was not started with its initial
heap size equaling its maximum one (-Xms == -Xmx).
2012-03-30 13:39:35 -04:00
Matei Zaharia
03d5b3b48d
Use Runtime.maxMemory instead of Runtime.totalMemory in
...
BoundedMemoryCache, in case the JVM was not started with its initial
heap size equaling its maximum one (-Xms == -Xmx).
2012-03-30 13:38:19 -04:00
Matei Zaharia
dfa3b6b544
Fixes to work with the very latest Mesos 0.9 API
2012-03-29 22:12:35 -04:00
Matei Zaharia
4d52cc6738
Merge branch 'master' into mesos-0.9
2012-03-29 21:29:39 -04:00
Reynold Xin
42dcdbcb2f
Removed the extra spaces in OrderedRDDFunctions and SortedRDD.
2012-03-29 15:21:57 -07:00
Matei Zaharia
08cda89e8a
Further fixes to how Mesos is found and used
2012-03-17 13:39:14 -07:00
Matei Zaharia
3c3fdf6eca
Merge branch 'master' into mesos-0.9
2012-03-17 13:09:21 -07:00
Matei Zaharia
c7af538ac1
Some fixes to sorting for when the RDD has fewer elements than the
...
number of partitions we ask to partition it into. Also, removed a test
that was taking way too long to run.
2012-03-17 13:08:36 -07:00
Matei Zaharia
a099a63a8a
Initial work to make Spark compile with Mesos 0.9 and Hadoop 1.0
2012-03-17 12:31:34 -07:00
Matei Zaharia
a5e2b6a6bd
Merge pull request #112 from cengle/master
...
Changed HadoopRDD to get key and value containers from the RecordReader instead of through reflection
2012-03-06 13:38:32 -08:00
Matei Zaharia
97eee50825
Fixes a nasty bug that could happen when tasks fail, because calling
...
wait() with a timeout of 0 on a Java object means "wait forever".
2012-03-01 13:43:17 -08:00
Cliff Engle
dd68cb6099
Get key and value container from RecordReader
2012-02-29 16:33:23 -08:00
Matei Zaharia
1e10df0a46
Merge pull request #111 from alupher/master
...
Adding sorting to RDDs
2012-02-24 15:50:14 -08:00
Matei Zaharia
aa04f87cd2
Added support for parallel execution of jobs in DAGScheduler.
2012-02-19 22:50:23 -08:00
Antonio
620798161b
Added fixes to sorting
2012-02-13 00:07:39 -08:00
Matei Zaharia
2587ce1690
Fixed a deadlock that occured with MesosScheduler due to an earlier
...
synchronization change
2012-02-11 21:22:45 -08:00
Antonio
e93f622665
Added sorting by key for pair RDDs
2012-02-11 00:56:28 -08:00
Matei Zaharia
98f008b721
Formatting fixes
2012-02-10 10:52:03 -08:00
Matei Zaharia
7660a8b12f
Merge branch 'formatting'
...
Conflicts:
core/src/main/scala/spark/DAGScheduler.scala
core/src/main/scala/spark/SimpleShuffleFetcher.scala
core/src/main/scala/spark/SparkContext.scala
2012-02-10 10:42:14 -08:00
haoyuan
194c42ab79
Code format.
2012-02-10 08:19:53 -08:00
Matei Zaharia
8f5ed51234
Delete Spark's temporary directories when the JVM exits.
2012-02-09 22:58:24 -08:00
Matei Zaharia
c0a0df3285
Made the default cache BoundedMemoryCache, and reduced its default size
2012-02-09 22:32:02 -08:00
Matei Zaharia
0e93891d3d
Replaced LocalFileShuffle with a non-singleton ShuffleManager class
...
and made DAGScheduler automatically set SparkEnv.
2012-02-09 22:14:56 -08:00
haoyuan
445e0bb1b5
Format the code a bit mroe.
2012-02-09 15:50:26 -08:00
haoyuan
651932e703
Format the code as coding style agreed by Matei/TD/Haoyuan
2012-02-09 13:26:23 -08:00
Matei Zaharia
e02dc83a5b
IO optimizations
2012-02-06 20:40:39 -08:00