Commit graph

220 commits

Author SHA1 Message Date
Matei Zaharia f48742683a Made caches dataset-aware so that they won't cyclically evict partitions
from the same dataset.
2012-05-06 20:14:40 -07:00
Matei Zaharia 32a4f4623c Merge pull request #129 from mesos/rxin
Force serialize/deserialize task results in local execution mode.
2012-04-24 16:18:39 -07:00
Reynold Xin 9821cd4d42 Force serialize/deserialize task results in local execution mode. 2012-04-24 14:55:28 -07:00
Antonio 3e48818993 Removed commented-out System.exit call 2012-04-23 11:42:58 -07:00
Antonio 39d99168dc Added exception handling instead of just exiting in LocalScheduler for tasks that throw exceptions 2012-04-20 14:46:43 -07:00
Reynold Xin e601b3b9e5 Added the ability to set environmental variables in piped rdd. 2012-04-17 16:40:56 -07:00
Matei Zaharia 3b745176e0 Bug fix to pluggable closure serialization change 2012-04-12 17:53:02 +00:00
Matei Zaharia 112655f032 Merge pull request #121 from rxin/kryo-closure
Added an option (spark.closure.serializer) to specify the serializer for closures.
2012-04-10 14:21:02 -07:00
Reynold Xin d295ccb43c Added a closureSerializer field in SparkEnv and use it to serialize
tasks.
2012-04-10 13:29:46 -07:00
Reynold Xin 968f75f6af Added an option (spark.closure.serializer) to specify the serializer for
closures. This enables using Kryo as the closure serializer.
2012-04-09 21:59:56 -07:00
Matei Zaharia a633974143 Merge branch 'master' of github.com:mesos/spark 2012-04-08 23:41:25 -07:00
Matei Zaharia d401e1b3e8 Fix a possible deadlock in MesosScheduler 2012-04-08 23:38:49 -07:00
Ankur Dave 7be1c7b331 Report entry dropping in BoundedMemoryCache 2012-04-06 15:49:32 -07:00
Matei Zaharia 816d4e5840 Pass local IP address instead of hostname in spark.master.host. Fixes #117. 2012-04-05 14:53:17 -07:00
Matei Zaharia 335a6036ad Converted some tabs to spaces 2012-04-05 11:58:01 -07:00
Matei Zaharia 8c95a85438 Use Runtime.maxMemory instead of Runtime.totalMemory in
BoundedMemoryCache, in case the JVM was not started with its initial
heap size equaling its maximum one (-Xms == -Xmx).
2012-03-30 13:39:35 -04:00
Reynold Xin 42dcdbcb2f Removed the extra spaces in OrderedRDDFunctions and SortedRDD. 2012-03-29 15:21:57 -07:00
Matei Zaharia c7af538ac1 Some fixes to sorting for when the RDD has fewer elements than the
number of partitions we ask to partition it into. Also, removed a test
that was taking way too long to run.
2012-03-17 13:08:36 -07:00
Matei Zaharia a5e2b6a6bd Merge pull request #112 from cengle/master
Changed HadoopRDD to get key and value containers from the RecordReader instead of through reflection
2012-03-06 13:38:32 -08:00
Matei Zaharia 97eee50825 Fixes a nasty bug that could happen when tasks fail, because calling
wait() with a timeout of 0 on a Java object means "wait forever".
2012-03-01 13:43:17 -08:00
Cliff Engle dd68cb6099 Get key and value container from RecordReader 2012-02-29 16:33:23 -08:00
Matei Zaharia 1e10df0a46 Merge pull request #111 from alupher/master
Adding sorting to RDDs
2012-02-24 15:50:14 -08:00
Matei Zaharia aa04f87cd2 Added support for parallel execution of jobs in DAGScheduler. 2012-02-19 22:50:23 -08:00
Antonio 620798161b Added fixes to sorting 2012-02-13 00:07:39 -08:00
Matei Zaharia 2587ce1690 Fixed a deadlock that occured with MesosScheduler due to an earlier
synchronization change
2012-02-11 21:22:45 -08:00
Antonio e93f622665 Added sorting by key for pair RDDs 2012-02-11 00:56:28 -08:00
Matei Zaharia 98f008b721 Formatting fixes 2012-02-10 10:52:03 -08:00
Matei Zaharia 7660a8b12f Merge branch 'formatting'
Conflicts:
	core/src/main/scala/spark/DAGScheduler.scala
	core/src/main/scala/spark/SimpleShuffleFetcher.scala
	core/src/main/scala/spark/SparkContext.scala
2012-02-10 10:42:14 -08:00
haoyuan 194c42ab79 Code format. 2012-02-10 08:19:53 -08:00
Matei Zaharia 8f5ed51234 Delete Spark's temporary directories when the JVM exits. 2012-02-09 22:58:24 -08:00
Matei Zaharia c0a0df3285 Made the default cache BoundedMemoryCache, and reduced its default size 2012-02-09 22:32:02 -08:00
Matei Zaharia 0e93891d3d Replaced LocalFileShuffle with a non-singleton ShuffleManager class
and made DAGScheduler automatically set SparkEnv.
2012-02-09 22:14:56 -08:00
haoyuan 445e0bb1b5 Format the code a bit mroe. 2012-02-09 15:50:26 -08:00
haoyuan 651932e703 Format the code as coding style agreed by Matei/TD/Haoyuan 2012-02-09 13:26:23 -08:00
Matei Zaharia e02dc83a5b IO optimizations 2012-02-06 20:40:39 -08:00
Matei Zaharia c40e766368 Use java.util.HashMap in shuffles 2012-02-06 19:20:25 -08:00
Matei Zaharia b267175ab5 Synchronization fix in case SparkContext is used from multiple threads. 2012-02-06 14:28:18 -08:00
Hiral Patel b47952342e Add register immutable map to kryo serializer 2012-01-26 15:24:20 -08:00
Matei Zaharia fabcc82528 Merge pull request #103 from edisontung/master
Made improvements to takeSample. Also changed SparkLocalKMeans to SparkKMeans
2012-01-13 19:20:03 -08:00
Matei Zaharia fd5581a0d3 Fixed a failure recovery bug and added some tests for fault recovery. 2012-01-13 19:17:27 -08:00
Edison Tung 1ecc221f84 Fixed bugs
I've fixed the bugs detailed in the diff. One of the bugs was already
fixed on the local file (forgot to commit).
2012-01-09 11:59:52 -08:00
Matei Zaharia e269f6f7ea Register RDDs with the MapOutputTracker even if they have no partitions.
Fixes #105.
2012-01-05 15:59:20 -05:00
Matei Zaharia 3034fc0d91 Merge commit 'ad4ebff42c1b738746b2b9ecfbb041b6d06e3e16' 2011-12-14 18:19:43 +01:00
Matei Zaharia 6a650cbbdf Make Spark port default to 7077 so that it's not an ephemeral port that might be taken 2011-12-14 18:18:22 +01:00
Matei Zaharia 735843a049 Merge remote-tracking branch 'origin/charles-newhadoop' 2011-12-02 21:59:30 -08:00
Charles Reiss 66f05f383e Add new Hadoop API reading support. 2011-12-01 14:02:10 -08:00
Charles Reiss 02d43e6986 Add new Hadoop API writing support. 2011-12-01 14:01:28 -08:00
Edison Tung 42f8847a21 Revert de01b6deaaee1b43321e0aac330f4a98c0ea61c6^..HEAD 2011-12-01 13:43:25 -08:00
Edison Tung de01b6deaa Fixed bug in RDD
Math.min takes 2 args, not 1. This was not committed earlier for some
reason
2011-12-01 13:34:37 -08:00
Matei Zaharia 22b8fcf632 Added fold() and aggregate() operations that reuse an object to
merge results into rather than requiring a new object allocation
for each element merged. Fixes #95.
2011-11-30 11:37:47 -08:00