Commit graph

462 commits

Author SHA1 Message Date
Matei Zaharia d71a358c46 Fixed a test that was getting extremely lucky before, and increased the
number of samples used for sorting
2012-09-26 00:25:34 -07:00
Matei Zaharia 051785c7e6 Several fixes to sampling issues pointed out by Henry Milner:
- takeSample was biased towards earlier partitions
- There were some range errors in takeSample
- SampledRDDs with replacement didn't produce appropriate counts
  across partitions (we took exactly frac of each one)
2012-09-25 21:46:58 -07:00
Matei Zaharia 4d3339a3ec Merge pull request #217 from rxin/dev
Added a method to RDD to expose the ClassManifest.
2012-09-24 23:52:32 -07:00
Reynold Xin 7a4cd92861 Renamed RDD.manifest to RDD.elementClassManifest 2012-09-24 23:42:33 -07:00
Matei Zaharia 296e24b440 Merge pull request #218 from rnpandya/dev
Scripts to start Spark under windows
2012-09-24 21:10:31 -07:00
Reynold Xin 348bcbca1f Added a method to RDD to expose the ClassManifest. 2012-09-24 16:56:27 -07:00
Ravi Pandya 39215357af Windows command scripts for sbt and run 2012-09-24 15:43:19 -07:00
Matei Zaharia 6eeb379cf8 Fix some test issues 2012-09-24 15:39:58 -07:00
Matei Zaharia f855e4fad2 Merge pull request #208 from rxin/dev
Separated ShuffledRDD into multiple classes.
2012-09-24 12:32:01 -07:00
root 107a5ca879 Make default number of parallel fetches slightly smaller since it doesn't seem to hurt performance much and it will cause slightly less GC. 2012-09-23 06:06:12 +00:00
root e41cab04ca Avoid creating an extra buffer when saving a stream of values as DISK_ONLY 2012-09-23 05:56:44 +00:00
Denny afb7ccc838 HTTP File server fixes. 2012-09-21 10:58:13 -07:00
root 6d28dde370 Rename our toIterator method into asIterator to prevent confusion with the
Scala collection one, which often *copies* a collection.
2012-09-21 06:02:55 +00:00
root a642051ade Fixed a performance bug in BlockManager that was creating garbage when
returning deserialized, in-memory RDDs.
2012-09-21 05:42:21 +00:00
root 8feb5caacd Fixed an issue with ordering of classloader setup that was causing Java deserializer to break 2012-09-21 05:13:19 +00:00
Reynold Xin 6b5980da79 Set a limited number of retry in standalone deploy mode. 2012-09-19 15:41:56 -07:00
Reynold Xin 397d3816e1 Separated ShuffledRDD into multiple classes: RepartitionShuffledRDD,
ShuffledSortedRDD, and ShuffledAggregatedRDD.
2012-09-19 12:31:45 -07:00
Denny ca64d16a2d When a file is downloaded, make it executable. That's neccsary for scripts (e.g. in Shark) 2012-09-17 10:08:37 -07:00
Matei Zaharia 840cbcf849 Change default serializer to Java.. it had accidentally become Kryo. 2012-09-13 17:19:26 -07:00
Matei Zaharia b4dfa25c8a Store shuffle map outputs as DISK_ONLY 2012-09-12 16:05:57 -07:00
Matei Zaharia 2d761e3353 Ported performance and FT improvements from latest streaming work 2012-09-12 14:54:40 -07:00
Matei Zaharia 9b4cd1648b Fix bugs with Connection's shutdown callback failing to get its address 2012-09-12 14:54:14 -07:00
Matei Zaharia 9199775d41 Wait for Akka to really shut down in SparkEnv.stop() 2012-09-12 14:50:37 -07:00
Denny 5e4076e3f2 Merge branch 'dev' into feature/fileserver
Conflicts:
	core/src/main/scala/spark/SparkContext.scala
2012-09-11 16:57:17 -07:00
Denny 77873d2c8e Formatting 2012-09-11 16:51:46 -07:00
Denny 24b9b37314 Subclass URLClassLoader instead of using reflection 2012-09-11 16:51:08 -07:00
Denny 31c53e917d Use stageId as index for fileSet caches. 2012-09-11 16:10:45 -07:00
Matei Zaharia 943df48348 Merge branch 'dev' of github.com:mesos/spark into dev 2012-09-11 16:00:37 -07:00
Matei Zaharia 6d7f907e73 Manually merge pull request #175 by Imran Rashid 2012-09-11 16:00:06 -07:00
Reynold Xin 7af7c79ce5 Updated the logError call from the previous commit to conform to
logError API.
2012-09-11 14:32:24 -07:00
Reynold Xin 38b9119c96 Log entire exception (including stack trace) in BlockManagerWorker. 2012-09-11 11:31:35 -07:00
Denny 4d3471dd07 Fix serialization bugs and added local cluster tests 2012-09-10 15:39:58 -07:00
Denny b864c36a30 Dynamically adding jar files and caching fileSets. 2012-09-10 12:49:09 -07:00
Denny f275fb07da General FileServer
A general fileserver for both JARs and regular files.
2012-09-10 12:48:59 -07:00
Matei Zaharia a13780670d Added a unit test for local-cluster mode and simplified some of the code involved in that 2012-09-10 12:48:58 -07:00
Denny f2ac55840c Add shutdown hook to Executor Runner and execute code to shutdown local cluster in Scheduler Backend 2012-09-10 12:48:58 -07:00
Denny 9ead8ab14e Set SPARK_LAUNCH_WITH_SCALA=0 in Executor Runner 2012-09-10 12:48:58 -07:00
Denny 8bb3c73977 Renamed spark-cluster to spark-local. 2012-09-10 12:48:58 -07:00
Denny a367c20f49 Fix wrong counting 2012-09-10 12:48:57 -07:00
Denny 93fe331e6d Delete old DeployUtils. 2012-09-10 12:48:57 -07:00
Denny cf074f9c96 Renamed class. 2012-09-10 12:48:57 -07:00
Denny 3749f94184 Start a standalone cluster locally. 2012-09-10 12:48:57 -07:00
Matei Zaharia 995982b3c9 Added a unit test for local-cluster mode and simplified some of the code involved in that 2012-09-07 17:08:36 -07:00
Matei Zaharia 8d2fcc2832 Merge pull request #189 from dennybritz/feature/localcluster
Simulating a Spark standalone cluster locally
2012-09-07 15:43:43 -07:00
Denny 7ff9311add Add shutdown hook to Executor Runner and execute code to shutdown local cluster in Scheduler Backend 2012-09-07 14:09:12 -07:00
Denny 4e7b264cf7 Set SPARK_LAUNCH_WITH_SCALA=0 in Executor Runner 2012-09-07 11:39:44 -07:00
root c2da64409a Randomize the order of block fetches in getMultiple 2012-09-06 23:16:26 +00:00
Denny 886183e591 Renamed spark-cluster to spark-local. 2012-09-05 17:10:54 -07:00
Reynold Xin c308fbcb79 Removed cache add/remove log messages from CacheTracker.
Added log messages on BlockManagerMaster to reflect block add/remove.
Also did some minor cleanup of storage package code.
2012-09-05 15:59:48 -07:00
Denny babbca0a2f Fix wrong counting 2012-09-04 22:04:18 -07:00