ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Denny	22dde6e020	Start a standalone cluster locally.	2012-09-04 20:56:30 -07:00
Matei Zaharia	a842c63044	Minor formatting fixes	2012-09-03 16:24:00 -07:00
Harvey	3076b038f4	Start fetching a remote block when a received remote block has been passed to the reduce function	2012-09-01 12:01:35 -07:00
Matei Zaharia	389fb4cc54	End runJob() with a SparkException when a task fails too many times in one of the cluster schedulers.	2012-08-31 17:47:43 -07:00
Matei Zaharia	a480dec6b2	Deserialize multi-get results in the caller's thread. This fixes an issue with shared buffers in the KryoSerializer.	2012-08-30 20:01:06 -07:00
Reynold Xin	a8a2a08a1a	Added a test for testing map-side combine on/off switch.	2012-08-30 12:34:28 -07:00
Reynold Xin	5945bcdcc5	Added a new flag in Aggregator to indicate applying map side combiners.	2012-08-29 23:32:08 -07:00
Reynold Xin	c68e820b2a	Merge branch 'dev' of github.com:mesos/spark into dev	2012-08-29 23:01:19 -07:00
Reynold Xin	940869dfda	Disable running combiners on map tasks when mergeCombiners function is not specified by the user.	2012-08-29 23:00:02 -07:00
Matei Zaharia	bf2e9cb08e	Fault tolerance and block store fixes discovered through streaming tests.	2012-08-27 23:07:50 -07:00
Reynold Xin	3a6a95dc24	Removed the deserialization cache for ShuffleMapTask because it was causing concurrency problems (some variables in Shark get set to null). The cost of task deserialization on slaves is trivial compared with the execution time of the task anyway.	2012-08-27 22:33:15 -07:00
Matei Zaharia	2c16ae36d7	Set log level in tests to WARN	2012-08-23 20:38:14 -07:00
Matei Zaharia	deedb9e7b7	Fix further issues with tests and broadcast. The broadcast fix is to store values as MEMORY_ONLY_DESER instead of MEMORY_ONLY, which will save substantial time on serialization.	2012-08-23 20:31:49 -07:00
Matei Zaharia	59b831b9d1	Fixed test failures due to broadcast not stopping correctly	2012-08-23 19:59:55 -07:00
Matei Zaharia	7310a6f499	Merge pull request #147 from mosharaf/dev Broadcast refactoring/cleaning up	2012-08-23 19:38:28 -07:00
Matei Zaharia	25a6a39e6d	Added other SparkContext constructors to JavaSparkContext	2012-08-19 18:59:16 -07:00
Shivaram Venkataraman	0f4fbb057b	Change BlockManagerSuite test cases to use a deterministic size estimator and update the results to match the new estimates	2012-08-13 13:32:23 -07:00
Shivaram Venkataraman	22ba3a3f77	Add test-cases for 32-bit and no-compressed oops scenarios.	2012-08-13 13:32:10 -07:00
Shivaram Venkataraman	1f68c4b03b	Update test cases to match the new size estimates. Uses 64-bit and compressed oops setting to get deterministic results	2012-08-13 13:31:54 -07:00
Shivaram Venkataraman	1ea269110c	Move object size and pointer size initialization into a function to enable unit-testing	2012-08-13 13:31:45 -07:00
Shivaram Venkataraman	44661df9cc	If spark.test.useCompressedOops is set, use that to infer compressed oops setting. This is useful to get a deterministic test case	2012-08-13 13:31:39 -07:00
Shivaram Venkataraman	0dd8fe73ba	Use HotSpotDiagnosticMXBean to get if CompressedOops are in use or not	2012-08-13 13:31:29 -07:00
Shivaram Venkataraman	80104ce1da	Add link to Java wiki which specifies what changes with compressed oops	2012-08-13 13:31:21 -07:00
Shivaram Venkataraman	00ab5490b3	Changes to make size estimator more accurate. Fixes object size, pointer size according to architecture and also aligns objects and arrays when computing instance sizes. Verified using Eclipse Memory Analysis Tool (MAT)	2012-08-13 13:31:11 -07:00
Matei Zaharia	6ae3c375a9	Renamed apply() to call() in Java API and allowed it to throw Exceptions	2012-08-12 23:10:19 +02:00
Matei Zaharia	0141879c40	Use Promises instead of having a Future wait on a thread in ConnectionManager.	2012-08-12 22:16:32 +02:00
Matei Zaharia	845a870242	Return remotely fetched blocks in a pipelined fashion from BlockManager	2012-08-12 20:01:38 +02:00
Matei Zaharia	e17ed9a21d	Switch to Akka futures in connection manager. It's still not good because each Future ends up waiting on a lock, but it seems to work better than Scala Actors, and more importantly it allows us to use onComplete and other listeners on futures.	2012-08-12 19:40:37 +02:00
Matei Zaharia	ad8a7612a4	Changed multi-get method in BlockManager to return an iterator	2012-08-12 19:18:01 +02:00
Matei Zaharia	3c94e5c188	Merge pull request #168 from shivaram/dev Use JavaConversion to get a scala iterator	2012-08-10 00:57:33 -07:00
Matei Zaharia	e463e7a333	Merge pull request #167 from JoshRosen/piped-rdd-fixes Detect non-zero exit status from PipedRDD process	2012-08-10 00:56:42 -07:00
Josh Rosen	59c22fb444	Print exit status in PipedRDD failure exception.	2012-08-10 00:33:56 -07:00
Shivaram Venkataraman	1803cce692	Use an implicit conversion to get the scala iterator	2012-08-08 14:31:04 -07:00
Shivaram Venkataraman	674fcf56bf	Use JavaConversion to get a scala iterator	2012-08-08 14:10:23 -07:00
Shivaram Venkataraman	f4aaec7a48	Avoid a copy in ShuffleMapTask by creating an iterator that will be used by the block manager.	2012-08-08 00:47:02 -07:00
Mosharaf Chowdhury	d821dd3ccc	BroadcastManager is a class now (replaced Braodcast object)	2012-08-05 01:10:51 -07:00
Mosharaf Chowdhury	b4804119f9	Merge remote-tracking branch 'upstream/dev' into dev	2012-08-04 20:42:12 -07:00
Matei Zaharia	88b016db2a	Merge pull request #160 from dennybritz/clusterscripts Standalone cluster scripts	2012-08-04 17:45:20 -07:00
Mosharaf Chowdhury	1b0534af8f	Merge branch 'dev' into bc-bm	2012-08-04 00:30:08 -07:00
Mosharaf Chowdhury	d11b457e67	Merge remote-tracking branch 'upstream/dev' into dev	2012-08-04 00:28:10 -07:00
Mosharaf Chowdhury	24b7eb872c	Bug fixed. Broadcast now works with BlockManager.	2012-08-04 00:27:28 -07:00
Shivaram Venkataraman	ce3444d2cb	Fix testcheckpoint to reuse spark context defined in the class	2012-08-03 18:52:26 -07:00
Matei Zaharia	62898b631f	Made range partition balance tests more aggressive. This is because we pull out such a large sample (10x the number of partitions) that we should expect pretty good balance. The tests are also deterministic so there's no worry about them failing irreproducibly.	2012-08-03 16:46:48 -04:00
Matei Zaharia	6601a6212b	Added a unit test for cross-partition balancing in sort, and changes to RangePartitioner to make it pass. It turns out that the first partition was always kind of small due to how we picked partition boundaries.	2012-08-03 16:40:45 -04:00
Harvey	1170de3757	Fix for partitioning when sorting in descending order	2012-08-03 16:40:38 -04:00
Paul Cavallaro	d05c0f97ca	Logging Throwables in Info and Debug Logging Throwables in logInfo and logDebug instead of swallowing them. Conflicts: core/src/main/scala/spark/Logging.scala	2012-08-03 16:40:21 -04:00
Denny	0008994044	merged dev branch	2012-08-02 16:00:33 -07:00
Denny	53008c2d8a	Settings variables and bugfix for stop script.	2012-08-02 15:59:39 -07:00
Matei Zaharia	71a958b0b7	Merge branch 'dev' of github.com:mesos/spark into dev Conflicts: project/SparkBuild.scala	2012-08-02 17:23:13 -04:00
Denny	7312a5c30f	Use spray's implicit Marshaller for Futures.	2012-08-02 14:11:27 -07:00
Denny	ba7e30fb5e	Mostly stlyistic changes.	2012-08-02 13:55:09 -07:00
Shivaram Venkataraman	1a07bb9ba4	Avoid an extra partition copy by passing an iterator to blockManager.put	2012-08-02 12:22:33 -07:00
Shivaram Venkataraman	6790908b11	Use maxMemory to better estimate memory available for BlockManager cache	2012-08-02 12:05:05 -07:00
Denny	863c31b7c1	Moved resources into static folder	2012-08-02 09:48:36 -07:00
Denny	0ee44c225e	Spark standalone mode cluster scripts. Heavily inspired by Hadoop cluster scripts ;-)	2012-08-01 20:38:52 -07:00
Denny	6c670c37dd	Webui improvements.	2012-08-01 19:47:57 -07:00
Denny	1b29e90a79	merge dev branch	2012-08-01 14:06:09 -07:00
Denny	011220fa55	Compact job page.	2012-08-01 11:26:45 -07:00
Denny	7a295fee96	Spark WebUI Implementation.	2012-08-01 11:01:09 -07:00
Mosharaf Chowdhury	f23395e8c5	Merge remote-tracking branch 'upstream/dev' into dev	2012-07-30 19:39:49 -07:00
Matei Zaharia	3ee2530c0c	Merge branch 'block-manager-fix' into dev	2012-07-30 13:58:46 -07:00
Matei Zaharia	400221f851	Merge branch 'dev' of git://github.com/tdas/spark into dev	2012-07-30 13:54:57 -07:00
Matei Zaharia	ed1b0f8388	Made BlockManagerMaster no longer be a singleton. Also cleaned up a few formatting things throughout block manager code.	2012-07-30 13:53:47 -07:00
Matei Zaharia	f471c82558	Various reorganization and formatting fixes	2012-07-30 11:24:01 -07:00
Mosharaf Chowdhury	5932a87cac	Merge remote-tracking branch 'upstream/dev' into dev	2012-07-29 18:20:45 -07:00
Matei Zaharia	d7f089323a	Fixed AccumulatorSuite to clean up SparkContext with BeforeAndAfter	2012-07-28 20:25:42 -07:00
Imran Rashid	f7149c5e46	tasks cannot access value of accumulator	2012-07-28 20:16:17 -07:00
Imran Rashid	244cbbe33a	one more minor cleanup to scaladoc	2012-07-28 20:16:10 -07:00
Imran Rashid	3b392c67db	fix up scaladoc, naming of type parameters	2012-07-28 20:16:01 -07:00
Imran Rashid	f1face1ea9	rename addToAccum to addAccumulator	2012-07-28 20:16:01 -07:00
Imran Rashid	2d666b9d76	add some functionality to Vector, delete copy in AccumulatorSuite	2012-07-28 20:15:51 -07:00
Imran Rashid	edc6972f8e	move Vector class into core and spark.util package	2012-07-28 20:15:42 -07:00
Imran Rashid	83659af11c	Accumulator now inherits from Accumulable, whcih simplifies a bunch of other things (eg., no +:=) Conflicts: core/src/main/scala/spark/Accumulators.scala	2012-07-28 20:13:51 -07:00
Imran Rashid	79d58ed20a	improve scaladoc	2012-07-28 20:12:41 -07:00
Imran Rashid	ae07f3864c	add Accumulatable, add corresponding docs & tests for accumulators	2012-07-28 20:12:41 -07:00
Matei Zaharia	f6f917bd00	Add a sleep to prevent a failing test. The BlockManager's put seems to be slightly asynchronous, which can cause it to fail this test by not removing stuff from the cache before we put the next value. We should probably change the semantics of put() in this case but it's hard right now. It will also be hard for asynchronously replicated puts.	2012-07-27 16:59:36 -07:00
Matei Zaharia	c0c78d2119	Renamed test more descriptively	2012-07-27 16:28:18 -07:00
Matei Zaharia	dee8ff1b9d	Added a second version of union() without varargs.	2012-07-27 16:27:52 -07:00
Tathagata Das	cf429699e1	Updated the new checkpoint RDD to remember partitioning of the original RDD.	2012-07-27 23:16:37 +00:00
Mosharaf Chowdhury	b5be936d7c	Broadcasts using BlockManager instead of BoundedMemoryCache	2012-07-27 15:38:46 -07:00
Mosharaf Chowdhury	1f19fbb8db	Merge remote-tracking branch 'upstream/dev' into dev Conflicts: core/src/main/scala/spark/broadcast/Broadcast.scala	2012-07-27 15:18:23 -07:00
Matei Zaharia	b51d733a57	Fixed Java union methods having same erasure. Changed union() methods on lists to take a separate "first element" argument in order to differentiate them to the compiler, because Java 7 considered it an error to have them all take Lists parameterized with different types.	2012-07-27 12:23:27 -07:00
Tathagata Das	3e271c3b61	Merge branch 'dev' of github.com:tdas/spark into dev	2012-07-27 12:01:04 -07:00
Tathagata Das	024905f682	Added BlockRDD and a first-cut version of checkpoint() to RDD class.	2012-07-27 12:00:49 -07:00
Tathagata Das	d1eee44a03	Fixed more stuff in BoundedMemoryCache.	2012-07-27 18:33:32 +00:00
Tathagata Das	d1b7f41671	Fixed bug in BoundedMemoryCache.	2012-07-27 09:00:45 -07:00
Tathagata Das	435d129bec	Fixed bugs in block dropping code of MemoryStore and changed synchronized HashMap to ConcurrentHashMap in BlockManager.	2012-07-27 10:02:26 +00:00
Tathagata Das	0426769f89	Modified the block dropping code for better performance.	2012-07-26 20:53:45 -07:00
Matei Zaharia	5c5aa2ff81	Merge pull request #153 from JoshRosen/new-java-api Java API	2012-07-26 17:20:52 -07:00
Josh Rosen	c5e2810dc7	Add persist(), splits(), glom(), and mapPartitions() to Java API.	2012-07-26 12:46:47 -07:00
Josh Rosen	bf61c10072	Detect non-zero exit status from PipedRDD process.	2012-07-26 11:32:59 -07:00
Josh Rosen	6a78e88237	Minor cleanup and optimizations in Java API. - Add override keywords. - Cache RDDs and counts in TC example. - Clean up JavaRDDLike's abstract methods.	2012-07-24 09:47:00 -07:00
Denny	4f4a34c025	Stlystic changes Conflicts: core/src/test/scala/spark/MesosSchedulerSuite.scala	2012-07-23 16:32:20 -07:00
Denny	866e6949df	Always destroy SparkContext in after block for the unit tests. Conflicts: core/src/test/scala/spark/ShuffleSuite.scala	2012-07-23 16:29:17 -07:00
Matei Zaharia	600e99728d	Fix a bug where an input path was added to a Hadoop job configuration twice	2012-07-23 16:16:19 -07:00
Josh Rosen	042dcbde33	Add type annotations to Java API methods. Add missing Scala Map to java.util.Map conversions.	2012-07-22 17:35:29 -07:00
Josh Rosen	e23938c3be	Use mapValues() in JavaPairRDD.cogroupResultToJava().	2012-07-22 15:10:01 -07:00
Josh Rosen	01dce3f569	Add Java API Add distinct() method to RDD. Fix bug in DoubleRDDFunctions.	2012-07-18 17:34:29 -07:00
Mosharaf Chowdhury	85cd9979f2	Fix for isLocal	2012-07-13 01:13:14 -07:00
Mosharaf Chowdhury	1c83fd4b66	Merged with Upstream dev	2012-07-13 01:08:28 -07:00

1 2 3 4 5 ...

460 commits