ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
jerryshao	e4ff544a8d	Clean StageToInfos periodically when spark.cleaner.ttl is enabled	2013-07-05 10:34:45 +08:00
Lian Cheng	c0c3155c3c	Bug fix: SPARK-789 https://spark-project.atlassian.net/browse/SPARK-789	2013-07-05 00:54:10 +08:00
Holden Karau	0f06d6217d	s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning	2013-07-04 01:05:39 -07:00
Gavin Li	94238aae57	fix dependencies	2013-07-03 18:08:38 +00:00
Gavin Li	96130c30d9	add compression codec trait and snappy compression	2013-07-03 05:49:04 +00:00
$Y.CORP.YAHOO.COM\tgraves$ Y.CORP.YAHOO.COM\tgraves	923cf92900	Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes to only add the credentials when the profile is hadoop2-yarn.	2013-07-02 21:18:59 -05:00
Patrick Wendell	39e2325675	Removing dead code	2013-07-02 16:28:40 -07:00
Patrick Wendell	8ca1cc1786	Adding truncation for log files	2013-07-02 16:10:50 -07:00
Patrick Wendell	9a42d04efa	Throw exception for missing resource	2013-07-01 14:43:13 -07:00
Patrick Wendell	1025d7d1ef	Package refactoring	2013-07-01 14:40:53 -07:00
Patrick Wendell	30b9034241	Fixing bug where logs aren't shown	2013-07-01 13:48:01 -07:00
Patrick Wendell	8688689387	Various formatting changes	2013-07-01 13:40:12 -07:00
Patrick Wendell	735c951a09	Adding test script	2013-07-01 09:33:22 -07:00
Patrick Wendell	5de326db7d	Print exception message	2013-07-01 09:19:45 -07:00
root	ec31e68d5d	Fixed PySpark perf regression by not using socket.makefile(), and improved debuggability by letting "print" statements show up in the executor's stderr Conflicts: core/src/main/scala/spark/api/python/PythonRDD.scala	2013-07-01 06:26:31 +00:00
root	3296d132b6	Fix performance bug with new Python code not using buffered streams	2013-07-01 06:25:43 +00:00
Matei Zaharia	03d0b858c8	Made use of spark.executor.memory setting consistent and documented it Conflicts: core/src/main/scala/spark/SparkContext.scala	2013-06-30 15:46:46 -07:00
Patrick Wendell	e721ff7e5a	Allowing details for failed stages	2013-06-29 11:26:30 -07:00
Patrick Wendell	473961d82e	Styling for progress bar	2013-06-29 08:38:04 -07:00
Patrick Wendell	249f0e54ba	Minor changes from Matei's review	2013-06-28 13:25:26 -07:00
Patrick Wendell	62c2c6b856	Forcing Jetty to run as daemon	2013-06-27 21:47:22 -07:00
Patrick Wendell	a55190d314	Adding better tabs for UI headers.	2013-06-27 19:14:51 -07:00
Patrick Wendell	362d996c81	Handful of changes based on matei's review - Avoid exception when no tasks have finished for a stage - Adding DOCTYPE so css renders properly - Adding progress slider	2013-06-27 19:14:28 -07:00
Patrick Wendell	92a4c2a5f6	Fixing bug in local scheduler time recording	2013-06-27 12:33:06 -07:00
Stephen Haberman	d7011632d1	Wrap lines.	2013-06-26 12:35:57 -05:00
Patrick Wendell	ee692482a6	One more private class	2013-06-26 09:07:32 -07:00
Patrick Wendell	a59c15a37e	Adding config option for retained stages	2013-06-26 08:54:57 -07:00
Patrick Wendell	274193664a	Bumping timeouts	2013-06-26 08:51:28 -07:00
Patrick Wendell	b14ad509ba	Moving static ui package	2013-06-26 08:46:51 -07:00
Patrick Wendell	2cbaa0734b	Making all new classes package private	2013-06-26 08:44:55 -07:00
Stephen Haberman	d11025dc6a	Be cute with Option and getenv.	2013-06-26 09:53:35 -05:00
Matei Zaharia	6c8d1b2ca6	Fix computation of classpath when we launch java directly The previous version assumed that a CLASSPATH environment variable was set by the "run" script when launching the process that starts the ExecutorRunner, but unfortunately this is not true in tests. Instead, we factor the classpath calculation into an extenral script and call that. NOTE: This includes a Windows version but hasn't yet been tested there.	2013-06-25 18:21:00 -04:00
Matei Zaharia	15b00914c5	Some fixes to the launch-java-directly change: - Split SPARK_JAVA_OPTS into multiple command-line arguments if it contains spaces; this splitting follows quoting rules in bash - Add the Scala JARs to the classpath if they're not in the CLASSPATH variable because the ExecutorRunner is launched with "scala" (this can happen when using local-cluster URLs in spark-shell)	2013-06-25 17:17:27 -04:00
Matei Zaharia	7e0191c6ea	Merge remote-tracking branch 'cgrothaus/SPARK-698' Conflicts: run	2013-06-25 15:47:40 -04:00
Patrick Wendell	d66bd6f885	Adding another unit test to Web UI suite	2013-06-24 17:12:55 -07:00
Patrick Wendell	f7389330c3	Allowing for requested port on construction	2013-06-24 16:51:52 -07:00
Patrick Wendell	42157027f2	A few bug fixes and a unit test	2013-06-24 16:25:05 -07:00
Patrick Wendell	a4248138b4	Minor style cleanup	2013-06-24 14:22:28 -07:00
Patrick Wendell	b5e6e8bcc8	Cleaning up some code for Job Progress	2013-06-24 14:13:24 -07:00
Patrick Wendell	93e8ed85aa	Work around for initalization issue	2013-06-24 13:11:18 -07:00
Patrick Wendell	f6e64b5cd6	Updating based on changes to JobLogger (and one small change to JobLogger)	2013-06-24 12:40:41 -07:00
Matei Zaharia	78ffe164b3	Clone the zero value for each key in foldByKey The old version reused the object within each task, leading to overwriting of the object when a mutable type is used, which is expected to be common in fold. Conflicts: core/src/test/scala/spark/ShuffleSuite.scala	2013-06-23 10:26:53 -07:00
Matei Zaharia	0e0f9d3069	Fix search path for REPL class loader to really find added JARs	2013-06-22 17:44:04 -07:00
Matei Zaharia	3e61beff7b	Merge pull request #648 from shivaram/netty-dbg Shuffle fixes and cleanup	2013-06-22 16:22:47 -07:00
Patrick Wendell	7e9f1ed0de	Some cleanup of styling	2013-06-22 10:31:37 -07:00
Patrick Wendell	3b7ebdeeb8	Handling entirely failed stages	2013-06-22 10:31:37 -07:00
Patrick Wendell	be6107ce44	Some tweaking with shared page header	2013-06-22 10:31:37 -07:00
Patrick Wendell	9a24d1a2d0	Using scala in XML imports	2013-06-22 10:31:37 -07:00
Patrick Wendell	f91e1c4822	Linking RDD information when available in stages	2013-06-22 10:31:37 -07:00
Patrick Wendell	a86bb459e2	Showing shuffle status and purging old stages	2013-06-22 10:31:37 -07:00
Patrick Wendell	3485e73376	Style cleanup	2013-06-22 10:31:37 -07:00
Patrick Wendell	dd696f3a3d	Some renaming and comments	2013-06-22 10:31:37 -07:00
Patrick Wendell	5c872e9ef5	Documentation and some refactoring	2013-06-22 10:31:37 -07:00
Patrick Wendell	17776323a6	More work on percentile data:	2013-06-22 10:31:37 -07:00
Patrick Wendell	dcf6a68177	Refactoring into different modules	2013-06-22 10:31:36 -07:00
Patrick Wendell	ce81c320ac	Adding helper function to make listing tables	2013-06-22 10:31:36 -07:00
Patrick Wendell	9fd5dc3ea9	Initial steps towards job progress UI	2013-06-22 10:31:36 -07:00
Patrick Wendell	bc4a811c57	Stash	2013-06-22 10:31:36 -07:00
Patrick Wendell	77c53f7868	Refactoring UI packages	2013-06-22 10:31:36 -07:00
Patrick Wendell	8b5c7e71c4	Import cleanup	2013-06-22 10:31:36 -07:00
Patrick Wendell	32a45d01b1	Removing twirl files	2013-06-22 10:31:36 -07:00
Patrick Wendell	4e1f202481	Removing dead code	2013-06-22 10:31:36 -07:00
Patrick Wendell	d6fde4ffe4	Some JSON cleanup	2013-06-22 10:31:36 -07:00
Patrick Wendell	91ec5a1a04	Changing JSON protocol and removing spray code	2013-06-22 10:31:36 -07:00
Patrick Wendell	fc94576ece	Adding worker version of UI	2013-06-22 10:31:36 -07:00
Patrick Wendell	ee73c09ac9	Some comments	2013-06-22 10:31:36 -07:00
Patrick Wendell	9161db5478	Cleaning up master web UI	2013-06-22 10:31:36 -07:00
Patrick Wendell	e55cf0245f	Adding WebUI file	2013-06-22 10:31:35 -07:00
Patrick Wendell	f85fd7a793	Commenting unfinished part	2013-06-22 10:31:35 -07:00
Patrick Wendell	2c36a514aa	Spray refactoring for master web UI	2013-06-22 10:31:35 -07:00
Patrick Wendell	7e6977b6c5	Fix in storage status page	2013-06-22 10:31:35 -07:00
Patrick Wendell	950f83535a	Adding deterministic port	2013-06-22 10:31:35 -07:00
Patrick Wendell	7cd70dc2c1	Minor cleanup	2013-06-22 10:31:35 -07:00
Patrick Wendell	e66f570194	Completely hacked version of block manager UI in jetty	2013-06-22 10:31:35 -07:00
Patrick Wendell	60fbf7e461	Partially working checkpoint	2013-06-22 10:31:35 -07:00
Matei Zaharia	1ef5d0d2c9	Merge pull request #644 from shimingfei/joblogger add Joblogger to Spark (on new Spark code)	2013-06-22 09:35:57 -07:00
Jey Kottalam	1ba3c17303	use parens when calling method with side-effects	2013-06-21 12:14:16 -04:00
Jey Kottalam	edb18ca928	Rename PythonWorker to PythonWorkerFactory	2013-06-21 12:14:16 -04:00
Jey Kottalam	62c4781400	Add tests and fixes for Python daemon shutdown	2013-06-21 12:14:16 -04:00
Jey Kottalam	c79a6078c3	Prefork Python worker processes	2013-06-21 12:14:16 -04:00
Jey Kottalam	40afe0d2a5	Add Python timing instrumentation	2013-06-21 12:14:16 -04:00
Mingfei	2fc794a6c7	small modify in DAGScheduler	2013-06-21 18:21:35 +08:00
Mingfei	4b9862ac9c	small format modification	2013-06-21 17:55:32 +08:00
Mingfei	aa7aa587be	some format modification	2013-06-21 17:48:41 +08:00
Mingfei	5240795154	edit according to comments	2013-06-21 17:38:23 +08:00
Matei Zaharia	71030ba3eb	Merge pull request #654 from lyogavin/enhance_pipe fix typo and coding style in #638	2013-06-19 15:21:03 -07:00
Thomas Graves	bad51c7cb4	upmerge with latest mesos/spark master and fix hbase compile with hadoop2-yarn profile	2013-06-19 14:39:13 -05:00
Thomas Graves	75d78c7ac9	Add support for Spark on Yarn on a secure Hadoop cluster	2013-06-19 11:18:42 -05:00
Matei Zaharia	7902baddc7	Update ASM to version 4.0	2013-06-19 13:34:30 +02:00
Gavin Li	0a2a9bce1e	fix typo and coding style	2013-06-18 21:30:13 +00:00
jerryshao	1e9269c3ee	reduce ZippedPartitionsRDD's getPreferredLocations complexity	2013-06-18 09:49:06 +08:00
Matei Zaharia	db42451a52	Merge pull request #643 from adatao/master Bug fix: Zero-length partitions result in NaN for overall mean & variance	2013-06-17 15:26:36 -07:00
Matei Zaharia	e82a2ffcc9	Merge pull request #653 from rxin/logging SPARK-781: Log the temp directory path when Spark says "Failed to create temp directory."	2013-06-17 15:13:15 -07:00
Matei Zaharia	ec193c7d89	Merge remote-tracking branch 'xiajunluan/xiajunluan' Conflicts: core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala	2013-06-18 00:11:50 +02:00
Reynold Xin	be3c406edf	Fixed the typo pointed out by Matei.	2013-06-17 17:07:51 -04:00
Reynold Xin	1450296797	SPARK-781: Log the temp directory path when Spark says "Failed to create temp directory".	2013-06-17 16:58:23 -04:00
Gavin Li	4508089fc3	refine comments and add sc.clean	2013-06-17 05:23:46 +00:00
Gavin Li	e6ae049283	Merge remote-tracking branch 'upstream1/master' into enhance_pipe	2013-06-16 22:53:39 +00:00
Gavin Li	fb6d733fa8	update according to comments	2013-06-16 22:32:55 +00:00
Matei Zaharia	f961aac8b2	Merge pull request #649 from ryanlecompte/master Add top K method to RDD using a bounded priority queue	2013-06-15 00:53:41 -07:00
ryanlecompte	e8801d4490	use delegation for BoundedPriorityQueue, add Java API	2013-06-14 23:39:05 -07:00
Reynold Xin	2cc188fd54	SPARK-774: cogroup should also disable map side combine by default	2013-06-14 00:10:54 -07:00
Reynold Xin	6738178d0d	SPARK-772: groupByKey should disable map side combine.	2013-06-13 23:59:42 -07:00
ryanlecompte	93b3f5e535	drop unneeded ClassManifest implicit	2013-06-13 16:26:35 -07:00
ryanlecompte	44b8dbaede	use Iterator.single(elem) instead of Iterator(elem) for improved performance based on scaladocs	2013-06-13 16:23:15 -07:00
Shivaram Venkataraman	1d9f0df065	Fix some comments and style	2013-06-13 14:46:25 -07:00
Mingfei	967a6a699d	modify sparklister function interface according to comments	2013-06-13 14:36:07 +08:00
Shivaram Venkataraman	5da4287b1d	Merge branch 'netty-dbg' of github.com:shivaram/spark into netty-dbg	2013-06-12 16:38:37 -07:00
Shivaram Venkataraman	5e9a9317c5	Merge branch 'master' of git://github.com/mesos/spark into netty-dbg	2013-06-12 16:38:01 -07:00
ryanlecompte	db5bca08ff	add a new top K method to RDD using a bounded priority queue	2013-06-12 10:54:16 -07:00
Andrew xia	190ec61799	change code style and debug info	2013-06-10 15:27:02 +08:00
Patrick Wendell	ef14dc2e77	Adding Java-API version of compression codec	2013-06-09 18:09:46 -07:00
Patrick Wendell	df592192e7	Monads FTW	2013-06-09 18:09:24 -07:00
Patrick Wendell	d1bbcebae5	Adding compression to Hadoop save functions	2013-06-09 11:39:35 -07:00
Mingfei	ade822011d	not check return value of eventQueue.take	2013-06-08 16:26:45 +08:00
Mingfei	4fd86e0e10	delete test code for joblogger in SparkContext	2013-06-08 15:45:47 +08:00
Mingfei	362f0f93ac	Merge branch 'master' of https://github.com/mesos/spark	2013-06-08 15:20:13 +08:00
Mingfei	1a4d93c025	modify to pass job annotation by localProperties and use daeamon thread to do joblogger's work	2013-06-08 14:23:39 +08:00
Matei Zaharia	b58a29295b	Small formatting and style fixes	2013-06-07 22:51:28 -07:00
Matei Zaharia	c8fc423bc2	Merge pull request #631 from jerryshao/master Fix block manager UI display issue when enable spark.cleaner.ttl	2013-06-07 22:43:18 -07:00
Matei Zaharia	c9ca0a4a58	Small code style fix to SchedulingAlgorithm.scala	2013-06-07 22:40:44 -07:00
Matei Zaharia	1ae60bcb36	Merge pull request #634 from xiajunluan/master [Spark-753] Fix ClusterSchedulSuite unit test failed	2013-06-07 22:39:06 -07:00
Shivaram Venkataraman	ac480fd977	Clean up variables and counters in BlockFetcherIterator	2013-06-06 16:34:27 -07:00
Gavin Li	e179ff8a32	update according to comments	2013-06-05 22:41:05 +00:00
Shivaram Venkataraman	cb2f5046ee	Pass in bufferSize to BufferedOutputStream	2013-06-05 15:09:02 -07:00
Shivaram Venkataraman	c851957fe4	Don't write zero block files with java serializer	2013-06-05 14:28:38 -07:00
Christopher Nguyen	9d35904357	In the current code, when both partitions happen to have zero-length, the return mean will be NaN. Consequently, the result of mean after reducing over all partitions will also be NaN, which is not correct if there are partitions with non-zero length. This patch fixes this issue.	2013-06-04 22:12:47 -07:00
Matei Zaharia	fff3728552	Merge pull request #640 from pwendell/timeout-update Fixing bug in BlockManager timeout	2013-06-04 16:09:50 -07:00
Patrick Wendell	061fd3ae36	Fixing bug in BlockManager timeout	2013-06-04 19:02:44 -04:00
Matei Zaharia	f420d4f228	Merge pull request #639 from pwendell/timeout-update Bump akka and blockmanager timeouts to 60 seconds	2013-06-04 15:25:58 -07:00
Patrick Wendell	8bd4e12104	Bump akka and blockmanager timeouts to 60 seconds	2013-06-04 18:14:24 -04:00
Shivaram Venkataraman	96943a1cc0	var to val	2013-06-03 12:29:38 -07:00
Shivaram Venkataraman	cd347f547a	Reuse the file object as it is valid after delete	2013-06-03 12:27:51 -07:00
Shivaram Venkataraman	a058b0acf3	Delete a file for a block if it already exists.	2013-06-03 12:10:00 -07:00
Andrew xia	606bb1b450	Fix schedulingAlgorithm bugs for unit test	2013-06-03 10:29:23 +08:00
Shivaram Venkataraman	038cfc1a9a	Make connect timeout configurable	2013-05-31 23:32:18 -07:00
Shivaram Venkataraman	91aca92249	Another round of Netty fixes. 1. Avoid race condition between stop and copier completion 2. Handle socket exceptions by reporting them and filling in a failed FetchResult	2013-05-31 23:21:38 -07:00
Gavin Li	9f84315c05	enhance pipe to support what we can do in hadoop streaming	2013-06-01 00:26:10 +00:00
Reynold Xin	de1167bf2c	Incorporated Charles' feedback to put rdd metadata removal in BlockManagerMasterActor.	2013-05-31 15:54:57 -07:00
Reynold Xin	ba5e544461	More block manager cleanup. Implemented a removeRdd method in BlockManager, and use that to implement RDD.unpersist. Previously, unpersist needs to send B akka messages, where B = number of blocks. Now unpersist only needs to send W akka messages, where W = the number of workers.	2013-05-31 01:48:16 -07:00
jerryshao	926f41cc52	fix block manager UI display issue when enable spark.cleaner.ttl	2013-05-31 09:32:52 +08:00
Reynold Xin	bed1b08169	Do not create symlink for local add file. Instead, copy the file. This prevents Spark from changing the original file's permission, and also allow add file to work on non-posix operating systems.	2013-05-30 16:21:49 -07:00
Shivaram Venkataraman	3b0cd17343	Merge branch 'master' of git://github.com/mesos/spark Conflicts: core/src/test/scala/spark/ShuffleSuite.scala	2013-05-30 14:36:24 -07:00
Andrew xia	c3db3ea554	1. Add unit test for local scheduler 2. Move localTaskSetManager to a new file	2013-05-30 20:49:40 +08:00
Andrew xia	ecceb101d3	implement FIFO and fair scheduler for spark local mode	2013-05-30 10:43:01 +08:00
Shivaram Venkataraman	19fd6d54c0	Also flush serializer in revertPartialWrites	2013-05-29 17:29:34 -07:00
Shivaram Venkataraman	618c8cae1e	Skip fetching zero-sized blocks in OIO. Also unify splitLocalRemoteBlocks for netty/nio and add a test case	2013-05-29 13:18:54 -07:00
Matei Zaharia	6ed71390d9	Merge pull request #626 from stephenh/remove-add-if-no-port Remove unused addIfNoPort.	2013-05-29 10:14:22 -07:00
Shivaram Venkataraman	b79b10a6d6	Flush serializer to fix zero-size kryo blocks bug. Also convert the local-cluster test case to check for non-zero block sizes	2013-05-29 00:52:55 -07:00
Matei Zaharia	41d230ccb0	Merge pull request #611 from squito/classloader Use default classloaders for akka & deserializing task results	2013-05-28 23:35:24 -07:00
Stephen Haberman	4fe1fbdd51	Remove unused addIfNoPort.	2013-05-28 16:26:32 -05:00
Matei Zaharia	3db1e17baa	Merge pull request #620 from jerryshao/master Fix CheckpointRDD java.io.FileNotFoundException when calling getPreferredLocations	2013-05-27 21:31:43 -07:00
Matei Zaharia	e8d4b6c296	Merge pull request #529 from xiajunluan/master [SPARK-663]Implement Fair Scheduler in Spark Cluster Scheduler	2013-05-25 21:09:03 -07:00
Reynold Xin	26962c9340	Automatically configure Netty port. This makes unit tests using local-cluster pass. Previously they were failing because Netty was trying to bind to the same port for all processes. Pair programmed with @shivaram.	2013-05-24 16:39:33 -07:00
Reynold Xin	6ea085169d	Fixed the bug that shuffle serializer is ignored by the new shuffle block iterators for local blocks. Also added a unit test for that.	2013-05-24 14:08:37 -07:00
jerryshao	bd3ea8f2a6	fix CheckpointRDD getPreferredLocations java.io.FileNotFoundException	2013-05-24 14:26:19 +08:00
Charles Reiss	f350f14084	Use ARRAY_SAMPLE_SIZE constant instead of 100.0	2013-05-21 18:11:33 -07:00
Andrew xia	ecd6d75c6a	fix bug of unit tests	2013-05-21 06:49:23 +08:00
Reynold Xin	5912cc4967	Merge pull request #610 from JoshRosen/spark-747 Throw exception if TaskResult exceeds Akka frame size	2013-05-17 19:58:40 -07:00
Reynold Xin	8d78c5f89f	Changed the logging level from info to warning when addJar(null) is called.	2013-05-17 18:51:35 -07:00
Andrew xia	3d4672eaa9	Merge branch 'master' into xiajunluan Conflicts: core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/scheduler/cluster/ClusterScheduler.scala core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala	2013-05-18 07:28:03 +08:00
Andrew xia	d19753b9c7	expose TaskSetManager type to resourceOffer function in ClusterScheduler	2013-05-18 06:45:19 +08:00
Andrew xia	c6e2770bfe	Fix ClusterScheduler bug to avoid allocating tasks to same slave	2013-05-17 05:10:38 +08:00
Mridul Muralidharan	f0881f8d48	Hope this does not turn into a bike shed change	2013-05-17 01:58:50 +05:30
Mridul Muralidharan	feddd2530d	Filter out nulls - prevent NPE	2013-05-16 17:49:14 +05:30
Josh Rosen	b8e46b6074	Abort job if result exceeds Akka frame size; add test.	2013-05-16 01:57:57 -07:00
Matei Zaharia	2f576aba8f	Merge pull request #602 from rxin/shufflemerge Manual merge & cleanup of Shane's Shuffle Performance Optimization	2013-05-15 18:06:24 -07:00
Reynold Xin	203d7b7c14	Merge pull request #593 from squito/driver_ui_link Master UI has link to Application UI	2013-05-15 00:47:20 -07:00
Reynold Xin	f3491cb89b	Merge branch 'master' of github.com:mesos/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/test/scala/spark/DistributedSuite.scala project/SparkBuild.scala	2013-05-15 00:31:52 -07:00
Reynold Xin	f9d40a5848	Added a comment in JdbcRDD for example usage.	2013-05-14 23:29:57 -07:00
Reynold Xin	81ad2fa331	Merge branch 'jdbc' of github.com:koeninger/spark Conflicts: project/SparkBuild.scala	2013-05-14 23:12:00 -07:00
Imran Rashid	38d4b97c6d	use threads classloader when deserializing task results; classnotfoundexception includes classloader	2013-05-14 22:32:14 -07:00
Imran Rashid	d7d1da79d3	when akka starts, use akkas default classloader (current thread)	2013-05-14 22:32:09 -07:00
Matei Zaharia	016ac86830	Merge pull request #601 from rxin/emptyrdd-master EmptyRDD (master branch 0.8)	2013-05-13 21:45:36 -07:00
Matei Zaharia	4b354e0a08	Merge pull request #589 from mridulm/master Add support for instance local scheduling	2013-05-13 17:39:19 -07:00
Patrick Wendell	7f0833647b	Capturing class name	2013-05-12 07:54:03 -07:00
Patrick Wendell	72b9c4cb6e	Small fix	2013-05-11 23:53:50 -07:00
Patrick Wendell	1c15b85051	Removing import	2013-05-11 23:52:53 -07:00
Patrick Wendell	059ab88754	Changing technique to use same code path in all cases	2013-05-11 23:50:54 -07:00
Cody Koeninger	3da2305ed0	code cleanup per rxin comments	2013-05-11 23:59:07 -05:00
Josh Rosen	440719109e	Throw exception if task result exceeds Akka frame size. This partially addresses SPARK-747.	2013-05-11 19:17:13 -07:00
Patrick Wendell	0345954530	SPARK-738: Spark should detect and squash nonserializable exceptions	2013-05-11 14:17:09 -07:00
Mark Hamstra	6e6b3e0d7e	Actually use the cleaned closure in foreachPartition	2013-05-10 13:02:34 -07:00
Imran Rashid	0ab818d508	fix linebreak	2013-05-09 00:38:59 -07:00
Reynold Xin	5d70ee4663	Cleaned up connection manager (moved many classes to their own files).	2013-05-07 22:42:15 -07:00
Reynold Xin	8388e8dd7a	Minor style fix in DiskStore...	2013-05-07 18:40:35 -07:00
Reynold Xin	547dcbe494	Cleaned up Scala files in network/netty from Shane's PR.	2013-05-07 18:39:33 -07:00
Reynold Xin	9e64396ca4	Cleaned up the Java files from Shane's PR.	2013-05-07 18:30:54 -07:00
Reynold Xin	0e5cc30868	Cleaned up BlockManager and BlockFetcherIterator from Shane's PR.	2013-05-07 18:18:24 -07:00
Reynold Xin	8b79485171	Moved BlockFetcherIterator to its own file.	2013-05-07 17:02:32 -07:00
Reynold Xin	90577ada69	Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/DiskStore.scala project/SparkBuild.scala	2013-05-07 15:56:19 -07:00
Reynold Xin	0fd84965f6	Added EmptyRDD.	2013-05-06 15:40:34 -07:00
Imran Rashid	22a5063ae4	switch from separating appUI host & port to combining into just appUiUrl	2013-05-05 12:19:11 -07:00
Matei Zaharia	7af92f248b	Merge pull request #597 from JoshRosen/webui-fixes Two minor bug fixes for Spark Web UI	2013-05-04 22:29:17 -07:00
Josh Rosen	42b1953c53	Fix SPARK-630: app details page shows finished executors as running.	2013-05-04 18:34:47 -07:00
Josh Rosen	d48e9fde01	Fix SPARK-629: weird number of cores in job details page.	2013-05-04 18:34:45 -07:00
Mridul Muralidharan	25198d7e9e	Merge branch 'master' of github.com:mridulm/spark	2013-05-04 20:45:56 +05:30
Mridul Muralidharan	5b011d18d7	Merge from master	2013-05-04 20:41:27 +05:30
Mridul Muralidharan	edb57c8331	Add support for instance local in getPreferredLocations of ZippedPartitionsBaseRDD. Add comments to both ZippedPartitionsBaseRDD and ZippedRDD to better describe the potential problem with the approach	2013-05-04 19:47:45 +05:30
Matei Zaharia	3bf2c868c3	Merge pull request #594 from shivaram/master Add zip partitions to Java API	2013-05-03 18:27:30 -07:00
Shivaram Venkataraman	bb8a434f9d	Add zipPartitions to Java API.	2013-05-03 15:14:02 -07:00
Imran Rashid	6fae936088	applications (aka drivers) send their webUI address to master when registering so it can be displayed in the master web ui	2013-05-03 12:59:10 -07:00
Mridul Muralidharan	ea2a6f91d3	pull from master	2013-05-04 00:35:59 +05:30
Reynold Xin	93091f6936	Merge branch 'master' of github.com:mesos/spark into blockmanager	2013-05-03 01:02:32 -07:00
Reynold Xin	2bc895a829	Updated according to Matei's code review comment.	2013-05-03 01:02:16 -07:00
Mridul Muralidharan	11589c39d9	Fix ZippedRDD as part Matei's suggestion	2013-05-03 12:23:30 +05:30
Matei Zaharia	6fe9d4e61e	Merge pull request #592 from woggling/localdir-fix Don't accept generated local directory names that can't be created	2013-05-02 21:33:56 -07:00
Matei Zaharia	538ee755b4	Merge pull request #581 from jerryshao/master fix [SPARK-740] block manage UI throws exception when enabling Spark Streaming	2013-05-02 09:01:42 -07:00
Charles Reiss	c847dd3da2	Don't accept generated temp directory names that can't be created successfully.	2013-05-01 23:19:10 -07:00
Reynold Xin	4a31877408	Added the unpersist api to JavaRDD.	2013-05-01 20:31:54 -07:00
Reynold Xin	98df9d2853	Added removeRdd function in BlockManager.	2013-05-01 20:17:09 -07:00
Mridul Muralidharan	dfde9ce9dd	comment out debug versions of checkHost, etc from Utils - which were used to test	2013-05-02 07:41:33 +05:30
Mridul Muralidharan	1b5aaeadc7	Integrate review comments 2	2013-05-02 07:30:06 +05:30
jerryshao	c047f0e3ad	filter out Spark streaming block RDD and sort RDDInfo with id	2013-05-02 09:48:32 +08:00
Mridul Muralidharan	609a817f52	Integrate review comments on pull request	2013-05-02 06:44:33 +05:30
Reynold Xin	204eb32e14	Changed the type of the persistentRdds hashmap back to TimeStampedHashMap.	2013-05-01 16:14:58 -07:00
Reynold Xin	34637b97ec	Added SparkContext.cleanup back. Not sure why it was removed before ...	2013-05-01 16:12:37 -07:00
Reynold Xin	3227ec8edd	Cleaned up Ram's code. Moved SparkContext.remove to RDD.unpersist. Also updated unit tests to make sure they are properly testing for concurrency.	2013-05-01 16:07:44 -07:00
harshars	8481562731	Merged Ram's commit on removing RDDs. Conflicts: core/src/main/scala/spark/SparkContext.scala	2013-05-01 14:42:17 -07:00
Mridul Muralidharan	27764a00f4	Fix some npe introduced accidentally	2013-05-01 20:56:05 +05:30
Mridul Muralidharan	d960e7e0f8	a) Add support for hyper local scheduling - specific to a host + port - before trying host local scheduling. b) Add some fixes to test code to ensure it passes (and fixes some other issues). c) Fix bug in task scheduling which incorrectly used availableCores instead of all cores on the node.	2013-05-01 20:24:00 +05:30
Matei Zaharia	aa8fe1a209	Merge pull request #586 from mridulm/master Pull request to address issues Reynold Xin reported	2013-04-30 22:30:18 -07:00
Reynold Xin	dd7bef3147	Two minor fixes according to Ryan LeCompte's review.	2013-04-30 15:02:32 -07:00
Reynold Xin	cea6174573	Merge branch 'master' of github.com:mesos/spark into blockmanager Conflicts: core/src/main/scala/spark/BlockStoreShuffleFetcher.scala	2013-04-30 13:28:35 -07:00
Mridul Muralidharan	60cabb35cb	Add addition catch block for exception too	2013-05-01 01:17:14 +05:30
Mridul Muralidharan	3b748ced22	Be more aggressive and defensive in all uses of SelectionKey in select loop	2013-05-01 00:30:30 +05:30
Mridul Muralidharan	0f45477be1	Change indentation	2013-05-01 00:10:02 +05:30
Mridul Muralidharan	538614acfe	Be more aggressive and defensive in select also	2013-05-01 00:05:32 +05:30
Mridul Muralidharan	48854e1dbf	If key is not valid, close connection	2013-04-30 23:59:33 +05:30
Matei Zaharia	f708dda81e	Merge pull request #585 from pwendell/listener-perf [Fix SPARK-742] Task Metrics should not employ per-record timing by default	2013-04-30 07:51:40 -07:00
Mridul Muralidharan	e46d547ccd	Fix issues reported by Reynold	2013-04-30 16:15:56 +05:30
Reynold Xin	1055785a83	Allow specifying the shuffle write file buffer size. The default buffer size is 8KB in FastBufferedOutputStream, which is too small and would cause a lot of disk seeks.	2013-04-29 23:33:56 -07:00
Reynold Xin	7007201201	Added a shuffle block manager so it is easier in the future to consolidate shuffle output files.	2013-04-29 23:07:03 -07:00
Reynold Xin	d3586ef438	Merge branch 'blockmanager' of github.com:rxin/spark into blockmanager Conflicts: core/src/main/scala/spark/storage/DiskStore.scala	2013-04-29 15:44:18 -07:00
Patrick Wendell	016ce1fa9c	Using full package name for util	2013-04-29 12:02:27 -07:00
Patrick Wendell	540be6b154	Modified version of the fix which just removes all per-record tracking.	2013-04-29 11:32:07 -07:00
Patrick Wendell	224fbac061	Spark-742: TaskMetrics should not employ per-record timing. This patch does three things: 1. Makes TimedIterator a trait with two implementations (one a no-op) 2. Makes the default behavior to use the no-op implementation 3. Removes DelegateBlockFetchTracker. This is just cleanup, but it seems like the triat doesn't really reduce complexity in any way. In the future we can add other implementations, e.g. ones which perform sampling.	2013-04-29 11:13:43 -07:00
Shivaram Venkataraman	604d3bf56c	Rename partition class and add scala doc	2013-04-28 16:31:07 -07:00
Shivaram Venkataraman	15acd49f07	Actually rename classes to ZippedPartitions* (the previous commit only renamed the file)	2013-04-28 16:03:22 -07:00
Shivaram Venkataraman	6e84635ab9	Rename classes from MapZipped* to Zipped*	2013-04-28 15:58:40 -07:00
Shivaram Venkataraman	0cc6642b7c	Rename to zipPartitions and style changes	2013-04-28 05:11:03 -07:00
Shivaram Venkataraman	c9c4954d99	Add an interface to zip iterators of multiple RDDs The current code supports 2, 3 or 4 arguments but can be extended to more arguments if required.	2013-04-26 16:57:46 -07:00
Matei Zaharia	6e6b5204ea	Create an empty directory when checkpointing a 0-partition RDD (fixes a test failure on Hadoop 2.0)	2013-04-25 00:42:37 -07:00
Reynold Xin	ba6ffa6a5f	Allow the specification of a shuffle serializer in the read path (for local block reads).	2013-04-24 17:38:07 -07:00
Reynold Xin	aa618ed2a2	Allow changing the serializer on a per shuffle basis.	2013-04-24 14:52:49 -07:00
Mridul Muralidharan	dd515ca3ee	Attempt at fixing merge conflict	2013-04-24 09:24:17 +05:30
Reynold Xin	31ce6c66d6	Added a BlockObjectWriter interface in block manager so ShuffleMapTask doesn't need to build up an array buffer for each shuffle bucket.	2013-04-23 17:48:59 -07:00
koeninger	dfac0aa5c2	prevent mysql driver from pulling entire resultset into memory. explicitly close resultset and statement.	2013-04-22 21:12:52 -05:00
Mridul Muralidharan	7acab3ab45	Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo	2013-04-22 08:01:13 +05:30
koeninger	b2a3f24dde	first attempt at an RDD to pull data from JDBC sources	2013-04-21 00:29:37 -05:00

... 3 4 5 6 7 ...

1629 commits