ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Andrew xia	7d2eada451	Add metrics source of DAGScheduler and blockManager Conflicts: core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/SparkEnv.scala	2013-07-24 14:57:47 +08:00
jerryshao	e9ac88754d	Remove twice add Source bug and code clean	2013-07-24 14:57:47 +08:00
jerryshao	e080588f73	Add metrics system unit test	2013-07-24 14:57:47 +08:00
jerryshao	5ce5dc9fcd	Add default properties to deal with no configure file situation	2013-07-24 14:57:47 +08:00
jerryshao	871bc1687e	Add Executor instrumentation	2013-07-24 14:57:46 +08:00
jerryshao	7fb574bf66	Code clean and remarshal	2013-07-24 14:57:46 +08:00
Andrew xia	4d6dd67fa1	refactor metrics system 1.change source abstract class to support MetricRegistry 2.change master/work/jvm source class	2013-07-24 14:57:46 +08:00
jerryshao	03f9871116	MetricsSystem refactor	2013-07-24 14:57:46 +08:00
jerryshao	c3daad3f65	Update metric source support for instrumentation	2013-07-24 14:57:46 +08:00
jerryshao	9dec8c73e6	Add Master and Worker instrumentation support	2013-07-24 14:57:46 +08:00
jerryshao	503acd3a37	Build metrics system framwork	2013-07-24 14:57:46 +08:00
Matei Zaharia	b011329040	Merge pull request #727 from rxin/scheduler Scheduler code style cleanup.	2013-07-23 22:50:09 -07:00
Matei Zaharia	876125b997	Merge pull request #726 from rxin/spark-826 SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure	2013-07-23 22:28:21 -07:00
Reynold Xin	3dae1df66f	Moved non-serializable closure catching exception from submitStage to submitMissingTasks	2013-07-23 20:29:07 -07:00
Reynold Xin	d33b8a2a0f	Added comments on task closure serialization.	2013-07-23 20:28:39 -07:00
Reynold Xin	85ab8114bc	Moved non-serializable closure catching exception from submitStage to submitMissingTasks	2013-07-23 20:25:58 -07:00
Matei Zaharia	6a31b7191d	Small bug fix	2013-07-23 16:20:24 -07:00
Matei Zaharia	2f1736c396	Merge pull request #725 from karenfeng/task-start Creates task start events	2013-07-23 15:53:30 -07:00
Karen Feng	abc78cd331	Modifies instead of copies HashSets, fixes comment style	2013-07-23 15:47:16 -07:00
Karen Feng	383684daaa	Replaces Seq with HashSet, removes redundant import	2013-07-23 15:33:27 -07:00
Reynold Xin	f2422d4f29	SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure.	2013-07-23 15:30:20 -07:00
Reynold Xin	5ed38b4d1d	Scheduler code style cleanup.	2013-07-23 15:28:59 -07:00
Reynold Xin	101b8cc78a	SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure.	2013-07-23 15:28:20 -07:00
Dmitriy Lyubimov	72bac09c42	Leaking spark context in the test	2013-07-23 15:19:07 -07:00
Karen Feng	9f2dbb2a7c	Adds/removes active tasks only once	2013-07-23 15:10:09 -07:00
Dmitriy Lyubimov	ef82ff8564	Merge branch 'master' into SPARK-826 Conflicts: core/src/main/scala/spark/scheduler/local/LocalScheduler.scala	2013-07-23 13:43:00 -07:00
Karen Feng	0200801a55	Tracks task start events and shows number of active tasks on Executor UI	2013-07-23 13:35:43 -07:00
Dmitriy Lyubimov	310e73d566	style	2013-07-23 13:23:25 -07:00
Matei Zaharia	f369e0e51b	Merge pull request #720 from ooyala/2013-07/persistent-rdds-api Add a public method getCachedRdds to SparkContext	2013-07-23 13:22:27 -07:00
Dmitriy Lyubimov	ac60d06381	Re-working in terms of changes to TaskSetManager. Verified with Standalone and Local mode.	2013-07-23 13:13:19 -07:00
Evan Chan	efd6418c1b	Move getPersistentRDDs testing to a new Suite	2013-07-23 10:40:41 -07:00
Evan Chan	4830e22562	Rename method per rxin feedback	2013-07-23 09:50:13 -07:00
Evan Chan	2c2bfbe294	Add toMap method to TimeStampedHashMap and use it	2013-07-23 01:36:44 -07:00
Matei Zaharia	401aac8b18	Merge pull request #719 from karenfeng/ui-808 Creates Executors tab for Jobs UI	2013-07-22 16:57:16 -07:00
Karen Feng	872c97ad82	Split task columns, memory columns sort by numeric value	2013-07-22 16:54:37 -07:00
Matei Zaharia	ea1cfabfdd	Merge branch 'master' of github.com:mesos/spark	2013-07-22 16:22:02 -07:00
Matei Zaharia	8e38e77232	Fix a test that was using an outdated config setting	2013-07-22 16:05:32 -07:00
Karen Feng	2eea974795	Executors UI now calls executor ID from TaskInfo instead of TaskMetrics	2013-07-22 15:15:54 -07:00
Dmitriy Lyubimov	8ca0c31944	removing non-pertinent comment	2013-07-22 14:48:46 -07:00
Dmitriy Lyubimov	b4b230e606	Fixing for LocalScheduler with test, that much works ..	2013-07-22 14:42:47 -07:00
Karen Feng	85c4d7bf3b	Shows number of complete/total/failed tasks (bug: failed tasks assigned to null executor)	2013-07-22 14:35:47 -07:00
Josh Rosen	f649dabb4a	Fix bug: DoubleRDDFunctions.sampleStdev() computed non-sample stdev(). Update JavaDoubleRDD to add new methods and docs. Fixes SPARK-825.	2013-07-22 13:21:48 -07:00
Karen Feng	8901f379c9	Fixed memory used/remaining/total bug	2013-07-22 09:58:03 -07:00
Karen Feng	636b19f833	Merge branch 'master' of https://github.com/mesos/spark into ui-808	2013-07-22 09:53:26 -07:00
Evan Chan	0337d88321	Add a public method getCachedRdds to SparkContext	2013-07-21 18:26:14 -07:00
Karen Feng	865dc63bac	Changed table format for executors	2013-07-19 15:57:01 -07:00
Karen Feng	81bb5dc640	Creates Executors tab for application with RDD block and memory/disk used, solves SPARK-808	2013-07-19 14:08:30 -07:00
Konstantin Boudnik	cfce9a6a36	Regression: default webui-port can't be set via command line "--webui-port" anymore	2013-07-19 14:00:58 -07:00
Liang-Chi Hsieh	4530e8a9bf	fix typo.	2013-07-20 00:04:25 +08:00
Liang-Chi Hsieh	aa6f83289b	A better fix for giving local jars unde Yarn mode.	2013-07-19 22:25:28 +08:00
Liang-Chi Hsieh	a613628c50	Do not copy local jars given to SparkContext in yarn mode since the Context is not running on local. This bug causes failure when jars can not be found. Example codes (such as spark.examples.SparkPi) can not work without this fix under yarn mode.	2013-07-19 16:59:12 +08:00
Matei Zaharia	af3c9d5042	Add Apache license headers and LICENSE and NOTICE files	2013-07-16 17:21:33 -07:00
Matei Zaharia	b1f9f64743	Merge branch 'master' of github.com:mesos/spark	2013-07-16 11:01:53 -07:00
Matei Zaharia	5c388808a8	SPARK-814: Result stages should be named after action	2013-07-16 11:01:14 -07:00
Matei Zaharia	f347cc3f65	Fix deprecation warning and style issues	2013-07-16 10:53:30 -07:00
Reynold Xin	69316603d6	Throw a more meaningful message when runJob is called to launch tasks on non-existent partitions.	2013-07-15 22:50:11 -07:00
Karen Feng	6dc7c9bfb1	Removed job UI column, linked description to job UI	2013-07-15 16:33:50 -07:00
Karen Feng	fbf5aa761e	Removed log message, added field in master UI to link to log UI	2013-07-15 15:50:03 -07:00
Karen Feng	eac381a957	Merge branch 'ui-802' of https://github.com/karenfeng/spark into ui-802	2013-07-15 15:48:44 -07:00
Karen Feng	3955711250	Added field to master UI with link to job UI	2013-07-15 15:47:21 -07:00
Karen Feng	0d78b6d9cd	Links to job UI from standalone deploy cluster web UI: fixes SPARK-802	2013-07-15 13:47:38 -07:00
Karen Feng	b2aaa1199e	Adds app name in HTML page titles on job web UI: fixes SPARK-806	2013-07-15 11:44:42 -07:00
Matei Zaharia	d47c16f78d	Add an option to disable reference tracking in Kryo	2013-07-15 01:55:54 +00:00
Matei Zaharia	c7877d5e16	Merge pull request #689 from BlackNiuza/application_status Bug fix: SPARK-796	2013-07-14 12:58:13 -07:00
Matei Zaharia	10c05937bd	Merge pull request #699 from pwendell/ui-env Add `Environment` tab to SparkUI.	2013-07-14 11:45:18 -07:00
Patrick Wendell	4883586838	Responding to Matei's review	2013-07-14 10:37:26 -07:00
BlackNiuza	00556a94c9	add spaces before curly braces and after for if conditions	2013-07-14 17:04:53 +08:00
Matei Zaharia	b91a218cea	Cosmetic fixes to web UI	2013-07-14 07:31:33 +00:00
Matei Zaharia	a44a7b1238	Determine Spark core classes better in getCallSite	2013-07-14 07:23:09 +00:00
root	e271fde10b	Fixed a delay scheduling bug in the YARN branch, found by Patrick	2013-07-14 06:24:29 +00:00
Patrick Wendell	ddb97f0fdf	Add `Environment` tab to SparkUI. This adds a tab which displays system property and classpath information. This can be useful in debugging various types of issues such as: 1. Extra/incorrect Hadoop jars being included in the classpath 2. Spark launching with a different JRE version than intended 3. Spark system properties not being set to intended values 4. User added jars that conflict with Spark jars	2013-07-13 16:14:40 -07:00
Matei Zaharia	77c69ae5a0	Merge pull request #697 from pwendell/block-locations Show block locations in Web UI.	2013-07-12 23:05:21 -07:00
Matei Zaharia	5a7835c152	Merge pull request #691 from karenfeng/logpaging Create log pages	2013-07-12 20:28:21 -07:00
Matei Zaharia	71ccca0cc1	Merge pull request #696 from woggle/executor-env Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.sh	2013-07-12 20:25:06 -07:00
Matei Zaharia	90fc3f30cd	Merge pull request #692 from Reinvigorate/takeOrdered adding takeOrdered() to RDD	2013-07-12 20:23:36 -07:00
Patrick Wendell	08150f19ab	Minor style fix	2013-07-12 19:32:35 -07:00
Patrick Wendell	6855338e14	Show block locations in Web UI. This fixes SPARK-769. Support is added for enumerating the locations of blocks in the UI. There is also some minor cleanup in StorageUtils.	2013-07-12 19:30:32 -07:00
Karen Feng	73984b96a8	Removed unit test of nonexistent function Utils.lastNBytes	2013-07-12 14:26:56 -07:00
Charles Reiss	531a7e5574	Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.	2013-07-12 12:58:25 -07:00
seanm	a1662326e9	comment adjustment to takeOrdered	2013-07-12 08:38:19 -07:00
Andrew xia	2080e25006	Enhance job ui in spark ui system with adding pool information	2013-07-12 14:25:18 +08:00
seanm	a2c915fba8	giving order to top and making tests more clear	2013-07-11 18:55:00 -07:00
Karen Feng	5c67ca0278	Remove "Bytes" in lieu of String notation	2013-07-11 17:31:59 -07:00
Karen Feng	6d054487bf	Replace default buffer value to 100 GB, changed buttons to use String notation, removed default buffer parameter in UI URLs	2013-07-11 17:12:17 -07:00
Karen Feng	a32784109d	Fixed links for "Back to Master"	2013-07-11 16:57:55 -07:00
Karen Feng	ece2388585	Removed logPageLength from logPage	2013-07-11 16:35:56 -07:00
Karen Feng	9ed036ccdb	Replaced logPageLength with byteLength to prevent buffer shrink bug	2013-07-11 16:33:53 -07:00
Karen Feng	fdc226a14c	Clarified start and end byte variable names	2013-07-11 15:36:43 -07:00
Karen Feng	5d5dbc39f6	getByteRange moved to WorkerWebUI, takes converted parameters, returns only start/end offset	2013-07-11 15:22:45 -07:00
Karen Feng	15fd11d657	Removed redundant calls to request by logPage	2013-07-11 15:01:50 -07:00
Karen Feng	11872888ca	Created getByteRange function for logs and log pages, removed lastNBytes function	2013-07-11 14:56:37 -07:00
Matei Zaharia	018d04c64e	Merge pull request #684 from woggle/mesos-classloader Explicitly set class loader for MesosSchedulerDriver callbacks.	2013-07-11 12:48:37 -07:00
Karen Feng	e3a3fcf61b	Scrollbar on log pages appear automatically	2013-07-11 12:16:38 -07:00
Karen Feng	044d4577ec	Fixed capitalization of log page	2013-07-11 12:02:15 -07:00
Karen Feng	0ecc33f0c8	Added byte range, page title with log name, previous/next bytes buttons, initialization to end of log, large default buffer, buggy back to master link	2013-07-11 11:25:58 -07:00
Karen Feng	74bd3fc680	Added byte range on log pages	2013-07-10 15:44:28 -07:00
Karen Feng	24196c91f0	Changed buffer to 10,000 bytes, created scrollbar for fixed-height log	2013-07-10 15:27:52 -07:00
Karen Feng	f5f3b272f8	Fixed mixup of start/end, moved more import files	2013-07-10 14:52:29 -07:00
Karen Feng	dbe948d9a2	Moved appropriate import files from UISuite to UtilsSuite	2013-07-10 14:15:41 -07:00
Karen Feng	5f8a20b4a8	Moved unit tests for Utils from UISuite to UtilsSuite	2013-07-10 13:53:39 -07:00
Karen Feng	0d4580360b	Fixed docstring of offsetBytes to match params and wrapped for 100+ character lines	2013-07-10 13:24:26 -07:00
Karen Feng	04263e4d46	Made some minor style changes	2013-07-10 13:15:42 -07:00
Karen Feng	cfb6447ac4	Fixed for nonexistent bytes, added unit tests, changed stdout-page to stdout	2013-07-10 11:47:57 -07:00
seanm	ee4ce2fc51	adding takeOrdered to java API	2013-07-10 10:46:04 -07:00
seanm	24705d0f46	adding takeOrdered() to RDD	2013-07-10 10:33:11 -07:00
Karen Feng	620a6974c6	Allows for larger files, refactors lastNBytes, removes old Log column, fixes imports, uses map	2013-07-10 10:20:53 -07:00
BlackNiuza	ce18b50d5f	set SUCCEEDED for all master in shutdown hook	2013-07-10 19:11:43 +08:00
Karen Feng	b6072b58bf	Fixes style, makes "std__-page" consistent, reads only parts of files	2013-07-09 17:25:10 -07:00
Karen Feng	13fc6f248c	Clean commit of log paging	2013-07-09 14:17:15 -07:00
BlackNiuza	aaa7b081df	according to mridulm's comments to adjust the code	2013-07-09 20:03:01 +08:00
Charles Reiss	e47253e0cc	Reset ClassLoader in MesosSchedulerBackend, too. (per review comments). Also set ClassLoader for all mesos callbacks, not just statusUpdate, registered.	2013-07-09 01:23:23 -07:00
BlackNiuza	c1d44be805	Bug fix: SPARK-796	2013-07-09 15:18:28 +08:00
Matei Zaharia	7dcda9ae74	Merge pull request #688 from markhamstra/scalaDependencies Fixed SPARK-795 with explicit dependencies	2013-07-08 23:24:23 -07:00
Mark Hamstra	0b39d66f3f	pom cleanup	2013-07-08 16:07:09 -07:00
Mark Hamstra	afdaf430bd	Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems	2013-07-08 15:40:50 -07:00
Charles Reiss	8c1d1c98e0	Explicitly set class loader for MesosSchedulerDriver callbacks.	2013-07-08 12:25:46 -07:00
Shivaram Venkataraman	4af0d63cb1	Remove akka LogLevel fix as we no longer use spray	2013-07-07 10:42:43 -07:00
Shivaram Venkataraman	d362d0f411	Ignore stderr when calling cat on a non-existing file	2013-07-07 04:09:46 -07:00
Shivaram Venkataraman	7d6d9e6ab2	Set DriverSuite log level to WARN	2013-07-07 04:09:15 -07:00
Shivaram Venkataraman	a948f06725	Suppress log messages in sbt test with two changes: 1. Set akka log level to ERROR before shutting down the actorSystem. This avoids akka log messages (like Spray) from falling back to INFO on the Stdout logger 2. Initialize netty to use SLF4J in LocalSparkContext. This ensures that stack trace thrown during shutdown is handled by SLF4J instead of stdout	2013-07-07 04:09:08 -07:00
Patrick Wendell	32b9d21a97	Fix occasional failure in UI listener. If a task fails before the metrics are initialized, it remains possible that the metrics field will be `None`. This patch accounts for that possbility by keeping metrics as an `Option` at all times.	2013-07-06 16:40:02 -07:00
Matei Zaharia	1ffadb2d9e	Merge remote-tracking branch 'pwendell/ui-updates' Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala core/src/main/scala/spark/util/AkkaUtils.scala pom.xml	2013-07-06 15:51:41 -07:00
Matei Zaharia	94871e4703	Merge pull request #655 from tgravescs/master Add support for running Spark on Yarn on a secure Hadoop Cluster	2013-07-06 15:26:19 -07:00
Matei Zaharia	3f918b33f8	Merge pull request #672 from holdenk/master s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning	2013-07-06 12:45:18 -07:00
Matei Zaharia	2a36e5449b	Merge pull request #673 from xiajunluan/master Add config template file for fair scheduler feature	2013-07-06 12:43:21 -07:00
Matei Zaharia	7ba7fa110b	Merge pull request #674 from liancheng/master Bug fix: SPARK-789	2013-07-06 11:45:08 -07:00
BlackNiuza	44a2440039	Remove active job from idToActiveJob when job finished or aborted	2013-07-07 01:33:09 +08:00
Patrick Wendell	37abe84212	Tracking some task metrics even during failures.	2013-07-06 09:19:59 -07:00
Patrick Wendell	84b7fc54e6	Enforcing correct sort order for formatted strings	2013-07-05 17:21:08 -07:00
Matei Zaharia	399bd65ef5	Fixed compile error due to merge	2013-07-05 11:27:06 -07:00
Matei Zaharia	652ea0f1d8	Allow RDD.takeSample to give samples bigger than the RDD Before, when withReplacement was set to true, we would not get a sample bigger than the RDD's count(). Conflicts: core/src/main/scala/spark/RDD.scala core/src/test/scala/spark/RDDSuite.scala	2013-07-05 11:15:13 -07:00
Matei Zaharia	6586c5e28b	Added a SparkContext accessor to RDD	2013-07-05 11:13:46 -07:00
jerryshao	e4ff544a8d	Clean StageToInfos periodically when spark.cleaner.ttl is enabled	2013-07-05 10:34:45 +08:00
Lian Cheng	c0c3155c3c	Bug fix: SPARK-789 https://spark-project.atlassian.net/browse/SPARK-789	2013-07-05 00:54:10 +08:00
Andrew xia	6ccfb73ca9	Add fair scheduler config template file	2013-07-04 19:19:44 +08:00
Holden Karau	0f06d6217d	s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning	2013-07-04 01:05:39 -07:00
Gavin Li	94238aae57	fix dependencies	2013-07-03 18:08:38 +00:00
Gavin Li	96130c30d9	add compression codec trait and snappy compression	2013-07-03 05:49:04 +00:00
$Y.CORP.YAHOO.COM\tgraves$ Y.CORP.YAHOO.COM\tgraves	923cf92900	Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes to only add the credentials when the profile is hadoop2-yarn.	2013-07-02 21:18:59 -05:00
Patrick Wendell	39e2325675	Removing dead code	2013-07-02 16:28:40 -07:00
Patrick Wendell	8ca1cc1786	Adding truncation for log files	2013-07-02 16:10:50 -07:00
Patrick Wendell	9a42d04efa	Throw exception for missing resource	2013-07-01 14:43:13 -07:00
Patrick Wendell	1025d7d1ef	Package refactoring	2013-07-01 14:40:53 -07:00
Patrick Wendell	30b9034241	Fixing bug where logs aren't shown	2013-07-01 13:48:01 -07:00
Patrick Wendell	8688689387	Various formatting changes	2013-07-01 13:40:12 -07:00
Patrick Wendell	735c951a09	Adding test script	2013-07-01 09:33:22 -07:00
Patrick Wendell	5de326db7d	Print exception message	2013-07-01 09:19:45 -07:00
root	ec31e68d5d	Fixed PySpark perf regression by not using socket.makefile(), and improved debuggability by letting "print" statements show up in the executor's stderr Conflicts: core/src/main/scala/spark/api/python/PythonRDD.scala	2013-07-01 06:26:31 +00:00
root	3296d132b6	Fix performance bug with new Python code not using buffered streams	2013-07-01 06:25:43 +00:00
Matei Zaharia	03d0b858c8	Made use of spark.executor.memory setting consistent and documented it Conflicts: core/src/main/scala/spark/SparkContext.scala	2013-06-30 15:46:46 -07:00
Patrick Wendell	e721ff7e5a	Allowing details for failed stages	2013-06-29 11:26:30 -07:00
Patrick Wendell	473961d82e	Styling for progress bar	2013-06-29 08:38:04 -07:00
Patrick Wendell	249f0e54ba	Minor changes from Matei's review	2013-06-28 13:25:26 -07:00
Matei Zaharia	50ca17635a	Merge pull request #664 from pwendell/test-fix Removing incorrect test statement	2013-06-27 22:24:52 -07:00
Patrick Wendell	c537e869f3	Missing logo file	2013-06-27 22:02:03 -07:00
Patrick Wendell	c767e74370	Removing incorrect test statement	2013-06-27 21:48:58 -07:00
Patrick Wendell	62c2c6b856	Forcing Jetty to run as daemon	2013-06-27 21:47:22 -07:00
Patrick Wendell	a55190d314	Adding better tabs for UI headers.	2013-06-27 19:14:51 -07:00
Patrick Wendell	362d996c81	Handful of changes based on matei's review - Avoid exception when no tasks have finished for a stage - Adding DOCTYPE so css renders properly - Adding progress slider	2013-06-27 19:14:28 -07:00
Patrick Wendell	92a4c2a5f6	Fixing bug in local scheduler time recording	2013-06-27 12:33:06 -07:00
Stephen Haberman	d7011632d1	Wrap lines.	2013-06-26 12:35:57 -05:00
Patrick Wendell	ee692482a6	One more private class	2013-06-26 09:07:32 -07:00
Patrick Wendell	a59c15a37e	Adding config option for retained stages	2013-06-26 08:54:57 -07:00
Patrick Wendell	274193664a	Bumping timeouts	2013-06-26 08:51:28 -07:00
Patrick Wendell	b14ad509ba	Moving static ui package	2013-06-26 08:46:51 -07:00
Patrick Wendell	2cbaa0734b	Making all new classes package private	2013-06-26 08:44:55 -07:00
Stephen Haberman	d11025dc6a	Be cute with Option and getenv.	2013-06-26 09:53:35 -05:00
Matei Zaharia	9f0d913295	Refactored tests to share SparkContexts in some of them Creating these seems to take a while and clutters the output with Akka stuff, so it would be nice to share them.	2013-06-25 19:18:30 -04:00
Matei Zaharia	6c8d1b2ca6	Fix computation of classpath when we launch java directly The previous version assumed that a CLASSPATH environment variable was set by the "run" script when launching the process that starts the ExecutorRunner, but unfortunately this is not true in tests. Instead, we factor the classpath calculation into an extenral script and call that. NOTE: This includes a Windows version but hasn't yet been tested there.	2013-06-25 18:21:00 -04:00
Matei Zaharia	15b00914c5	Some fixes to the launch-java-directly change: - Split SPARK_JAVA_OPTS into multiple command-line arguments if it contains spaces; this splitting follows quoting rules in bash - Add the Scala JARs to the classpath if they're not in the CLASSPATH variable because the ExecutorRunner is launched with "scala" (this can happen when using local-cluster URLs in spark-shell)	2013-06-25 17:17:27 -04:00
Matei Zaharia	7680ce0bd6	Fixed deprecated use of expect in SizeEstimatorSuite	2013-06-25 16:11:44 -04:00
Matei Zaharia	7e0191c6ea	Merge remote-tracking branch 'cgrothaus/SPARK-698' Conflicts: run	2013-06-25 15:47:40 -04:00
Patrick Wendell	d66bd6f885	Adding another unit test to Web UI suite	2013-06-24 17:12:55 -07:00
Patrick Wendell	f7389330c3	Allowing for requested port on construction	2013-06-24 16:51:52 -07:00
Patrick Wendell	42157027f2	A few bug fixes and a unit test	2013-06-24 16:25:05 -07:00
Patrick Wendell	a4248138b4	Minor style cleanup	2013-06-24 14:22:28 -07:00
Patrick Wendell	b5e6e8bcc8	Cleaning up some code for Job Progress	2013-06-24 14:13:24 -07:00
Patrick Wendell	93e8ed85aa	Work around for initalization issue	2013-06-24 13:11:18 -07:00
Patrick Wendell	f6e64b5cd6	Updating based on changes to JobLogger (and one small change to JobLogger)	2013-06-24 12:40:41 -07:00
Matei Zaharia	78ffe164b3	Clone the zero value for each key in foldByKey The old version reused the object within each task, leading to overwriting of the object when a mutable type is used, which is expected to be common in fold. Conflicts: core/src/test/scala/spark/ShuffleSuite.scala	2013-06-23 10:26:53 -07:00
Matei Zaharia	0e0f9d3069	Fix search path for REPL class loader to really find added JARs	2013-06-22 17:44:04 -07:00
Matei Zaharia	3e61beff7b	Merge pull request #648 from shivaram/netty-dbg Shuffle fixes and cleanup	2013-06-22 16:22:47 -07:00
Patrick Wendell	7e9f1ed0de	Some cleanup of styling	2013-06-22 10:31:37 -07:00
Patrick Wendell	3b7ebdeeb8	Handling entirely failed stages	2013-06-22 10:31:37 -07:00
Patrick Wendell	be6107ce44	Some tweaking with shared page header	2013-06-22 10:31:37 -07:00
Patrick Wendell	9a24d1a2d0	Using scala in XML imports	2013-06-22 10:31:37 -07:00
Patrick Wendell	f91e1c4822	Linking RDD information when available in stages	2013-06-22 10:31:37 -07:00
Patrick Wendell	a86bb459e2	Showing shuffle status and purging old stages	2013-06-22 10:31:37 -07:00
Patrick Wendell	3485e73376	Style cleanup	2013-06-22 10:31:37 -07:00
Patrick Wendell	dd696f3a3d	Some renaming and comments	2013-06-22 10:31:37 -07:00
Patrick Wendell	5c872e9ef5	Documentation and some refactoring	2013-06-22 10:31:37 -07:00
Patrick Wendell	17776323a6	More work on percentile data:	2013-06-22 10:31:37 -07:00
Patrick Wendell	dcf6a68177	Refactoring into different modules	2013-06-22 10:31:36 -07:00
Patrick Wendell	ce81c320ac	Adding helper function to make listing tables	2013-06-22 10:31:36 -07:00
Patrick Wendell	9fd5dc3ea9	Initial steps towards job progress UI	2013-06-22 10:31:36 -07:00
Patrick Wendell	bc4a811c57	Stash	2013-06-22 10:31:36 -07:00
Patrick Wendell	77c53f7868	Refactoring UI packages	2013-06-22 10:31:36 -07:00
Patrick Wendell	8b5c7e71c4	Import cleanup	2013-06-22 10:31:36 -07:00
Patrick Wendell	32a45d01b1	Removing twirl files	2013-06-22 10:31:36 -07:00
Patrick Wendell	17f145f3bc	Updating Maven build	2013-06-22 10:31:36 -07:00
Patrick Wendell	4e1f202481	Removing dead code	2013-06-22 10:31:36 -07:00
Patrick Wendell	d6fde4ffe4	Some JSON cleanup	2013-06-22 10:31:36 -07:00
Patrick Wendell	91ec5a1a04	Changing JSON protocol and removing spray code	2013-06-22 10:31:36 -07:00
Patrick Wendell	fc94576ece	Adding worker version of UI	2013-06-22 10:31:36 -07:00
Patrick Wendell	ee73c09ac9	Some comments	2013-06-22 10:31:36 -07:00
Patrick Wendell	9161db5478	Cleaning up master web UI	2013-06-22 10:31:36 -07:00
Patrick Wendell	e55cf0245f	Adding WebUI file	2013-06-22 10:31:35 -07:00
Patrick Wendell	f85fd7a793	Commenting unfinished part	2013-06-22 10:31:35 -07:00
Patrick Wendell	2c36a514aa	Spray refactoring for master web UI	2013-06-22 10:31:35 -07:00
Patrick Wendell	7e6977b6c5	Fix in storage status page	2013-06-22 10:31:35 -07:00
Patrick Wendell	950f83535a	Adding deterministic port	2013-06-22 10:31:35 -07:00
Patrick Wendell	7cd70dc2c1	Minor cleanup	2013-06-22 10:31:35 -07:00
Patrick Wendell	e66f570194	Completely hacked version of block manager UI in jetty	2013-06-22 10:31:35 -07:00
Patrick Wendell	60fbf7e461	Partially working checkpoint	2013-06-22 10:31:35 -07:00
Matei Zaharia	1ef5d0d2c9	Merge pull request #644 from shimingfei/joblogger add Joblogger to Spark (on new Spark code)	2013-06-22 09:35:57 -07:00
Jey Kottalam	1ba3c17303	use parens when calling method with side-effects	2013-06-21 12:14:16 -04:00
Jey Kottalam	edb18ca928	Rename PythonWorker to PythonWorkerFactory	2013-06-21 12:14:16 -04:00
Jey Kottalam	62c4781400	Add tests and fixes for Python daemon shutdown	2013-06-21 12:14:16 -04:00
Jey Kottalam	c79a6078c3	Prefork Python worker processes	2013-06-21 12:14:16 -04:00
Jey Kottalam	40afe0d2a5	Add Python timing instrumentation	2013-06-21 12:14:16 -04:00
Mingfei	2fc794a6c7	small modify in DAGScheduler	2013-06-21 18:21:35 +08:00
Mingfei	4b9862ac9c	small format modification	2013-06-21 17:55:32 +08:00
Mingfei	aa7aa587be	some format modification	2013-06-21 17:48:41 +08:00
Mingfei	5240795154	edit according to comments	2013-06-21 17:38:23 +08:00
Matei Zaharia	71030ba3eb	Merge pull request #654 from lyogavin/enhance_pipe fix typo and coding style in #638	2013-06-19 15:21:03 -07:00
Thomas Graves	bad51c7cb4	upmerge with latest mesos/spark master and fix hbase compile with hadoop2-yarn profile	2013-06-19 14:39:13 -05:00
Thomas Graves	75d78c7ac9	Add support for Spark on Yarn on a secure Hadoop cluster	2013-06-19 11:18:42 -05:00
Matei Zaharia	7902baddc7	Update ASM to version 4.0	2013-06-19 13:34:30 +02:00
Gavin Li	0a2a9bce1e	fix typo and coding style	2013-06-18 21:30:13 +00:00
jerryshao	1e9269c3ee	reduce ZippedPartitionsRDD's getPreferredLocations complexity	2013-06-18 09:49:06 +08:00
Matei Zaharia	db42451a52	Merge pull request #643 from adatao/master Bug fix: Zero-length partitions result in NaN for overall mean & variance	2013-06-17 15:26:36 -07:00
Matei Zaharia	e82a2ffcc9	Merge pull request #653 from rxin/logging SPARK-781: Log the temp directory path when Spark says "Failed to create temp directory."	2013-06-17 15:13:15 -07:00
Matei Zaharia	ec193c7d89	Merge remote-tracking branch 'xiajunluan/xiajunluan' Conflicts: core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala	2013-06-18 00:11:50 +02:00
Reynold Xin	be3c406edf	Fixed the typo pointed out by Matei.	2013-06-17 17:07:51 -04:00
Reynold Xin	1450296797	SPARK-781: Log the temp directory path when Spark says "Failed to create temp directory".	2013-06-17 16:58:23 -04:00
Gavin Li	4508089fc3	refine comments and add sc.clean	2013-06-17 05:23:46 +00:00
Gavin Li	e6ae049283	Merge remote-tracking branch 'upstream1/master' into enhance_pipe	2013-06-16 22:53:39 +00:00
Gavin Li	fb6d733fa8	update according to comments	2013-06-16 22:32:55 +00:00
Christopher Nguyen	f91195cc15	Import just scala.math.abs rather than scala.math._	2013-06-16 01:29:53 -07:00
Christopher Nguyen	5c886194e4	Move zero-length partition testing from JavaAPISuite.java to PartitioningSuite.scala	2013-06-16 01:23:48 -07:00
Christopher Nguyen	479442a9b9	Add zeroLengthPartitions() test to make sure, e.g., StatCounter.scala can handle empty partitions without incorrectly returning NaN	2013-06-15 17:35:55 -07:00
Matei Zaharia	f961aac8b2	Merge pull request #649 from ryanlecompte/master Add top K method to RDD using a bounded priority queue	2013-06-15 00:53:41 -07:00
ryanlecompte	e8801d4490	use delegation for BoundedPriorityQueue, add Java API	2013-06-14 23:39:05 -07:00
Andrew xia	53add598f2	Update LocalSchedulerSuite to avoid using sleep for task launch	2013-06-15 01:46:13 +08:00
Reynold Xin	2cc188fd54	SPARK-774: cogroup should also disable map side combine by default	2013-06-14 00:10:54 -07:00
Reynold Xin	6738178d0d	SPARK-772: groupByKey should disable map side combine.	2013-06-13 23:59:42 -07:00
ryanlecompte	93b3f5e535	drop unneeded ClassManifest implicit	2013-06-13 16:26:35 -07:00
ryanlecompte	44b8dbaede	use Iterator.single(elem) instead of Iterator(elem) for improved performance based on scaladocs	2013-06-13 16:23:15 -07:00
Shivaram Venkataraman	1d9f0df065	Fix some comments and style	2013-06-13 14:46:25 -07:00
Mingfei	967a6a699d	modify sparklister function interface according to comments	2013-06-13 14:36:07 +08:00
Shivaram Venkataraman	5da4287b1d	Merge branch 'netty-dbg' of github.com:shivaram/spark into netty-dbg	2013-06-12 16:38:37 -07:00
Shivaram Venkataraman	5e9a9317c5	Merge branch 'master' of git://github.com/mesos/spark into netty-dbg	2013-06-12 16:38:01 -07:00
ryanlecompte	db5bca08ff	add a new top K method to RDD using a bounded priority queue	2013-06-12 10:54:16 -07:00
Patrick Wendell	fd6148c8b2	Removing print statement	2013-06-10 10:27:25 -07:00
Andrew xia	190ec61799	change code style and debug info	2013-06-10 15:27:02 +08:00
Patrick Wendell	ef14dc2e77	Adding Java-API version of compression codec	2013-06-09 18:09:46 -07:00
Patrick Wendell	df592192e7	Monads FTW	2013-06-09 18:09:24 -07:00
Patrick Wendell	083a3485ab	Clean extra whitespace	2013-06-09 11:49:33 -07:00
Patrick Wendell	d1bbcebae5	Adding compression to Hadoop save functions	2013-06-09 11:39:35 -07:00
Mingfei	ade822011d	not check return value of eventQueue.take	2013-06-08 16:26:45 +08:00
Mingfei	4fd86e0e10	delete test code for joblogger in SparkContext	2013-06-08 15:45:47 +08:00
Mingfei	362f0f93ac	Merge branch 'master' of https://github.com/mesos/spark	2013-06-08 15:20:13 +08:00
Mingfei	1a4d93c025	modify to pass job annotation by localProperties and use daeamon thread to do joblogger's work	2013-06-08 14:23:39 +08:00
Matei Zaharia	b58a29295b	Small formatting and style fixes	2013-06-07 22:51:28 -07:00
Matei Zaharia	c8fc423bc2	Merge pull request #631 from jerryshao/master Fix block manager UI display issue when enable spark.cleaner.ttl	2013-06-07 22:43:18 -07:00
Matei Zaharia	c9ca0a4a58	Small code style fix to SchedulingAlgorithm.scala	2013-06-07 22:40:44 -07:00
Matei Zaharia	1ae60bcb36	Merge pull request #634 from xiajunluan/master [Spark-753] Fix ClusterSchedulSuite unit test failed	2013-06-07 22:39:06 -07:00
Shivaram Venkataraman	ac480fd977	Clean up variables and counters in BlockFetcherIterator	2013-06-06 16:34:27 -07:00
Gavin Li	e179ff8a32	update according to comments	2013-06-05 22:41:05 +00:00
Shivaram Venkataraman	cb2f5046ee	Pass in bufferSize to BufferedOutputStream	2013-06-05 15:09:02 -07:00
Shivaram Venkataraman	c851957fe4	Don't write zero block files with java serializer	2013-06-05 14:28:38 -07:00
Christopher Nguyen	9d35904357	In the current code, when both partitions happen to have zero-length, the return mean will be NaN. Consequently, the result of mean after reducing over all partitions will also be NaN, which is not correct if there are partitions with non-zero length. This patch fixes this issue.	2013-06-04 22:12:47 -07:00
Matei Zaharia	fff3728552	Merge pull request #640 from pwendell/timeout-update Fixing bug in BlockManager timeout	2013-06-04 16:09:50 -07:00
Patrick Wendell	061fd3ae36	Fixing bug in BlockManager timeout	2013-06-04 19:02:44 -04:00
Matei Zaharia	f420d4f228	Merge pull request #639 from pwendell/timeout-update Bump akka and blockmanager timeouts to 60 seconds	2013-06-04 15:25:58 -07:00
Patrick Wendell	8bd4e12104	Bump akka and blockmanager timeouts to 60 seconds	2013-06-04 18:14:24 -04:00
Shivaram Venkataraman	96943a1cc0	var to val	2013-06-03 12:29:38 -07:00
Shivaram Venkataraman	cd347f547a	Reuse the file object as it is valid after delete	2013-06-03 12:27:51 -07:00
Shivaram Venkataraman	a058b0acf3	Delete a file for a block if it already exists.	2013-06-03 12:10:00 -07:00
Andrew xia	606bb1b450	Fix schedulingAlgorithm bugs for unit test	2013-06-03 10:29:23 +08:00
Gavin Li	4a9913d66a	add ut for pipe enhancement	2013-06-02 23:21:09 +00:00
Shivaram Venkataraman	038cfc1a9a	Make connect timeout configurable	2013-05-31 23:32:18 -07:00
Shivaram Venkataraman	91aca92249	Another round of Netty fixes. 1. Avoid race condition between stop and copier completion 2. Handle socket exceptions by reporting them and filling in a failed FetchResult	2013-05-31 23:21:38 -07:00
Gavin Li	9f84315c05	enhance pipe to support what we can do in hadoop streaming	2013-06-01 00:26:10 +00:00
Reynold Xin	de1167bf2c	Incorporated Charles' feedback to put rdd metadata removal in BlockManagerMasterActor.	2013-05-31 15:54:57 -07:00
Reynold Xin	ba5e544461	More block manager cleanup. Implemented a removeRdd method in BlockManager, and use that to implement RDD.unpersist. Previously, unpersist needs to send B akka messages, where B = number of blocks. Now unpersist only needs to send W akka messages, where W = the number of workers.	2013-05-31 01:48:16 -07:00
jerryshao	926f41cc52	fix block manager UI display issue when enable spark.cleaner.ttl	2013-05-31 09:32:52 +08:00
Reynold Xin	f6ad3781b1	Fixed the flaky unpersist test in RDDSuite.	2013-05-30 16:28:08 -07:00
Reynold Xin	bed1b08169	Do not create symlink for local add file. Instead, copy the file. This prevents Spark from changing the original file's permission, and also allow add file to work on non-posix operating systems.	2013-05-30 16:21:49 -07:00
Shivaram Venkataraman	3b0cd17343	Merge branch 'master' of git://github.com/mesos/spark Conflicts: core/src/test/scala/spark/ShuffleSuite.scala	2013-05-30 14:36:24 -07:00
Andrew xia	c3db3ea554	1. Add unit test for local scheduler 2. Move localTaskSetManager to a new file	2013-05-30 20:49:40 +08:00
Andrew xia	ecceb101d3	implement FIFO and fair scheduler for spark local mode	2013-05-30 10:43:01 +08:00
Shivaram Venkataraman	19fd6d54c0	Also flush serializer in revertPartialWrites	2013-05-29 17:29:34 -07:00
Shivaram Venkataraman	618c8cae1e	Skip fetching zero-sized blocks in OIO. Also unify splitLocalRemoteBlocks for netty/nio and add a test case	2013-05-29 13:18:54 -07:00
Matei Zaharia	6ed71390d9	Merge pull request #626 from stephenh/remove-add-if-no-port Remove unused addIfNoPort.	2013-05-29 10:14:22 -07:00
Shivaram Venkataraman	b79b10a6d6	Flush serializer to fix zero-size kryo blocks bug. Also convert the local-cluster test case to check for non-zero block sizes	2013-05-29 00:52:55 -07:00
Matei Zaharia	41d230ccb0	Merge pull request #611 from squito/classloader Use default classloaders for akka & deserializing task results	2013-05-28 23:35:24 -07:00
Shivaram Venkataraman	fbc1ab3468	Couple of Netty fixes a. Fix the port number by reading it from the bound channel b. Fix the shutdown sequence to make sure we actually block on the channel c. Fix the unit test to use two JVMs.	2013-05-28 16:27:16 -07:00
Stephen Haberman	4fe1fbdd51	Remove unused addIfNoPort.	2013-05-28 16:26:32 -05:00
Matei Zaharia	3db1e17baa	Merge pull request #620 from jerryshao/master Fix CheckpointRDD java.io.FileNotFoundException when calling getPreferredLocations	2013-05-27 21:31:43 -07:00
Matei Zaharia	e8d4b6c296	Merge pull request #529 from xiajunluan/master [SPARK-663]Implement Fair Scheduler in Spark Cluster Scheduler	2013-05-25 21:09:03 -07:00
Reynold Xin	6bbbe01287	Fixed a stupid mistake that NonJavaSerializableClass was made Java serializable.	2013-05-24 16:51:45 -07:00
Reynold Xin	26962c9340	Automatically configure Netty port. This makes unit tests using local-cluster pass. Previously they were failing because Netty was trying to bind to the same port for all processes. Pair programmed with @shivaram.	2013-05-24 16:39:33 -07:00
Reynold Xin	6ea085169d	Fixed the bug that shuffle serializer is ignored by the new shuffle block iterators for local blocks. Also added a unit test for that.	2013-05-24 14:08:37 -07:00
jerryshao	bd3ea8f2a6	fix CheckpointRDD getPreferredLocations java.io.FileNotFoundException	2013-05-24 14:26:19 +08:00
Matei Zaharia	a2b0a7975c	Merge pull request #619 from woggling/adjust-sampling Use ARRAY_SAMPLE_SIZE constant instead of hard-coded 100.0 in SizeEstimator	2013-05-21 18:16:20 -07:00
Charles Reiss	f350f14084	Use ARRAY_SAMPLE_SIZE constant instead of 100.0	2013-05-21 18:11:33 -07:00
Charles Reiss	786c97b87c	DistributedSuite: remove dead test code	2013-05-21 11:35:49 -07:00
Andrew xia	ecd6d75c6a	fix bug of unit tests	2013-05-21 06:49:23 +08:00
Reynold Xin	5912cc4967	Merge pull request #610 from JoshRosen/spark-747 Throw exception if TaskResult exceeds Akka frame size	2013-05-17 19:58:40 -07:00
Reynold Xin	8d78c5f89f	Changed the logging level from info to warning when addJar(null) is called.	2013-05-17 18:51:35 -07:00
Reynold Xin	6729c2ead8	Merge branch 'master' of github.com:mesos/spark	2013-05-17 17:58:06 -07:00
Andrew xia	3d4672eaa9	Merge branch 'master' into xiajunluan Conflicts: core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/scheduler/cluster/ClusterScheduler.scala core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala	2013-05-18 07:28:03 +08:00
Andrew xia	d19753b9c7	expose TaskSetManager type to resourceOffer function in ClusterScheduler	2013-05-18 06:45:19 +08:00
Reynold Xin	61cf176238	Added dependency on netty-all in Maven.	2013-05-16 14:31:26 -07:00
Andrew xia	c6e2770bfe	Fix ClusterScheduler bug to avoid allocating tasks to same slave	2013-05-17 05:10:38 +08:00
Mridul Muralidharan	f0881f8d48	Hope this does not turn into a bike shed change	2013-05-17 01:58:50 +05:30
Mridul Muralidharan	feddd2530d	Filter out nulls - prevent NPE	2013-05-16 17:49:14 +05:30
Josh Rosen	b8e46b6074	Abort job if result exceeds Akka frame size; add test.	2013-05-16 01:57:57 -07:00
Matei Zaharia	2f576aba8f	Merge pull request #602 from rxin/shufflemerge Manual merge & cleanup of Shane's Shuffle Performance Optimization	2013-05-15 18:06:24 -07:00
Reynold Xin	203d7b7c14	Merge pull request #593 from squito/driver_ui_link Master UI has link to Application UI	2013-05-15 00:47:20 -07:00
Reynold Xin	f3491cb89b	Merge branch 'master' of github.com:mesos/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/test/scala/spark/DistributedSuite.scala project/SparkBuild.scala	2013-05-15 00:31:52 -07:00
Reynold Xin	f9d40a5848	Added a comment in JdbcRDD for example usage.	2013-05-14 23:29:57 -07:00
Reynold Xin	404f9ff617	Added derby dependency to Maven pom files for the JDBC Java test.	2013-05-14 23:28:34 -07:00
Reynold Xin	81ad2fa331	Merge branch 'jdbc' of github.com:koeninger/spark Conflicts: project/SparkBuild.scala	2013-05-14 23:12:00 -07:00
Imran Rashid	38d4b97c6d	use threads classloader when deserializing task results; classnotfoundexception includes classloader	2013-05-14 22:32:14 -07:00
Imran Rashid	d7d1da79d3	when akka starts, use akkas default classloader (current thread)	2013-05-14 22:32:09 -07:00
Cody Koeninger	b16c4896f6	add test for JdbcRDD using embedded derby, per rxin suggestion	2013-05-14 23:44:04 -05:00
Matei Zaharia	016ac86830	Merge pull request #601 from rxin/emptyrdd-master EmptyRDD (master branch 0.8)	2013-05-13 21:45:36 -07:00
Matei Zaharia	4b354e0a08	Merge pull request #589 from mridulm/master Add support for instance local scheduling	2013-05-13 17:39:19 -07:00
Patrick Wendell	7f0833647b	Capturing class name	2013-05-12 07:54:03 -07:00
Patrick Wendell	72b9c4cb6e	Small fix	2013-05-11 23:53:50 -07:00
Patrick Wendell	1c15b85051	Removing import	2013-05-11 23:52:53 -07:00
Patrick Wendell	059ab88754	Changing technique to use same code path in all cases	2013-05-11 23:50:54 -07:00
Cody Koeninger	3da2305ed0	code cleanup per rxin comments	2013-05-11 23:59:07 -05:00
Josh Rosen	440719109e	Throw exception if task result exceeds Akka frame size. This partially addresses SPARK-747.	2013-05-11 19:17:13 -07:00
Patrick Wendell	a5c28bb888	Removing unnecessary map	2013-05-11 14:20:39 -07:00
Patrick Wendell	0345954530	SPARK-738: Spark should detect and squash nonserializable exceptions	2013-05-11 14:17:09 -07:00
Mark Hamstra	6e6b3e0d7e	Actually use the cleaned closure in foreachPartition	2013-05-10 13:02:34 -07:00
Mridul Muralidharan	b05c9d22d7	Remove explicit hardcoding of yarn-standalone as args(0) if it is missing.	2013-05-09 18:49:12 +05:30
Imran Rashid	0ab818d508	fix linebreak	2013-05-09 00:38:59 -07:00
Reynold Xin	9cafacf32d	Added test for Netty suite.	2013-05-07 22:42:37 -07:00
Reynold Xin	5d70ee4663	Cleaned up connection manager (moved many classes to their own files).	2013-05-07 22:42:15 -07:00
Reynold Xin	8388e8dd7a	Minor style fix in DiskStore...	2013-05-07 18:40:35 -07:00
Reynold Xin	547dcbe494	Cleaned up Scala files in network/netty from Shane's PR.	2013-05-07 18:39:33 -07:00
Reynold Xin	9e64396ca4	Cleaned up the Java files from Shane's PR.	2013-05-07 18:30:54 -07:00
Reynold Xin	0e5cc30868	Cleaned up BlockManager and BlockFetcherIterator from Shane's PR.	2013-05-07 18:18:24 -07:00
Reynold Xin	8b79485171	Moved BlockFetcherIterator to its own file.	2013-05-07 17:02:32 -07:00
Reynold Xin	90577ada69	Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/DiskStore.scala project/SparkBuild.scala	2013-05-07 15:56:19 -07:00
Jey Kottalam	aacca1b8a8	Update Maven build to Scala 2.9.3	2013-05-07 14:39:44 -07:00
Reynold Xin	64d4d2b036	Added tests for joins, cogroups, and unions for EmptyRDD.	2013-05-06 16:30:46 -07:00
Reynold Xin	0fd84965f6	Added EmptyRDD.	2013-05-06 15:40:34 -07:00
Imran Rashid	22a5063ae4	switch from separating appUI host & port to combining into just appUiUrl	2013-05-05 12:19:11 -07:00
Matei Zaharia	7af92f248b	Merge pull request #597 from JoshRosen/webui-fixes Two minor bug fixes for Spark Web UI	2013-05-04 22:29:17 -07:00
Reynold Xin	0a2bed356b	Fixed flaky unpersist test in DistributedSuite.	2013-05-04 21:50:08 -07:00
Reynold Xin	62a077cd08	Merge branch 'unpersist-test' of github.com:shivaram/spark into blockmanager	2013-05-04 21:49:50 -07:00
Josh Rosen	42b1953c53	Fix SPARK-630: app details page shows finished executors as running.	2013-05-04 18:34:47 -07:00
Josh Rosen	c0688451a6	Fix wrong closing tags in web UI HTML.	2013-05-04 18:34:46 -07:00
Josh Rosen	d48e9fde01	Fix SPARK-629: weird number of cores in job details page.	2013-05-04 18:34:45 -07:00
Mridul Muralidharan	25198d7e9e	Merge branch 'master' of github.com:mridulm/spark	2013-05-04 20:45:56 +05:30
Mridul Muralidharan	5b011d18d7	Merge from master	2013-05-04 20:41:27 +05:30
Mridul Muralidharan	edb57c8331	Add support for instance local in getPreferredLocations of ZippedPartitionsBaseRDD. Add comments to both ZippedPartitionsBaseRDD and ZippedRDD to better describe the potential problem with the approach	2013-05-04 19:47:45 +05:30
Matei Zaharia	3bf2c868c3	Merge pull request #594 from shivaram/master Add zip partitions to Java API	2013-05-03 18:27:30 -07:00
Shivaram Venkataraman	2274ad0786	Fix flaky test by changing catch and adding sleep	2013-05-03 16:35:35 -07:00
Shivaram Venkataraman	bb8a434f9d	Add zipPartitions to Java API.	2013-05-03 15:14:02 -07:00
Imran Rashid	6fae936088	applications (aka drivers) send their webUI address to master when registering so it can be displayed in the master web ui	2013-05-03 12:59:10 -07:00
Mridul Muralidharan	ea2a6f91d3	pull from master	2013-05-04 00:35:59 +05:30
Reynold Xin	93091f6936	Merge branch 'master' of github.com:mesos/spark into blockmanager	2013-05-03 01:02:32 -07:00
Reynold Xin	2bc895a829	Updated according to Matei's code review comment.	2013-05-03 01:02:16 -07:00
Mridul Muralidharan	11589c39d9	Fix ZippedRDD as part Matei's suggestion	2013-05-03 12:23:30 +05:30
Matei Zaharia	6fe9d4e61e	Merge pull request #592 from woggling/localdir-fix Don't accept generated local directory names that can't be created	2013-05-02 21:33:56 -07:00
Matei Zaharia	538ee755b4	Merge pull request #581 from jerryshao/master fix [SPARK-740] block manage UI throws exception when enabling Spark Streaming	2013-05-02 09:01:42 -07:00
Charles Reiss	c847dd3da2	Don't accept generated temp directory names that can't be created successfully.	2013-05-01 23:19:10 -07:00
Reynold Xin	4a31877408	Added the unpersist api to JavaRDD.	2013-05-01 20:31:54 -07:00
Reynold Xin	98df9d2853	Added removeRdd function in BlockManager.	2013-05-01 20:17:09 -07:00
Mridul Muralidharan	dfde9ce9dd	comment out debug versions of checkHost, etc from Utils - which were used to test	2013-05-02 07:41:33 +05:30
Mridul Muralidharan	1b5aaeadc7	Integrate review comments 2	2013-05-02 07:30:06 +05:30
jerryshao	c047f0e3ad	filter out Spark streaming block RDD and sort RDDInfo with id	2013-05-02 09:48:32 +08:00
Mridul Muralidharan	609a817f52	Integrate review comments on pull request	2013-05-02 06:44:33 +05:30
Reynold Xin	204eb32e14	Changed the type of the persistentRdds hashmap back to TimeStampedHashMap.	2013-05-01 16:14:58 -07:00
Reynold Xin	34637b97ec	Added SparkContext.cleanup back. Not sure why it was removed before ...	2013-05-01 16:12:37 -07:00
Reynold Xin	3227ec8edd	Cleaned up Ram's code. Moved SparkContext.remove to RDD.unpersist. Also updated unit tests to make sure they are properly testing for concurrency.	2013-05-01 16:07:44 -07:00
harshars	8481562731	Merged Ram's commit on removing RDDs. Conflicts: core/src/main/scala/spark/SparkContext.scala	2013-05-01 14:42:17 -07:00
Mridul Muralidharan	27764a00f4	Fix some npe introduced accidentally	2013-05-01 20:56:05 +05:30
Mridul Muralidharan	d960e7e0f8	a) Add support for hyper local scheduling - specific to a host + port - before trying host local scheduling. b) Add some fixes to test code to ensure it passes (and fixes some other issues). c) Fix bug in task scheduling which incorrectly used availableCores instead of all cores on the node.	2013-05-01 20:24:00 +05:30
Matei Zaharia	aa8fe1a209	Merge pull request #586 from mridulm/master Pull request to address issues Reynold Xin reported	2013-04-30 22:30:18 -07:00
Reynold Xin	dd7bef3147	Two minor fixes according to Ryan LeCompte's review.	2013-04-30 15:02:32 -07:00
Reynold Xin	cea6174573	Merge branch 'master' of github.com:mesos/spark into blockmanager Conflicts: core/src/main/scala/spark/BlockStoreShuffleFetcher.scala	2013-04-30 13:28:35 -07:00
Mridul Muralidharan	60cabb35cb	Add addition catch block for exception too	2013-05-01 01:17:14 +05:30
Mridul Muralidharan	3b748ced22	Be more aggressive and defensive in all uses of SelectionKey in select loop	2013-05-01 00:30:30 +05:30
Mridul Muralidharan	0f45477be1	Change indentation	2013-05-01 00:10:02 +05:30
Mridul Muralidharan	538614acfe	Be more aggressive and defensive in select also	2013-05-01 00:05:32 +05:30
Mridul Muralidharan	48854e1dbf	If key is not valid, close connection	2013-04-30 23:59:33 +05:30
Matei Zaharia	f708dda81e	Merge pull request #585 from pwendell/listener-perf [Fix SPARK-742] Task Metrics should not employ per-record timing by default	2013-04-30 07:51:40 -07:00
Mridul Muralidharan	e46d547ccd	Fix issues reported by Reynold	2013-04-30 16:15:56 +05:30
Reynold Xin	1055785a83	Allow specifying the shuffle write file buffer size. The default buffer size is 8KB in FastBufferedOutputStream, which is too small and would cause a lot of disk seeks.	2013-04-29 23:33:56 -07:00
Reynold Xin	7007201201	Added a shuffle block manager so it is easier in the future to consolidate shuffle output files.	2013-04-29 23:07:03 -07:00
Reynold Xin	d3586ef438	Merge branch 'blockmanager' of github.com:rxin/spark into blockmanager Conflicts: core/src/main/scala/spark/storage/DiskStore.scala	2013-04-29 15:44:18 -07:00
Patrick Wendell	016ce1fa9c	Using full package name for util	2013-04-29 12:02:27 -07:00
Patrick Wendell	540be6b154	Modified version of the fix which just removes all per-record tracking.	2013-04-29 11:32:07 -07:00
Patrick Wendell	224fbac061	Spark-742: TaskMetrics should not employ per-record timing. This patch does three things: 1. Makes TimedIterator a trait with two implementations (one a no-op) 2. Makes the default behavior to use the no-op implementation 3. Removes DelegateBlockFetchTracker. This is just cleanup, but it seems like the triat doesn't really reduce complexity in any way. In the future we can add other implementations, e.g. ones which perform sampling.	2013-04-29 11:13:43 -07:00
Matei Zaharia	0f45347c7b	More unit test fixes	2013-04-28 22:29:27 -07:00
Matei Zaharia	bce4089f22	Fix BlockManagerSuite to deal with clearing spark.hostPort	2013-04-28 22:23:48 -07:00
Matei Zaharia	68c07ea198	Merge pull request #582 from shivaram/master Add zip partitions interface	2013-04-28 20:19:33 -07:00
Shivaram Venkataraman	604d3bf56c	Rename partition class and add scala doc	2013-04-28 16:31:07 -07:00
Shivaram Venkataraman	15acd49f07	Actually rename classes to ZippedPartitions* (the previous commit only renamed the file)	2013-04-28 16:03:22 -07:00
Shivaram Venkataraman	6e84635ab9	Rename classes from MapZipped* to Zipped*	2013-04-28 15:58:40 -07:00
Mridul Muralidharan	afee902443	Attempt to fix streaming test failures after yarn branch merge	2013-04-28 22:26:45 +05:30
Shivaram Venkataraman	0cc6642b7c	Rename to zipPartitions and style changes	2013-04-28 05:11:03 -07:00
Shivaram Venkataraman	c9c4954d99	Add an interface to zip iterators of multiple RDDs The current code supports 2, 3 or 4 arguments but can be extended to more arguments if required.	2013-04-26 16:57:46 -07:00
Matei Zaharia	6e6b5204ea	Create an empty directory when checkpointing a 0-partition RDD (fixes a test failure on Hadoop 2.0)	2013-04-25 00:42:37 -07:00
Reynold Xin	ba6ffa6a5f	Allow the specification of a shuffle serializer in the read path (for local block reads).	2013-04-24 17:38:07 -07:00
Reynold Xin	aa618ed2a2	Allow changing the serializer on a per shuffle basis.	2013-04-24 14:52:49 -07:00
Mridul Muralidharan	dd515ca3ee	Attempt at fixing merge conflict	2013-04-24 09:24:17 +05:30
Reynold Xin	31ce6c66d6	Added a BlockObjectWriter interface in block manager so ShuffleMapTask doesn't need to build up an array buffer for each shuffle bucket.	2013-04-23 17:48:59 -07:00
Mridul Muralidharan	8faf5c51c3	Patch from Thomas Graves to improve the YARN Client, and move to more production ready hadoop yarn branch	2013-04-24 02:31:57 +05:30
koeninger	dfac0aa5c2	prevent mysql driver from pulling entire resultset into memory. explicitly close resultset and statement.	2013-04-22 21:12:52 -05:00
Mridul Muralidharan	7acab3ab45	Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo	2013-04-22 08:01:13 +05:30
koeninger	b2a3f24dde	first attempt at an RDD to pull data from JDBC sources	2013-04-21 00:29:37 -05:00
Mridul Muralidharan	ac2e8e8720	Add some basic documentation	2013-04-19 00:13:19 +05:30
Andrew xia	8436bd5d4a	remove TaskSetQueueManager and update code style	2013-04-19 02:17:22 +08:00
Andrew xia	e0603d7e8b	refactor the Schedulable interface and add unit test for SchedulingAlgorithm	2013-04-18 13:13:54 +08:00
Mridul Muralidharan	5ee2f5c483	Cache pattern, add (commented out) alternatives for check* apis	2013-04-17 23:13:34 +05:30
Mridul Muralidharan	f07961060d	Add a small note on spark.tasks.schedule.aggression	2013-04-17 23:13:02 +05:30
Mridul Muralidharan	02dffd2eb0	Ensure all ask/await block for spark.akka.askTimeout - so that it is controllable : instead of arbitrary timeouts spread across codebase. In our tests, we use 30 seconds, though default of 10 is maintained	2013-04-17 05:52:57 +05:30
Mridul Muralidharan	a402b23bcd	Fudge order of classpath - so that our jars take precedence over what is in CLASSPATH variable. Sounds logical, hope there is no issue cos of it	2013-04-17 05:52:00 +05:30
Mridul Muralidharan	bcdde331c3	Move from master to driver	2013-04-17 04:12:18 +05:30
Mridul Muralidharan	ad80f68eb5	remove spurious debug statements	2013-04-16 22:15:34 +05:30
Mridul Muralidharan	f7969f72ee	Fix exception when checkpoint path does not exist (no data in rdd which is being checkpointed for example)	2013-04-16 21:51:38 +05:30
Mridul Muralidharan	323ab8ff3b	Scala does not prevent variable shadowing ! Sick error due to it ...	2013-04-16 17:05:10 +05:30
shane-huang	b493f55a4f	fix a bug in netty Block Fetcher Signed-off-by: shane-huang <shengsheng.huang@intel.com>	2013-04-16 10:01:01 +08:00
Mridul Muralidharan	59c380d69a	Fix npe	2013-04-16 03:29:38 +05:30
Mridul Muralidharan	dd2b64ec97	Fix bug with atomic update	2013-04-16 03:19:24 +05:30
Mridul Muralidharan	5540ab8243	Use hostname instead of hostport for executor, fix creation of workdir	2013-04-16 02:57:43 +05:30
Mridul Muralidharan	eb7e95e833	Commit job to persist files	2013-04-16 02:56:36 +05:30
Matei Zaharia	a64c107449	Make ShuffledRDD.prev transient	2013-04-15 16:41:51 -04:00
Mridul Muralidharan	19652a44be	Fix issue with FileSuite failing	2013-04-15 19:16:36 +05:30
Mridul Muralidharan	54b3d45b81	Checkpoint commit - compiles and passes a lot of tests - not all though, looking into FileSuite issues	2013-04-15 18:26:50 +05:30
Mridul Muralidharan	d90d2af103	Checkpoint commit - compiles and passes a lot of tests - not all though, looking into FileSuite issues	2013-04-15 18:12:11 +05:30
Matei Zaharia	c35d530bcf	Fix compile error	2013-04-13 12:43:12 -04:00
Andrew Ash	29d3440efb	Add details when BlockManager heartbeats time out Makes it more clear what the threshold was for tuning spark.storage.blockManagerSlaveTimeoutMs Before: WARN "Removing BlockManager BlockManagerId(201304022120-1976232532-5050-27464-0, myhostname, 51337) with no recent heart beats After: WARN "Removing BlockManager BlockManagerId(201304022120-1976232532-5050-27464-0, myhostname, 51337) with no recent heart beats: 19216ms exceeds 15000ms	2013-04-11 01:54:02 -03:00
Andrew xia	2f883c515f	Contiue to update codes for scala code style 1.refactor braces for "class" "if" "while" "for" "match" 2.make code lines less than 100 3.refactor class parameter and extends defination	2013-04-09 13:02:50 +08:00
Matei Zaharia	65caa8f711	Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0' Conflicts: docs/_config.yml project/SparkBuild.scala	2013-04-08 12:43:17 -04:00
Matei Zaharia	054feb6448	Fixed a bug with zip	2013-04-07 21:15:21 -04:00
Matei Zaharia	b5900d47b1	Fix compile warning	2013-04-07 20:55:42 -04:00
Matei Zaharia	6962d40b44	Fix deprecated warning	2013-04-07 20:27:33 -04:00
Mridul Muralidharan	6798a09df8	Add support for building against hadoop2-yarn : adding new maven profile for it	2013-04-07 17:47:38 +05:30
shane-huang	df47b40b76	Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages change reference from io.Source to scala.io.Source to avoid looking into io.netty package Signed-off-by: shane-huang <shengsheng.huang@intel.com>	2013-04-07 14:37:12 +08:00
Andrew xia	2b373dd07a	add properties default value null to fix sbt/sbt test errors	2013-04-02 12:11:14 +08:00
Mark Hamstra	e215f67923	Correct sense of 'filter out' in comment.	2013-03-31 08:00:13 -07:00
Mark Hamstra	8bcdc64005	Fixed broken filter in getWritableClass[T]	2013-03-30 22:09:52 -07:00
Matei Zaharia	9831bc1a09	Merge pull request #539 from cgrothaus/fix-webui-workdirpath Bugfix: WorkerWebUI must respect workDirPath from Worker	2013-03-29 22:16:22 -07:00
Matei Zaharia	3cc8ab6e29	Merge pull request #541 from stephenh/shufflecoalesce Add a shuffle parameter to coalesce.	2013-03-29 22:14:07 -07:00
Andrew xia	1a28f92711	change some typo and some spacing	2013-03-29 08:34:28 +08:00
Andrew xia	def3d1c84a	1.remove redundant spacing in source code 2.replace get/set functions with val and var defination	2013-03-29 08:20:35 +08:00
Jey Kottalam	bc8ba222ff	Bump development version to 0.8.0	2013-03-28 15:42:01 -07:00
Holden Karau	f5df729b12	Explicitly catch all throwables (warning in 2.10)	2013-03-24 16:15:32 -07:00
Stephen Haberman	dd854d5b9f	Use Boolean in the Java API, and != for assert.	2013-03-23 11:49:45 -05:00
Stephen Haberman	4ca273edc4	Merge branch 'master' into shufflecoalesce Conflicts: core/src/test/scala/spark/RDDSuite.scala	2013-03-23 11:45:45 -05:00
Matei Zaharia	b8949cab88	Merge pull request #505 from stephenh/volatile Make Executor fields volatile since they're read from the thread pool.	2013-03-23 07:19:34 -07:00
Matei Zaharia	fd53f2fc7b	Merge pull request #510 from markhamstra/WithThing mapWith, flatMapWith and filterWith	2013-03-23 07:13:21 -07:00
Andrew xia	d1d9bdaabe	Just update typo and comments	2013-03-23 07:25:30 +08:00
Stephen Haberman	00170eb0b9	Fix are/our typo.	2013-03-22 12:59:08 -05:00
Stephen Haberman	1c67c7dfd1	Add a shuffle parameter to coalesce. This is useful for when you want just 1 output file (part-00000) but still up the upstream RDD to be computed in parallel.	2013-03-22 08:54:44 -05:00
Christoph Grothaus	445f387ef4	Bugfix: WorkerWebUI must respect workDirPath from Worker	2013-03-22 11:08:40 +01:00
Matei Zaharia	35588490cb	Merge pull request #538 from rxin/cogroup Added mapSideCombine flag to CoGroupedRDD. Added unit test for CoGroupedRDD.	2013-03-20 19:27:47 -07:00
Stephen Haberman	4f4215311a	Merge branch 'master' into volatile	2013-03-20 15:37:10 -05:00
Matei Zaharia	b812e6b7bb	Merge pull request #526 from markhamstra/foldByKey Add foldByKey	2013-03-20 11:21:02 -07:00
Reynold Xin	d48ee7e55e	Merge branch 'master' of github.com:mesos/spark into cogroup	2013-03-20 14:00:28 +08:00
Reynold Xin	00a11304fd	Added mapSideCombine flag to CoGroupedRDD. Added unit test for CoGroupedRDD.	2013-03-20 13:49:51 +08:00
Matei Zaharia	945d1e720e	Merge pull request #536 from sasurfer/master CoalescedRDD for many partitions	2013-03-19 21:59:06 -07:00
Matei Zaharia	1cbbe94ac1	Merge pull request #534 from stephenh/removetrycatch Remove try/catch block that can't be hit.	2013-03-19 21:34:34 -07:00
Andrey Kouznetsov	bd167f83b0	call setConf from input format if it is Configurable	2013-03-19 17:15:15 +04:00
Giovanni Delussu	aceae029f7	CoalescedRDD changed to work with a big number of partitions both in the original and the new coalesced RDD. The limitation was in the range that Scala.Int can represent.	2013-03-19 11:25:45 +01:00
Stephen Haberman	fb34967815	Remove try/catch block that can't be hit.	2013-03-18 01:55:50 -05:00
Mark Hamstra	ab33e27cc9	constructorOfA -> constructA in doc comments	2013-03-16 15:29:15 -07:00
Mark Hamstra	9784fc1fcd	fix wayward comma in doc comment	2013-03-16 15:25:02 -07:00
Mark Hamstra	32979b5e7d	whitespace	2013-03-16 13:36:46 -07:00
Mark Hamstra	ca9f81e8fc	refactor foldByKey to use combineByKey	2013-03-16 13:31:01 -07:00
Mark Hamstra	1fb192ef40	Merge branch 'master' of https://github.com/mesos/spark into foldByKey	2013-03-16 12:17:13 -07:00
Mark Hamstra	80fc8c82ed	_With[Matei]	2013-03-16 12:16:29 -07:00
Mark Hamstra	38454c4aed	Merge branch 'master' of https://github.com/mesos/spark into WithThing	2013-03-16 11:54:44 -07:00
Matei Zaharia	c1e9cdc49f	Merge pull request #525 from stephenh/subtractByKey Add PairRDDFunctions.subtractByKey.	2013-03-16 11:47:45 -07:00
Mark Hamstra	ef75be3bf7	Merge branch 'master' of https://github.com/mesos/spark into foldByKey	2013-03-15 21:41:24 -07:00
Andrew xia	5892393140	refactor fair scheduler implementation 1.Chage "pool" properties to be the memeber of ActiveJob 2.Abstract the Schedulable of Pool and TaskSetManager 3.Abstract the FIFO and FS comparator algorithm 4.Miscellaneous changing of class define and construction	2013-03-16 11:13:38 +08:00
Matei Zaharia	cdbfd1e196	Merge pull request #516 from squito/fix_local_metrics Fix local metrics	2013-03-15 15:13:28 -07:00
Mikhail Bautin	7fd2708eda	Add a log4j compile dependency to fix build in IntelliJ Also rename parent project to spark-parent (otherwise it shows up as "parent" in IntelliJ, which is very confusing).	2013-03-15 11:41:51 -07:00
Mark Hamstra	1a4070477d	whitespace cleanup	2013-03-15 11:28:28 -07:00
Mark Hamstra	857010392b	Fuller implementation of foldByKey	2013-03-15 10:56:05 -07:00
Mark Hamstra	16a4ca4537	restrict V type of foldByKey in order to retain ClassManifest; added foldByKey to Java API and test	2013-03-14 13:58:37 -07:00
Mark Hamstra	b1422cbdd5	added foldByKey	2013-03-14 12:59:58 -07:00
Stephen Haberman	7786881f47	Fix tabs that snuck in.	2013-03-14 14:57:12 -05:00
Stephen Haberman	7d8bb4df3a	Allow subtractByKey's other argument to have a different value type.	2013-03-14 14:44:15 -05:00
Stephen Haberman	4632c45af1	Finished subtractByKeys.	2013-03-14 10:35:34 -05:00
Matei Zaharia	4032beba49	Merge pull request #521 from stephenh/earlyclose Close the reader in HadoopRDD as soon as iteration end.	2013-03-13 19:29:46 -07:00
Stephen Haberman	63fe225587	Simplify SubtractedRDD in preparation from subtractByKey.	2013-03-13 17:17:34 -05:00
Mark Hamstra	cd5b947cf6	Merge branch 'master' of https://github.com/mesos/spark into WithThing	2013-03-13 13:16:14 -07:00
Stephen Haberman	e7f1a69c6b	Add a test for NextIterator.	2013-03-13 10:46:33 -05:00
Stephen Haberman	1a175d13b9	Add NextIterator.closeIfNeeded.	2013-03-13 10:17:39 -05:00
Stephen Haberman	8f00d23598	Remove NextIterator.close default implementation.	2013-03-12 12:30:10 -05:00
Harold Lim	0b64e5f1ac	Removed some commented code	2013-03-12 13:31:27 +08:00
Harold Lim	f5b1fecb9f	Cleaned up the code	2013-03-12 13:31:27 +08:00
Harold Lim	b5325182a3	Updated/Refactored the Fair Task Scheduler. It does not inherit ClusterScheduler anymore. Rather, ClusterScheduler internally uses TaskSetQueuesManager that handles the scheduling of taskset queues. This is the class that should be extended to support other scheduling policies	2013-03-12 13:31:27 +08:00
Harold Lim	54ed7c4af4	Changed the name of the system property to set the allocation xml	2013-03-12 13:31:27 +08:00
Harold Lim	c07087364b	Made changes to the SparkContext to have a DynamicVariable for setting local properties that can be passed down the stack. Added an implementation of the fair scheduler	2013-03-12 13:31:27 +08:00
Stephen Haberman	9e68f48625	More quickly call close in HadoopRDD. This also refactors out the common "gotNext" iterator pattern into a shared utility class.	2013-03-11 23:59:17 -05:00
Charles Reiss	769d399674	Send block sizes as longs.	2013-03-11 14:17:05 -07:00
Mark Hamstra	562893bea3	deleted excess curly braces	2013-03-10 22:43:08 -07:00
Imran Rashid	8a11ac3dc7	increase sleep time	2013-03-10 22:31:44 -07:00
Imran Rashid	9f97f2f9d8	add a small wait to one task to make sure some task runtime really is non-zero	2013-03-10 22:30:18 -07:00
Mark Hamstra	1289e7176b	refactored _With API and added foreachPartition	2013-03-10 22:27:13 -07:00
Mark Hamstra	b57df1f5e3	Merge branch 'master' of https://github.com/mesos/spark into WithThing	2013-03-10 16:56:31 -07:00
Matei Zaharia	2e1bbc4e7e	Merge remote-tracking branch 'woggling/dag-sched-driver-port' Conflicts: core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala	2013-03-10 16:52:54 -07:00
Matei Zaharia	91a9d093bd	Merge pull request #512 from patelh/fix-kryo-serializer Fix reference bug in Kryo serializer, add test, update version	2013-03-10 15:48:23 -07:00
Matei Zaharia	557cfd0f4d	Merge pull request #515 from woggling/deploy-app-death Notify standalone deploy client of application death.	2013-03-10 15:44:57 -07:00
Matei Zaharia	a59cc6060f	Merge remote-tracking branch 'stephenh/nomocks' Conflicts: core/src/main/scala/spark/storage/BlockManagerMaster.scala core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala	2013-03-10 13:39:10 -07:00
Imran Rashid	20f01a0a1b	enable task metrics in local mode, add tests	2013-03-09 21:17:31 -08:00
Imran Rashid	ec30188a2a	rename remoteFetchWaitTime to fetchWaitTime, since it also includes time from local fetches	2013-03-09 21:16:53 -08:00
Charles Reiss	b0983c5762	Notify standalone deploy client of application death. Usually, this isn't necessary since the application will be removed as a result of the deploy client disconnecting, but occassionally, the standalone deploy master removes an application otherwise. Also mark applications as FAILED instead of FINISHED when they are killed as a result of their executors failing too many times.	2013-03-09 11:29:45 -08:00
Charles Reiss	d0216cb38b	Prevent DAGSchedulerSuite from corrupting driver.port. Use the LocalSparkContext abstraction to properly manage clearing spark.driver.port.	2013-03-09 10:49:02 -08:00
Hiral Patel	664e5fd24b	Fix reference bug in Kryo serializer, add test, update version	2013-03-07 22:16:11 -08:00
Mark Hamstra	5ff0810b11	refactor mapWith, flatMapWith and filterWith to each use two parameter lists	2013-03-05 12:25:44 -08:00
Mark Hamstra	d046d8ad32	whitespace formatting	2013-03-05 00:48:13 -08:00
Mark Hamstra	9148b968cf	mapWith, flatMapWith and filterWith	2013-03-04 15:48:47 -08:00
Matei Zaharia	9f0dc829cb	Fix TaskMetrics not being serializable	2013-03-04 12:08:31 -08:00
Matei Zaharia	04fb81ffe5	Merge pull request #506 from rxin/spark-706 Fixed SPARK-706: Failures in block manager put leads to read task hanging.	2013-03-03 17:20:07 -08:00
Imran Rashid	0bd1d00c2a	minor cleanup based on feedback in review request	2013-03-03 16:46:45 -08:00
Imran Rashid	f1006b99ff	change CleanupIterator to CompletionIterator	2013-03-03 16:39:05 -08:00
Imran Rashid	8fef5b9c5f	refactoring of TaskMetrics	2013-03-03 16:34:04 -08:00
Imran Rashid	d36abdb053	Merge branch 'master' into stageInfo	2013-03-03 15:20:46 -08:00
Matei Zaharia	6bfc7cad6b	Merge pull request #504 from mosharaf/master Worker address was getting removed when removing an app.	2013-03-02 22:14:49 -08:00
Mark Hamstra	8b06b359da	bump version to 0.7.1-SNAPSHOT in the subproject poms to keep the maven build building.	2013-02-28 23:34:34 -08:00
Reynold Xin	44134e12bb	Fixed SPARK-706: Failures in block manager put leads to read task hanging.	2013-02-28 15:14:59 -08:00
Stephen Haberman	6415c2bb60	Don't create the Executor until we have everything it needs.	2013-02-28 12:38:09 -06:00
Stephen Haberman	80eecd2cb1	Make Executor fields volatile since they're read from the thread pool.	2013-02-28 10:41:07 -06:00
Mosharaf Chowdhury	4ab387bcdb	Fixed master datastructure updates after removing an application; and a typo.	2013-02-27 13:52:44 -08:00
Matei Zaharia	ece3edfffa	Fix a problem with no hosts being counted as alive in the first job	2013-02-26 12:11:03 -08:00
Matei Zaharia	73697e2891	Fix overly large thread names in PySpark	2013-02-26 12:07:59 -08:00
Stephen Haberman	db957e5bd7	Fix MapOutputTrackerSuite.	2013-02-26 01:38:50 -06:00
Stephen Haberman	a65aa549ff	Override DAGScheduler.runLocally so we can remove the Thread.sleep.	2013-02-25 23:49:32 -06:00
Stephen Haberman	a4adeb255c	Merge branch 'master' into nomocks Conflicts: core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala	2013-02-25 23:48:52 -06:00
Tathagata Das	c02e064938	Fixed replication bug in BlockManager	2013-02-25 17:27:46 -08:00
Matei Zaharia	490f056cdd	Allow passing sparkHome and JARs to StreamingContext constructor Also warns if spark.cleaner.ttl is not set in the version where you pass your own SparkContext.	2013-02-25 15:13:30 -08:00
Matei Zaharia	568bdaf8ae	Set spark.deploy.spreadOut to true by default in 0.7 (improves locality)	2013-02-25 14:34:55 -08:00
Matei Zaharia	1ef58dadcc	Add a config property for Akka lifecycle event logging	2013-02-25 14:01:24 -08:00
Matei Zaharia	ceaec4a675	Merge pull request #498 from pwendell/shutup-akka Disable remote lifecycle logging from Akka.	2013-02-25 12:31:24 -08:00
Patrick Wendell	85a85646d9	Disable remote lifecycle logging from Akka. This changes the default setting to `off` for remote lifecycle events. When this is on, it is very chatty at the INFO level. It also prints out several ERROR messages sometimes when sc.stop() is called.	2013-02-25 12:25:43 -08:00
Imran Rashid	8f17387d97	remove bogus comment	2013-02-25 10:31:06 -08:00
Matei Zaharia	6ae9a22c3e	Get spark.default.paralellism on each call to defaultPartitioner, instead of only once, in case the user changes it across Spark uses	2013-02-25 10:28:08 -08:00
Matei Zaharia	d6e6abece3	Merge pull request #459 from stephenh/bettersplits Change defaultPartitioner to use upstream split size.	2013-02-25 09:22:04 -08:00
Stephen Haberman	c44ccf2862	Use default parallelism if its set.	2013-02-24 23:54:03 -06:00
Stephen Haberman	44032bc476	Merge branch 'master' into bettersplits Conflicts: core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/test/scala/spark/ShuffleSuite.scala	2013-02-24 22:08:14 -06:00
Christoph Grothaus	f39f2b7636	Incorporate feedback from mateiz: - we do not need getEnvOrEmpty - Instead of saving SPARK_NONDAEMON_JAVA_OPTS, it would be better to modify the scripts to use a different variable name for the JAVA_OPTS they do eventually use	2013-02-24 21:24:30 +01:00
Tathagata Das	dff53d1b94	Merge branch 'mesos-master' into streaming	2013-02-24 12:17:22 -08:00
Matei Zaharia	3b9f929467	Merge pull request #468 from haitaoyao/master support customized java options for Master, Worker, Executor, and Repl	2013-02-23 23:38:15 -08:00
Stephen Haberman	37c7a71f9c	Add subtract to JavaRDD, JavaDoubleRDD, and JavaPairRDD.	2013-02-24 00:27:53 -06:00
Stephen Haberman	f442e7d83c	Update for split->partition rename.	2013-02-24 00:27:14 -06:00
Stephen Haberman	cec87a0653	Merge branch 'master' into subtract	2013-02-23 23:27:55 -06:00
Tathagata Das	d853aa9658	Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs.	2013-02-23 17:42:26 -08:00
Patrick Wendell	931f439be9	Responding to code review	2013-02-23 15:40:41 -08:00
Patrick Wendell	f51b0f93f2	Adding Java-accessible methods to Vector.scala This is needed for the Strata machine learning tutorial (and also is generally helpful).	2013-02-23 13:26:59 -08:00
Matei Zaharia	d942d39072	Handle exceptions in RecordReader.close() better (suggested by Jim Donahue)	2013-02-23 11:19:07 -08:00
Matei Zaharia	c89824046a	Merge pull request #490 from woggling/conn-death Detect when SendingConnections disconnect even if we aren't sending to them	2013-02-22 22:58:19 -08:00
Charles Reiss	50cf8c8b79	Add fault tolerance test that uses replicated RDDs.	2013-02-22 16:11:53 -08:00
Charles Reiss	c8a7886921	Detect when SendingConnections drop by trying to read them. Comment fix	2013-02-22 16:11:52 -08:00
Matei Zaharia	d4d7993bf5	Several fixes to the work to log when no resources can be used by a job. Fixed some of the messages as well as code style.	2013-02-22 15:51:37 -08:00
Matei Zaharia	f33662c133	Merge remote-tracking branch 'pwendell/starvation-check' Also fixed a bug where master was offering executors on dead workers Conflicts: core/src/main/scala/spark/deploy/master/Master.scala	2013-02-22 15:27:41 -08:00
Matei Zaharia	7341de0d48	Merge pull request #475 from JoshRosen/spark-668 Remove hack workaround for SPARK-668	2013-02-22 14:56:18 -08:00
Patrick Wendell	f8c3a03d55	SPARK-702: Replace Function --> JFunction in JavaAPI Suite. In a few places the Scala (rather than Java) function class is used.	2013-02-22 12:54:15 -08:00
Imran Rashid	0f37b43b40	make the ShuffleFetcher responsible for collecting shuffle metrics, which gives us metrics for CoGroupedRDD and ShuffledRDD	2013-02-21 16:56:28 -08:00
Imran Rashid	9230617f23	add cleanup iterator	2013-02-21 16:55:14 -08:00
Imran Rashid	81bd07da26	sparkListeners should be a val	2013-02-21 15:21:45 -08:00
Imran Rashid	796e934d31	add some docs & some cleanup	2013-02-21 15:19:34 -08:00
Imran Rashid	394d3acc3e	store taskInfo & metrics together in a tuple	2013-02-21 15:19:34 -08:00
Imran Rashid	7960927cf4	get rid of a bunch of boilerplate; more formatting happens in Listener, not StageInfo	2013-02-21 15:19:34 -08:00
Imran Rashid	d0bfac3eed	taskInfo tracks if a task is run on a preferred host	2013-02-21 15:19:34 -08:00
Imran Rashid	6f62a57858	add runtime breakdowns	2013-02-21 15:19:34 -08:00
Imran Rashid	176cb20703	add task result size; better formatting for time interval distributions; cleanup distribution formatting	2013-02-21 15:19:33 -08:00
Imran Rashid	f2fcabf2ea	add timing around parts of executor & track result size	2013-02-21 15:19:33 -08:00
Imran Rashid	ff127cfcd3	Merge branch 'master' into stageInfo Conflicts: core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/storage/BlockManager.scala	2013-02-21 15:16:21 -08:00
Imran Rashid	69f9a7035f	fully revert change to addOnCompleteCallback -- missed this in `e9f53ec`	2013-02-21 15:07:46 -08:00
Imran Rashid	baab23abdf	TaskContext does not hold a reference to Task; instead, it has a shared instance of TaskMetrics with Task	2013-02-21 14:13:01 -08:00
haitao.yao	8215b95547	Merge branch 'mesos'	2013-02-21 10:07:24 +08:00
Christoph Grothaus	85a35c6840	Fix SPARK-698. From ExecutorRunner, launch java directly instead via the run scripts.	2013-02-20 21:42:11 +01:00
Tathagata Das	334ab92441	Fixed bug in CheckpointSuite	2013-02-20 10:26:36 -08:00
Tathagata Das	1cb725e417	Merge branch 'mesos-master' into streaming	2013-02-20 09:55:35 -08:00
Tathagata Das	fb9956256d	Merge branch 'mesos-master' into streaming Conflicts: core/src/main/scala/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala	2013-02-20 09:01:29 -08:00
Matei Zaharia	05bc02e80b	Merge pull request #482 from woggling/shutdown-exceptions Don't call System.exit over uncaught exceptions from shutdown hooks	2013-02-19 20:56:15 -08:00
haitao.yao	6a3d44c673	Merge branch 'mesos'	2013-02-20 10:23:58 +08:00
Charles Reiss	092c631fa8	Pull detection of being in a shutdown hook into utility function.	2013-02-19 17:49:55 -08:00
Reynold Xin	130f704baf	Added a method to create PartitionPruningRDD.	2013-02-19 16:03:52 -08:00
Charles Reiss	d0588bd6d7	Catch/log errors deleting temp dirs	2013-02-19 13:04:06 -08:00
Charles Reiss	687581c3ec	Paranoid uncaught exception handling for exceptions during shutdown	2013-02-19 13:03:02 -08:00
haitao.yao	7c129388fb	Merge branch 'mesos'	2013-02-19 11:22:24 +08:00
Matei Zaharia	7151e1e4c8	Rename "jobs" to "applications" in the standalone cluster	2013-02-17 23:23:08 -08:00
Matei Zaharia	06e5e6627f	Renamed "splits" to "partitions"	2013-02-17 22:13:26 -08:00
Matei Zaharia	340cc54e47	Merge pull request #471 from stephenh/parallelrdd Move ParallelCollection into spark.rdd package.	2013-02-16 16:39:15 -08:00
Matei Zaharia	3260b6120e	Merge pull request #470 from stephenh/morek Make CoGroupedRDDs explicitly have the same key type.	2013-02-16 16:38:38 -08:00
Stephen Haberman	924f47dd11	Add RDD.subtract. Instead of reusing the cogroup primitive, this adds a SubtractedRDD that knows it only needs to keep rdd1's values (per split) in memory.	2013-02-16 13:38:42 -06:00
Stephen Haberman	e7713adb99	Move ParallelCollection into spark.rdd package.	2013-02-16 13:20:48 -06:00

... 10 11 12 13 14 ...

2263 commits