ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Patrick Wendell	224fbac061	Spark-742: TaskMetrics should not employ per-record timing. This patch does three things: 1. Makes TimedIterator a trait with two implementations (one a no-op) 2. Makes the default behavior to use the no-op implementation 3. Removes DelegateBlockFetchTracker. This is just cleanup, but it seems like the triat doesn't really reduce complexity in any way. In the future we can add other implementations, e.g. ones which perform sampling.	2013-04-29 11:13:43 -07:00
Matei Zaharia	0f45347c7b	More unit test fixes	2013-04-28 22:29:27 -07:00
Matei Zaharia	bce4089f22	Fix BlockManagerSuite to deal with clearing spark.hostPort	2013-04-28 22:23:48 -07:00
Matei Zaharia	68c07ea198	Merge pull request #582 from shivaram/master Add zip partitions interface	2013-04-28 20:19:33 -07:00
Shivaram Venkataraman	604d3bf56c	Rename partition class and add scala doc	2013-04-28 16:31:07 -07:00
Shivaram Venkataraman	15acd49f07	Actually rename classes to ZippedPartitions* (the previous commit only renamed the file)	2013-04-28 16:03:22 -07:00
Shivaram Venkataraman	6e84635ab9	Rename classes from MapZipped* to Zipped*	2013-04-28 15:58:40 -07:00
Matei Zaharia	f6ee9a8728	Merge pull request #583 from mridulm/master Fix issues with streaming test cases after yarn branch merge	2013-04-28 15:36:04 -07:00
Mridul Muralidharan	430c531464	Remove debug statements	2013-04-29 00:24:30 +05:30
Mridul Muralidharan	3a89a76b87	Make log message more descriptive to aid in debugging	2013-04-29 00:04:12 +05:30
Mridul Muralidharan	9bd439502e	Remove spurious commit	2013-04-28 23:09:08 +05:30
Mridul Muralidharan	7fa6978a1e	Allow CheckpointWriter pending tasks to finish	2013-04-28 23:08:10 +05:30
Mridul Muralidharan	00c7a37604	Merge branch 'master' of github.com:mridulm/spark	2013-04-28 22:44:34 +05:30
Mridul Muralidharan	afee902443	Attempt to fix streaming test failures after yarn branch merge	2013-04-28 22:26:45 +05:30
Shivaram Venkataraman	0cc6642b7c	Rename to zipPartitions and style changes	2013-04-28 05:11:03 -07:00
Shivaram Venkataraman	c9c4954d99	Add an interface to zip iterators of multiple RDDs The current code supports 2, 3 or 4 arguments but can be extended to more arguments if required.	2013-04-26 16:57:46 -07:00
Matei Zaharia	1f20ef2567	Merge branch 'master' of github.com:mesos/spark	2013-04-25 20:03:13 -07:00
Matei Zaharia	1b169f190c	Exclude old versions of Netty, which had a different Maven organization	2013-04-25 19:52:12 -07:00
Matei Zaharia	cf54b824ff	Merge pull request #580 from pwendell/quickstart SPARK-739 Have quickstart standlone job use README	2013-04-25 11:45:58 -07:00
Patrick Wendell	a72134a6ac	SPARK-739 Have quickstart standlone job use README	2013-04-25 10:39:28 -07:00
Matei Zaharia	6e6b5204ea	Create an empty directory when checkpointing a 0-partition RDD (fixes a test failure on Hadoop 2.0)	2013-04-25 00:42:37 -07:00
Matei Zaharia	eef9ea1993	Update unit test memory to 2 GB	2013-04-25 00:42:29 -07:00
Matei Zaharia	01d9ba5038	Add back line removed during YARN merge	2013-04-25 00:11:27 -07:00
Reynold Xin	ba6ffa6a5f	Allow the specification of a shuffle serializer in the read path (for local block reads).	2013-04-24 17:38:07 -07:00
Reynold Xin	aa618ed2a2	Allow changing the serializer on a per shuffle basis.	2013-04-24 14:52:49 -07:00
Matei Zaharia	118a6c76f5	Merge pull request #575 from mridulm/master Manual merge of yarn branch to trunk	2013-04-24 08:42:30 -07:00
Mridul Muralidharan	3b594a4e3b	Do not add signature files - results in validation errors when using assembled file	2013-04-24 10:18:25 +05:30
Mridul Muralidharan	dd515ca3ee	Attempt at fixing merge conflict	2013-04-24 09:24:17 +05:30
Mridul Muralidharan	d09db1c051	concurrentRestrictions fails for this PR - but works for master, probably some version change	2013-04-24 09:15:29 +05:30
Mridul Muralidharan	adcda84f96	Pull latest SparkBuild.scala from master and merge conflicts	2013-04-24 08:57:25 +05:30
Reynold Xin	31ce6c66d6	Added a BlockObjectWriter interface in block manager so ShuffleMapTask doesn't need to build up an array buffer for each shuffle bucket.	2013-04-23 17:48:59 -07:00
Mridul Muralidharan	5b85c715c8	Revert back to 2.0.2-alpha : 0.23.7 has protocol changes which break against cloudera	2013-04-24 02:57:51 +05:30
Mridul Muralidharan	8faf5c51c3	Patch from Thomas Graves to improve the YARN Client, and move to more production ready hadoop yarn branch	2013-04-24 02:31:57 +05:30
Mridul Muralidharan	b11058f42c	Ensure that maven package adds yarn jars as part of shaded jar for hadoop2-yarn profile	2013-04-23 22:48:32 +05:30
koeninger	dfac0aa5c2	prevent mysql driver from pulling entire resultset into memory. explicitly close resultset and statement.	2013-04-22 21:12:52 -05:00
Mridul Muralidharan	7acab3ab45	Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo	2013-04-22 08:01:13 +05:30
koeninger	b2a3f24dde	first attempt at an RDD to pull data from JDBC sources	2013-04-21 00:29:37 -05:00
Matei Zaharia	17e076de80	Turn on forking in test JVMs to reduce the pressure on perm gen and code cache sizes due to having 2 instances of the Scala compiler and a bunch of classloaders.	2013-04-18 22:25:57 -07:00
Mridul Muralidharan	ac2e8e8720	Add some basic documentation	2013-04-19 00:13:19 +05:30
Mridul Muralidharan	5ee2f5c483	Cache pattern, add (commented out) alternatives for check* apis	2013-04-17 23:13:34 +05:30
Mridul Muralidharan	f07961060d	Add a small note on spark.tasks.schedule.aggression	2013-04-17 23:13:02 +05:30
Matei Zaharia	5d8a71c484	Merge pull request #570 from jey/increase-codecache-size Increase ReservedCodeCacheSize for sbt	2013-04-16 19:48:02 -07:00
Mridul Muralidharan	5d891534fd	Move back to 2.0.2-alpha, since 2.0.3-alpha is not available in cloudera yet. Also, add netty dependency explicitly to prevent resolving to older 2.3x version. Additionally, comment out retrievePattern to ensure correct netty is picked up	2013-04-17 05:54:43 +05:30
Mridul Muralidharan	46779b4745	Move back to 2.0.2-alpha, since 2.0.3-alpha is not available in cloudera yet	2013-04-17 05:53:28 +05:30
Mridul Muralidharan	02dffd2eb0	Ensure all ask/await block for spark.akka.askTimeout - so that it is controllable : instead of arbitrary timeouts spread across codebase. In our tests, we use 30 seconds, though default of 10 is maintained	2013-04-17 05:52:57 +05:30
Mridul Muralidharan	a402b23bcd	Fudge order of classpath - so that our jars take precedence over what is in CLASSPATH variable. Sounds logical, hope there is no issue cos of it	2013-04-17 05:52:00 +05:30
Mridul Muralidharan	bcdde331c3	Move from master to driver	2013-04-17 04:12:18 +05:30
Jey Kottalam	6bfe4bf3eb	Increase ReservedCodeCacheSize for sbt	2013-04-16 09:50:59 -07:00
Mridul Muralidharan	ad80f68eb5	remove spurious debug statements	2013-04-16 22:15:34 +05:30
Mridul Muralidharan	f7969f72ee	Fix exception when checkpoint path does not exist (no data in rdd which is being checkpointed for example)	2013-04-16 21:51:38 +05:30

1 2 3 4 5 ...

2748 commits