ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Patrick Wendell	b4e382c210	Adding sc name in metrics source	2013-09-08 16:06:49 -07:00
Patrick Wendell	8026537597	Fixing package name in template conf	2013-09-08 16:06:32 -07:00
Matei Zaharia	0b957997ad	Merge pull request #908 from pwendell/master Fix target JVM version in scala build	2013-09-08 15:30:16 -07:00
Patrick Wendell	27bd74c8ad	Fix target JVM version in scala build	2013-09-08 14:37:45 -07:00
Matei Zaharia	5a587fb98d	Updated cluster diagram to show caches	2013-09-08 13:51:57 -07:00
Patrick Wendell	c190b48bf5	Adding more docs and some code cleanup	2013-09-08 13:46:28 -07:00
Stephen Haberman	df5fd35273	Add better docs for coalesce. Include the useful tip that if shuffle=true, coalesce can actually increase the number of partitions. This makes coalesce more like a generic `RDD.repartition` operation. (Ideally this `RDD.repartition` could automatically choose either a coalesce or a shuffle if numPartitions was either less than or greater than, respectively, the current number of partitions.)	2013-09-08 15:39:04 -05:00
Matei Zaharia	af8ffdb73c	Review comments	2013-09-08 13:36:50 -07:00
Matei Zaharia	04cfb3aa9d	Merge pull request #898 from ilikerps/660 SPARK-660: Add StorageLevel support in Python	2013-09-08 10:33:20 -07:00
Patrick Wendell	8de8ee5d3c	Ganglia sink	2013-09-08 10:08:18 -07:00
Matei Zaharia	c0d375107f	Some tweaks to CDH/HDP doc	2013-09-08 00:44:41 -07:00
Aaron Davidson	a3868544be	Whoopsy daisy	2013-09-08 00:30:47 -07:00
Matei Zaharia	f261d2a60f	Added cluster overview doc, made logo higher-resolution, and added more details on monitoring	2013-09-08 00:29:11 -07:00
Matei Zaharia	651a96adf7	More fair scheduler docs and property names. Also changed uses of "job" terminology to "application" when they referred to an entire Spark program, to avoid confusion.	2013-09-08 00:29:11 -07:00
Matei Zaharia	98fb69822c	Work in progress: - Add job scheduling docs - Rename some fair scheduler properties - Organize intro page better - Link to Apache wiki for "contributing to Spark"	2013-09-08 00:29:11 -07:00
Matei Zaharia	38488aca8a	Merge pull request #900 from pwendell/cdh-docs Provide docs to describe running on CDH/HDP cluster.	2013-09-08 00:28:53 -07:00
Patrick Wendell	a8e376ec0f	Merge pull request #904 from pwendell/master Adding Apache license to two files	2013-09-07 21:16:01 -07:00
Patrick Wendell	6d2198643c	Adding Apache license to two files	2013-09-07 20:46:58 -07:00
Aaron Davidson	c1cc8c4da2	Export StorageLevel and refactor	2013-09-07 14:41:31 -07:00
Patrick Wendell	22b982d2bc	File rename	2013-09-07 14:38:54 -07:00
Matei Zaharia	cfde85e395	Merge pull request #901 from ooyala/2013-09/0.8-doc-changes 0.8 Doc changes for make-distribution.sh	2013-09-07 13:53:08 -07:00
Matei Zaharia	4a7813a247	Merge pull request #903 from rxin/resulttask Fixed the bug that ResultTask was not properly deserializing outputId.	2013-09-07 13:52:24 -07:00
Patrick Wendell	61c4762d45	Changes based on feedback	2013-09-07 11:55:10 -07:00
Aaron Davidson	8001687af5	Remove reflection, hard-code StorageLevels The sc.StorageLevel -> StorageLevel pathway is a bit janky, but otherwise the shell would have to call a private method of SparkContext. Having StorageLevel available in sc also doesn't seem like the end of the world. There may be a better solution, though. As for creating the StorageLevel object itself, this seems to be the best way in Python 2 for creating singleton, enum-like objects: http://stackoverflow.com/questions/36932/how-can-i-represent-an-enum-in-python	2013-09-07 09:34:07 -07:00
Evan Chan	be1ee28ca6	CR feedback from Matei	2013-09-07 08:56:24 -07:00
Matei Zaharia	afe46ba36e	Merge pull request #892 from jey/fix-yarn-assembly YARN build fixes	2013-09-07 07:28:51 -07:00
Reynold Xin	210eae26f4	Fixed the bug that ResultTask was not properly deserializing outputId.	2013-09-07 21:59:47 +08:00
Aaron Davidson	b8a0b6ea5e	Memoize StorageLevels read from JVM	2013-09-06 15:36:04 -07:00
Patrick Wendell	2eebeff5eb	Merge pull request #897 from pwendell/master Docs describing Spark monitoring and instrumentation	2013-09-06 15:25:22 -07:00
Jey Kottalam	b98572c70a	Generate new SSH key for the cluster, make "--identity-file" optional	2013-09-06 14:51:47 -07:00
Jey Kottalam	6919a28d51	Construct shell commands as sequences for safety and composability	2013-09-06 14:28:26 -07:00
Evan Chan	ff1dbf2106	Add references to make-distribution.sh	2013-09-06 14:20:44 -07:00
Evan Chan	88d53f0dff	"launch" scripts is more accurate terminology	2013-09-06 14:03:44 -07:00
Evan Chan	5a18b854a7	Easier way to start the master	2013-09-06 13:59:43 -07:00
Evan Chan	76d5d2d3c5	Add notes about starting spark-shell	2013-09-06 13:53:00 -07:00
Patrick Wendell	a2a0cf9d68	Docs describing Spark monitoring and instrumentation	2013-09-06 13:52:57 -07:00
Patrick Wendell	e653a9d891	Provide docs to describe running on CDH/HDP cluster. This doc consolidates information relevant to CDH/HDP users in a single place.	2013-09-06 13:49:57 -07:00
Jey Kottalam	30a32c8335	Minor YARN build cleanups	2013-09-06 11:31:16 -07:00
Jey Kottalam	70661246fd	Fix YARN assembly generation under Maven	2013-09-06 11:31:16 -07:00
Jey Kottalam	35ed09f1d1	Clarify YARN example	2013-09-06 11:31:16 -07:00
Reynold Xin	1e15feb5a3	Hot fix to resolve the compilation error caused by SPARK-821.	2013-09-06 22:44:05 +08:00
Nick Pentreath	737f01a1ef	Adding algorithm for implicit feedback data to ALS	2013-09-06 14:45:05 +02:00
Patrick Wendell	ddcb9d310a	Merge pull request #895 from ilikerps/821 SPARK-821: Don't cache results when action run locally on driver	2013-09-05 23:54:09 -07:00
Aaron Davidson	a63d4c7dc2	SPARK-660: Add StorageLevel support in Python It uses reflection... I am not proud of that fact, but it at least ensures compatibility (sans refactoring of the StorageLevel stuff).	2013-09-05 23:36:27 -07:00
Aaron Davidson	3a04e76c89	Reynold's second round of comments	2013-09-05 21:43:26 -07:00
Ameet Talwalkar	d52edfa753	updated content	2013-09-05 21:06:50 -07:00
Matei Zaharia	699c331f2f	Merge pull request #891 from xiajunluan/SPARK-864 [SPARK-864]DAGScheduler Exception if we delete Worker and StandaloneExecutorBackend then add Worker	2013-09-05 20:21:53 -07:00
Aaron Davidson	4f2236a1c5	Add unit test and address comments	2013-09-05 18:06:30 -07:00
Aaron Davidson	1418d18af4	SPARK-821: Don't cache results when action run locally on driver Caching the results of local actions (e.g., rdd.first()) causes the driver to store entire partitions in its own memory, which may be highly constrained. This patch simply makes the CacheManager avoid caching the result of all locally-run computations.	2013-09-05 15:34:42 -07:00
Andrew xia	7c15e3c5de	Fix bug SPARK-864	2013-09-05 15:56:11 +08:00

... 3 4 5 6 7 ...

4207 commits