ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Patrick Wendell	6079721fa1	Update build version in master	2013-09-24 11:41:51 -07:00
Matei Zaharia	0a8cc30921	Move some classes to more appropriate packages: * RDD, RDDFunctions -> org.apache.spark.rdd Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer	2013-09-01 14:13:16 -07:00
Matei Zaharia	5701eb92c7	Fix some URLs	2013-09-01 14:13:16 -07:00
Matei Zaharia	46eecd110a	Initial work to rename package to org.apache.spark	2013-09-01 14:13:13 -07:00
Matei Zaharia	53cd50c069	Change build and run instructions to use assemblies This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.	2013-08-29 21:19:04 -07:00
Jey Kottalam	23f4622aff	Remove redundant dependencies from POMs	2013-08-18 18:53:57 -07:00
Jey Kottalam	ad580b94d5	Maven build now also works with YARN	2013-08-16 13:50:12 -07:00
Jey Kottalam	9dd15fe700	Don't mark hadoop-client as 'provided'	2013-08-16 13:50:12 -07:00
Jey Kottalam	11b42a84db	Maven build now works with CDH hadoop-2.0.0-mr1	2013-08-16 13:50:12 -07:00
Jey Kottalam	353fab2440	Initial changes to make Maven build agnostic of hadoop version	2013-08-16 13:50:12 -07:00
Matei Zaharia	af3c9d5042	Add Apache license headers and LICENSE and NOTICE files	2013-07-16 17:21:33 -07:00
Mridul Muralidharan	afee902443	Attempt to fix streaming test failures after yarn branch merge	2013-04-28 22:26:45 +05:30
Mridul Muralidharan	dd515ca3ee	Attempt at fixing merge conflict	2013-04-24 09:24:17 +05:30
Matei Zaharia	adba773fab	Fix passing of superstep in Bagel to avoid seeing new values of the superstep value upon recomputation, and set the default storage level in Bagel to MEMORY_AND_DISK	2013-04-08 17:34:38 -04:00
Mridul Muralidharan	6798a09df8	Add support for building against hadoop2-yarn : adding new maven profile for it	2013-04-07 17:47:38 +05:30
Jey Kottalam	bc8ba222ff	Bump development version to 0.8.0	2013-03-28 15:42:01 -07:00
Mikhail Bautin	7fd2708eda	Add a log4j compile dependency to fix build in IntelliJ Also rename parent project to spark-parent (otherwise it shows up as "parent" in IntelliJ, which is very confusing).	2013-03-15 11:41:51 -07:00
Nick Pentreath	8dd943fc3e	Fix doc style	2013-03-11 09:14:00 +02:00
Nick Pentreath	d35c5a5176	Adding test for non-default persistence level	2013-03-09 12:52:16 +02:00
Nick Pentreath	1e981d8b26	Added choice of persitance level to Bagel. Also added documentation.	2013-03-09 12:40:48 +02:00
Mark Hamstra	8b06b359da	bump version to 0.7.1-SNAPSHOT in the subproject poms to keep the maven build building.	2013-02-28 23:34:34 -08:00
Matei Zaharia	06e5e6627f	Renamed "splits" to "partitions"	2013-02-17 22:13:26 -08:00
Matei Zaharia	582d31dff9	Formatting fixes	2013-02-11 13:24:54 -08:00
Matei Zaharia	ea08537143	Fixed an exponential recursion that could happen with doCheckpoint due to lack of memoization	2013-02-11 13:23:50 -08:00
Mikhail Bautin	fe3eceab57	Remove activation of profiles by default See the discussion at https://github.com/mesos/spark/pull/355 for why default profile activation is a problem.	2013-01-31 13:30:41 -08:00
Stephen Haberman	7dfb82a992	Replace old 'master' term with 'driver'.	2013-01-25 11:03:00 -06:00
Tathagata Das	cd1521cfdb	Merge branch 'master' into streaming Conflicts: core/src/main/scala/spark/rdd/CoGroupedRDD.scala core/src/main/scala/spark/rdd/FilteredRDD.scala docs/_layouts/global.html docs/index.md run	2013-01-15 12:08:51 -08:00
Shivaram Venkataraman	bbc56d85ed	Rename environment variable for hadoop profiles to hadoopVersion	2013-01-12 15:24:13 -08:00
Shivaram Venkataraman	9262522306	Activate hadoop2 profile in pom.xml with -Dhadoop=2	2013-01-10 22:07:34 -08:00
Shivaram Venkataraman	f7adb382ac	Activate hadoop1 if property hadoop is missing. hadoop2 can be activated now by using -Dhadoop -Phadoop2.	2013-01-08 03:19:43 -08:00
Shivaram Venkataraman	4bbe07e5ec	Activate hadoop1 profile by default for maven builds	2013-01-07 17:46:22 -08:00
Tathagata Das	4719e6d8fe	Changed locations for unit test logs.	2013-01-07 16:06:07 -08:00
Thomas Dudziak	02d64f9662	Mark hadoop dependencies provided in all library artifacts	2012-12-10 21:27:54 -08:00
Matei Zaharia	ccff0a089a	Use the same output directories that SBT had in subprojects This will make it easier to make the "run" script work with a Maven build	2012-12-10 10:58:56 -08:00
Thomas Dudziak	3b643e86bc	Updated versions in the pom.xml files to match current master	2012-11-27 17:50:42 -08:00
Thomas Dudziak	69297c64be	Addressed code review comments	2012-11-27 15:45:16 -08:00
Thomas Dudziak	811a32257b	Added maven and debian build files	2012-11-20 16:19:51 -08:00
Matei Zaharia	4be12d97ec	Some doc fixes, including showing version number in nav bar again	2012-10-13 19:05:11 -07:00
Matei Zaharia	b4067cbad4	More doc updates, and moved Serializer to a subpackage.	2012-10-12 18:19:21 -07:00
Matei Zaharia	eca570f66a	Removed the need to sleep in tests due to waiting for Akka to shut down	2012-10-07 00:17:59 -07:00
Matei Zaharia	74a9244255	Write all unit test output to a file	2012-10-01 15:07:42 -07:00
Matei Zaharia	0121a26bd1	Changed the way tasks' dependency files are sent to workers so that custom serializers or Kryo registrators can be loaded.	2012-09-28 16:14:05 -07:00
Matei Zaharia	2c16ae36d7	Set log level in tests to WARN	2012-08-23 20:38:14 -07:00
Matei Zaharia	deedb9e7b7	Fix further issues with tests and broadcast. The broadcast fix is to store values as MEMORY_ONLY_DESER instead of MEMORY_ONLY, which will save substantial time on serialization.	2012-08-23 20:31:49 -07:00
Denny	4f4a34c025	Stlystic changes Conflicts: core/src/test/scala/spark/MesosSchedulerSuite.scala	2012-07-23 16:32:20 -07:00
Denny	866e6949df	Always destroy SparkContext in after block for the unit tests. Conflicts: core/src/test/scala/spark/ShuffleSuite.scala	2012-07-23 16:29:17 -07:00
Matei Zaharia	f58da6164e	Merge branch 'master' into dev	2012-06-15 23:47:11 -07:00
Matei Zaharia	a96558caa3	Performance improvements to shuffle operations: in particular, preserve RDD partitioning in more cases where it's possible, and use iterators instead of materializing collections when doing joins.	2012-06-09 14:44:18 -07:00
Matei Zaharia	63051dd2bc	Merge in engine improvements from the Spark Streaming project, developed jointly with Tathagata Das and Haoyuan Li. This commit imports the changes and ports them to Mesos 0.9, but does not yet pass unit tests due to various classes not supporting a graceful stop() yet.	2012-06-07 12:45:38 -07:00
Reynold Xin	968f75f6af	Added an option (spark.closure.serializer) to specify the serializer for closures. This enables using Kryo as the closure serializer.	2012-04-09 21:59:56 -07:00
Ankur Dave	c5be7d2b22	Update Bagel unit tests to reflect API change	2011-11-08 19:56:44 +00:00
Ankur Dave	ab3889f627	Implement standalone WikipediaPageRank with custom serializer	2011-10-09 16:53:10 -07:00
Ankur Dave	cbdc01eecd	Update WikipediaPageRank to reflect Bagel API changes	2011-10-09 16:19:34 -07:00
Ankur Dave	6d707f6b63	Remove ShortestPath for now	2011-10-09 16:19:34 -07:00
Ankur Dave	0028caf3a4	Simplify and genericize type parameters in Bagel	2011-10-09 15:58:39 -07:00
Ismael Juma	0fba22b3d2	Fix issue #65 : Change @serializable to extends Serializable in 2.9 branch Note that we use scala.Serializable introduced in Scala 2.9 instead of java.io.Serializable. Also, case classes inherit from scala.Serializable by default.	2011-08-02 10:16:33 +01:00
Matei Zaharia	969644df8e	Cleaned up a few issues to do with default parallelism levels. Also renamed HadoopFileWriter to HadoopWriter (since it's not only for files) and fixed a bug for lookup().	2011-07-14 12:40:56 -04:00
Matei Zaharia	4db50e26c7	Fixed unit tests by making them clean up the SparkContext after use and thus clean up the various singletons (RDDCache, MapOutputTracker, etc). This isn't perfect yet (ideally we shouldn't use singleton objects at all) but we can fix that later.	2011-05-13 12:03:58 -07:00
Ankur Dave	f40a0898a7	Rename bagel to spark.bagel and Pregel to Bagel	2011-05-09 15:23:21 -07:00
Ankur Dave	c1104058c6	Move shortest path and PageRank to bagel.examples	2011-05-03 18:53:58 -07:00
Ankur Dave	563c5e717c	Refactor and add aggregator support Refactored out the agg() and comp() methods from Pregel.run. Defined an implicit conversion to allow applications that don't use aggregators to avoid including a null argument for the result of the aggregator in the compute function.	2011-05-03 15:40:45 -07:00
Ankur Dave	c18fa3ebc6	Package combiner functions into a trait	2011-05-03 15:40:41 -07:00
Ankur Dave	1c8ca0ebe1	Add Bagel test suite Note: This test suite currently fails for the same reason that the Spark Core test suite fails: Spark currently seems to have a bug where any test after the first one fails.	2011-05-03 15:40:31 -07:00
Ankur Dave	c5b3ea755f	Clean up Bagel source and interface	2011-05-03 15:40:01 -07:00
Ankur Dave	19122af787	Update ShortestPath to work with controllable partitioning	2011-05-03 15:39:39 -07:00
Ankur Dave	62ef620354	Clean up Pregel.run, add logging	2011-05-03 15:38:01 -07:00
Ankur Dave	c0736f6f68	Add Bagel, an implementation of Pregel on Spark	2011-05-03 15:37:08 -07:00

1 2 3

117 commits