Commit graph

1621 commits

Author SHA1 Message Date
Matei Zaharia 2e914d9983 Formatting 2013-01-10 19:13:08 -08:00
Matei Zaharia 3548c9c0c8 Merge branch 'master' of github.com:mesos/spark 2013-01-10 19:06:40 -08:00
Matei Zaharia 6d1c230281 Merge pull request #357 from tysonjh/master
JSON support added to WebUI
2013-01-10 19:06:07 -08:00
Matei Zaharia aa1c8c602d Merge pull request #358 from stephenh/shorterclasspath
Retrieve jars to a flat directory so * can be used for the classpath.
2013-01-10 18:04:47 -08:00
Matei Zaharia 248995c535 Merge pull request #356 from shane-huang/master
Fix an issue in ConnectionManager where sendMessage may create too many unnecessary connections
2013-01-10 17:52:23 -08:00
Reynold Xin bd336f5f40 Changed CoGroupRDD's hash map from Scala to Java. 2013-01-10 17:13:04 -08:00
Josh Rosen 49c74ba2af Change PYSPARK_PYTHON_EXEC to PYSPARK_PYTHON. 2013-01-10 08:10:59 -08:00
shane-huang 9930a95d21 Modified Patch according to comments 2013-01-10 20:09:55 +08:00
Josh Rosen d55f2b9882 Use take() instead of takeSample() in PySpark kmeans example.
This is a temporary change until we port takeSample().
2013-01-09 21:21:23 -08:00
Josh Rosen 1a64432ba5 Indicate success/failure in PySpark test script. 2013-01-09 20:30:36 -08:00
Tyson 549ee388a1 Removed io.spray spray-json dependency as it is not needed. 2013-01-09 15:12:23 -05:00
Tyson bf9d9946f9 Query parameter reformatted to be more extensible and routing more robust 2013-01-09 11:29:58 -05:00
Tyson 0da2ff102e Added url query parameter json and handler 2013-01-09 10:40:48 -05:00
Tyson 269fe018c7 JSON object definitions 2013-01-09 10:40:43 -05:00
Tyson 6e8c8f61c4 Added the spray implicit marshaller library
Added the io.spray JSON library
2013-01-09 10:40:33 -05:00
Matei Zaharia 9cc764f523 Code style 2013-01-08 22:29:57 -08:00
Matei Zaharia 14972141f9 Merge pull request #344 from mbautin/log_preferred_hosts
Log preferred hosts
2013-01-08 22:26:34 -08:00
Matei Zaharia d0bae072ea Merge pull request #353 from stephenh/tupleBy
Add RDD.tupleBy.
2013-01-08 22:24:03 -08:00
Josh Rosen b57dd0f160 Add mapPartitionsWithSplit() to PySpark. 2013-01-08 16:05:02 -08:00
Stephen Haberman c3f1675f9c Retrieve jars to a flat directory so * can be used for the classpath. 2013-01-08 14:44:33 -06:00
Stephen Haberman 8ac0f35be4 Add JavaRDDLike.keyBy. 2013-01-08 09:57:45 -06:00
Stephen Haberman 4ee6b22775 Merge branch 'master' into tupleBy
Conflicts:
	core/src/test/scala/spark/RDDSuite.scala
2013-01-08 09:10:10 -06:00
shane-huang e4cb72da8a Fix an issue in ConnectionManager where sendingMessage may create too many unnecessary SendingConnections. 2013-01-08 22:40:58 +08:00
Shivaram Venkataraman f7adb382ac Activate hadoop1 if property hadoop is missing. hadoop2 can be activated now
by using -Dhadoop -Phadoop2.
2013-01-08 03:19:43 -08:00
Mikhail Bautin 4725b0f643 Fixing if/else coding style for preferred hosts logging 2013-01-07 20:09:26 -08:00
Mikhail Bautin c41042c816 Log preferred hosts 2013-01-07 20:06:09 -08:00
Shivaram Venkataraman 4bbe07e5ec Activate hadoop1 profile by default for maven builds 2013-01-07 17:46:22 -08:00
Matei Zaharia a37adfa67b Merge pull request #354 from shivaram/ibm-jdk-fixes
Fixes to build and test spark on IBM JVM
2013-01-07 17:37:03 -08:00
Shivaram Venkataraman b1336e2fe4 Update expected size of strings to match our dummy string class 2013-01-07 17:00:32 -08:00
Shivaram Venkataraman fb3d4d5e85 Make default hadoop version 1.0.3 in pom.xml 2013-01-07 16:46:06 -08:00
Shivaram Venkataraman 55c66d365f Use a dummy string class in Size Estimator tests to make it resistant to jdk
versions
2013-01-07 15:58:00 -08:00
Shivaram Venkataraman 77d751731c Remove unused BoundedMemoryCache file and associated test case. 2013-01-07 15:57:46 -08:00
Shivaram Venkataraman aed368a970 Update Hadoop dependency to 1.0.3 as 0.20 has Sun specific dependencies. Also
fix SequenceFileRDDFunctions to pick the right type conversion across Hadoop
versions
2013-01-07 15:57:33 -08:00
Shivaram Venkataraman f8d579a0c0 Remove dependencies on sun jvm classes. Instead use reflection to infer
HotSpot options and total physical memory size
2013-01-07 15:57:18 -08:00
Matei Zaharia 1941d9602d Merge branch 'master' of github.com:mesos/spark 2013-01-07 16:50:39 -05:00
Matei Zaharia 9c32f300fb Add Accumulable.setValue for easier use in Java 2013-01-07 16:50:23 -05:00
Stephen Haberman 8dc06069fe Rename RDD.tupleBy to keyBy. 2013-01-06 15:21:45 -06:00
Matei Zaharia 8fd3a70c18 Add PairRDD.keys() and values() to Java API 2013-01-05 22:46:45 -05:00
Matei Zaharia b1663752c6 Merge pull request #351 from stephenh/values
Add PairRDDFunctions.keys and values.
2013-01-05 19:15:54 -08:00
Matei Zaharia 0982572519 Add methods called just 'accumulator' for int/double in Java API 2013-01-05 22:11:28 -05:00
Matei Zaharia 86af64b0a6 Fix Accumulators in Java, and add a test for them 2013-01-05 20:55:17 -05:00
Stephen Haberman 1fdb6946b5 Add RDD.tupleBy. 2013-01-05 13:07:59 -06:00
Stephen Haberman 6a0db3b449 Fix typo. 2013-01-05 12:56:17 -06:00
Matei Zaharia 7ab9f09140 Merge pull request #352 from stephenh/collect
Add RDD.collect(PartialFunction).
2013-01-05 10:17:20 -08:00
Stephen Haberman f4e6b9361f Add RDD.collect(PartialFunction). 2013-01-05 12:14:08 -06:00
Stephen Haberman 8d57c78c83 Add PairRDDFunctions.keys and values. 2013-01-05 12:04:01 -06:00
Josh Rosen 33beba3965 Change PySpark RDD.take() to not call iterator(). 2013-01-03 14:52:21 -08:00
Josh Rosen ce9f1bbe20 Add pyspark script to replace the other scripts.
Expand the PySpark programming guide.
2013-01-01 21:25:49 -08:00
Josh Rosen b58340dbd9 Rename top-level 'pyspark' directory to 'python' 2013-01-01 15:05:00 -08:00
Josh Rosen 170e451fbd Minor documentation and style fixes for PySpark. 2013-01-01 13:52:14 -08:00