Matei Zaharia
fbb3fc4143
Merge pull request #346 from JoshRosen/python-api
...
Python API (PySpark)
2013-01-12 23:49:36 -08:00
Tyson
1731f1fed4
Added an optional format parameter for individual job queries and optimized the jobId query
2013-01-11 15:01:43 -05:00
Tyson
c063e8777e
Added implicit json writers for JobDescription and ExecutorRunner
2013-01-11 14:57:38 -05:00
Matei Zaharia
2e914d9983
Formatting
2013-01-10 19:13:08 -08:00
Matei Zaharia
3548c9c0c8
Merge branch 'master' of github.com:mesos/spark
2013-01-10 19:06:40 -08:00
Matei Zaharia
6d1c230281
Merge pull request #357 from tysonjh/master
...
JSON support added to WebUI
2013-01-10 19:06:07 -08:00
Matei Zaharia
248995c535
Merge pull request #356 from shane-huang/master
...
Fix an issue in ConnectionManager where sendMessage may create too many unnecessary connections
2013-01-10 17:52:23 -08:00
shane-huang
9930a95d21
Modified Patch according to comments
2013-01-10 20:09:55 +08:00
Tyson
549ee388a1
Removed io.spray spray-json dependency as it is not needed.
2013-01-09 15:12:23 -05:00
Tyson
bf9d9946f9
Query parameter reformatted to be more extensible and routing more robust
2013-01-09 11:29:58 -05:00
Tyson
0da2ff102e
Added url query parameter json and handler
2013-01-09 10:40:48 -05:00
Tyson
269fe018c7
JSON object definitions
2013-01-09 10:40:43 -05:00
Matei Zaharia
9cc764f523
Code style
2013-01-08 22:29:57 -08:00
Matei Zaharia
14972141f9
Merge pull request #344 from mbautin/log_preferred_hosts
...
Log preferred hosts
2013-01-08 22:26:34 -08:00
Josh Rosen
b57dd0f160
Add mapPartitionsWithSplit() to PySpark.
2013-01-08 16:05:02 -08:00
Stephen Haberman
8ac0f35be4
Add JavaRDDLike.keyBy.
2013-01-08 09:57:45 -06:00
Stephen Haberman
4ee6b22775
Merge branch 'master' into tupleBy
...
Conflicts:
core/src/test/scala/spark/RDDSuite.scala
2013-01-08 09:10:10 -06:00
shane-huang
e4cb72da8a
Fix an issue in ConnectionManager where sendingMessage may create too many unnecessary SendingConnections.
2013-01-08 22:40:58 +08:00
Mikhail Bautin
4725b0f643
Fixing if/else coding style for preferred hosts logging
2013-01-07 20:09:26 -08:00
Mikhail Bautin
c41042c816
Log preferred hosts
2013-01-07 20:06:09 -08:00
Shivaram Venkataraman
77d751731c
Remove unused BoundedMemoryCache file and associated test case.
2013-01-07 15:57:46 -08:00
Shivaram Venkataraman
aed368a970
Update Hadoop dependency to 1.0.3 as 0.20 has Sun specific dependencies. Also
...
fix SequenceFileRDDFunctions to pick the right type conversion across Hadoop
versions
2013-01-07 15:57:33 -08:00
Shivaram Venkataraman
f8d579a0c0
Remove dependencies on sun jvm classes. Instead use reflection to infer
...
HotSpot options and total physical memory size
2013-01-07 15:57:18 -08:00
Matei Zaharia
1941d9602d
Merge branch 'master' of github.com:mesos/spark
2013-01-07 16:50:39 -05:00
Matei Zaharia
9c32f300fb
Add Accumulable.setValue for easier use in Java
2013-01-07 16:50:23 -05:00
Stephen Haberman
8dc06069fe
Rename RDD.tupleBy to keyBy.
2013-01-06 15:21:45 -06:00
Matei Zaharia
8fd3a70c18
Add PairRDD.keys() and values() to Java API
2013-01-05 22:46:45 -05:00
Matei Zaharia
b1663752c6
Merge pull request #351 from stephenh/values
...
Add PairRDDFunctions.keys and values.
2013-01-05 19:15:54 -08:00
Matei Zaharia
0982572519
Add methods called just 'accumulator' for int/double in Java API
2013-01-05 22:11:28 -05:00
Matei Zaharia
86af64b0a6
Fix Accumulators in Java, and add a test for them
2013-01-05 20:55:17 -05:00
Stephen Haberman
1fdb6946b5
Add RDD.tupleBy.
2013-01-05 13:07:59 -06:00
Stephen Haberman
f4e6b9361f
Add RDD.collect(PartialFunction).
2013-01-05 12:14:08 -06:00
Stephen Haberman
8d57c78c83
Add PairRDDFunctions.keys and values.
2013-01-05 12:04:01 -06:00
Josh Rosen
33beba3965
Change PySpark RDD.take() to not call iterator().
2013-01-03 14:52:21 -08:00
Josh Rosen
b58340dbd9
Rename top-level 'pyspark' directory to 'python'
2013-01-01 15:05:00 -08:00
Josh Rosen
170e451fbd
Minor documentation and style fixes for PySpark.
2013-01-01 13:52:14 -08:00
Matei Zaharia
55809fbc6d
Merge pull request #349 from woggling/cache-finally
...
Avoid stalls when computation of cached RDD throws exception
2013-01-01 08:21:33 -08:00
Charles Reiss
58072a7340
Remove some dead comments
2013-01-01 08:07:44 -08:00
Charles Reiss
feadaf72f4
Mark key as not loading in CacheTracker even when compute() fails
2013-01-01 07:57:20 -08:00
Josh Rosen
f803953998
Raise exception when hashing Java arrays (SPARK-597)
2012-12-31 20:20:11 -08:00
Josh Rosen
59195c68ec
Update PySpark for compatibility with TaskContext.
2012-12-29 16:01:03 -08:00
Josh Rosen
c5cee53f20
Merge remote-tracking branch 'origin/master' into python-api
...
Conflicts:
docs/quick-start.md
2012-12-29 16:00:51 -08:00
Josh Rosen
7ec3595de2
Fix bug (introduced by batching) in PySpark take()
2012-12-28 22:21:16 -08:00
Josh Rosen
397e67103c
Change Utils.fetchFile() warning to SparkException.
2012-12-28 17:37:13 -08:00
Josh Rosen
d64fa72d2e
Add addFile() and addJar() to JavaSparkContext.
2012-12-28 17:00:57 -08:00
Josh Rosen
bd237d4a9d
Add synchronization to LocalScheduler.updateDependencies().
2012-12-28 17:00:57 -08:00
Josh Rosen
f1bf4f0385
Skip deletion of files in clearFiles().
...
This fixes an issue where Spark could delete
original files in the current working directory
that were added to the job using addFile().
There was also the potential for addFile() to
overwrite local files, which is addressed by
changing Utils.fetchFile() to log a warning
instead of overwriting a file with new contents.
This is a short-term fix; a better long-term
solution would be to remove the dependence on
storing files in the current working directory,
since we can't change the cwd from Java.
2012-12-28 17:00:57 -08:00
Josh Rosen
fbadb1cda5
Mark api.python classes as private; echo Java output to stderr.
2012-12-28 09:06:11 -08:00
Josh Rosen
1dca0c5180
Remove debug output from PythonPartitioner.
2012-12-26 18:23:06 -08:00
Josh Rosen
4608902fb8
Use filesystem to collect RDDs in PySpark.
...
Passing large volumes of data through Py4J seems
to be slow. It appears to be faster to write the
data to the local filesystem and read it back from
Python.
2012-12-24 17:20:10 -08:00