Matei Zaharia
1e137a5a21
Merge pull request #846 from rxin/rdd
...
Two minor RDD refactoring
2013-08-17 22:22:32 -07:00
Reynold Xin
2c00ea3efc
Moved shuffle serializer setting from a constructor parameter to a setSerializer method in various RDDs that involve shuffle operations.
2013-08-17 21:43:29 -07:00
Reynold Xin
0e84fee76b
Removed the mapSideCombine option in partitionBy.
2013-08-17 21:13:41 -07:00
Reynold Xin
10af952a3d
Removed the mapSideCombine option in CoGroupedRDD.
2013-08-17 21:07:34 -07:00
Reynold Xin
5d050a3e1f
Removed the unused shuffleId in ShuffleDependency's constructor.
2013-08-16 23:23:16 -07:00
Matei Zaharia
e89ffc7b3c
Merge pull request #839 from jegonzal/zip_partitions
...
Currying RDD.zipPartitions
2013-08-16 14:02:34 -07:00
Jey Kottalam
67b593607c
Rename YARN build flag to SPARK_WITH_YARN
2013-08-16 14:00:05 -07:00
Jey Kottalam
b1d99744a8
Fix SBT build under Hadoop 0.23.x
2013-08-16 13:50:12 -07:00
Jey Kottalam
c1e547bb7f
Updates to repl and example POMs to match SBT build
2013-08-16 13:50:12 -07:00
Jey Kottalam
ad580b94d5
Maven build now also works with YARN
2013-08-16 13:50:12 -07:00
Jey Kottalam
741ecd56fe
Forgot to remove a few references to ${classifier}
2013-08-16 13:50:12 -07:00
Jey Kottalam
9dd15fe700
Don't mark hadoop-client as 'provided'
2013-08-16 13:50:12 -07:00
Jey Kottalam
11b42a84db
Maven build now works with CDH hadoop-2.0.0-mr1
2013-08-16 13:50:12 -07:00
Jey Kottalam
353fab2440
Initial changes to make Maven build agnostic of hadoop version
2013-08-16 13:50:12 -07:00
Jey Kottalam
8add2d7a59
Fix repl/assembly when YARN enabled
2013-08-16 13:50:12 -07:00
Jey Kottalam
3f98eff63a
Allow make-distribution.sh to specify Hadoop version used
2013-08-16 13:50:09 -07:00
Joseph E. Gonzalez
53b2639a1e
Reversing the argument order in zipPartitions to enable stronger type inference.
2013-08-16 12:38:59 -07:00
Andre Schumacher
c7e348faec
Implementing SPARK-878 for PySpark: adding zip and egg files to context and passing it down to workers which add these to their sys.path
2013-08-16 11:58:20 -07:00
Holden Karau
8fc40818d7
Fix
2013-08-15 23:08:48 -07:00
Reynold Xin
1fb1b09928
Merge pull request #841 from rxin/json
...
Use the JSON formatter from Scala library and removed dependency on lift-json.
2013-08-15 22:15:05 -07:00
Matei Zaharia
c69c48947d
Merge pull request #843 from Reinvigorate/bug-879
...
fixing typo in conf/slaves
2013-08-15 20:55:09 -07:00
seanm
a5193a8fac
fixing typo
2013-08-15 20:52:58 -06:00
Reynold Xin
c961c19b7b
Use the JSON formatter from Scala library and removed dependency on lift-json.
...
It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
2013-08-15 18:23:01 -07:00
Reynold Xin
eddbf43b54
Revert "Merge pull request #834 from Daemoen/master"
...
This reverts commit 230ab2722e
, reversing
changes made to 659553b21d
.
2013-08-15 17:49:37 -07:00
Reynold Xin
230ab2722e
Merge pull request #834 from Daemoen/master
...
Updated json output to allow for display of worker state
2013-08-15 17:45:17 -07:00
Patrick Wendell
659553b21d
Merge pull request #836 from pwendell/rename
...
Rename `memoryBytesToString` and `memoryMegabytesToString`
2013-08-15 16:56:31 -07:00
Jey Kottalam
a0f0848463
Update default version of Hadoop to 1.2.1
2013-08-15 16:50:37 -07:00
Jey Kottalam
a06a9d5c5f
Rename HadoopWriter to SparkHadoopWriter since it's outside of our package
2013-08-15 16:50:37 -07:00
Jey Kottalam
8f979edef5
Fix newTaskAttemptID to work under YARN
2013-08-15 16:50:37 -07:00
Jey Kottalam
14b6bcdf93
update YARN docs
2013-08-15 16:50:37 -07:00
Jey Kottalam
8bb0bd11ce
YARN ApplicationMaster shouldn't wait forever
2013-08-15 16:50:37 -07:00
Jey Kottalam
e2d7656ca3
re-enable YARN support
2013-08-15 16:50:37 -07:00
Jey Kottalam
bd0bab47c9
SparkEnv isn't available this early, and not needed anyway
2013-08-15 16:50:37 -07:00
Jey Kottalam
4f43fd791a
make SparkHadoopUtil a member of SparkEnv
2013-08-15 16:50:37 -07:00
Jey Kottalam
43ebcb8484
rename HadoopMapRedUtil => SparkHadoopMapRedUtil, HadoopMapReduceUtil => SparkHadoopMapReduceUtil
2013-08-15 16:50:37 -07:00
Jey Kottalam
cb4ef19214
yarn support
2013-08-15 16:50:37 -07:00
Jey Kottalam
8b1c1520fc
add comment
2013-08-15 16:50:37 -07:00
Jey Kottalam
5d0785b4e5
remove hadoop-yarn's org/apache/...
2013-08-15 16:50:37 -07:00
Jey Kottalam
273b499b9a
yarn sbt
2013-08-15 16:50:37 -07:00
Jey Kottalam
69c3bbf688
dynamically detect hadoop version
2013-08-15 16:50:37 -07:00
Jey Kottalam
f67b94ad4f
remove core/src/hadoop{1,2} dirs
2013-08-15 16:50:36 -07:00
Jey Kottalam
b877e20a33
move yarn to its own directory
2013-08-15 16:50:36 -07:00
Matei Zaharia
28369ff773
Merge pull request #829 from JoshRosen/pyspark-unit-tests-python-2.6
...
Fix PySpark unit tests on Python 2.6
2013-08-15 16:44:02 -07:00
Joseph E. Gonzalez
327a4db9f7
changing caching behavior on indexedrdds
2013-08-15 16:36:26 -07:00
Patrick Wendell
4c6ade1ad5
Rename memoryBytesToString
and memoryMegabytesToString
...
These are used all over the place now and they are not specific to memory at all.
memoryBytesToString --> bytesToString
memoryMegabytesToString --> megabytesToString
2013-08-15 15:58:07 -07:00
Reynold Xin
1a13460cb0
Merge pull request #833 from rxin/ui
...
Various UI improvements.
2013-08-15 15:50:44 -07:00
Reynold Xin
1a51deae8a
More minor UI changes including code review feedback.
2013-08-15 14:34:07 -07:00
Joseph E. Gonzalez
3bb6e019d4
adding better error handling when indexing an RDD
2013-08-15 14:29:48 -07:00
Joseph E. Gonzalez
61281756f2
IndexedRDD passes all PairRDD Function tests
2013-08-15 14:20:59 -07:00
Daemoen
ad2e8b5126
Updated json output to allow for display of worker state
...
Ops teams need to ensure that the cluster is functional and performant. Having to scrape the html source for worker state won't work reliably, and will be slow. By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.
2013-08-15 12:19:14 -07:00