Matei Zaharia
8d81358a05
Provide more memory for tests
2013-08-29 21:19:06 -07:00
Matei Zaharia
ab0e625d9e
Fix PySpark for assembly run and include it in dist
2013-08-29 21:19:06 -07:00
Matei Zaharia
53cd50c069
Change build and run instructions to use assemblies
...
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.
As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
2013-08-29 21:19:04 -07:00
jerryshao
f3dbe6b215
Fix removed block zero size log reporting
2013-08-30 09:39:01 +08:00
Patrick Wendell
abdbacf252
Merge pull request #871 from pwendell/expose-local
...
Expose `isLocal` in SparkContext.
2013-08-28 21:11:31 -07:00
Matei Zaharia
afcade3ca8
Merge pull request #873 from pwendell/master
...
Hot fix for command runner
2013-08-28 20:15:40 -07:00
Patrick Wendell
1798e69e71
Adding extra args
2013-08-28 19:56:46 -07:00
Patrick Wendell
30d2421112
Make local variable public
2013-08-28 19:53:31 -07:00
Patrick Wendell
2fc9a028f2
Hot fix for command runner
2013-08-28 19:03:06 -07:00
Andre Schumacher
a511c5379e
RDD sample() and takeSample() prototypes for PySpark
2013-08-28 16:46:13 -07:00
Josh Rosen
742c44eae6
Don't send SIGINT to Py4J gateway subprocess.
...
This addresses SPARK-885, a usability issue where PySpark's
Java gateway process would be killed if the user hit ctrl-c.
Note that SIGINT still won't cancel the running s
This fix is based on http://stackoverflow.com/questions/5045771
2013-08-28 16:39:44 -07:00
Andre Schumacher
457bcd3343
PySpark: implementing subtractByKey(), subtract() and keyBy()
2013-08-28 16:14:22 -07:00
Joseph E. Gonzalez
3325083552
Removing commented out test code
2013-08-28 14:57:12 -07:00
Matei Zaharia
baa84e7e4c
Merge pull request #865 from tgravescs/fixtmpdir
...
Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
2013-08-28 12:44:46 -07:00
Y.CORP.YAHOO.COM\tgraves
aac1214ee4
Change Executor to only look at the env variable SPARK_YARN_MODE
2013-08-28 13:26:26 -05:00
Matei Zaharia
cd043cf922
Merge pull request #867 from tgravescs/yarnenvconfigs
...
Spark on Yarn allow users to specify environment variables
2013-08-27 19:50:32 -07:00
Joseph E. Gonzalez
766b6fd380
Fixing IndexedRDD unit tests.
2013-08-27 18:54:26 -07:00
Joseph E. Gonzalez
9afd0e2375
Merging upstream changes.
2013-08-27 18:26:54 -07:00
Joseph E. Gonzalez
93503a7054
Allowing RDD to select its implementation of PairRDDFunctions
2013-08-27 18:16:19 -07:00
Y.CORP.YAHOO.COM\tgraves
3f206bf0b5
Updated based on review comments.
2013-08-27 14:34:27 -05:00
Y.CORP.YAHOO.COM\tgraves
cf52a3cba6
Allow for Executors to have different directories then the Spark Master for Yarn
2013-08-27 11:00:21 -05:00
Matei Zaharia
898da7e422
Merge pull request #859 from ianbuss/sbt_opts
...
Pass SBT_OPTS environment through to sbt_launcher
2013-08-26 20:40:49 -07:00
Y.CORP.YAHOO.COM\tgraves
63dc635de6
fix typos
2013-08-26 17:06:20 -05:00
Y.CORP.YAHOO.COM\tgraves
c9464c74a1
Add ability for user to specify environment variables
2013-08-26 16:44:27 -05:00
Y.CORP.YAHOO.COM\tgraves
6dd64e8bb2
Update docs and remove old reference to --user option
2013-08-26 14:29:24 -05:00
shivaram
17bafeab39
Merge pull request #864 from rxin/json1
...
Revert json library change
2013-08-26 11:59:32 -07:00
Y.CORP.YAHOO.COM\tgraves
dfb4c697bc
Throw exception if the yarn local dirs isn't set
2013-08-26 13:57:01 -05:00
Reynold Xin
a77e0abb96
Added worker state to the cluster master JSON ui.
2013-08-26 11:21:03 -07:00
Reynold Xin
9db1e50344
Revert "Merge pull request #841 from rxin/json"
...
This reverts commit 1fb1b09928
, reversing
changes made to c69c48947d
.
2013-08-26 11:05:14 -07:00
Y.CORP.YAHOO.COM\tgraves
c0b4095ee8
Change to use Yarn appropriate directories rather then /tmp or the user specified spark.local.dir
2013-08-26 12:48:37 -05:00
Shivaram Venkataraman
dc06b52879
Add an option to turn off data validation, test it.
...
Also moves addIntercept to have default true to make it similar
to validateData option
2013-08-25 23:14:35 -07:00
Shivaram Venkataraman
b8c50a0642
Center & scale variables in Ridge, Lasso.
...
Also add a unit test that checks if ridge regression lowers
cross-validation error.
2013-08-25 22:24:27 -07:00
Patrick Wendell
f9fc5c160a
Merge pull request #603 from pwendell/ec2-updates
...
Several Improvements to EC2 Scripts
2013-08-24 15:19:56 -07:00
Patrick Wendell
2cfe52ef55
Version bump for ec2 docs
2013-08-24 15:16:53 -07:00
Patrick Wendell
4879685910
Merge remote-tracking branch 'mesos/master' into ec2-updates
2013-08-24 14:50:58 -07:00
Matei Zaharia
d282c1ebbb
Merge pull request #860 from jey/sbt-ide-fixes
...
Fix IDE project generation under SBT
2013-08-23 11:20:20 -07:00
Jey Kottalam
a9db1b7b6e
Upgrade SBT IDE project generators
2013-08-23 10:27:18 -07:00
Jey Kottalam
b7f9e6374a
Fix SBT generation of IDE project files
2013-08-23 10:26:37 -07:00
Ian Buss
d7f18e3d27
Pass SBT_OPTS environment through to sbt_launcher
2013-08-23 09:50:23 +01:00
Matei Zaharia
5a6ac12840
Merge pull request #701 from ScrapCodes/documentation-suggestions
...
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma
2bc348e92c
Linking custom receiver guide
2013-08-23 09:44:02 +05:30
Prashant Sharma
3049415e24
Corrections in documentation comment
2013-08-23 09:40:28 +05:30
Prashant Sharma
39a1d58da4
Improved documentation for spark custom receiver
2013-08-23 09:38:50 +05:30
Matei Zaharia
215c13dd41
Fix code style and a nondeterministic RDD issue in ALS
2013-08-22 16:13:46 -07:00
Matei Zaharia
46ea0c1b47
Merge pull request #814 from holdenk/master
...
Create less instances of the random class during ALS initialization.
2013-08-22 15:57:28 -07:00
Matei Zaharia
9ac3d62cac
Merge pull request #856 from jey/sbt-fix-hadoop-0.23.9
...
Re-add removed dependency to fix build under Hadoop 0.23.9
2013-08-22 15:51:10 -07:00
Jey Kottalam
281b6c5f28
Re-add removed dependency on 'commons-daemon'
...
Fixes SBT build under Hadoop 0.23.9 and 2.0.4
2013-08-22 15:45:45 -07:00
Matei Zaharia
ae8ba83ef2
Merge pull request #855 from jey/update-build-docs
...
Update build docs
2013-08-22 10:14:54 -07:00
Matei Zaharia
8a36fd09dd
Merge pull request #854 from markhamstra/pomUpdate
...
Synced sbt and maven builds to use the same dependencies, etc.
2013-08-22 10:13:35 -07:00
Matei Zaharia
c2d00f12e2
Merge pull request #832 from alig/coalesce
...
Coalesced RDD with locality
2013-08-22 10:13:03 -07:00