Matei Zaharia
23762efda2
New hardware provisioning doc, and updates to menus
2013-08-30 10:16:26 -07:00
Matei Zaharia
1b0f69c623
Change docs color theme for 0.8
2013-08-30 10:15:58 -07:00
Reynold Xin
9e17e456d2
Merge pull request #875 from shivaram/build-fix
...
Fix broken build by removing addIntercept
2013-08-30 00:22:53 -07:00
Shivaram Venkataraman
adc700582b
Fix broken build by removing addIntercept
2013-08-30 00:16:32 -07:00
Evan Sparks
016787de32
Merge pull request #863 from shivaram/etrain-ridge
...
Adding linear regression and refactoring Ridge regression to use SGD
2013-08-29 22:15:14 -07:00
Evan Sparks
852d810787
Merge pull request #819 from shivaram/sgd-cleanup
...
Change SVM to use {0,1} labels
2013-08-29 22:13:15 -07:00
Matei Zaharia
ca71620950
Merge pull request #857 from mateiz/assembly
...
Change build and run instructions to use assemblies
2013-08-29 21:51:14 -07:00
Reynold Xin
1528776628
Merge pull request #874 from jerryshao/fix-report-bug
...
Fix removed block zero size log reporting
2013-08-29 21:30:47 -07:00
Matei Zaharia
e11bc18294
Update Maven docs
2013-08-29 21:19:07 -07:00
Matei Zaharia
d8a4008685
Fix path to assembly in make-distribution.sh
2013-08-29 21:19:07 -07:00
Matei Zaharia
2de756ff19
Update some build instructions because only sbt assembly and mvn package
...
are now needed
2013-08-29 21:19:06 -07:00
Matei Zaharia
666d93c294
Update Maven build to create assemblies expected by new scripts
...
This includes the following changes:
- The "assembly" package now builds in Maven by default, and creates an
assembly containing both hadoop-client and Spark, unlike the old
BigTop distribution assembly that skipped hadoop-client
- There is now a bigtop-dist package to build the old BigTop assembly
- The repl-bin package is no longer built by default since the scripts
don't reply on it; instead it can be enabled with -Prepl-bin
- Py4J is now included in the assembly/lib folder as a local Maven repo,
so that the Maven package can link to it
- run-example now adds the original Spark classpath as well because the
Maven examples assembly lists spark-core and such as provided
- The various Maven projects add a spark-yarn dependency correctly
2013-08-29 21:19:06 -07:00
Matei Zaharia
d7dec938e5
Don't use SPARK_LAUNCH_WITH_SCALA in pyspark
2013-08-29 21:19:06 -07:00
Matei Zaharia
3ff105f87d
Find assembly correctly in pyspark
2013-08-29 21:19:06 -07:00
Matei Zaharia
aab345c463
Fix finding of assembly JAR, as well as some pointers to ./run
2013-08-29 21:19:06 -07:00
Matei Zaharia
8d81358a05
Provide more memory for tests
2013-08-29 21:19:06 -07:00
Matei Zaharia
ab0e625d9e
Fix PySpark for assembly run and include it in dist
2013-08-29 21:19:06 -07:00
Matei Zaharia
53cd50c069
Change build and run instructions to use assemblies
...
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.
As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
2013-08-29 21:19:04 -07:00
jerryshao
f3dbe6b215
Fix removed block zero size log reporting
2013-08-30 09:39:01 +08:00
Patrick Wendell
abdbacf252
Merge pull request #871 from pwendell/expose-local
...
Expose `isLocal` in SparkContext.
2013-08-28 21:11:31 -07:00
Matei Zaharia
afcade3ca8
Merge pull request #873 from pwendell/master
...
Hot fix for command runner
2013-08-28 20:15:40 -07:00
Patrick Wendell
1798e69e71
Adding extra args
2013-08-28 19:56:46 -07:00
Patrick Wendell
30d2421112
Make local variable public
2013-08-28 19:53:31 -07:00
Patrick Wendell
2fc9a028f2
Hot fix for command runner
2013-08-28 19:03:06 -07:00
Andre Schumacher
a511c5379e
RDD sample() and takeSample() prototypes for PySpark
2013-08-28 16:46:13 -07:00
Josh Rosen
742c44eae6
Don't send SIGINT to Py4J gateway subprocess.
...
This addresses SPARK-885, a usability issue where PySpark's
Java gateway process would be killed if the user hit ctrl-c.
Note that SIGINT still won't cancel the running s
This fix is based on http://stackoverflow.com/questions/5045771
2013-08-28 16:39:44 -07:00
Andre Schumacher
457bcd3343
PySpark: implementing subtractByKey(), subtract() and keyBy()
2013-08-28 16:14:22 -07:00
Joseph E. Gonzalez
3325083552
Removing commented out test code
2013-08-28 14:57:12 -07:00
Matei Zaharia
baa84e7e4c
Merge pull request #865 from tgravescs/fixtmpdir
...
Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
2013-08-28 12:44:46 -07:00
Y.CORP.YAHOO.COM\tgraves
aac1214ee4
Change Executor to only look at the env variable SPARK_YARN_MODE
2013-08-28 13:26:26 -05:00
Matei Zaharia
cd043cf922
Merge pull request #867 from tgravescs/yarnenvconfigs
...
Spark on Yarn allow users to specify environment variables
2013-08-27 19:50:32 -07:00
Joseph E. Gonzalez
766b6fd380
Fixing IndexedRDD unit tests.
2013-08-27 18:54:26 -07:00
Joseph E. Gonzalez
9afd0e2375
Merging upstream changes.
2013-08-27 18:26:54 -07:00
Joseph E. Gonzalez
93503a7054
Allowing RDD to select its implementation of PairRDDFunctions
2013-08-27 18:16:19 -07:00
Y.CORP.YAHOO.COM\tgraves
3f206bf0b5
Updated based on review comments.
2013-08-27 14:34:27 -05:00
Y.CORP.YAHOO.COM\tgraves
cf52a3cba6
Allow for Executors to have different directories then the Spark Master for Yarn
2013-08-27 11:00:21 -05:00
Matei Zaharia
898da7e422
Merge pull request #859 from ianbuss/sbt_opts
...
Pass SBT_OPTS environment through to sbt_launcher
2013-08-26 20:40:49 -07:00
Y.CORP.YAHOO.COM\tgraves
63dc635de6
fix typos
2013-08-26 17:06:20 -05:00
Y.CORP.YAHOO.COM\tgraves
c9464c74a1
Add ability for user to specify environment variables
2013-08-26 16:44:27 -05:00
Y.CORP.YAHOO.COM\tgraves
6dd64e8bb2
Update docs and remove old reference to --user option
2013-08-26 14:29:24 -05:00
shivaram
17bafeab39
Merge pull request #864 from rxin/json1
...
Revert json library change
2013-08-26 11:59:32 -07:00
Y.CORP.YAHOO.COM\tgraves
dfb4c697bc
Throw exception if the yarn local dirs isn't set
2013-08-26 13:57:01 -05:00
Reynold Xin
a77e0abb96
Added worker state to the cluster master JSON ui.
2013-08-26 11:21:03 -07:00
Reynold Xin
9db1e50344
Revert "Merge pull request #841 from rxin/json"
...
This reverts commit 1fb1b09928
, reversing
changes made to c69c48947d
.
2013-08-26 11:05:14 -07:00
Y.CORP.YAHOO.COM\tgraves
c0b4095ee8
Change to use Yarn appropriate directories rather then /tmp or the user specified spark.local.dir
2013-08-26 12:48:37 -05:00
Shivaram Venkataraman
dc06b52879
Add an option to turn off data validation, test it.
...
Also moves addIntercept to have default true to make it similar
to validateData option
2013-08-25 23:14:35 -07:00
Shivaram Venkataraman
b8c50a0642
Center & scale variables in Ridge, Lasso.
...
Also add a unit test that checks if ridge regression lowers
cross-validation error.
2013-08-25 22:24:27 -07:00
Patrick Wendell
f9fc5c160a
Merge pull request #603 from pwendell/ec2-updates
...
Several Improvements to EC2 Scripts
2013-08-24 15:19:56 -07:00
Patrick Wendell
2cfe52ef55
Version bump for ec2 docs
2013-08-24 15:16:53 -07:00
Patrick Wendell
4879685910
Merge remote-tracking branch 'mesos/master' into ec2-updates
2013-08-24 14:50:58 -07:00