Reynold Xin
f5812d0354
Added mapPartitionsWithSplit to the programming guide.
2012-09-29 01:31:36 -07:00
Matei Zaharia
2f11e3c285
Merge pull request #227 from JoshRosen/fix/distinct_numsplits
...
Allow controlling number of splits in distinct().
2012-09-28 23:57:24 -07:00
Josh Rosen
8654165e69
Use null as dummy value in distinct().
2012-09-28 23:55:17 -07:00
Josh Rosen
37c199bbb0
Allow controlling number of splits in distinct().
2012-09-28 23:44:19 -07:00
Matei Zaharia
56dcad5936
Don't create a Cache in SparkEnv because we don't use it
2012-09-28 23:40:56 -07:00
Matei Zaharia
1d44644f4f
Logging tweaks
2012-09-28 23:28:16 -07:00
Matei Zaharia
815d6bd69a
Renamed subdirs option
2012-09-28 19:02:41 -07:00
Matei Zaharia
e54e1d7043
Made subdirs per local dir configurable, and reduced lock usage a bit
2012-09-28 19:00:50 -07:00
Matei Zaharia
ae8c7d6cfa
Made disk store use multiple directories, deleted ShuffleManager
2012-09-28 18:28:13 -07:00
Matei Zaharia
3d7267999d
Print and track user call sites in more places in Spark
2012-09-28 17:42:00 -07:00
Matei Zaharia
9f6efbf06a
Merge pull request #225 from pwendell/dev
...
Log message which records RDD origin
2012-09-28 16:28:07 -07:00
Matei Zaharia
0121a26bd1
Changed the way tasks' dependency files are sent to workers so that
...
custom serializers or Kryo registrators can be loaded.
2012-09-28 16:14:05 -07:00
Patrick Wendell
9fc78f8f29
Fixing some whitespace issues
2012-09-28 16:05:50 -07:00
Patrick Wendell
bc909c2903
Changes based on Matei's comments
2012-09-28 16:04:36 -07:00
Patrick Wendell
c387e40fb1
Log message which records RDD origin
...
This adds tracking to determine the "origin" of an RDD. Origin is defined by
the boundary between the user's code and the spark code, during an RDD's
instantiation. It is meant to help users understand where a Spark RDD is
coming from in their code.
This patch also logs origin data when stages are submitted to the scheduler.
Finally, it adds a new log message to fix an inconsitency in the way that
dependent stages (those missing parents) and independent stages (those
without) are logged during submission.
2012-09-28 15:51:46 -07:00
Matei Zaharia
915ab970b7
Make error reporting less scary if we can't look up UseCompressedOops
2012-09-28 14:52:37 -07:00
Matei Zaharia
2a8bfbca00
Fixed a bug where isLocal was set to false when using local[K]
2012-09-28 14:50:54 -07:00
Matei Zaharia
4a138403ef
Fix a bug in JAR fetcher that made it always fetch the JAR
2012-09-27 21:32:06 -07:00
Matei Zaharia
009b0e37e7
Added an option to compress blocks in the block store
2012-09-27 18:45:44 -07:00
Matei Zaharia
7bcb08cef5
Renamed storage levels to something cleaner; fixes #223 .
2012-09-27 17:50:59 -07:00
Matei Zaharia
0850d641af
Merge branch 'dev' of github.com:mesos/spark into dev
2012-09-26 23:53:48 -07:00
Matei Zaharia
bf18e0994e
Minor typos
2012-09-26 23:53:38 -07:00
Matei Zaharia
a4093f7563
Minor doc fixes
2012-09-26 23:22:15 -07:00
Matei Zaharia
920fab23c3
Merge pull request #222 from rxin/dev
...
Added MapPartitionsWithSplitRDD.
2012-09-26 23:16:45 -07:00
Matei Zaharia
ea05fc130b
Updates to standalone cluster, web UI and deploy docs.
2012-09-26 22:54:39 -07:00
Matei Zaharia
1ef4f0fbd2
Allow controlling number of splits in sortByKey.
2012-09-26 19:18:47 -07:00
Matei Zaharia
874a9fd407
More updates to docs, including tuning guide
2012-09-26 19:17:58 -07:00
Reynold Xin
1ad1331a34
Added MapPartitionsWithSplitRDD.
2012-09-26 17:11:28 -07:00
Matei Zaharia
ee71fa49c1
Look for Kryo registrator using context class loader
2012-09-26 14:15:16 -07:00
Matei Zaharia
a417cd4d9d
Look for Kryo registrator using context class loader
2012-09-26 14:14:17 -07:00
Matei Zaharia
58eb44acbb
Doc tweaks
2012-09-26 00:32:59 -07:00
Matei Zaharia
d71a358c46
Fixed a test that was getting extremely lucky before, and increased the
...
number of samples used for sorting
2012-09-26 00:25:34 -07:00
Matei Zaharia
d51d5e0582
Doc fixes
2012-09-25 23:59:04 -07:00
Matei Zaharia
c5754bb939
Fixes to Java guide
2012-09-25 23:51:04 -07:00
Matei Zaharia
f1246cc7c1
Various enhancements to the programming guide and HTML/CSS
2012-09-25 23:26:56 -07:00
Matei Zaharia
051785c7e6
Several fixes to sampling issues pointed out by Henry Milner:
...
- takeSample was biased towards earlier partitions
- There were some range errors in takeSample
- SampledRDDs with replacement didn't produce appropriate counts
across partitions (we took exactly frac of each one)
2012-09-25 21:46:58 -07:00
Matei Zaharia
56c90485fd
More updates to documentation
2012-09-25 19:31:07 -07:00
Matei Zaharia
1821bf1f1f
Merge branch 'dev' of github.com:mesos/spark into dev
2012-09-25 15:46:27 -07:00
Matei Zaharia
e47e11720f
Documentation updates
2012-09-25 15:46:18 -07:00
Matei Zaharia
635348b185
Merge pull request #220 from andyk/doc
...
One commit that makes nav dropdowns show on hover
2012-09-25 15:45:49 -07:00
Andy Konwinski
098351b735
Makes nav menu dropdowns show on hover instead of on click.
2012-09-25 15:42:50 -07:00
Andy Konwinski
8d30fe616e
Merge remote-tracking branch 'public-spark/dev' into doc
2012-09-25 15:19:57 -07:00
Matei Zaharia
30362a21e7
Update license info on deploy scripts
2012-09-25 14:43:47 -07:00
Matei Zaharia
60bce9574f
Update Jekyll plugin to look at Scala 2.9.2 docs
2012-09-25 14:31:53 -07:00
Matei Zaharia
aa50d5b9a2
Merge pull request #219 from rnpandya/dev
...
Add spark-shell.cmd
2012-09-25 09:58:35 -07:00
Ravi Pandya
36349a2fb6
Add spark-shell.cmd
2012-09-25 07:26:29 -07:00
Matei Zaharia
4d3339a3ec
Merge pull request #217 from rxin/dev
...
Added a method to RDD to expose the ClassManifest.
2012-09-24 23:52:32 -07:00
Reynold Xin
7a4cd92861
Renamed RDD.manifest to RDD.elementClassManifest
2012-09-24 23:42:33 -07:00
Matei Zaharia
296e24b440
Merge pull request #218 from rnpandya/dev
...
Scripts to start Spark under windows
2012-09-24 21:10:31 -07:00
Reynold Xin
348bcbca1f
Added a method to RDD to expose the ClassManifest.
2012-09-24 16:56:27 -07:00