Matei Zaharia
e2c68642c6
Miscellaneous fixes from code review.
...
Also replaced SparkConf.getOrElse with just a "get" that takes a default
value, and added getInt, getLong, etc to make code that uses this
simpler later on.
2014-01-01 22:03:39 -05:00
Matei Zaharia
45ff8f413d
Merge remote-tracking branch 'apache/master' into conf2
...
Conflicts:
core/src/main/scala/org/apache/spark/SparkContext.scala
core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala
2014-01-01 21:25:00 -05:00
Reza Zadeh
7c04b3134a
Merge remote-tracking branch 'upstream/master' into sparsesvd
2014-01-01 18:12:35 -08:00
Patrick Wendell
c1d928a897
Merge pull request #312 from pwendell/log4j-fix-2
...
SPARK-1008: Logging improvments
1. Adds a default log4j file that gets loaded if users haven't specified a log4j file.
2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings
after building with SBT (and I've seen similar warnings on the mailing list).
2014-01-01 17:03:48 -08:00
Patrick Wendell
f8d245bdfc
Merge remote-tracking branch 'apache-github/master' into log4j-fix-2
...
Conflicts:
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
2014-01-01 16:10:51 -08:00
Reynold Xin
dc9cb83e86
Merge pull request #126 from jegonzal/FixingPersist
...
Fixing Persist Behavior
2014-01-01 13:28:34 -08:00
Andrew Or
92c304fd03
Simplify ExternalAppendOnlyMap on the assumption that the mergeCombiners function is specified
2014-01-01 11:42:33 -08:00
Matei Zaharia
0e5b2adb5c
Merge remote-tracking branch 'apache/master' into conf2
...
Conflicts:
project/SparkBuild.scala
2014-01-01 13:28:54 -05:00
Lian, Cheng
dd6033e685
Aggregated all sample points to driver without any shuffle
2014-01-02 01:38:24 +08:00
Reynold Xin
9a0ff721c9
Merge pull request #314 from witgo/master
...
restore core/pom.xml file modification
2013-12-31 21:50:24 -08:00
Joseph E. Gonzalez
2f2524fd11
Addressing issue in compute where compute is invoked instead of iterator on the parent RDD.
2013-12-31 21:37:51 -08:00
Andrew Or
3bc9e391a3
Merge branch 'master' of github.com:andrewor14/incubator-spark
2013-12-31 20:02:12 -08:00
Andrew Or
83dfa16664
Address Patrick's and Reynold's comments
2013-12-31 20:02:05 -08:00
liguoqiang
b5d0b3b0f7
restore core/pom.xml file modification
2014-01-01 11:30:08 +08:00
Reynold Xin
8b8e70ebde
Merge pull request #73 from falaki/ApproximateDistinctCount
...
Approximate distinct count
Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.
2013-12-31 17:48:24 -08:00
Aaron Davidson
08302b113a
Rename IntermediateBlockId to TempBlockId
2013-12-31 17:44:15 -08:00
Patrick Wendell
37c43c9dd1
Adding outer checkout when initializing logging
2013-12-31 17:36:56 -08:00
Andrew Or
8bbe08b21e
Merge branch 'master' of github.com:andrewor14/incubator-spark
2013-12-31 17:26:26 -08:00
Andrew Or
53d8d36684
Add support and test for null keys in ExternalAppendOnlyMap
...
Also add safeguard against use of destructively sorted AppendOnlyMap
2013-12-31 17:19:02 -08:00
Hossein Falaki
bee445c927
Made the code more compact and readable
2013-12-31 16:58:18 -08:00
Hossein Falaki
acb0323053
minor improvements
2013-12-31 15:34:26 -08:00
Matei Zaharia
42bcfb2bb2
Fix two compile errors introduced in merge
2013-12-31 18:26:23 -05:00
Matei Zaharia
ba9338f104
Merge remote-tracking branch 'apache/master' into conf2
...
Conflicts:
core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
2013-12-31 18:23:14 -05:00
Joseph E. Gonzalez
3d93d73396
Fixing the persist behavior of the Vertex and Edge RDDs to persist the current RDD and not the parent.
2013-12-31 15:19:01 -08:00
Patrick Wendell
63b411dd86
Merge pull request #238 from ngbinh/upgradeNetty
...
upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final
the changes are listed at https://github.com/netty/netty/wiki/New-and-noteworthy
2013-12-31 14:31:28 -08:00
Joey
32d6ae9d9c
Merge pull request #120 from ankurdave/subgraph-reuses-view
...
Reuse VTableReplicated in GraphImpl.subgraph
2013-12-31 13:51:07 -08:00
Andrew Or
3ce22df954
Add warning message for spilling
2013-12-31 11:33:10 -08:00
Andrew Or
94ddc91d06
Address Aaron's and Jerry's comments
2013-12-31 10:50:08 -08:00
Patrick Wendell
55b7e2fdff
Merge pull request #289 from tdas/filestream-fix
...
Bug fixes for file input stream and checkpointing
- Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.)
- Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration.
- Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten.
- Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.
2013-12-31 10:12:51 -08:00
Jianping J Wang
6e50df6255
Update SvdppSuite.scala
2013-12-31 22:02:16 +08:00
Jianping J Wang
61e6671f5a
fix test bug
2013-12-31 22:01:02 +08:00
Tathagata Das
fcd17a1e8e
Fixed comments and long lines based on comments on PR 289.
2013-12-31 02:01:45 -08:00
Jianping J Wang
12c26d7fb9
Update Svdpp.scala
2013-12-31 17:18:15 +08:00
Jianping J Wang
ab7b8ce13e
Update Svdpp.scala
2013-12-31 17:11:20 +08:00
Jianping J Wang
4a30f69b25
update svdpp test
2013-12-31 17:00:59 +08:00
Jianping J Wang
779c66ae4e
refactor and fix bugs
2013-12-31 16:59:05 +08:00
Tathagata Das
977bcc36d4
Merge branch 'apache-master' into project-refactor
2013-12-31 00:43:38 -08:00
Tathagata Das
87b915f221
Removed extra empty lines.
2013-12-31 00:42:10 -08:00
Tathagata Das
3ab297adaa
Removed unnecessary comments.
2013-12-31 00:38:19 -08:00
Tathagata Das
97630849ff
Added pom.xml for external projects and removed unnecessary dependencies and repositoris from other poms and sbt.
2013-12-31 00:28:57 -08:00
Patrick Wendell
4abb0c57ab
Tiny typo fix
2013-12-31 00:05:03 -08:00
Patrick Wendell
4d009dcac6
Removing use in test
2013-12-31 00:01:44 -08:00
Patrick Wendell
3c254f2eec
Minor fixes
2013-12-30 23:55:33 -08:00
Aaron Davidson
375d11743c
Add new line at end of file
2013-12-30 23:42:37 -08:00
Patrick Wendell
18181e6c41
Removing initLogging entirely
2013-12-30 23:39:47 -08:00
Aaron Davidson
daa7792ad6
Refactor SamplingSizeTracker into SizeTrackingAppendOnlyMap
2013-12-30 23:39:02 -08:00
Hossein Falaki
d6cded7155
Added Java unit tests for countApproxDistinct and countApproxDistinctByKey
2013-12-30 19:32:05 -08:00
Hossein Falaki
c3073b6cf2
Added Java API for countApproxDistinct
2013-12-30 19:31:06 -08:00
Hossein Falaki
ed06500d30
Added Java API for countApproxDistinctByKey
2013-12-30 19:30:42 -08:00
Hossein Falaki
b75d7c98bc
Added stream 2.5.1 jar depenency
2013-12-30 19:29:17 -08:00