Commit graph

287 commits

Author SHA1 Message Date
Jey Kottalam 69c3bbf688 dynamically detect hadoop version 2013-08-15 16:50:37 -07:00
Matei Zaharia d9588183fa Update to Mesos 0.12.1 2013-08-13 18:51:35 -07:00
jerryshao 320e87e7ab Add MetricsServlet for Spark metrics system 2013-08-12 13:23:23 +08:00
Matei Zaharia dce5e47435 Merge pull request #800 from dlyubimov/HBASE_VERSION
Pull HBASE_VERSION in the head of sbt build
2013-08-09 21:53:45 -07:00
Matei Zaharia cd247ba5bb Merge pull request #786 from shivaram/mllib-java
Java fixes, tests and examples for ALS, KMeans
2013-08-09 20:41:13 -07:00
Dmitriy Lyubimov 27f674f82b fewer words 2013-08-09 13:54:41 -07:00
Dmitriy Lyubimov ae95b57469 Pull HBASE_VERSION in the head of sbt build 2013-08-09 12:45:18 -07:00
Matei Zaharia 5a4003c1ac Update to Chill 0.3.1 2013-08-08 13:30:27 -07:00
Shivaram Venkataraman 471fbadd0c Java examples, tests for KMeans and ALS
- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
  easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
  called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
  examples project.

Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
  KMeans init
2013-08-06 15:43:46 -07:00
Matei Zaharia e466a55a6b Revert Mesos version to 0.9 since the 0.12 artifact has target Java 7 2013-08-01 15:45:21 -07:00
Matei Zaharia b2b86c2575 Merge pull request #753 from shivaram/glm-refactor
Build changes for ML lib
2013-07-31 15:51:39 -07:00
Matei Zaharia 14bf2fe039 Merge pull request #749 from benh/spark-executor-uri
Added property 'spark.executor.uri' for launching on Mesos.
2013-07-31 14:18:16 -07:00
Shivaram Venkataraman 15fd0d619d Add mllib, bagel to repl dependencies
Also don't build an assembly jar for them
2013-07-30 18:31:11 -07:00
Reynold Xin 3b1ced83fb Exclude older version of Snappy in streaming and examples. 2013-07-30 17:25:36 -07:00
Reynold Xin 368c58eac5 Merge branch 'lazy_file_open' of github.com:lyogavin/spark into compression
Conflicts:
	project/SparkBuild.scala
2013-07-30 16:04:18 -07:00
Shivaram Venkataraman 48851d4dd9 Add bagel, mllib to SBT assembly.
Also add jblas dependency to mllib pom.xml
2013-07-30 14:03:15 -07:00
Benjamin Hindman f6f46455eb Added property 'spark.executor.uri' for launching on Mesos without
requiring Spark to be installed. Using 'make_distribution.sh' a user
can put a Spark distribution at a URI supported by Mesos (e.g.,
'hdfs://...') and then set that when launching their job. Also added
SPARK_EXECUTOR_URI for the REPL.
2013-07-29 23:32:52 -07:00
ryanlecompte 8e0939f5a9 refactor Kryo serializer support to use chill/chill-java 2013-07-24 20:43:57 -07:00
jerryshao 5730193e0c Fix some typos 2013-07-24 14:57:47 +08:00
jerryshao 576528f0f9 Add dependency of Codahale's metrics library 2013-07-24 14:57:46 +08:00
Josh Rosen c83680434b Add JavaAPICompletenessChecker.
This is used to find methods in the Scala API that
need to be ported to the Java API.  To use it:

  ./run spark.tools.JavaAPICompletenessChecker
Conflicts:
	project/SparkBuild.scala
	run
	run2.cmd
2013-07-22 16:11:49 -07:00
Liang-Chi Hsieh d1738d72ba also exclude asm for hadoop2. hadoop1 looks like no need to do that too. 2013-07-20 00:37:24 +08:00
Liang-Chi Hsieh 3aad452653 fix a bug in build process that pulls in two versionf of ASM. 2013-07-19 02:29:46 +08:00
Matei Zaharia cad48edb70 Merge pull request #708 from ScrapCodes/dependencies-upgrade
Dependency upgrade Akka 2.0.3 -> 2.0.5
2013-07-16 21:41:28 -07:00
Matei Zaharia af3c9d5042 Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
Prashant Sharma 2748e73eb9 Dependency upgrade Akka 2.0.3 -> 2.0.5 2013-07-16 16:08:46 +05:30
Matei Zaharia 668b0dc6a7 Merge branch 'master' of github.com:mesos/spark 2013-07-13 19:10:46 -07:00
Matei Zaharia cd28d9c147 Merge remote-tracking branch 'origin/pr/662'
Conflicts:
	bin/compute-classpath.sh
2013-07-13 19:10:00 -07:00
seanm c4d5b01e44 changing com.google.code.findbugs maven coordinates 2013-07-13 14:56:23 -06:00
Matei Zaharia 3cc6818f13 Merge pull request #668 from shimingfei/guava-14.0.1
update guava version from 11.0.1 to 14.0.1
2013-07-06 19:51:20 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia 43b24635ee Renamed ML package to MLlib and added it to classpath 2013-07-05 11:38:53 -07:00
Matei Zaharia 05be233ce2 Removed dependency on Apache Commons Math 2013-07-05 11:13:46 -07:00
Reynold Xin 6a9a9a364c Minor clean up of the RidgeRegression code. I am not even sure why I did
this :s.
2013-07-05 11:13:45 -07:00
Matei Zaharia 729e463f64 Import RidgeRegression example
Conflicts:
	run
2013-07-05 11:13:41 -07:00
Gavin Li 94238aae57 fix dependencies 2013-07-03 18:08:38 +00:00
Mingfei 04567a1771 update guava version from 11.0.1 to 14.0.1 2013-07-03 17:43:37 +08:00
Matei Zaharia 5cfcd3c336 Remove Twitter4J specific repo since it's in Maven central 2013-06-29 15:37:27 -07:00
Evan Chan 1107b4d55b Merge branch 'master' into 2013-06/assembly-jar-deploy
Conflicts:
	run

Previous changes that I made to run and set-dev-classpath.sh instead
have been folded into compute-classpath.sh
2013-06-28 17:18:35 -07:00
Matei Zaharia 32370da4e4 Don't use forward slash in exclusion for JAR signature files 2013-06-25 22:08:19 -04:00
Evan Chan d2f46ac680 Merge branch 'master' into 2013-06/assembly-jar-deploy
Conflicts:
	run
2013-06-25 14:50:16 -07:00
Tathagata Das c89af0a7f9 Merge branch 'master' into streaming
Conflicts:
	.gitignore
2013-06-24 23:57:47 -07:00
Patrick Wendell 91ec5a1a04 Changing JSON protocol and removing spray code 2013-06-22 10:31:36 -07:00
Matei Zaharia b350f34703 Increase memory for tests to prevent a crash on JDK 7 2013-06-22 07:48:20 -07:00
Evan Chan 071ff7efa1 Enable building a fat jar for the Spark REPL 2013-06-20 17:53:23 -07:00
Matei Zaharia ae7a5da6b3 Fix some dependency issues in SBT build (same will be needed for Maven):
- Exclude a version of ASM 3.x that comes from HBase
- Don't use a special ASF repo for HBase
- Update SLF4J version
- Add sbt-dependency-graph plugin so we can easily find dependency trees
2013-06-20 18:44:46 +02:00
Matei Zaharia 7902baddc7 Update ASM to version 4.0 2013-06-19 13:34:30 +02:00
Matei Zaharia dbfab49d2a Merge remote-tracking branch 'milliondreams/casdemo'
Conflicts:
	project/SparkBuild.scala
2013-06-18 14:55:31 +02:00
Matei Zaharia 73f4c7d2d1 Merge pull request #605 from esjewett/SPARK-699
Add hBase example (retry of pull request #596)
2013-06-18 04:21:17 -07:00
Matei Zaharia 2ab311f4ce Removed second version of junit test plugin from plugins.sbt 2013-06-18 00:40:25 +02:00
Christopher Nguyen 479442a9b9 Add zeroLengthPartitions() test to make sure, e.g., StatCounter.scala can handle empty partitions without incorrectly returning NaN 2013-06-15 17:35:55 -07:00
Rohit Rai 6d8423fd1b Adding deps to examples/pom.xml
Fixing exclusion in examples deps in SparkBuild.scala
2013-06-02 13:03:45 +05:30
Rohit Rai 3be7bdcefd Adding example to make Spark RDD from Cassandra 2013-06-01 19:32:17 +05:30
Reynold Xin f742435f18 Removed the duplicated netty dependency in SBT build file. 2013-05-16 14:31:03 -07:00
Reynold Xin f3491cb89b Merge branch 'master' of github.com:mesos/spark into shufflemerge
Conflicts:
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/test/scala/spark/DistributedSuite.scala
	project/SparkBuild.scala
2013-05-15 00:31:52 -07:00
Reynold Xin 81ad2fa331 Merge branch 'jdbc' of github.com:koeninger/spark
Conflicts:
	project/SparkBuild.scala
2013-05-14 23:12:00 -07:00
Cody Koeninger b16c4896f6 add test for JdbcRDD using embedded derby, per rxin suggestion 2013-05-14 23:44:04 -05:00
Ethan Jewett ee6f6aa6cd Add hBase example 2013-05-09 18:33:38 -05:00
Reynold Xin 012c9e5ab0 Revert "Merge pull request #596 from esjewett/master" because the
dependency on hbase introduces netty-3.2.2 which conflicts with
netty-3.5.3 already in Spark. This caused multiple test failures.

This reverts commit 0f1b7a06e1, reversing
changes made to aacca1b8a8.
2013-05-09 14:20:01 -07:00
Reynold Xin 90577ada69 Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge
Conflicts:
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/DiskStore.scala
	project/SparkBuild.scala
2013-05-07 15:56:19 -07:00
Ethan Jewett 02e8cfa617 HBase example 2013-05-04 12:31:30 -05:00
Jey Kottalam 207afe4088 Remove spark-repl's extraneous dependency on spark-streaming 2013-05-01 16:57:31 -07:00
Matei Zaharia f1f92c88eb Build against Hadoop 1 by default 2013-04-29 17:08:45 -07:00
Matei Zaharia 1b169f190c Exclude old versions of Netty, which had a different Maven organization 2013-04-25 19:52:12 -07:00
Matei Zaharia eef9ea1993 Update unit test memory to 2 GB 2013-04-25 00:42:29 -07:00
Matei Zaharia 01d9ba5038 Add back line removed during YARN merge 2013-04-25 00:11:27 -07:00
Mridul Muralidharan 3b594a4e3b Do not add signature files - results in validation errors when using assembled file 2013-04-24 10:18:25 +05:30
Mridul Muralidharan dd515ca3ee Attempt at fixing merge conflict 2013-04-24 09:24:17 +05:30
Mridul Muralidharan adcda84f96 Pull latest SparkBuild.scala from master and merge conflicts 2013-04-24 08:57:25 +05:30
Mridul Muralidharan 5b85c715c8 Revert back to 2.0.2-alpha : 0.23.7 has protocol changes which break against cloudera 2013-04-24 02:57:51 +05:30
Mridul Muralidharan 8faf5c51c3 Patch from Thomas Graves to improve the YARN Client, and move to more production ready hadoop yarn branch 2013-04-24 02:31:57 +05:30
Mridul Muralidharan 7acab3ab45 Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo 2013-04-22 08:01:13 +05:30
Matei Zaharia 17e076de80 Turn on forking in test JVMs to reduce the pressure on perm gen and code
cache sizes due to having 2 instances of the Scala compiler and a bunch
of classloaders.
2013-04-18 22:25:57 -07:00
Mridul Muralidharan 5d891534fd Move back to 2.0.2-alpha, since 2.0.3-alpha is not available in cloudera yet. Also, add netty dependency explicitly to prevent resolving to older 2.3x version. Additionally, comment out retrievePattern to ensure correct netty is picked up 2013-04-17 05:54:43 +05:30
Matei Zaharia ec5e553b41 Merge pull request #558 from ash211/patch-jackson-conflict
Don't pull in old versions of Jackson via hadoop-core
2013-04-14 08:20:13 -07:00
Matei Zaharia ed336e0d44 Fix tests from different projects running in parallel in SBT 0.12 2013-04-11 22:29:37 -04:00
Andrew Ash 18bd41d1a3 Don't pull in old versions of Jackson via hadoop-core 2013-04-09 14:44:47 -04:00
Matei Zaharia 65caa8f711 Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
Conflicts:
	docs/_config.yml
	project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Matei Zaharia 1cb3eb9762 Merge remote-tracking branch 'kalpit/master'
Conflicts:
	project/SparkBuild.scala
2013-04-07 20:54:18 -04:00
Matei Zaharia b362df39ea Merge pull request #552 from MLnick/master
Bumping version for Twitter Algebird to latest
2013-04-07 17:17:52 -07:00
Mridul Muralidharan 6798a09df8 Add support for building against hadoop2-yarn : adding new maven profile for it 2013-04-07 17:47:38 +05:30
shane-huang df47b40b76 Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager
Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages
change reference from io.Source to scala.io.Source to avoid looking into io.netty package

Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-04-07 14:37:12 +08:00
Andy Konwinski 5555811bd5 Update build to Scala 2.9.3 2013-04-04 13:26:45 -07:00
Nick Pentreath 0f54344fd8 Bumping Algebird version in examples now that it supports JDK 1.6 2013-04-03 13:15:34 +02:00
Jey Kottalam bc8ba222ff Bump development version to 0.8.0 2013-03-28 15:42:01 -07:00
kalpit f0164e5047 upgraded sbt version, sbt plugins and some library dependencies to latest stable version 2013-03-26 17:49:29 -07:00
Holden Karau 8456d673e2 Re-enable deprecation warnings since there are only two 2013-03-24 17:30:23 -07:00
Holden Karau e104a76016 Makes the syntax highlighting on the build file not broken in emacs. 2013-03-24 16:16:05 -07:00
seanm 42822cf95d changing streaming resolver for akka 2013-03-13 11:40:42 -06:00
seanm 4aa1205202 adding typesafe repo to streaming resolvers so that akka-zeromq is found 2013-03-11 12:37:29 -06:00
Hiral Patel 664e5fd24b Fix reference bug in Kryo serializer, add test, update version 2013-03-07 22:16:11 -08:00
Matei Zaharia db9b90fdbd Change version to 0.7.1-SNAPSHOT for development branch 2013-02-27 09:15:26 -08:00
Matei Zaharia 7e67c626ee Change version number to 0.7.0 2013-02-25 20:30:47 -08:00
Matei Zaharia 6494cab19d Update Hadoop dependency to 1.0.4 2013-02-25 15:38:21 -08:00
Prashant Sharma 254acb1666 Moving akka dependency resolver to shared. 2013-02-25 13:37:07 +05:30
Tathagata Das 5ab37be983 Fixed class paths and dependencies based on Matei's comments. 2013-02-24 16:24:52 -08:00
Tathagata Das f282bc4960 Changed Algebird from 0.1.9 to 0.1.8 2013-02-24 12:44:12 -08:00
Tathagata Das 24c0cd6168 Fixed resolver for akka-zeromq 2013-02-22 18:23:29 -08:00
Tathagata Das cfa65ebff1 Merge pull request #480 from MLnick/streaming-eg-algebird
[Streaming] Examples using Twitter's Algebird library
2013-02-22 12:29:04 -08:00
Nick Pentreath 718474b9c6 Bumping Algebird to 0.1.9 2013-02-21 12:11:31 +02:00