ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Joseph E. Gonzalez	1856b37e9d	Merge branch 'master' of https://github.com/apache/incubator-spark into indexedrdd_graphx	2013-10-18 12:21:19 -07:00
Patrick Wendell	35befe07bb	Fixing spark streaming example and a bug in examples build. - Examples assembly included a log4j.properties which clobbered Spark's - Example had an error where some classes weren't serializable - Did some other clean-up in this example	2013-10-15 22:55:43 -07:00
Joseph E. Gonzalez	ef7c369092	merged with upstream changes	2013-10-14 22:56:42 -07:00
Neal Wiggins	67d4a31f87	Remove unnecessary mutable imports	2013-10-11 09:47:27 -07:00
Joseph E. Gonzalez	8b59fb72c4	Merging latest changes from spark main branch	2013-09-17 20:56:12 -07:00
Matei Zaharia	12b2f1f9c9	Add missing license headers found with RAT	2013-09-02 12:23:03 -07:00
Matei Zaharia	0a8cc30921	Move some classes to more appropriate packages: * RDD, RDDFunctions -> org.apache.spark.rdd Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer	2013-09-01 14:13:16 -07:00
Matei Zaharia	46eecd110a	Initial work to rename package to org.apache.spark	2013-09-01 14:13:13 -07:00
Matei Zaharia	aab345c463	Fix finding of assembly JAR, as well as some pointers to ./run	2013-08-29 21:19:06 -07:00
Matei Zaharia	53cd50c069	Change build and run instructions to use assemblies This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.	2013-08-29 21:19:04 -07:00
Jey Kottalam	4f43fd791a	make SparkHadoopUtil a member of SparkEnv	2013-08-15 16:50:37 -07:00
Evan Sparks	ff9ebfabb4	Merge pull request #762 from shivaram/sgd-cleanup Refactor SGD options into a new class.	2013-08-11 10:52:55 -07:00
Alexander Pivovarov	2d97cc46af	Fixed path to JavaALS.java and JavaKMeans.java, fixed hadoop2-yarn profile	2013-08-10 23:04:50 -07:00
Matei Zaharia	4c4f769187	Optimize Scala PageRank to use reduceByKey	2013-08-10 18:09:54 -07:00
Matei Zaharia	06e4f2a8f2	Merge pull request #789 from MLnick/master Adding Scala version of PageRank example	2013-08-10 18:06:23 -07:00
Matei Zaharia	cd247ba5bb	Merge pull request #786 from shivaram/mllib-java Java fixes, tests and examples for ALS, KMeans	2013-08-09 20:41:13 -07:00
Matei Zaharia	06303a62e5	Optimize JavaPageRank to use reduceByKey instead of groupByKey	2013-08-08 18:50:00 -07:00
Shivaram Venkataraman	2812e72200	Add setters for optimizer, gradient in SGD. Also remove java-specific constructor for LabeledPoint.	2013-08-08 16:24:31 -07:00
Shivaram Venkataraman	e1a209f791	Remove Java-specific constructor for Rating. The scala constructor works for native type java types. Modify examples to match this.	2013-08-08 14:36:02 -07:00
Nick Pentreath	c4eea875ac	Style changes as per Matei's comments	2013-08-08 12:40:37 +02:00
Nick Pentreath	cce758b893	Adding Scala version of PageRank example	2013-08-07 16:38:52 +02:00
Shivaram Venkataraman	338b7a7455	Merge branch 'master' of git://github.com/mesos/spark into sgd-cleanup Conflicts: mllib/src/main/scala/spark/mllib/util/MLUtils.scala	2013-08-06 21:21:55 -07:00
Shivaram Venkataraman	7db69d56f2	Refactor GLM algorithms and add Java tests This change adds Java examples and unit tests for all GLM algorithms to make sure the MLLib interface works from Java. Changes include - Introduce LabeledPoint and avoid using Doubles in train arguments - Rename train to run in class methods - Make the optimizer a member variable of GLM to make sure the builder pattern works	2013-08-06 17:23:22 -07:00
Shivaram Venkataraman	471fbadd0c	Java examples, tests for KMeans and ALS - Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init	2013-08-06 15:43:46 -07:00
stayhf	882baee489	Got rid of unnecessary map function	2013-08-06 21:34:39 +00:00
stayhf	326a7a82e0	changes as reviewer requested	2013-08-06 21:03:24 +00:00
stayhf	98fd62605d	Updated code with reviewer's suggestions	2013-08-05 00:30:28 +00:00
stayhf	a682637301	Simple PageRank algorithm implementation in Java for SPARK-760	2013-08-03 06:01:16 +00:00
Matei Zaharia	af3c9d5042	Add Apache license headers and LICENSE and NOTICE files	2013-07-16 17:21:33 -07:00
Matei Zaharia	ccfe953a4d	Merge pull request #577 from skumargithub/master Example of cumulative counting using updateStateByKey	2013-06-29 17:57:53 -07:00
Matei Zaharia	1667158544	Merge remote-tracking branch 'mrpotes/master'	2013-06-29 14:36:09 -07:00
James Phillpotts	176193b1e8	Fix usage and parameter extraction	2013-06-25 23:06:15 +01:00
James Phillpotts	366572edca	Include a default OAuth implementation, and update examples and JavaStreamingContext	2013-06-25 22:59:34 +01:00
Tathagata Das	c89af0a7f9	Merge branch 'master' into streaming Conflicts: .gitignore	2013-06-24 23:57:47 -07:00
Matei Zaharia	dbfab49d2a	Merge remote-tracking branch 'milliondreams/casdemo' Conflicts: project/SparkBuild.scala	2013-06-18 14:55:31 +02:00
Rohit Rai	b5b12823fa	Fixing the style as per feedback	2013-06-13 14:05:46 +05:30
Rohit Rai	b104c7f5c7	Example to write the output to cassandra	2013-06-03 15:15:52 +05:30
Rohit Rai	56c64c4033	A better way to read column value if you are sure the column exists in every row.	2013-06-03 12:48:35 +05:30
Rohit Rai	81c2adc15c	Removing infix call	2013-06-02 12:51:15 +05:30
Rohit Rai	3be7bdcefd	Adding example to make Spark RDD from Cassandra	2013-06-01 19:32:17 +05:30
Ethan Jewett	ee6f6aa6cd	Add hBase example	2013-05-09 18:33:38 -05:00
Reynold Xin	012c9e5ab0	Revert "Merge pull request #596 from esjewett/master" because the dependency on hbase introduces netty-3.2.2 which conflicts with netty-3.5.3 already in Spark. This caused multiple test failures. This reverts commit `0f1b7a06e1`, reversing changes made to `aacca1b8a8`.	2013-05-09 14:20:01 -07:00
Ethan Jewett	a3d5f92210	Switch to using SparkContext method to create RDD	2013-05-07 11:43:06 -05:00
unknown	cbf6a5ee1e	Removed unused code, clarified intent of the program, batch size to 1 second	2013-05-06 08:05:45 -06:00
Ethan Jewett	7cff7e7897	Fix indents and mention other configuration options	2013-05-04 14:56:55 -05:00
Ethan Jewett	9290f16430	Remove unnecessary column family config	2013-05-04 12:39:14 -05:00
Ethan Jewett	02e8cfa617	HBase example	2013-05-04 12:31:30 -05:00
unknown	1d54401d7e	Modified as per TD's suggestions	2013-04-30 23:01:32 -06:00
Mridul Muralidharan	dd515ca3ee	Attempt at fixing merge conflict	2013-04-24 09:24:17 +05:30
unknown	0dc1e2d60f	Examaple of cumulative counting using updateStateByKey	2013-04-22 09:22:45 -06:00

1 2 3

146 commits