ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Thomas Graves	7edbea41b4	SPARK-1189: Add Security to Spark - Akka, Http, ConnectionManager, UI use servlets resubmit pull request. was https://github.com/apache/incubator-spark/pull/332. Author: Thomas Graves <tgraves@apache.org> Closes #33 from tgravescs/security-branch-0.9-with-client-rebase and squashes the following commits: dfe3918 [Thomas Graves] Fix merge conflict since startUserClass now using runAsUser 05eebed [Thomas Graves] Fix dependency lost in upmerge d1040ec [Thomas Graves] Fix up various imports 05ff5e0 [Thomas Graves] Fix up imports after upmerging to master ac046b3 [Thomas Graves] Merge remote-tracking branch 'upstream/master' into security-branch-0.9-with-client-rebase 13733e1 [Thomas Graves] Pass securityManager and SparkConf around where we can. Switch to use sparkConf for reading config whereever possible. Added ConnectionManagerSuite unit tests. 4a57acc [Thomas Graves] Change UI createHandler routines to createServlet since they now return servlets 2f77147 [Thomas Graves] Rework from comments 50dd9f2 [Thomas Graves] fix header in SecurityManager ecbfb65 [Thomas Graves] Fix spacing and formatting b514bec [Thomas Graves] Fix reference to config ed3d1c1 [Thomas Graves] Add security.md 6f7ddf3 [Thomas Graves] Convert SaslClient and SaslServer to scala, change spark.authenticate.ui to spark.ui.acls.enable, and fix up various other things from review comments 2d9e23e [Thomas Graves] Merge remote-tracking branch 'upstream/master' into security-branch-0.9-with-client-rebase_rework 5721c5a [Thomas Graves] update AkkaUtilsSuite test for the actorSelection changes, fix typos based on comments, and remove extra lines I missed in rebase from AkkaUtils f351763 [Thomas Graves] Add Security to Spark - Akka, Http, ConnectionManager, UI to use servlets	2014-03-06 18:27:50 -06:00
Prashant Sharma	181ec50307	[java8API] SPARK-964 Investigate the potential for using JDK 8 lambda expressions for the Java/Scala APIs Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #17 from ScrapCodes/java8-lambdas and squashes the following commits: 95850e6 [Patrick Wendell] Some doc improvements and build changes to the Java 8 patch. 85a954e [Prashant Sharma] Nit. import orderings. 673f7ac [Prashant Sharma] Added support for -java-home as well 80a13e8 [Prashant Sharma] Used fake class tag syntax 26eb3f6 [Prashant Sharma] Patrick's comments on PR. 35d8d79 [Prashant Sharma] Specified java 8 building in the docs 31d4cd6 [Prashant Sharma] Maven build to support -Pjava8-tests flag. 4ab87d3 [Prashant Sharma] Review feedback on the pr c33dc2c [Prashant Sharma] SPARK-964, Java 8 API Support.	2014-03-03 22:31:30 -08:00
Prashant Sharma	919bd7f669	Merge pull request #567 from ScrapCodes/style2. SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2 Continuation of PR #557 With this all scala style errors are fixed across the code base !! The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too. Author: Prashant Sharma <prashant.s@imaginea.com> Closes #567 and squashes the following commits: 3b1ec30 [Prashant Sharma] scala style fixes	2014-02-09 22:17:52 -08:00
Patrick Wendell	b69f8b2a01	Merge pull request #557 from ScrapCodes/style. Closes #557 . SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Author: Patrick Wendell <pwendell@gmail.com> Author: Prashant Sharma <scrapcodes@gmail.com> == Merge branch commits == commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4 Author: Prashant Sharma <scrapcodes@gmail.com> Date: Sun Feb 9 17:39:07 2014 +0530 scala style fixes commit f91709887a8e0b608c5c2b282db19b8a44d53a43 Author: Patrick Wendell <pwendell@gmail.com> Date: Fri Jan 24 11:22:53 2014 -0800 Adding scalastyle snapshot	2014-02-09 10:09:19 -08:00
Stevo Slavić	f7fd80d9a7	Merge pull request #540 from sslavic/patch-3. Closes #540 . Fix line end character stripping for Windows LogQuery Spark example would produce unwanted result when run on Windows platform because of different, platform specific trailing line end characters (not only \n but \r too). This fix makes use of Scala's standard library string functions to properly strip all trailing line end characters, letting Scala handle the platform specific stuff. Author: Stevo Slavić <sslavic@gmail.com> == Merge branch commits == commit 1e43ba0ea773cc005cf0aef78b6c1755f8e88b27 Author: Stevo Slavić <sslavic@gmail.com> Date: Wed Feb 5 14:48:29 2014 +0100 Fix line end character stripping for Windows LogQuery Spark example would produce unwanted result when run on Windows platform because of different, platform specific trailing line end characters (not only \n but \r too). This fix makes use of Scala's standard library string functions to properly strip all trailing line end characters, letting Scala handle the platform specific stuff.	2014-02-05 10:29:45 -08:00
Henry Saputra	0386f42e38	Merge pull request #529 from hsaputra/cleanup_right_arrowop_scala Change the ⇒ character (maybe from scalariform) to => in Scala code for style consistency Looks like there are some ⇒ Unicode character (maybe from scalariform) in Scala code. This PR is to change it to => to get some consistency on the Scala code. If we want to use ⇒ as default we could use sbt plugin scalariform to make sure all Scala code has ⇒ instead of => And remove unused imports found in TwitterInputDStream.scala while I was there =) Author: Henry Saputra <hsaputra@apache.org> == Merge branch commits == commit 29c1771d346dff901b0b778f764e6b4409900234 Author: Henry Saputra <hsaputra@apache.org> Date: Sat Feb 1 22:05:16 2014 -0800 Change the ⇒ character (maybe from scalariform) to => in Scala code for style consistency.	2014-02-02 21:51:17 -08:00
Patrick Wendell	a1238bb5fc	Merge pull request #492 from skicavs/master fixed job name and usage information for the JavaSparkPi example	2014-01-22 14:32:59 -08:00
Matei Zaharia	d009b17d13	Merge pull request #315 from rezazadeh/sparsesvd Sparse SVD # Singular Value Decomposition Given an m x n matrix A, compute matrices U, S, V such that A = U S * V^T* There is no restriction on m, but we require n^2 doubles to fit in memory. Further, n should be less than m. The decomposition is computed by first computing A^TA = V S^2 V^T, computing svd locally on that (since n x n is small), from which we recover S and V. Then we compute U via easy matrix multiplication as U = A V * S^-1* Only singular vectors associated with the largest k singular values If there are k such values, then the dimensions of the return will be: * S is k x k and diagonal, holding the singular values on diagonal. * U is m x k and satisfies U^TU = eye(k). V is n x k and satisfies V^TV = eye(k). All input and output is expected in sparse matrix format, 0-indexed as tuples of the form ((i,j),value) all in RDDs. # Testing Tests included. They test: - Decomposition promise (A = USV^T) - For small matrices, output is compared to that of jblas - Rank 1 matrix test included - Full Rank matrix test included - Middle-rank matrix forced via k included # Example Usage import org.apache.spark.SparkContext import org.apache.spark.mllib.linalg.SVD import org.apache.spark.mllib.linalg.SparseMatrix import org.apache.spark.mllib.linalg.MatrixyEntry // Load and parse the data file val data = sc.textFile("mllib/data/als/test.data").map { line => val parts = line.split(',') MatrixEntry(parts(0).toInt, parts(1).toInt, parts(2).toDouble) } val m = 4 val n = 4 // recover top 1 singular vector val decomposed = SVD.sparseSVD(SparseMatrix(data, m, n), 1) println("singular values = " + decomposed.S.data.toArray.mkString) # Documentation Added to docs/mllib-guide.md	2014-01-22 14:01:30 -08:00
Kevin Mader	36f9a64ec9	fixed job name and usage information for the JavaSparkPi example	2014-01-22 15:58:23 +01:00
Tathagata Das	2e95174c45	Added StreamingContext.awaitTermination to streaming examples.	2014-01-20 20:25:04 -08:00
Reza Zadeh	caf97a25a2	Merge remote-tracking branch 'upstream/master' into sparsesvd	2014-01-17 14:34:03 -08:00
Reza Zadeh	4e96757793	make example 0-indexed	2014-01-17 14:33:03 -08:00
Reza Zadeh	d28bf41827	changes from PR	2014-01-17 13:39:40 -08:00
Tathagata Das	11e6534d92	Updated java API docs for streaming, along with very minor changes in the code examples.	2014-01-16 14:44:02 -08:00
Patrick Wendell	23034798d7	Add missing header files	2014-01-14 01:17:13 -08:00
Tathagata Das	f8e239e058	Merge remote-tracking branch 'apache/master' into filestream-fix Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala	2014-01-13 23:57:27 -08:00
Reza Zadeh	845e568fad	Merge remote-tracking branch 'upstream/master' into sparsesvd	2014-01-13 23:52:34 -08:00
Tathagata Das	4e497db8f3	Removed StreamingContext.registerInputStream and registerOutputStream - they were useless as InputDStream has been made to register itself. Also made DStream.register() private[streaming] - not useful to expose the confusing function. Updated a lot of documentation.	2014-01-13 23:23:46 -08:00
Reynold Xin	e2d25d2dfe	Merge branch 'master' into graphx	2014-01-13 16:21:26 -08:00
Patrick Wendell	b93f9d42f2	Merge pull request #400 from tdas/dstream-move Moved DStream and PairDSream to org.apache.spark.streaming.dstream Similar to the package location of `org.apache.spark.rdd.RDD`, `DStream` has been moved from `org.apache.spark.streaming.DStream` to `org.apache.spark.streaming.dstream.DStream`. I know that the package name is a little long, but I think its better to keep it consistent with Spark's structure. Also fixed persistence of windowed DStream. The RDDs generated generated by windowed DStream are essentially unions of underlying RDDs, and persistent these union RDDs would store numerous copies of the underlying data. Instead setting the persistence level on the windowed DStream is made to set the persistence level of the underlying DStream.	2014-01-13 12:18:05 -08:00
Ankur Dave	8ca9773974	Add LiveJournalPageRank example	2014-01-13 12:17:58 -08:00
Tathagata Das	777c181d2f	Merge remote-tracking branch 'apache/master' into dstream-move Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala	2014-01-12 21:59:51 -08:00
Patrick Wendell	0ab505a29e	Merge pull request #395 from hsaputra/remove_simpleredundantreturn_scala Remove simple redundant return statements for Scala methods/functions Remove simple redundant return statements for Scala methods/functions: -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized -) Add small changes to making var to val if possible and remove () for simple get This hopefully makes the review simpler =) Pass compile and tests.	2014-01-12 21:31:04 -08:00
Patrick Wendell	f4d77f8cb8	Rename DStream.foreach to DStream.foreachRDD `foreachRDD` makes it clear that the granularity of this operator is per-RDD. As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other DStream operators which get pushed down to individual records within each RDD.	2014-01-12 17:21:00 -08:00
Henry Saputra	91a563608e	Merge branch 'master' into remove_simpleredundantreturn_scala	2014-01-12 10:34:13 -08:00
Henry Saputra	93a65e5fde	Remove simple redundant return statement for Scala methods/functions: -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized	2014-01-12 10:30:04 -08:00
Reza Zadeh	f324d53555	Merge remote-tracking branch 'upstream/master' into sparsesvd	2014-01-11 13:27:15 -08:00
Reza Zadeh	1afdeaeb2f	add dimension parameters to example	2014-01-10 21:30:54 -08:00
Tathagata Das	4f39e79c23	Merge remote-tracking branch 'apache/master' into driver-test Conflicts: streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala	2014-01-10 15:47:01 -08:00
Tathagata Das	e4bb845238	Updated docs based on Patrick's comments in PR 383.	2014-01-10 12:17:09 -08:00
Reza Zadeh	21c8a54c08	Merge remote-tracking branch 'upstream/master' into sparsesvd Conflicts: docs/mllib-guide.md	2014-01-09 22:45:32 -08:00
Reza Zadeh	cf5bd4ab2e	fix example	2014-01-09 22:39:41 -08:00
Patrick Wendell	997c830e0b	Merge pull request #363 from pwendell/streaming-logs Set default logging to WARN for Spark streaming examples. This programatically sets the log level to WARN by default for streaming tests. If the user has already specified a log4j.properties file, the user's file will take precedence over this default.	2014-01-09 22:22:20 -08:00
Patrick Wendell	7b748b83a1	Minor clean-up	2014-01-09 20:42:48 -08:00
Tathagata Das	f1d206c6b4	Merge branch 'standalone-driver' into driver-test Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala examples/src/main/java/org/apache/spark/streaming/examples/JavaNetworkWordCount.java streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2014-01-09 15:06:24 -08:00
Tathagata Das	6f713e2a3e	Changed the way StreamingContext finds and reads checkpoint files, and added JavaStreamingContext.getOrCreate.	2014-01-09 13:42:04 -08:00
Ankur Dave	3b2e22e2c3	Revert changes to examples/.../PageRankUtils.scala Reverts to 04d83fc37f9eef89c20331c85291a0a169f75e6d:examples/src/main/scala/org/apache/spark/examples/bagel/PageRankUtils.scala.	2014-01-09 13:27:40 -08:00
Patrick Wendell	35f80da21a	Set default logging to WARN for Spark streaming examples. This programatically sets the log level to WARN by default for streaming tests. If the user has already specified a log4j.properties file, the user's file will take precedence over this default.	2014-01-09 10:42:58 -08:00
Ankur Dave	91227566bc	Merge remote-tracking branch 'spark-upstream/master' into HEAD Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala pom.xml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala	2014-01-08 21:19:08 -08:00
Patrick Wendell	bc81ce040d	Merge remote-tracking branch 'apache-github/master' into standalone-driver Conflicts: core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala pom.xml	2014-01-08 00:38:31 -08:00
Patrick Wendell	c0f0155eca	Merge pull request #313 from tdas/project-refactor Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc. At a high level, these are the following changes. 1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules. 2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`. 3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information). 4. Jars of the external projects have been added to examples project but not to the assembly project. 5. In some files, imports have been rearrange to conform to the Spark coding guidelines.	2014-01-07 22:21:52 -08:00
Reynold Xin	15d9534501	Merge pull request #318 from srowen/master Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance Sorry if this is too abrupt or not a welcome set of changes, but thought I'd see if I could contribute a little. I'm a Java developer and just getting seriously into Spark. So I thought I'd suggest a number of small changes to the couple Java parts of the code to make it a little tighter, more standard and even a bit faster. Feel free to take all, some or none of this. Happy to explain any of it.	2014-01-07 08:10:02 -08:00
Tathagata Das	aa99f226a6	Removed XYZFunctions and added XYZUtils as a common Scala and Java interface for creating XYZ streams.	2014-01-07 01:56:15 -08:00
Sean Owen	4b92a20232	Issue #318 : minor style updates per review from Reynold Xin	2014-01-07 09:38:45 +00:00
prabeesh	a91f14cfdc	spark -> org.apache.spark	2014-01-07 12:21:20 +05:30
Patrick Wendell	c0498f9265	Merge remote-tracking branch 'apache-github/master' into standalone-driver Conflicts: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala core/src/main/scala/org/apache/spark/deploy/client/TestClient.scala core/src/main/scala/org/apache/spark/deploy/master/Master.scala core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala	2014-01-06 17:29:21 -08:00
Sean Owen	7379b2915f	Merge remote-tracking branch 'upstream/master'	2014-01-06 15:13:16 +00:00
Tathagata Das	3b4c4c7f4d	Merge remote-tracking branch 'apache/master' into project-refactor Conflicts: examples/src/main/java/org/apache/spark/streaming/examples/JavaFlumeEventCount.java streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala	2014-01-06 03:05:52 -08:00
Tathagata Das	d0fd3b9ad2	Changed JavaStreamingContextWith* to Function in streaming.api.java.** package. Also fixed packages of Flume and MQTT tests.	2014-01-06 01:47:53 -08:00
Patrick Wendell	79f52809c8	Removing SPARK_EXAMPLES_JAR in the code	2014-01-05 11:49:42 -08:00

1 2 3 4 5

236 commits