ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Xiangrui Meng	4cafc63524	[SPARK-7584] [MLLIB] User guide for VectorAssembler This PR adds a section in the user guide for `VectorAssembler` with code examples in Python/Java/Scala. It also adds a unit test in Java. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #6556 from mengxr/SPARK-7584 and squashes the following commits: 11313f6 [Xiangrui Meng] simplify Java example 0cd47f3 [Xiangrui Meng] update user guide fd36292 [Xiangrui Meng] update Java unit test ce61ca0 [Xiangrui Meng] add Java unit test for VectorAssembler e399942 [Xiangrui Meng] scala/python example code (cherry picked from commit `90c606925e`) Signed-off-by: Xiangrui Meng <meng@databricks.com>	2015-06-01 15:05:21 -07:00
Davies Liu	d023300f4e	[SPARK-7497] [PYSPARK] [STREAMING] fix streaming flaky tests Increase the duration and timeout in streaming python tests. Author: Davies Liu <davies@databricks.com> Closes #6239 from davies/flaky_tests and squashes the following commits: d6aee8f [Davies Liu] fix window tests 26317f7 [Davies Liu] Merge branch 'master' of github.com:apache/spark into flaky_tests 7947db6 [Davies Liu] fix streaming flaky tests (cherry picked from commit `b7ab0299b0`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-06-01 14:40:40 -07:00
Nishkam Ravi	2f41cf3e29	[DOC] Minor modification to Streaming docs with regards to parallel data receiving pwendell tdas Author: Nishkam Ravi <nravi@cloudera.com> Author: nishkamravi2 <nishkamravi@gmail.com> Author: nravi <nravi@c1704.halxg.cloudera.com> Closes #6544 from nishkamravi2/master_nravi and squashes the following commits: 46e8c03 [Nishkam Ravi] Slight modification to streaming docs (cherry picked from commit `e7c7e51f2e`) Signed-off-by: Sean Owen <sowen@cloudera.com>	2015-06-01 21:37:40 +01:00
Davies Liu	78a6723e87	[SPARK-7978] [SQL] [PYSPARK] DecimalType should not be singleton Author: Davies Liu <davies@databricks.com> Closes #6532 from davies/decimal and squashes the following commits: c7fcbce [Davies Liu] Update tests.py 1425359 [Davies Liu] DecimalType should not be singleton (cherry picked from commit `91777a1c3a`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 19:56:03 -07:00
Josh Rosen	df0bf71ee0	[HOTFIX] Remove trailing whitespace to fix Scalastyle checks `866652c903` enabled this check.	2015-05-31 16:34:20 -07:00
Sun Rui	f1d4e7e311	[SPARK-7227] [SPARKR] Support fillna / dropna in R DataFrame. Author: Sun Rui <rui.sun@intel.com> Closes #6183 from sun-rui/SPARK-7227 and squashes the following commits: dd6f5b3 [Sun Rui] Rename readEnv() back to readMap(). Add alias na.omit() for dropna(). 41cf725 [Sun Rui] [SPARK-7227][SPARKR] Support fillna / dropna in R DataFrame. (cherry picked from commit `46576ab303`) Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>	2015-05-31 15:02:16 -07:00
Reynold Xin	bab0fab68f	[SPARK-3850] Turn style checker on for trailing whitespaces. Author: Reynold Xin <rxin@databricks.com> Closes #6541 from rxin/trailing-whitespace-on and squashes the following commits: f72ebe4 [Reynold Xin] [SPARK-3850] Turn style checker on for trailing whitespaces. (cherry picked from commit `866652c903`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 14:23:48 -07:00
Yuhao Yang	4d5ce46772	[SPARK-7949] [MLLIB] [DOC] update document with some missing save/load add save load for examples: KMeansModel PowerIterationClusteringModel Word2VecModel IsotonicRegressionModel Author: Yuhao Yang <hhbyyh@gmail.com> Closes #6498 from hhbyyh/docSaveLoad and squashes the following commits: 7f9f06d [Yuhao Yang] add missing imports c604cad [Yuhao Yang] Merge remote-tracking branch 'upstream/master' into docSaveLoad 1dd77cc [Yuhao Yang] update document with some missing save/load (cherry picked from commit `0674700303`) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>	2015-05-31 11:52:04 -07:00
Reynold Xin	70cf9c3495	[SPARK-3850] Trim trailing spaces for MLlib. Author: Reynold Xin <rxin@databricks.com> Closes #6534 from rxin/whitespace-mllib and squashes the following commits: 38926e3 [Reynold Xin] [SPARK-3850] Trim trailing spaces for MLlib. (cherry picked from commit `e1067d0ad1`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 11:35:46 -07:00
zsxwing	8a72bc9170	[MINOR] Add license for dagre-d3 and graphlib-dot Add license for dagre-d3 and graphlib-dot Author: zsxwing <zsxwing@gmail.com> Closes #6539 from zsxwing/LICENSE and squashes the following commits: 82b0475 [zsxwing] Add license for dagre-d3 and graphlib-dot (cherry picked from commit `d1d2def2f5`) Signed-off-by: Andrew Or <andrew@databricks.com>	2015-05-31 11:18:20 -07:00
Reynold Xin	01f38f75d9	[SPARK-7979] Enforce structural type checker. Author: Reynold Xin <rxin@databricks.com> Closes #6536 from rxin/structural-type-checker and squashes the following commits: f833151 [Reynold Xin] Fixed compilation. 633f9a1 [Reynold Xin] Fixed typo. d1fa804 [Reynold Xin] [SPARK-7979] Enforce structural type checker. (cherry picked from commit `4b5f12bac9`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 01:40:57 -07:00
Reynold Xin	a1904fa79e	[SPARK-3850] Trim trailing spaces for SQL. Author: Reynold Xin <rxin@databricks.com> Closes #6535 from rxin/whitespace-sql and squashes the following commits: de50316 [Reynold Xin] [SPARK-3850] Trim trailing spaces for SQL. (cherry picked from commit `63a50be13d`) Signed-off-by: Reynold Xin <rxin@databricks.com> Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala	2015-05-31 00:52:02 -07:00
Reynold Xin	f63eab950b	[SPARK-3850] Trim trailing spaces for examples/streaming/yarn. Author: Reynold Xin <rxin@databricks.com> Closes #6530 from rxin/trim-whitespace-1 and squashes the following commits: 7b7b3a0 [Reynold Xin] Reset again. dc14597 [Reynold Xin] Reset scalastyle. cd556c4 [Reynold Xin] YARN, Kinesis, Flume. 4223fe1 [Reynold Xin] [SPARK-3850] Trim trailing spaces for examples/streaming. (cherry picked from commit `564bc11e98`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 00:48:29 -07:00
Reynold Xin	a7c217166b	[SPARK-3850] Trim trailing spaces for core. Author: Reynold Xin <rxin@databricks.com> Closes #6533 from rxin/whitespace-2 and squashes the following commits: 038314c [Reynold Xin] [SPARK-3850] Trim trailing spaces for core. (cherry picked from commit `74fdc97c72`) Signed-off-by: Reynold Xin <rxin@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala core/src/test/scala/org/apache/spark/serializer/KryoSerializerSuite.scala	2015-05-31 00:17:47 -07:00
Reynold Xin	2016927f70	[SPARK-7975] Add style checker to disallow overriding equals covariantly. Author: Reynold Xin <rxin@databricks.com> This patch had conflicts when merged, resolved by Committer: Reynold Xin <rxin@databricks.com> Closes #6527 from rxin/covariant-equals and squashes the following commits: e7d7784 [Reynold Xin] [SPARK-7975] Enforce CovariantEqualsChecker (cherry picked from commit `7896e99b2a`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 00:06:02 -07:00
Cheng Lian	0d093d6e78	[SQL] [MINOR] Adds @deprecated Scaladoc entry for SchemaRDD Author: Cheng Lian <lian@databricks.com> Closes #6529 from liancheng/schemardd-deprecation-fix and squashes the following commits: 49765c2 [Cheng Lian] Adds @deprecated Scaladoc entry for SchemaRDD (cherry picked from commit `8764dccebd`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 23:49:47 -07:00
Reynold Xin	adfc9d1fa0	[SPARK-7976] Add style checker to disallow overriding finalize. Author: Reynold Xin <rxin@databricks.com> Closes #6528 from rxin/style-finalizer and squashes the following commits: a2211ca [Reynold Xin] [SPARK-7976] Enable NoFinalizeChecker. (cherry picked from commit `084fef76e9`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 23:36:37 -07:00
Reynold Xin	5e268d3956	Update documentation for the new DataFrame reader/writer interface. Author: Reynold Xin <rxin@databricks.com> Closes #6522 from rxin/sql-doc-1.4 and squashes the following commits: c227be7 [Reynold Xin] Updated link. 040b6d7 [Reynold Xin] Update documentation for the new DataFrame reader/writer interface. (cherry picked from commit `00a7137900`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 20:10:08 -07:00
Reynold Xin	e74ea78276	[SPARK-7971] Add JavaDoc style deprecation for deprecated DataFrame methods Scala deprecated annotation actually doesn't show up in JavaDoc. Author: Reynold Xin <rxin@databricks.com> Closes #6523 from rxin/df-deprecated-javadoc and squashes the following commits: 26da2b2 [Reynold Xin] [SPARK-7971] Add JavaDoc style deprecation for deprecated DataFrame methods. (cherry picked from commit `c63e1a742b`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 19:51:58 -07:00
Reynold Xin	dc58e688ab	[SQL] Tighten up visibility for JavaDoc. I went through all the JavaDocs and tightened up visibility. Author: Reynold Xin <rxin@databricks.com> Closes #6526 from rxin/sql-1.4-visibility-for-docs and squashes the following commits: bc37d1e [Reynold Xin] Tighten up visibility for JavaDoc. (cherry picked from commit `14b314dc2c`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 19:51:17 -07:00
Xiangrui Meng	a60b8bf329	[SPARK-5610] [DOC] update genjavadocSettings to use the patched version of genjavadoc This PR updates `genjavadocSettings` to use a patched version of `genjavadoc-plugin` that hides package private classes/methods/interfaces in the generated Java API doc. The patch can be found at: https://github.com/typesafehub/genjavadoc/compare/master...mengxr:spark-1.4. It wasn't merged into the main repo because there exist corner cases where a package private Scala class has to be a Java public class in order to compile. This doesn't seem to apply to the Spark codebase. So we release a patched version under `org.spark-project` and use it in the Spark build. brkyvz is publishing the artifacts to Maven Central. Need more people audit the generated APIs and make sure we don't have false negatives. Current listed classes under `org.apache.spark.rdd`: ![screen shot 2015-05-29 at 12 48 52 pm](https://cloud.githubusercontent.com/assets/829644/7891396/28fb9daa-0601-11e5-8ed8-4e9522d25a71.png) After this PR: ![screen shot 2015-05-29 at 12 48 23 pm](https://cloud.githubusercontent.com/assets/829644/7891408/408e210e-0601-11e5-975c-ff0a02eb5c91.png) cc: pwendell rxin srowen Author: Xiangrui Meng <meng@databricks.com> Closes #6506 from mengxr/SPARK-5610 and squashes the following commits: 489c785 [Xiangrui Meng] update genjavadocSettings to use the patched version of genjavadoc (cherry picked from commit `2b258e1c07`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 17:22:31 -07:00
Mike Dusenberry	df56309b04	[SPARK-7920] [MLLIB] Make MLlib ChiSqSelector Serializable (& Fix Related Documentation Example). The MLlib ChiSqSelector class is not serializable, and so the example in the ChiSqSelector documentation fails. Also, that example is missing the import of ChiSqSelector. This PR makes ChiSqSelector extend Serializable in MLlib, and adds the ChiSqSelector import statement to the associated example in the documentation. Author: Mike Dusenberry <dusenberrymw@gmail.com> Closes #6462 from dusenberrymw/Make_ChiSqSelector_Serializable_and_Fix_Related_Docs_Example and squashes the following commits: 9cb2f94 [Mike Dusenberry] Make MLlib ChiSqSelector Serializable. d9003bf [Mike Dusenberry] Add missing import in MLlib ChiSqSelector Docs Scala example. (cherry picked from commit `1281a35188`) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>	2015-05-30 16:51:08 -07:00
Yanbo Liang	2790bb0354	[SPARK-7918] [MLLIB] MLlib Python doc parity check for evaluation and feature Check then make the MLlib Python evaluation and feature doc to be as complete as the Scala doc. Author: Yanbo Liang <ybliang8@gmail.com> Closes #6461 from yanboliang/spark-7918 and squashes the following commits: 940e3f1 [Yanbo Liang] truncate too long line and remove extra sparse a80ae58 [Yanbo Liang] MLlib Python doc parity check for evaluation and feature (cherry picked from commit `1617363fbb`) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>	2015-05-30 16:24:26 -07:00
Reynold Xin	6d7cf5382d	Updated SQL programming guide's Hive connectivity section. (cherry picked from commit `7716a5a1ec`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 14:58:11 -07:00
Cheng Lian	b2b7601471	[SPARK-7849] [SQL] [Docs] Updates SQL programming guide for 1.4 Author: Cheng Lian <lian@databricks.com> Closes #6520 from liancheng/spark-7849 and squashes the following commits: 705264b [Cheng Lian] Updates SQL programming guide for 1.4 (cherry picked from commit `6e3f0c7810`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-30 12:16:16 -07:00
Taka Shinagawa	e7ba3ea86b	[DOCS] [MINOR] Update for the Hadoop versions table with hadoop-2.6 Updated the doc for the hadoop-2.6 profile, which is new to Spark 1.4 Author: Taka Shinagawa <taka.epsilon@gmail.com> Closes #6450 from mrt/docfix2 and squashes the following commits: db1c43b [Taka Shinagawa] Updated the hadoop versions for hadoop-2.6 profile 323710e [Taka Shinagawa] The hadoop-2.6 profile is added to the Hadoop versions table (cherry picked from commit `3ab71eb9d5`) Signed-off-by: Sean Owen <sowen@cloudera.com>	2015-05-30 08:26:06 -04:00
Sean Owen	d6e9eade64	[SPARK-7890] [DOCS] Document that Spark 2.11 now supports Kafka Remove caveat about Kafka / JDBC not being supported for Scala 2.11 Author: Sean Owen <sowen@cloudera.com> Closes #6470 from srowen/SPARK-7890 and squashes the following commits: 4652634 [Sean Owen] One more rewording 7b7f3c8 [Sean Owen] Restore note about JDBC component 126744d [Sean Owen] Remove caveat about Kafka / JDBC not being supported for Scala 2.11 (cherry picked from commit `8c8de3ed86`) Signed-off-by: Sean Owen <sowen@cloudera.com>	2015-05-30 07:59:43 -04:00
Octavian Geagla	2c45009dad	[SPARK-7459] [MLLIB] ElementwiseProduct Java example Author: Octavian Geagla <ogeagla@gmail.com> Closes #6008 from ogeagla/elementwise-prod-doc and squashes the following commits: 72e6dc0 [Octavian Geagla] [SPARK-7459] [MLLIB] Java example import. cf2afbd [Octavian Geagla] [SPARK-7459] [MLLIB] Update description of example. b66431b [Octavian Geagla] [SPARK-7459] [MLLIB] Add override annotation to java example, make scala example use same data as java. 6b26b03 [Octavian Geagla] [SPARK-7459] [MLLIB] Fix line which is too long. 79af020 [Octavian Geagla] [SPARK-7459] [MLLIB] Actually don't use Java 8. 9d5b31a [Octavian Geagla] [SPARK-7459] [MLLIB] Don't use Java 8 4f0c92f [Octavian Geagla] [SPARK-7459] [MLLIB] ElementwiseProduct Java example. (cherry picked from commit `e3a4374833`) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>	2015-05-30 00:00:49 -07:00
Timothy Chen	8938a74893	[SPARK-7962] [MESOS] Fix master url parsing in rest submission client. Only parse standalone master url when master url starts with spark:// Author: Timothy Chen <tnachen@gmail.com> Closes #6517 from tnachen/fix_mesos_client and squashes the following commits: 61a1198 [Timothy Chen] Fix master url parsing in rest submission client. (cherry picked from commit `78657d53d7`) Signed-off-by: Andrew Or <andrew@databricks.com>	2015-05-29 23:56:27 -07:00
Octavian Geagla	11a4b30d1e	[SPARK-7576] [MLLIB] Add spark.ml user guide doc/example for ElementwiseProduct Author: Octavian Geagla <ogeagla@gmail.com> Closes #6501 from ogeagla/ml-guide-elemwiseprod and squashes the following commits: 4ad93d5 [Octavian Geagla] [SPARK-7576] [MLLIB] Incorporate code review feedback. f7be7ad [Octavian Geagla] [SPARK-7576] [MLLIB] Add spark.ml user guide doc/example for ElementwiseProduct. (cherry picked from commit `da2112aef2`) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>	2015-05-29 23:55:29 -07:00
Burak Yavuz	1513cffa35	[SPARK-7957] Preserve partitioning when using randomSplit cc JoshRosen Thanks for noticing this! Author: Burak Yavuz <brkyvz@gmail.com> Closes #6509 from brkyvz/sample-perf-reg and squashes the following commits: 497465d [Burak Yavuz] addressed code review 293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit (cherry picked from commit `7ed06c3992`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-29 22:19:23 -07:00
Taka Shinagawa	400e6dbce2	[DOCS][Tiny] Added a missing dash(-) in docs/configuration.md The first line had only two dashes (--) instead of three(---). Because of this missing dash(-), 'jekyll build' command was not converting configuration.md to _site/configuration.html Author: Taka Shinagawa <taka.epsilon@gmail.com> Closes #6513 from mrt/docfix3 and squashes the following commits: c470e2c [Taka Shinagawa] Added a missing dash(-) preventing jekyll from converting configuration.md to html format (cherry picked from commit `3792d25836`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-29 20:35:26 -07:00
Ram Sriharsha	9a88be1833	[SPARK-6013] [ML] Add more Python ML examples for spark.ml Author: Ram Sriharsha <rsriharsha@hw11853.local> Closes #6443 from harsha2010/SPARK-6013 and squashes the following commits: 732506e [Ram Sriharsha] Code Review Feedback 121c211 [Ram Sriharsha] python style fix 5f9b8c3 [Ram Sriharsha] python style fixes 925ca86 [Ram Sriharsha] Simple Params Example 8b372b1 [Ram Sriharsha] GBT Example 965ec14 [Ram Sriharsha] Random Forest Example (cherry picked from commit `dbf8ff38de`) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>	2015-05-29 15:22:38 -07:00
Shivaram Venkataraman	2bd4460548	[SPARK-7954] [SPARKR] Create SparkContext in sparkRSQL init cc davies Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6507 from shivaram/sparkr-init and squashes the following commits: 6fdd169 [Shivaram Venkataraman] Create SparkContext in sparkRSQL init (cherry picked from commit `5fb97dca9b`) Signed-off-by: Davies Liu <davies@databricks.com>	2015-05-29 15:08:50 -07:00
Shivaram Venkataraman	cf4122e4d4	[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc. cc rxin davies pwendell cc cafreeman -- Would be great if you could also take a look at this ! Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6490 from shivaram/sparkr-guide and squashes the following commits: d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries 408dce5 [Shivaram Venkataraman] Fix link dbb86e3 [Shivaram Venkataraman] Fix minor typo 9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example d09703c [Shivaram Venkataraman] Fix default argument in read.df ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better (cherry picked from commit `5f48e5c33b`) Signed-off-by: Davies Liu <davies@databricks.com>	2015-05-29 14:12:18 -07:00
Reynold Xin	f40605f064	[SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, LARROW, RARROW in style checker. … Author: Reynold Xin <rxin@databricks.com> Closes #6491 from rxin/more-whitespace and squashes the following commits: f6e63dc [Reynold Xin] [SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, LARROW, RARROW in style checker. (cherry picked from commit `94f62a4979`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-29 13:39:02 -07:00
Patrick Wendell	e549874c33	Preparing development version 1.4.0-SNAPSHOT	2015-05-29 13:07:07 -07:00
Patrick Wendell	dd109a8746	Preparing Spark release v1.4.0-rc3	2015-05-29 13:06:59 -07:00
Patrick Wendell	18811ca20b	Revert "[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior" This reverts commit `645e611644`.	2015-05-29 13:03:52 -07:00
Patrick Wendell	c68abaa34e	Preparing development version 1.4.0-SNAPSHOT	2015-05-29 12:15:18 -07:00
Patrick Wendell	fb60503ff2	Preparing Spark release v1.4.0-rc3	2015-05-29 12:15:13 -07:00
MechCoder	4be701aa50	[SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans Author: MechCoder <manojkumarsivaraj334@gmail.com> Closes #6497 from MechCoder/spark-7946 and squashes the following commits: 2fdd0a3 [MechCoder] Add non-regression test 8c988c6 [MechCoder] [SPARK-7946] DecayFactor wrongly set in StreamingKMeans (cherry picked from commit `6181937f31`) Signed-off-by: Xiangrui Meng <meng@databricks.com>	2015-05-29 11:36:48 -07:00
Cheng Lian	645e611644	[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior The `HiveThriftServer2Test` relies on proper logging behavior to assert whether the Thrift server daemon process is started successfully. However, some other jar files listed in the classpath may potentially contain an unexpected Log4J configuration file which overrides the logging behavior. This PR writes a temporary `log4j.properties` and prepend it to driver classpath before starting the testing Thrift server process to ensure proper logging behavior. cc andrewor14 yhuai Author: Cheng Lian <lian@databricks.com> Closes #6493 from liancheng/override-log4j and squashes the following commits: c489e0e [Cheng Lian] Fixes minor Scala styling issue b46ef0d [Cheng Lian] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior (cherry picked from commit `4782e13040`) Signed-off-by: Andrew Or <andrew@databricks.com>	2015-05-29 11:11:47 -07:00
Reynold Xin	62df047a36	HOTFIX: Scala style checker for DataTypeSuite.scala.	2015-05-29 11:06:33 -07:00
Cheng Lian	caea7a618d	[SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext() When starting `HiveThriftServer2` via `startWithContext`, property `spark.sql.hive.version` isn't set. This causes Simba ODBC driver 1.0.8.1006 behaves differently and fails simple queries. Hive2 JDBC driver works fine in this case. Also, when starting the server with `start-thriftserver.sh`, both Hive2 JDBC driver and Simba ODBC driver works fine. Please refer to [SPARK-7950] [1] for details. [1]: https://issues.apache.org/jira/browse/SPARK-7950 Author: Cheng Lian <lian@databricks.com> Closes #6500 from liancheng/odbc-bugfix and squashes the following commits: 051e3a3 [Cheng Lian] Fixes import order 3a97376 [Cheng Lian] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext() (cherry picked from commit `e7b6177557`) Signed-off-by: Yin Huai <yhuai@databricks.com>	2015-05-29 10:43:44 -07:00
Reynold Xin	23bd05fff7	HOTFIX: Scala style checker failure due to a missing space in TachyonBlockManager.scala.	2015-05-29 09:37:46 -07:00
Tim Ellison	459c3d22e0	[SPARK-7756] [CORE] Use testing cipher suites common to Oracle and IBM security providers Add alias names for supported cipher suites to the sample SSL configuration. The IBM JSSE provider reports its cipher suite with an SSL_ prefix, but accepts TLS_ prefixed suite names as an alias. However, Jetty filters the requested ciphers based on the provider's reported supported suites, so the TLS_ versions are never passed through to JSSE causing an SSL handshake failure. Author: Tim Ellison <t.p.ellison@gmail.com> Closes #6282 from tellison/SSLFailure and squashes the following commits: 8de8a3e [Tim Ellison] Update SecurityManagerSuite with new expected suite names 96158b2 [Tim Ellison] Update the sample configs to use ciphers that are common to both the Oracle and IBM security providers. 705421b [Tim Ellison] Merge branch 'master' of github.com:tellison/spark into SSLFailure 68b9425 [Tim Ellison] Merge branch 'master' of https://github.com/apache/spark into SSLFailure b0c35f6 [Tim Ellison] [CORE] Add aliases used for cipher suites in IBM provider (cherry picked from commit `bf46580708`) Signed-off-by: Sean Owen <sowen@cloudera.com>	2015-05-29 05:15:00 -04:00
Xiangrui Meng	509a7cafcc	[SPARK-7912] [SPARK-7921] [MLLIB] Update OneHotEncoder to handle ML attributes and change includeFirst to dropLast This PR contains two major changes to `OneHotEncoder`: 1. more robust handling of ML attributes. If the input attribute is unknown, we look at the values to get the max category index 2. change `includeFirst` to `dropLast` and leave the default to `true`. There are couple benefits: a. consistent with other tutorials of one-hot encoding (or dummy coding) (e.g., http://www.ats.ucla.edu/stat/mult_pkg/faq/general/dummy.htm) b. keep the indices unmodified in the output vector. If we drop the first, all indices will be shifted by 1. c. If users use `StringIndex`, the last element is the least frequent one. Sorry for including two changes in one PR! I'll update the user guide in another PR. jkbradley sryza Author: Xiangrui Meng <meng@databricks.com> Closes #6466 from mengxr/SPARK-7912 and squashes the following commits: a280dca [Xiangrui Meng] fix tests d8f234d [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-7912 171b276 [Xiangrui Meng] mention the difference between our impl vs sklearn's 00dfd96 [Xiangrui Meng] update OneHotEncoder in Python 208ddad [Xiangrui Meng] update OneHotEncoder to handle ML attributes and change includeFirst to dropLast (cherry picked from commit `23452be944`) Signed-off-by: Xiangrui Meng <meng@databricks.com>	2015-05-29 00:51:24 -07:00
Patrick Wendell	6bf5a42084	Preparing development version 1.4.0-SNAPSHOT	2015-05-28 23:40:27 -07:00
Patrick Wendell	f2796816be	Preparing Spark release v1.4.0-rc3	2015-05-28 23:40:22 -07:00

... 2 3 4 5 6 ...

11372 commits