ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Marcelo Vanzin	11e21cc17a	[SPARK-28187][BUILD] Add support for hadoop-cloud to the PR builder. Closes #24987 from vanzin/SPARK-28187. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-06-27 15:59:05 -07:00
Hyukjin Kwon	1d36b892ab	[SPARK-7721][INFRA][FOLLOW-UP] Remove cloned coverage repo after posting HTMLs ## What changes were proposed in this pull request? This PR proposes to remove cloned `pyspark-coverage-site` repo. it doesn't looks a problem in PR builder but somehow it's problematic in `spark-master-test-sbt-hadoop-2.7`. ## How was this patch tested? Jenkins. Closes #23729 from HyukjinKwon/followup-coverage. Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org> Co-authored-by: shane knapp <incomplete@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-06-25 09:18:32 +09:00
Sean Owen	67042e90e7	[MINOR][BUILD] Exclude pyspark-coverage-site/ dir from RAT ## What changes were proposed in this pull request? Looks like a directory `pyspark-site-coverage/` is now (?) generated and fails RAT checks. It should just be excluded. See: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/6029/console ## How was this patch tested? N/A Closes #24950 from srowen/pysparkcoveragesite. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-06-24 14:07:41 -05:00
Dongjoon Hyun	ea0e119f84	[SPARK-28111][BUILD] Upgrade `xbean-asm7-shaded` to 4.14 ## What changes were proposed in this pull request? This PR aims to update `xbean-asm7-shaded` to bring [XBEAN-318](https://issues.apache.org/jira/browse/XBEAN-318) which is helpful to log the class definition reading failures. - https://issues.apache.org/jira/projects/XBEAN/versions/12345220 ## How was this patch tested? Pass the Jenkins. Closes #24914 from dongjoon-hyun/SPARK-28111. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-20 07:59:59 -07:00
Sean Owen	15462e1a8f	[SPARK-28004][UI] Update jquery to 3.4.1 ## What changes were proposed in this pull request? We're using an old-ish jQuery, 1.12.4, and should probably update for Spark 3 to keep up in general, but also to keep up with CVEs. In fact, we know of at least one resolved in only 3.4.0+ (https://nvd.nist.gov/vuln/detail/CVE-2019-11358). They may not affect Spark, but, if the update isn't painful, maybe worthwhile in order to make future 3.x updates easier. jQuery 1 -> 2 doesn't sound like a breaking change, as 2.0 is supposed to maintain compatibility with 1.9+ (https://blog.jquery.com/2013/04/18/jquery-2-0-released/) 2 -> 3 has breaking changes: https://jquery.com/upgrade-guide/3.0/. It's hard to evaluate each one, but the most likely area for problems is in ajax(). However, our usage of jQuery (and plugins) is pretty simple. Update jquery to 3.4.1; update jquery blockUI and mustache to latest ## How was this patch tested? Manual testing of docs build (except R docs), worker/master UI, spark application UI. Note: this really doesn't guarantee it works, as our tests can't test javascript, and this is merely anecdotal testing, although I clicked about every link I could find. There's a risk this breaks a minor part of the UI; it does seem to work fine in the main. Closes #24843 from srowen/SPARK-28004. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-14 22:19:20 -07:00
Dongjoon Hyun	fd8240d10c	[SPARK-28051][INFRA] Exposing JIRA issue component types at GitHub PRs ## What changes were proposed in this pull request? This PR aims to expose JIRA issue component types at GitHub PRs. ## How was this patch tested? Manual. ``` $ export GITHUB_OAUTH_KEY=... $ export JIRA_PASSWORD=... $ export GITHUB_API_BASE='https://api.github.com/repos/your-id/spark' $ dev/github_jira_sync.py ``` Please note that the existing script will raise the following exceptions if your repo has less than 100 PRs. This will be handled at #24874 . ``` Traceback (most recent call last): File "dev/github_jira_sync.py", line 139, in <module> jira_prs = get_jira_prs() File "dev/github_jira_sync.py", line 83, in get_jira_prs link_header = filter(lambda k: k.startswith("Link"), page.info().headers)[0] IndexError: list index out of range ``` That is beyond the scope of this PR. Closes #24871 from dongjoon-hyun/SPARK-28051. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-14 20:36:45 -07:00
Dongjoon Hyun	7533cccc5d	[SPARK-28053][INFRA] Handle a corner case where there is no `Link` header ## What changes were proposed in this pull request? Currently, `github_jira_sync.py` assumes that there is `Link` always. However, it will fail when the number of the open PR is less than 100 (the default paging number). It will not happen in Apache Spark, but we had better fix that because it happens during review process for `github_jira_sync.py` script. ``` Traceback (most recent call last): File "dev/github_jira_sync.py", line 139, in <module> jira_prs = get_jira_prs() File "dev/github_jira_sync.py", line 83, in get_jira_prs link_header = filter(lambda k: k.startswith("Link"), page.info().headers)[0] IndexError: list index out of range ``` ## How was this patch tested? Manually check with another repo which has small number of open PRs (< 100). ``` $ export JIRA_PASSWORD=... $ export GITHUB_API_BASE='https://api.github.com/repos/your-id/spark' $ dev/github_jira_sync.py ``` Closes #24874 from dongjoon-hyun/SPARK-28053. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-06-14 16:33:34 +09:00
Dongjoon Hyun	e5d95117e4	[SPARK-27979][BUILD][test-maven] Remove deprecated `--force` option in `build/mvn` and `run-tests.py` ## What changes were proposed in this pull request? This is a second try of #24824. Since Apache Spark 2.0.0, SPARK-14867 deprecated `--force` option and made it ignored. This PR cleans up the related code completely at 3.0.0. BEFORE (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests WARNING: '--force' is deprecated and ignored. ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end WARNING: '--force' is deprecated and ignored. ``` AFTER (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Phive -Pkinesis-asl -Pmesos -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end ``` ## How was this patch tested? Manually check the Jenkins logs. Closes #24833 from dongjoon-hyun/SPARK-FORCE-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-10 18:40:46 -07:00
Dongjoon Hyun	742f805177	Revert "[SPARK-27979][BUILD][test-maven] Remove deprecated `--force` option in `build/mvn` and `run-tests.py`" This reverts commit `354ec254c5`.	2019-06-09 08:33:21 -07:00
Martin Junghanns	709387d660	[SPARK-27300][GRAPH] Add Spark Graph modules and dependencies ## What changes were proposed in this pull request? This PR introduces the necessary Maven modules for the new [Spark Graph](https://issues.apache.org/jira/browse/SPARK-25994) feature for Spark 3.0. * `spark-graph` is a parent module that users depend on to get all graph functionalities (Cypher and Graph Algorithms) * `spark-graph-api` defines the [Property Graph API](https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI) that is being shared between Cypher and Algorithms * `spark-cypher` contains a Cypher query engine implementation Both, `spark-graph-api` and `spark-cypher` depend on Spark SQL. Note, that the Maven module for Graph Algorithms is not part of this PR and will be introduced in https://issues.apache.org/jira/browse/SPARK-27302 A PoC for a running Cypher implementation can be found in this WIP PR https://github.com/apache/spark/pull/24297 ## How was this patch tested? Pass the Jenkins with all profiles and manually build and check the followings. ``` $ ls assembly/target/scala-2.12/jars/spark-cypher* assembly/target/scala-2.12/jars/spark-cypher_2.12-3.0.0-SNAPSHOT.jar $ ls assembly/target/scala-2.12/jars/spark-graph* \| grep -v graphx assembly/target/scala-2.12/jars/spark-graph-api_2.12-3.0.0-SNAPSHOT.jar assembly/target/scala-2.12/jars/spark-graph_2.12-3.0.0-SNAPSHOT.jar ``` Closes #24490 from s1ck/SPARK-27300. Lead-authored-by: Martin Junghanns <martin.junghanns@neotechnology.com> Co-authored-by: Max Kießling <max@kopfueber.org> Co-authored-by: Martin Junghanns <martin.junghanns@neo4j.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-09 00:26:26 -07:00
Dongjoon Hyun	354ec254c5	[SPARK-27979][BUILD][test-maven] Remove deprecated `--force` option in `build/mvn` and `run-tests.py` ## What changes were proposed in this pull request? Since Apache Spark 2.0.0, SPARK-14867 deprecated `--force` option and made it ignored. This PR cleans up the related code completely at 3.0.0. BEFORE (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests WARNING: '--force' is deprecated and ignored. ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end WARNING: '--force' is deprecated and ignored. ``` AFTER (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Phive -Pkinesis-asl -Pmesos -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end ``` ## How was this patch tested? Manually check the Jenkins logs. Closes #24824 from dongjoon-hyun/SPARK-27979. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-08 08:17:12 -07:00
Yuming Wang	3f102a8229	[SPARK-27749][SQL] hadoop-3.2 support hive-thriftserver ## What changes were proposed in this pull request? This PR mainly makes the following changes to make `hadoop-3.2` support `sql/hive-thriftserver`: 1. Upgrade [`TCLIService.thrift`](https://github.com/apache/hive/blob/rel/release-2.3.5/service-rpc/if/TCLIService.thrift) and related code to Hive 2.3.5 because of [HIVE-12442](https://issues.apache.org/jira/browse/HIVE-12442)(Note that we only migrate code without adding features, such as [HIVE-4924](https://issues.apache.org/jira/browse/HIVE-4924) and [HIVE-15473](https://issues.apache.org/jira/browse/HIVE-15473)). 2. Use slf4j as logging facade because of [HIVE-12237](https://issues.apache.org/jira/browse/HIVE-12237). 3. Port [HIVE-13169](https://issues.apache.org/jira/browse/HIVE-13169) to compatible with Hive 2.3. ## How was this patch tested? Exiting test Closes #24628 from wangyum/SPARK-27749. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>	2019-06-05 08:40:05 -07:00
Izek Greenfield	c647f9011c	[SPARK-27862][BUILD] Move to json4s 3.6.6 ## What changes were proposed in this pull request? Move to json4s version 3.6.6 Add scala-xml 1.2.0 ## How was this patch tested? Pass the Jenkins Closes #24736 from igreenfield/master. Authored-by: Izek Greenfield <igreenfield@axiomsl.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-05-30 19:42:56 -05:00
Fokko Driesprong	bd87323003	[SPARK-27757][CORE] Bump Jackson to 2.9.9 ## What changes were proposed in this pull request? This fixes CVE-2019-12086 on Databind: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9.9 ## How was this patch tested? Existing tests Closes #24646 from Fokko/SPARK-27757. Authored-by: Fokko Driesprong <fokko@apache.org> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-05-30 09:35:20 -05:00
HyukjinKwon	90b6cda9af	[SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0) ## What changes were proposed in this pull request? R 3.6.0 is released 2019-04-26. This PR targets to change R version from 3.5.1 to 3.6.0 in AppVeyor. This PR sets `R_REMOTES_NO_ERRORS_FROM_WARNINGS` to `true` to avoid the warnings below: ``` Error in strptime(xx, f, tz = tz) : (converted from warning) unable to identify current timezone 'C': please set environment variable 'TZ' Error in i.p(...) : (converted from warning) installation of package 'praise' had non-zero exit status Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p Execution halted ``` ## How was this patch tested? AppVeyor Closes #24716 from HyukjinKwon/SPARK-27848. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-05-28 14:42:03 +09:00
Sean Owen	6c5827c723	[SPARK-27794][R][DOCS] Use https URL for CRAN repo ## What changes were proposed in this pull request? Use https URL for CRAN repo (and for a Scala download in a Dockerfile) ## How was this patch tested? Existing tests. Closes #24664 from srowen/SPARK-27794. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-22 14:28:21 -07:00
Sean Owen	eed6de1a65	[MINOR][DOCS] Tighten up some key links to the project and download pages to use HTTPS ## What changes were proposed in this pull request? Tighten up some key links to the project and download pages to use HTTPS ## How was this patch tested? N/A Closes #24665 from srowen/HTTPSURLs. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-21 10:56:42 -07:00
HyukjinKwon	b7bf4fd123	[SPARK-27402][INFRA][FOLLOW-UP] Exclude 'hive-thriftserver' in modules to test for hadoop3.2 for now ## What changes were proposed in this pull request? This PR excludes 'hive-thriftserver' in modules to test for hadoop3.2 for now as well ## How was this patch tested? Manually tested via `run-tests.py` Closes #24644 from HyukjinKwon/SPARK-27402. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-20 07:53:19 -07:00
Dongjoon Hyun	141a3bfc8d	[SPARK-27755][BUILD] Update zstd-jni to 1.4.0-1 ## What changes were proposed in this pull request? This PR aims to update `zstd-jni` library to `1.4.0-1` which improves the `level 1 compression speed` performance by 6% in most scenarios. The following is the full release note. - https://github.com/facebook/zstd/releases/tag/v1.4.0 ## How was this patch tested? Pass the Jenkins. Closes #24632 from dongjoon-hyun/SPARK-27755. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-17 08:34:45 -07:00
Kazuaki Ishizaki	9e0d8c6ce2	[SPARK-27752][CORE] Upgrade lz4-java from 1.5.1 to 1.6.0 ## What changes were proposed in this pull request? This PR upgrades lz4-java from 1.5.1 to 1.6.0. Lz4-java is available at https://github.com/lz4/lz4-java. Changes from 1.5.1: - Upgraded LZ4 to 1.9.1. Updated the JNI bindings, except for the one for Linux/i386. Decompression speed is improved on amd64. - Deprecated use of LZ4FastDecompressor of a native instance because the corresponding C API function is deprecated. See the release note of LZ4 1.9.0 for details. Updated javadoc accordingly. - Changed the module name from org.lz4.lz4-java to org.lz4.java to avoid using - in the module name. (severn-everett, Oliver Eikemeier, Rei Odaira) - Enabled build with Java 11. Note that the distribution is still built with Java 7. (Rei Odaira) ## How was this patch tested? Existing tests. Closes #24629 from kiszk/SPARK-27752. Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-16 20:45:13 -07:00
Yuming Wang	f3ddd6f9da	[SPARK-27402][SQL][TEST-HADOOP3.2][TEST-MAVEN] Fix hadoop-3.2 test issue(except the hive-thriftserver module) ## What changes were proposed in this pull request? This pr fix hadoop-3.2 test issues(except the `hive-thriftserver` module): 1. Add `hive.metastore.schema.verification` and `datanucleus.schema.autoCreateAll` to HiveConf. 2. hadoop-3.2 support access the Hive metastore from 0.12 to 2.2 After [SPARK-27176](https://issues.apache.org/jira/browse/SPARK-27176) and this PR, we upgraded the built-in Hive to 2.3 when enabling the Hadoop 3.2+ profile. This upgrade fixes the following issues: - [HIVE-6727](https://issues.apache.org/jira/browse/HIVE-6727): Table level stats for external tables are set incorrectly. - [HIVE-15653](https://issues.apache.org/jira/browse/HIVE-15653): Some ALTER TABLE commands drop table stats. - [SPARK-12014](https://issues.apache.org/jira/browse/SPARK-12014): Spark SQL query containing semicolon is broken in Beeline. - [SPARK-25193](https://issues.apache.org/jira/browse/SPARK-25193): insert overwrite doesn't throw exception when drop old data fails. - [SPARK-25919](https://issues.apache.org/jira/browse/SPARK-25919): Date value corrupts when tables are "ParquetHiveSerDe" formatted and target table is Partitioned. - [SPARK-26332](https://issues.apache.org/jira/browse/SPARK-26332): Spark sql write orc table on viewFS throws exception. - [SPARK-26437](https://issues.apache.org/jira/browse/SPARK-26437): Decimal data becomes bigint to query, unable to query. ## How was this patch tested? This pr test Spark’s Hadoop 3.2 profile on jenkins and #24591 test Spark’s Hadoop 2.7 profile on jenkins This PR close #24591 Closes #24391 from wangyum/SPARK-27402. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>	2019-05-13 10:35:26 -07:00
Dongjoon Hyun	375cfa3d89	[SPARK-27467][BUILD] Upgrade Maven to 3.6.1 ## What changes were proposed in this pull request? This PR aims to upgrade Maven to 3.6.1 to bring JDK9+ related patches like [MNG-6506](https://issues.apache.org/jira/browse/MNG-6506). For the full release note, please see the following. - https://maven.apache.org/docs/3.6.1/release-notes.html This was committed and reverted due to AppVeyor failure. It turns out that the root cause is `PATH` issue. With the updated AppVeyor script, it passed. https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/builds/24273412 ## How was this patch tested? Pass the Jenkins and AppVoyer Closes #24481 from dongjoon-hyun/SPARK-R. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-02 20:01:17 -07:00
Yuming Wang	875e7e1d97	[SPARK-27620][BUILD] Upgrade jetty to 9.4.18.v20190429 ## What changes were proposed in this pull request? This pr upgrade jetty to [9.4.18.v20190429](https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.18.v20190429) because of [CVE-2019-10247](https://nvd.nist.gov/vuln/detail/CVE-2019-10247). ## How was this patch tested? Existing test. Closes #24513 from wangyum/SPARK-27620. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-05-03 09:25:54 +09:00
Yuming Wang	3ecafb0e14	[SPARK-27601][BUILD] Upgrade stream-lib to 2.9.6 ## What changes were proposed in this pull request? [stream-lib 2.9.6](https://github.com/addthis/stream-lib/commits/v2.9.6) include several improvements: ![image](https://user-images.githubusercontent.com/5399861/56938062-7eb77580-6b32-11e9-8c36-711ab943d657.png) ## How was this patch tested? N/A Closes #24492 from wangyum/SPARK-27601. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-05-02 15:21:57 -05:00
Cheng Lian	b73744a147	[SPARK-27611][BUILD] Exclude jakarta.activation:jakarta.activation-api from org.glassfish.jaxb:jaxb-runtime:2.3.2 PR #23890 introduced `org.glassfish.jaxb:jaxb-runtime:2.3.2` as a runtime dependency. As an unexpected side effect, `jakarta.activation:jakarta.activation-api:1.2.1` was also pulled in as a transitive dependency. As a result, for the Maven build, both of the following two jars can be found under `assembly/target/scala-2.12/jars/`: ``` activation-1.1.1.jar jakarta.activation-api-1.2.1.jar ``` This PR exludes the Jakarta one. Manually built Spark using Maven and checked files under `assembly/target/scala-2.12/jars/`. After this change, only `activation-1.1.1.jar` is there. Closes #24507 from liancheng/spark-27611. Authored-by: Cheng Lian <lian@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-01 20:12:17 -07:00
HyukjinKwon	d8db7db50b	Revert "[SPARK-27467][FOLLOW-UP][BUILD] Upgrade Maven to 3.6.1 in AppVeyor and Doc" This reverts commit `bde30bc57c`.	2019-04-28 11:03:15 +09:00
Yuming Wang	bde30bc57c	[SPARK-27467][FOLLOW-UP][BUILD] Upgrade Maven to 3.6.1 in AppVeyor and Doc ## What changes were proposed in this pull request? Update the `docs/building-spark.md`. Otherwise: ``` mvn package -DskipTests=true ... [INFO] --- maven-enforcer-plugin:3.0.0-M2:enforce (enforce-versions) spark-parent_2.12 --- [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.6.0 is not in the allowed range 3.6.1. ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M2:enforce (enforce-versions) on project spark-parent_2.12: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1] [ERROR] ... ``` ## How was this patch tested? Just test `https://archive.apache.org/dist/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.zip` is avilable. Closes #24477 from wangyum/SPARK-27467. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-27 09:09:47 -07:00
Yuming Wang	fe99305101	[SPARK-27556][BUILD] Exclude com.zaxxer:HikariCP-java7 from hadoop-yarn-server-web-proxy ## What changes were proposed in this pull request? There are two HikariCP packages in classpath when building with `-Phive -Pyarn -Phadoop-3.2`. The HikariCP dependency tree: ``` [INFO] \| +- org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:3.2.0:compile [INFO] \| \| \- org.apache.hadoop:hadoop-yarn-server-common:jar:3.2.0:compile [INFO] \| \| +- org.apache.hadoop:hadoop-yarn-registry:jar:3.2.0:compile [INFO] \| \| \| \- commons-daemon:commons-daemon:jar:1.0.13:compile [INFO] \| \| +- org.apache.geronimo.specs:geronimo-jcache_1.0_spec🫙1.0-alpha-1:compile [INFO] \| \| +- org.ehcache:ehcache:jar:3.3.1:compile [INFO] \| \| +- com.zaxxer:HikariCP-java7:jar:2.4.12:compile ``` ``` [INFO] +- org.apache.hive:hive-metastore:jar:2.3.4:compile [INFO] \| +- javolution:javolution:jar:5.5.1:compile [INFO] \| +- com.google.protobuf:protobuf-java:jar:2.5.0:compile [INFO] \| +- com.jolbox:bonecp:jar:0.8.0.RELEASE:compile [INFO] \| +- com.zaxxer:HikariCP:jar:2.5.1:compile ``` This pr exclude `com.zaxxer:HikariCP-java7` from `hadoop-yarn-server-web-proxy`. ## How was this patch tested? manual tests Closes #24450 from wangyum/SPARK-27556. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-04-26 12:15:39 -05:00
Yuming Wang	f82ed5e8e0	[MINOR][TEST] Remove out-dated hive version in run-tests.py ## What changes were proposed in this pull request? ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark (w/Hive 1.2.1) using SBT with these arguments: -Phadoop-3.2 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos test:package streaming-kinesis-asl-assembly/assembly ``` `(w/Hive 1.2.1)` is incorrect when testing hadoop-3.2, It's should be (w/Hive 2.3.4). This pr removes `(w/Hive 1.2.1)` in run-tests.py. ## How was this patch tested? N/A Closes #24451 from wangyum/run-tests-invalid-info. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-24 21:22:15 -07:00
Yuming Wang	777b4502b2	[SPARK-27176][FOLLOW-UP][SQL] Upgrade Hive parquet to 1.10.1 for hadoop-3.2 ## What changes were proposed in this pull request? When we compile and test Hadoop 3.2, we will hint the following two issues: 1. JobSummaryLevel is not a member of object org.apache.parquet.hadoop.ParquetOutputFormat. Fixed by [PARQUET-381](https://issues.apache.org/jira/browse/PARQUET-381)(Parquet 1.9.0) 2. java.lang.NoSuchFieldError: BROTLI at org.apache.parquet.hadoop.metadata.CompressionCodecName.<clinit>(CompressionCodecName.java:31). Fixed by [PARQUET-1143](https://issues.apache.org/jira/browse/PARQUET-1143)(Parquet 1.10.0) The reason is that the `parquet-hadoop-bundle-1.8.1.jar` conflicts with Parquet 1.10.1. I think it would be safe to upgrade Hive's parquet to 1.10.1 to workaround this issue. This is what Hive did when upgrading Parquet 1.8.1 to 1.10.0: [HIVE-17000](https://issues.apache.org/jira/browse/HIVE-17000) and [HIVE-19464](https://issues.apache.org/jira/browse/HIVE-19464). We can see that all changes are related to vectors, and vectors are disabled by default: see [HIVE-14826](https://issues.apache.org/jira/browse/HIVE-14826) and [HiveConf.java#L2723](https://github.com/apache/hive/blob/rel/release-2.3.4/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2723). This pr removes [parquet-hadoop-bundle-1.8.1.jar](https://github.com/apache/parquet-mr/tree/master/parquet-hadoop-bundle) , so Hive serde will use [parquet-common-1.10.1.jar, parquet-column-1.10.1.jar and parquet-hadoop-1.10.1.jar](https://github.com/apache/spark/blob/master/dev/deps/spark-deps-hadoop-3.2#L185-L189). ## How was this patch tested? 1. manual tests 2. [upgrade Hive Parquet to 1.10.1 annd run Hadoop 3.2 test on jenkins](https://github.com/apache/spark/pull/24044#commits-pushed-0c3f962) Closes #24346 from wangyum/SPARK-27176. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>	2019-04-19 08:59:08 -07:00
shane knapp	e1ece6a319	[SPARK-25079][PYTHON] update python3 executable to 3.6.x ## What changes were proposed in this pull request? have jenkins test against python3.6 (instead of 3.4). ## How was this patch tested? extensive testing on both the centos and ubuntu jenkins workers. NOTE: this will need to be backported to all active branches. Closes #24266 from shaneknapp/updating-python3-executable. Authored-by: shane knapp <incomplete@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-04-19 10:03:50 +09:00
Dongjoon Hyun	f93460dae9	[SPARK-27493][BUILD] Upgrade ASM to 7.1 ## What changes were proposed in this pull request? [SPARK-25946](https://issues.apache.org/jira/browse/SPARK-25946) upgraded ASM to 7.0 to support JDK11. This PR aims to update ASM to 7.1 to bring the bug fixes. - https://asm.ow2.io/versions.html - https://issues.apache.org/jira/browse/XBEAN-316 ## How was this patch tested? Pass the Jenkins. Closes #24395 from dongjoon-hyun/SPARK-27493. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-04-18 13:36:52 +09:00
Dongjoon Hyun	a8f20c95ab	[SPARK-27452][BUILD] Update zstd-jni to 1.3.8-9 ## What changes were proposed in this pull request? This PR aims to update `zstd-jni` from 1.3.2-2 to 1.3.8-9 to be aligned with the latest Zstd 1.3.8 in Apache Spark 3.0.0. Currently, Apache Spark is aligned with the old Zstd used in the first PR and there are many bugfix and improvement updates in `zstd-jni` until now. - https://github.com/facebook/zstd/releases/tag/v1.3.8 - https://github.com/facebook/zstd/releases/tag/v1.3.7 - https://github.com/facebook/zstd/releases/tag/v1.3.6 - https://github.com/facebook/zstd/releases/tag/v1.3.4 - https://github.com/facebook/zstd/releases/tag/v1.3.3 ## How was this patch tested? Pass the Jenkins with the existing tests. Closes #24364 from dongjoon-hyun/SPARK-ZSTD. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-16 08:54:16 -07:00
Sean Owen	8718367e2e	[SPARK-27470][PYSPARK] Update pyrolite to 4.23 ## What changes were proposed in this pull request? Update pyrolite to 4.23 to pick up bug and security fixes. ## How was this patch tested? Existing tests. Closes #24381 from srowen/SPARK-27470. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-04-16 19:41:40 +09:00
Sean Owen	a4cf1a4f4e	[SPARK-27469][CORE] Update Commons BeanUtils to 1.9.3 ## What changes were proposed in this pull request? Unify commons-beanutils deps to latest 1.9.3. This resolves the version inconsistency in Hadoop 2.7's build and also picks up security and bug fixes. ## How was this patch tested? Existing tests. Closes #24378 from srowen/SPARK-27469. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-15 19:18:37 -07:00
Dongjoon Hyun	0881f648cf	[SPARK-27451][BUILD] Upgrade lz4-java to 1.5.1 ## What changes were proposed in this pull request? This PR upgrades `lz4-java` to 1.5.1 in order to get a patch for avoiding racing with GC. - https://github.com/lz4/lz4-java/blob/master/CHANGES.md#151 ## How was this patch tested? Pass the Jenkins with the existing tests. Closes #24363 from dongjoon-hyun/SPARK-LZ4. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-12 19:21:43 -07:00
Yuming Wang	33f3c48cac	[SPARK-27176][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4 ## What changes were proposed in this pull request? This PR mainly contains: 1. Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4. 2. Resolve compatibility issues between Hive 1.2.1 and Hive 2.3.4 in the `sql/hive` module. ## How was this patch tested? jenkins test hadoop-2.7 manual test hadoop-3: ```shell build/sbt clean package -Phadoop-3.2 -Phive export SPARK_PREPEND_CLASSES=true # rm -rf metastore_db cat <<EOF > test_hadoop3.scala spark.range(10).write.saveAsTable("test_hadoop3") spark.table("test_hadoop3").show EOF bin/spark-shell --conf spark.hadoop.hive.metastore.schema.verification=false --conf spark.hadoop.datanucleus.schema.autoCreateAll=true -i test_hadoop3.scala ``` Closes #23788 from wangyum/SPARK-23710-hadoop3. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>	2019-04-08 08:42:21 -07:00
Sean Owen	23bde44797	[SPARK-27358][UI] Update jquery to 1.12.x to pick up security fixes ## What changes were proposed in this pull request? Update jquery -> 1.12.4, datatables -> 1.10.18, mustache -> 2.3.12. Add missing mustache license ## How was this patch tested? I manually tested the UI locally with the javascript console open and didn't observe any problems or JS errors. The only 'risky' change seems to be mustache, but on reading its release notes, don't think the changes from 0.8.1 to 2.x would affect Spark's simple usage. Closes #24288 from srowen/SPARK-27358. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-04-05 12:54:01 -05:00
LantaoJin	69dd44af19	[SPARK-27216][CORE] Upgrade RoaringBitmap to 0.7.45 to fix Kryo unsafe ser/dser issue ## What changes were proposed in this pull request? HighlyCompressedMapStatus uses RoaringBitmap to record the empty blocks. But RoaringBitmap couldn't be ser/deser with unsafe KryoSerializer. It's a bug of RoaringBitmap-0.5.11 and fixed in latest version. This is an update of #24157 ## How was this patch tested? Add a UT Closes #24264 from LantaoJin/SPARK-27216. Lead-authored-by: LantaoJin <jinlantao@gmail.com> Co-authored-by: Lantao Jin <jinlantao@gmail.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>	2019-04-03 20:09:50 -05:00
Yuming Wang	13c5c1fb4b	[SPARK-27180][BUILD][YARN] Fix testing issues with yarn module in Hadoop-3 ## What changes were proposed in this pull request? Fix testing issues with `yarn` module in Hadoop-3: 1. Upgrade jersey-1 to `1.19` to fix ```Cause: java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer```. 2. Copy `ServerSocketUtil` from hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java to fix ```java.lang.NoClassDefFoundError: org/apache/hadoop/net/ServerSocketUtil```. 3. Adapte `SessionHandler` from jetty-9.3.25.v20180904/jetty-server/src/main/java/org/eclipse/jetty/server/session/SessionHandler.java to fix ```java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager```. ## How was this patch tested? manual tests: ```shell build/sbt yarn/test -Pyarn build/sbt yarn/test -Phadoop-3.2 -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn -Phadoop-3.2 ``` Closes #24115 from wangyum/hadoop3-yarn. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-04-02 15:38:26 -05:00
Yuming Wang	f799e34962	[MINOR][BUILD] Upgrade apache-rat to 0.13 ## What changes were proposed in this pull request? This PR upgrade `apache-rat` to 0.13. Issues fixed by 0.13: https://issues.apache.org/jira/issues/?jql=project%20%3D%20RAT%20AND%20fixVersion%20%3D%200.13 ## How was this patch tested? manual tests Closes #24262 from wangyum/apache-rat. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2019-04-01 16:44:42 +09:00
Sean Owen	754f820035	[SPARK-26918][DOCS] All .md should have ASF license header ## What changes were proposed in this pull request? Add AL2 license to metadata of all .md files. This seemed to be the tidiest way as it will get ignored by .md renderers and other tools. Attempts to write them as markdown comments revealed that there is no such standard thing. ## How was this patch tested? Doc build Closes #24243 from srowen/SPARK-26918. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-03-30 19:49:45 -05:00
Sean Owen	2ec650d843	[SPARK-27267][CORE] Update snappy to avoid error when decompressing empty serialized data ## What changes were proposed in this pull request? (See JIRA for problem statement) Update snappy 1.1.7.1 -> 1.1.7.3 to pick up an empty-stream and Java 9 fix. There appear to be no other changes of consequence: https://github.com/xerial/snappy-java/blob/master/Milestone.md ## How was this patch tested? Existing tests Closes #24242 from srowen/SPARK-27267. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-03-30 02:41:24 -05:00
Hyukjin Kwon	0e16a6f5b0	[SPARK-27277][INFRA] Recover from setting fix version failure in merge script ## What changes were proposed in this pull request? I happened to meet this case few times before: ``` Enter comma-separated fix version(s) [3.0.0]: 3.0,0 Restoring head pointer to master git checkout master Already on 'master' git branch Traceback (most recent call last): File "./dev/merge_spark_pr_jira.py", line 537, in <module> main() File "./dev/merge_spark_pr_jira.py", line 523, in main resolve_jira_issues(title, merged_refs, jira_comment) File "./dev/merge_spark_pr_jira.py", line 359, in resolve_jira_issues resolve_jira_issue(merge_branches, comment, jira_id) File "./dev/merge_spark_pr_jira.py", line 302, in resolve_jira_issue jira_fix_versions = map(lambda v: get_version_json(v), fix_versions) File "./dev/merge_spark_pr_jira.py", line 302, in <lambda> jira_fix_versions = map(lambda v: get_version_json(v), fix_versions) File "./dev/merge_spark_pr_jira.py", line 300, in get_version_json return filter(lambda v: v.name == version_str, versions)[0].raw IndexError: list index out of range ``` I typed the fix version wrongly (there's comma in `3.0,0`) and it ended the loop in the merge script. Not a big deal but it bugged me few times. Finally I met this today again, and decided to fix. This PR proposes to recover from wrongly set fix versions. ## How was this patch tested? I manually copied and pasted the specific codes and tested separately in both Python 2 and Python 3. Positive cases: ``` Enter comma-separated fix version(s) [3.0.0]: # blank test (to use default) ['3.0.0'] ``` ``` Enter comma-separated fix version(s) [3.0.0,2.4.2]: # multiple default versions ['3.0.0', '2.4.2'] ``` ``` Enter comma-separated fix version(s) [3.0.0]: 2.4.1 # valid version ['2.4.1'] ``` ``` Enter comma-separated fix version(s) [3.0.0]: 3.0.0,2.4.2 # multiple valid versions ['3.0.0', '2.4.2'] ``` Keyboard interrupt(Ctrl + c): ``` Enter comma-separated fix version(s) [3.0.0]: ^CTraceback (most recent call last): # keyboard interrupt File "test_merge_script.py", line 45, in <module> test() File "test_merge_script.py", line 26, in test fix_versions = input("Enter comma-separated fix version(s) [%s]: " % default_fix_versions) KeyboardInterrupt ``` Wrongly typed versions (recovered): ``` Enter comma-separated fix version(s) [3.0.0]: 3.1 Specified version(s) [3.1] not found in the available versions, try again (or leave blank and fix manually). Enter comma-separated fix version(s) [3.0.0]: 123 Specified version(s) [123] not found in the available versions, try again (or leave blank and fix manually). Enter comma-separated fix version(s) [3.0.0]: 3.0,0 Specified version(s) [3.0, 0] not found in the available versions, try again (or leave blank and fix manually). Enter comma-separated fix version(s) [3.0.0]: damn Specified version(s) [damn] not found in the available versions, try again (or leave blank and fix manually). Enter comma-separated fix version(s) [3.0.0]: 3.0.0,2.5.2 # one invalid versions in multiple versions Specified version(s) [3.0.0, 2.5.2] not found in the available versions, try again (or leave blank and fix manually). ``` Arbitrary exceptions in fix version parsing (recovered) ``` Enter comma-separated fix version(s) [3.0.0]: Traceback (most recent call last): File "tmp.py", line 11, in <module> raise Exception("arbitrary exception") Exception: arbitrary exception Error setting fix version(s), try again (or leave blank and fix manually) Enter comma-separated fix version(s) [3.0.0]: Traceback (most recent call last): File "tmp.py", line 10, in <module> raise Exception("arbitrary exception") Exception: arbitrary exception Error setting fix version(s), try again (or leave blank and fix manually) Enter comma-separated fix version(s) [3.0.0]: ``` Closes #24213 from HyukjinKwon/merge_script_fix_version. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2019-03-26 21:14:07 +09:00
Sean Owen	8bc304f97e	[SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0 ## What changes were proposed in this pull request? Remove Scala 2.11 support in build files and docs, and in various parts of code that accommodated 2.11. See some targeted comments below. ## How was this patch tested? Existing tests. Closes #23098 from srowen/SPARK-26132. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-03-25 10:46:42 -05:00
Yuming Wang	9c0af746e5	[SPARK-27175][BUILD] Upgrade hadoop-3 to 3.2.0 ## What changes were proposed in this pull request? This PR upgrade `hadoop-3` to `3.2.0` to workaround [HADOOP-16086](https://issues.apache.org/jira/browse/HADOOP-16086). Otherwise some test case will throw IllegalArgumentException: ```java 02:44:34.707 ERROR org.apache.hadoop.hive.ql.exec.Task: Job Submission failed with exception 'java.io.IOException(Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.)' java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:116) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:109) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:102) at org.apache.hadoop.mapred.JobClient.init(JobClient.java:475) at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:454) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:369) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$runHive$1(HiveClientImpl.scala:730) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:283) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:221) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:220) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:266) at org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:719) at org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:709) at org.apache.spark.sql.hive.StatisticsSuite.createNonPartitionedTable(StatisticsSuite.scala:719) at org.apache.spark.sql.hive.StatisticsSuite.$anonfun$testAlterTableProperties$2(StatisticsSuite.scala:822) ``` ## How was this patch tested? manual tests Closes #24106 from wangyum/SPARK-27175. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-03-16 19:42:05 -05:00
Dongjoon Hyun	f26a1f3d37	[SPARK-27165][SPARK-27107][BUILD][SQL] Upgrade Apache ORC to 1.5.5 ## What changes were proposed in this pull request? This PR aims to update Apache ORC dependency to fix [SPARK-27107](https://issues.apache.org/jira/browse/SPARK-27107) . ``` [ORC-452] Support converting MAP column from JSON to ORC Improvement [ORC-447] Change the docker scripts to keep a persistent m2 cache [ORC-463] Add `version` command [ORC-475] ORC reader should lazily get filesystem [ORC-476] Make SearchAgument kryo buffer size configurable ``` ## How was this patch tested? Pass the Jenkins with the existing tests. Closes #24096 from dongjoon-hyun/SPARK-27165. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-03-14 20:14:31 -07:00
Yuming Wang	f0b6245ea4	[SPARK-27158][BUILD] dev/mima and dev/scalastyle support dynamic profiles ## What changes were proposed in this pull request? `dev/mima` and `dev/scalastyle` support dynamic reading profiles from `modules.py`. ## How was this patch tested? manual tests Closes #24089 from wangyum/SPARK-27158. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2019-03-15 08:20:42 +09:00
Jiaxin Shan	2d0b7cfe44	[SPARK-26742][K8S] Update Kubernetes-Client version to 4.1.2 ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/23814 was reverted because of Jenkins integration tests failure. After minikube upgrade, Kubernetes client SDK v1.4.2 work with kubernetes v1.13. We can bring this change back. Reference: [Bump Kubernetes Client Version to 4.1.2](https://issues.apache.org/jira/browse/SPARK-26742) [Original PR against master](https://github.com/apache/spark/pull/23814) [Kubernetes client upgrade for Spark 2.4](https://github.com/apache/spark/pull/23993) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Unit Tests: ``` All tests passed. [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary for Spark Project Parent POM 3.0.0-SNAPSHOT: [INFO] [INFO] Spark Project Parent POM ........................... SUCCESS [ 2.343 s] [INFO] Spark Project Tags ................................. SUCCESS [ 2.039 s] [INFO] Spark Project Sketch ............................... SUCCESS [ 12.714 s] [INFO] Spark Project Local DB ............................. SUCCESS [ 2.185 s] [INFO] Spark Project Networking ........................... SUCCESS [ 38.154 s] [INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 7.989 s] [INFO] Spark Project Unsafe ............................... SUCCESS [ 2.297 s] [INFO] Spark Project Launcher ............................. SUCCESS [ 2.813 s] [INFO] Spark Project Core ................................. SUCCESS [38:03 min] [INFO] Spark Project ML Local Library ..................... SUCCESS [ 3.848 s] [INFO] Spark Project GraphX ............................... SUCCESS [ 56.084 s] [INFO] Spark Project Streaming ............................ SUCCESS [04:58 min] [INFO] Spark Project Catalyst ............................. SUCCESS [06:39 min] [INFO] Spark Project SQL .................................. SUCCESS [37:12 min] [INFO] Spark Project ML Library ........................... SUCCESS [18:59 min] [INFO] Spark Project Tools ................................ SUCCESS [ 0.767 s] [INFO] Spark Project Hive ................................. SUCCESS [33:45 min] [INFO] Spark Project REPL ................................. SUCCESS [01:14 min] [INFO] Spark Project Assembly ............................. SUCCESS [ 1.444 s] [INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [01:12 min] [INFO] Kafka 0.10+ Token Provider for Streaming ........... SUCCESS [ 6.719 s] [INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [07:00 min] [INFO] Spark Project Examples ............................. SUCCESS [ 21.805 s] [INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 0.906 s] [INFO] Spark Avro ......................................... SUCCESS [ 50.486 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 02:32 h [INFO] Finished at: 2019-03-07T08:39:34Z [INFO] ------------------------------------------------------------------------ ``` Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #24002 from Jeffwan/update_k8s_sdk_master. Authored-by: Jiaxin Shan <seedjeffwan@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-13 15:04:27 -07:00
DB Tsai	2b9ad2516e	[MINOR][BUILD] Add Scala 2.12 profile back for branch-2.4 build Closes #24074 from dbtsai/scala-2.12. Authored-by: DB Tsai <d_tsai@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-03-12 20:08:52 -07:00

1 2 3 4 5 ...

702 commits