spark-instrumented-optimizer/dev/deps
Kent Yao 4d47ac4b4b [SPARK-33705][SQL][TEST] Fix HiveThriftHttpServerSuite flakiness
### What changes were proposed in this pull request?
TO FIX flaky tests:

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132345/testReport/
```
org.apache.spark.sql.hive.thriftserver.HiveThriftHttpServerSuite.JDBC query execution
org.apache.spark.sql.hive.thriftserver.HiveThriftHttpServerSuite.Checks Hive version
org.apache.spark.sql.hive.thriftserver.HiveThriftHttpServerSuite.SPARK-24829 Checks cast as float
```

The root cause here is a jar conflict issue.
`NewCookie.isHttpOnly` is not defined in the `jsr311-api.jar` which conflicts
The transitive artifact `jsr311-api.jar` of `hadoop-client` is excluded at the maven side. See https://issues.apache.org/jira/browse/SPARK-27179.

The Jenkins PR builder and Github Action use `SBT` as the compiler tool.

First, the exclusion rule from maven is not followed by sbt, so I was able to see `jsr311-api.jar` from maven cache to be added to the classpath directly. **This seems to be a  bug of `sbt-pom-reader` plugin but I'm not that sure.**

Then I added an `ExcludeRule` for the `hive-thriftserver` module at the SBT side and did see the `jsr311-api.jar` gone, but the CI jobs still failed with the same error.

I added a trace log in ThriftHttpServlet

```s
ERROR ThriftHttpServlet: !!!!!!!!! Suspect???????? --->
file:/home/jenkins/workspace/SparkPullRequestBuilder/assembly/target/scala-2.12/jars/jsr311-api-1.1.1.jar
```
And the log pointed out that the assembly phase copied it to `assembly/target/scala-2.12/jars/` which will be added to the classpath too. With the help of SBT `dependencyTree` tool, I saw the `jsr311-api` again as a transitive of `jersery-core` from `yarn` module with a `test` scope. So **This seems to be another bug from the SBT side of the `sbt-assembly` plugin.**  It copied a test scope transitive artifact to the assembly output.

In this PR, I defined some rules in SparkBuild.scala to bypass the potential bugs from the SBT side.

First, exclude the `jsr311` from all over the project and then add it back separately to the YARN module for SBT.

Additionally, the HiveThriftServerSuites was reflected for reducing flakiness too, but not related to the bugs I have found so far.

### Why are the changes needed?

fix test here

### Does this PR introduce _any_ user-facing change?

NO
### How was this patch tested?

passing jenkins and ga

Closes #30643 from yaooqinn/HiveThriftHttpServerSuite.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2020-12-14 05:14:38 +00:00
..
spark-deps-hadoop-2.7-hive-2.3 [SPARK-33705][SQL][TEST] Fix HiveThriftHttpServerSuite flakiness 2020-12-14 05:14:38 +00:00
spark-deps-hadoop-3.2-hive-2.3 [SPARK-33705][SQL][TEST] Fix HiveThriftHttpServerSuite flakiness 2020-12-14 05:14:38 +00:00