spark-instrumented-optimizer/sql/core
jiake 5c67a9a7fa [SPARK-26316][SPARK-21052] Revert hash join metrics in that causes performance degradation
## What changes were proposed in this pull request?
The wrong implementation in the hash join metrics in [spark 21052](https://issues.apache.org/jira/browse/SPARK-21052) caused significant performance degradation in TPC-DS. And the result is [here](https://docs.google.com/spreadsheets/d/18a5BdOlmm8euTaRodyeWum9yu92mbWWu6JbhGXtr7yE/edit#gid=0) in TPC-DS 1TB scale. So we currently partial revert 21052.
**Cluster info:**

  | Master Node | Worker Nodes
-- | -- | --
Node | 1x | 4x
Processor | Intel(R) Xeon(R) Platinum 8170 CPU  2.10GHz | Intel(R) Xeon(R) Platinum 8180 CPU  2.50GHz
Memory | 192 GB | 384 GB
Storage Main | 8 x 960G SSD | 8 x 960G SSD
Network | 10Gbe |  
Role | CM Management NameNodeSecondary NameNodeResource ManagerHive Metastore Server | DataNodeNodeManager
OS Version | CentOS 7.2 | CentOS 7.2
Hadoop | Apache Hadoop 2.7.5 | Apache Hadoop 2.7.5
Hive | Apache Hive 2.2.0 |  
Spark | Apache Spark 2.1.0  & Apache Spark2.3.0 |  
JDK  version | 1.8.0_112 | 1.8.0_112

**Related parameters setting:**

Component | Parameter | Value
-- | -- | --
Yarn Resource Manager | yarn.scheduler.maximum-allocation-mb | 120GB
  | yarn.scheduler.minimum-allocation-mb | 1GB
  | yarn.scheduler.maximum-allocation-vcores | 121
  | Yarn.resourcemanager.scheduler.class | Fair Scheduler
Yarn Node Manager | yarn.nodemanager.resource.memory-mb | 120GB
  | yarn.nodemanager.resource.cpu-vcores | 121
Spark | spark.executor.memory | 110GB
  | spark.executor.cores | 50

## How was this patch tested?
N/A

Closes #23269 from JkSelf/partial-revert-21052.

Authored-by: jiake <ke.a.jia@intel.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2018-12-11 21:23:27 +08:00
..
benchmarks [SPARK-25964][SQL][MINOR] Revise OrcReadBenchmark/DataSourceReadBenchmark case names and execution instructions 2018-11-08 10:08:14 -08:00
src [SPARK-26316][SPARK-21052] Revert hash join metrics in that causes performance degradation 2018-12-11 21:23:27 +08:00
pom.xml [SPARK-25956] Make Scala 2.12 as default Scala version in Spark 3.0 2018-11-14 16:22:23 -08:00