spark-instrumented-optimizer

History

Angers be4a55220a [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first. ## What changes were proposed in this pull request? ISSUE : https://issues.apache.org/jira/browse/SPARK-28106 When we use add jar in SQL, it will have three step: - add jar to HiveClient's classloader - HiveClientImpl.runHiveSQL("ADD JAR" + PATH) - SessionStateBuilder.addJar The second step seems has no impact to the whole process. Since event it failed, we still can execute. The first step will add jar path to HiveClient's ClassLoader, then we can use the jar in HiveClientImpl The Third Step will add this jar path to SparkContext. But expect local file path, it will call RpcServer's FileServer to add this to Env, the is you pass wrong path. it will cause error, but if you pass HDFS path or VIEWFS path, it won't check it and just add it to jar Path Map. Then when next TaskSetManager send out Task, this path will be brought by TaskDescription. Then Executor will call updateDependencies, this method will check all jar path and file path in TaskDescription. Then error happends like below: ![image](https://user-images.githubusercontent.com/46485123/59817635-4a527f80-9353-11e9-9e08-9407b2b54023.png) ## How was this patch tested? Exist Unit Test Environment Test Closes #24909 from AngersZhuuuu/SPARK-28106. Lead-authored-by: Angers <angers.zhu@gamil.com> Co-authored-by: 朱夷 <zhuyi01@corp.netease.com> Signed-off-by: jerryshao <jerryshao@tencent.com>		2019-07-16 15:29:05 +08:00
..
benchmarks	[SPARK-27070] Improve performance of DefaultPartitionCoalescer	2019-03-17 11:47:14 -05:00
src	[SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first.	2019-07-16 15:29:05 +08:00
pom.xml	[SPARK-28381][PYSPARK] Upgraded version of Pyrolite to 4.30	2019-07-15 12:29:58 +09:00