spark-instrumented-optimizer/sql/catalyst
Josh Rosen 604aa1b045 [SPARK-27786][SQL] Fix Sha1, Md5, and Base64 codegen when commons-codec is shaded
## What changes were proposed in this pull request?

When running a custom build of Spark which shades `commons-codec`, the `Sha1` expression generates code which fails to compile:

```
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 47, Column 93: A method named "sha1Hex" is not declared in any enclosing class nor any supertype, nor through a static import
```

This is caused by an interaction between Spark's code generator and the shading: the current codegen template includes the string `org.apache.commons.codec.digest.DigestUtils.sha1Hex` as part of a larger string literal, preventing JarJarLinks from being able to replace the class name with the shaded class's name. As a result, the generated code still references the original unshaded class name name, triggering an error in case the original unshaded dependency isn't on the path.

This problem impacts the `Sha1`, `Md5`, and `Base64` expressions.

To fix this problem and allow for proper shading, this PR updates the codegen templates to replace the hardcoded class names with `${classof[<name>].getName}` calls.

## How was this patch tested?

Existing tests.

To ensure that I found all occurrences of this problem, I used IntelliJ's "Find in Path" to search for lines matching the regex `^(?!import|package).*(org|com|net|io)\.(?!apache\.spark)` and then filtered matches to inspect only non-test "Usage in string constants" cases. This isn't _perfect_ but I think it'll catch most cases.

Closes #24655 from JoshRosen/fix-shaded-apache-commons.

Authored-by: Josh Rosen <rosenville@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-05-21 21:18:34 +08:00
..
benchmarks [SPARK-25657][SQL][TEST] Refactor HashBenchmark to use main method 2018-10-07 09:49:37 -07:00
src [SPARK-27786][SQL] Fix Sha1, Md5, and Base64 codegen when commons-codec is shaded 2019-05-21 21:18:34 +08:00
pom.xml [SPARK-27618][SQL][FOLLOW-UP] Unnecessary access to externalCatalog 2019-05-01 20:09:46 -07:00