e653484710
## What changes were proposed in this pull request? The current error message of USING join is quite confusing, for example: ``` scala> val df1 = List(1,2,3).toDS.withColumnRenamed("value", "c1") df1: org.apache.spark.sql.DataFrame = [c1: int] scala> val df2 = List(1,2,3).toDS.withColumnRenamed("value", "c2") df2: org.apache.spark.sql.DataFrame = [c2: int] scala> df1.join(df2, usingColumn = "c1") org.apache.spark.sql.AnalysisException: using columns ['c1] can not be resolved given input columns: [c1, c2] ;; 'Join UsingJoin(Inner,List('c1)) :- Project [value#1 AS c1#3] : +- LocalRelation [value#1] +- Project [value#7 AS c2#9] +- LocalRelation [value#7] ``` after this PR, it becomes: ``` scala> val df1 = List(1,2,3).toDS.withColumnRenamed("value", "c1") df1: org.apache.spark.sql.DataFrame = [c1: int] scala> val df2 = List(1,2,3).toDS.withColumnRenamed("value", "c2") df2: org.apache.spark.sql.DataFrame = [c2: int] scala> df1.join(df2, usingColumn = "c1") org.apache.spark.sql.AnalysisException: USING column `c1` can not be resolved with the right join side, the right output is: [c2]; ``` ## How was this patch tested? updated tests Author: Wenchen Fan <wenchen@databricks.com> Closes #16100 from cloud-fan/natural. |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |