[SPARK-29978][SQL][TESTS] Check json_tuple
does not truncate results
### What changes were proposed in this pull request? I propose to add a test from the commita936522113
for 2.4. I extended the test by a few more lengths of requested field to cover more code branches in Jackson Core. In particular, [the optimization](5eb8973f87/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala (L473-L476)
) calls Jackson's method42b8b56684/src/main/java/com/fasterxml/jackson/core/json/UTF8JsonGenerator.java (L742-L746)
where the internal buffer size is **8000**. In this way: - 2000 to check 2000+2000+2000 < 8000 - 2800 from the 2.4 commit. It covers the specific case:42b8b56684/src/main/java/com/fasterxml/jackson/core/json/UTF8JsonGenerator.java (L746)
- 8000-1, 8000, 8000+1 are sizes around the size of the internal buffer - 65535 to test an outstanding large field. ### Why are the changes needed? To be sure that the current implementation and future versions of Spark don't have the bug fixed in 2.4. ### Does this PR introduce any user-facing change? No ### How was this patch tested? By running `JsonFunctionsSuite`. Closes #26613 from MaxGekk/json_tuple-test. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
parent
06e203b856
commit
e6b157cf70
|
@ -644,4 +644,15 @@ class JsonFunctionsSuite extends QueryTest with SharedSparkSession {
|
|||
to_json(struct($"t"), Map("timestampFormat" -> "yyyy-MM-dd HH:mm:ss.SSSSSS")))
|
||||
checkAnswer(df, Row(s"""{"t":"$s"}"""))
|
||||
}
|
||||
|
||||
test("json_tuple - do not truncate results") {
|
||||
Seq(2000, 2800, 8000 - 1, 8000, 8000 + 1, 65535).foreach { len =>
|
||||
val str = Array.tabulate(len)(_ => "a").mkString
|
||||
val json_tuple_result = Seq(s"""{"test":"$str"}""").toDF("json")
|
||||
.withColumn("result", json_tuple('json, "test"))
|
||||
.select('result)
|
||||
.as[String].head.length
|
||||
assert(json_tuple_result === len)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
Loading…
Reference in a new issue