142df4834b
## What changes were proposed in this pull request? Currently, Spark `describe` supports `StringType`. However, `describe()` returns a dataset for only all numeric columns. This PR aims to include `StringType` columns in `describe()`, `describe` without argument. **Background** ```scala scala> spark.read.json("examples/src/main/resources/people.json").describe("age", "name").show() +-------+------------------+-------+ |summary| age| name| +-------+------------------+-------+ | count| 2| 3| | mean| 24.5| null| | stddev|7.7781745930520225| null| | min| 19| Andy| | max| 30|Michael| +-------+------------------+-------+ ``` **Before** ```scala scala> spark.read.json("examples/src/main/resources/people.json").describe().show() +-------+------------------+ |summary| age| +-------+------------------+ | count| 2| | mean| 24.5| | stddev|7.7781745930520225| | min| 19| | max| 30| +-------+------------------+ ``` **After** ```scala scala> spark.read.json("examples/src/main/resources/people.json").describe().show() +-------+------------------+-------+ |summary| age| name| +-------+------------------+-------+ | count| 2| 3| | mean| 24.5| null| | stddev|7.7781745930520225| null| | min| 19| Andy| | max| 30|Michael| +-------+------------------+-------+ ``` ## How was this patch tested? Pass the Jenkins with a update testcase. Author: Dongjoon Hyun <dongjoon@apache.org> Closes #14095 from dongjoon-hyun/SPARK-16429. |
||
---|---|---|
.. | ||
jarTest.R | ||
packageInAJarTest.R | ||
test_binary_function.R | ||
test_binaryFile.R | ||
test_broadcast.R | ||
test_client.R | ||
test_context.R | ||
test_includeJAR.R | ||
test_includePackage.R | ||
test_mllib.R | ||
test_parallelize_collect.R | ||
test_rdd.R | ||
test_Serde.R | ||
test_shuffle.R | ||
test_sparkSQL.R | ||
test_take.R | ||
test_textFile.R | ||
test_utils.R | ||
test_Windows.R |