## What changes were proposed in this pull request?
General rule on skip or not:
skip if
- RDD tests
- tests could run long or complicated (streaming, hivecontext)
- tests on error conditions
- tests won't likely change/break
## How was this patch tested?
unit tests, `R CMD check --as-cran`, `R CMD check`
Author: Felix Cheung <felixcheung_m@hotmail.com>
Closes#17817 from felixcheung/rskiptest.
## What changes were proposed in this pull request?
Port Tweedie GLM #16344 to SparkR
felixcheung yanboliang
## How was this patch tested?
new test in SparkR
Author: actuaryzhang <actuaryzhang10@gmail.com>
Closes#16729 from actuaryzhang/sparkRTweedie.
## What changes were proposed in this pull request?
The `coefficients` component in model summary should be 'matrix' but the underlying structure is indeed list. This affects several models except for 'AFTSurvivalRegressionModel' which has the correct implementation. The fix is to first `unlist` the coefficients returned from the `callJMethod` before converting to matrix. An example illustrates the issues:
```
data(iris)
df <- createDataFrame(iris)
model <- spark.glm(df, Sepal_Length ~ Sepal_Width, family = "gaussian")
s <- summary(model)
> str(s$coefficients)
List of 8
$ : num 6.53
$ : num -0.223
$ : num 0.479
$ : num 0.155
$ : num 13.6
$ : num -1.44
$ : num 0
$ : num 0.152
- attr(*, "dim")= int [1:2] 2 4
- attr(*, "dimnames")=List of 2
..$ : chr [1:2] "(Intercept)" "Sepal_Width"
..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
> s$coefficients[, 2]
$`(Intercept)`
[1] 0.4788963
$Sepal_Width
[1] 0.1550809
```
This shows that the underlying structure of coefficients is still `list`.
felixcheung wangmiao1981
Author: actuaryzhang <actuaryzhang10@gmail.com>
Closes#16730 from actuaryzhang/sparkRCoef.
## What changes were proposed in this pull request?
R family is a longer list than what Spark supports.
## How was this patch tested?
manual
Author: Felix Cheung <felixcheung_m@hotmail.com>
Closes#16511 from felixcheung/rdocglmfamily.
## What changes were proposed in this pull request?
SparkR ```mllib.R``` is getting bigger as we add more ML wrappers, I'd like to split it into multiple files to make us easy to maintain:
* mllib_classification.R
* mllib_clustering.R
* mllib_recommendation.R
* mllib_regression.R
* mllib_stat.R
* mllib_tree.R
* mllib_utils.R
Note: Only reorg, no actual code change.
## How was this patch tested?
Existing tests.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes#16312 from yanboliang/spark-18862.