[SPARK-12199][DOC] Follow-up: Refine example code in ml-features.md
https://issues.apache.org/jira/browse/SPARK-12199 Follow-up PR of SPARK-11551. Fix some errors in ml-features.md mengxr Author: Xusen Yin <yinxusen@gmail.com> Closes #10193 from yinxusen/SPARK-12199.
This commit is contained in:
parent
03138b67d3
commit
98b212d36b
|
@ -63,7 +63,7 @@ the [IDF Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.IDF) for mor
|
|||
`Word2VecModel`. The model maps each word to a unique fixed-size vector. The `Word2VecModel`
|
||||
transforms each document into a vector using the average of all words in the document; this vector
|
||||
can then be used for as features for prediction, document similarity calculations, etc.
|
||||
Please refer to the [MLlib user guide on Word2Vec](mllib-feature-extraction.html#Word2Vec) for more
|
||||
Please refer to the [MLlib user guide on Word2Vec](mllib-feature-extraction.html#word2Vec) for more
|
||||
details.
|
||||
|
||||
In the following code segment, we start with a set of documents, each of which is represented as a sequence of words. For each document, we transform it into a feature vector. This feature vector could then be passed to a learning algorithm.
|
||||
|
@ -411,7 +411,7 @@ for more details on the API.
|
|||
Refer to the [DCT Java docs](api/java/org/apache/spark/ml/feature/DCT.html)
|
||||
for more details on the API.
|
||||
|
||||
{% include_example java/org/apache/spark/examples/ml/JavaDCTExample.java %}}
|
||||
{% include_example java/org/apache/spark/examples/ml/JavaDCTExample.java %}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
@ -669,7 +669,7 @@ for more details on the API.
|
|||
The following example demonstrates how to load a dataset in libsvm format and then normalize each row to have unit $L^2$ norm and unit $L^\infty$ norm.
|
||||
|
||||
<div class="codetabs">
|
||||
<div data-lang="scala">
|
||||
<div data-lang="scala" markdown="1">
|
||||
|
||||
Refer to the [Normalizer Scala docs](api/scala/index.html#org.apache.spark.ml.feature.Normalizer)
|
||||
for more details on the API.
|
||||
|
@ -677,7 +677,7 @@ for more details on the API.
|
|||
{% include_example scala/org/apache/spark/examples/ml/NormalizerExample.scala %}
|
||||
</div>
|
||||
|
||||
<div data-lang="java">
|
||||
<div data-lang="java" markdown="1">
|
||||
|
||||
Refer to the [Normalizer Java docs](api/java/org/apache/spark/ml/feature/Normalizer.html)
|
||||
for more details on the API.
|
||||
|
@ -685,7 +685,7 @@ for more details on the API.
|
|||
{% include_example java/org/apache/spark/examples/ml/JavaNormalizerExample.java %}
|
||||
</div>
|
||||
|
||||
<div data-lang="python">
|
||||
<div data-lang="python" markdown="1">
|
||||
|
||||
Refer to the [Normalizer Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.Normalizer)
|
||||
for more details on the API.
|
||||
|
@ -709,7 +709,7 @@ Note that if the standard deviation of a feature is zero, it will return default
|
|||
The following example demonstrates how to load a dataset in libsvm format and then normalize each feature to have unit standard deviation.
|
||||
|
||||
<div class="codetabs">
|
||||
<div data-lang="scala">
|
||||
<div data-lang="scala" markdown="1">
|
||||
|
||||
Refer to the [StandardScaler Scala docs](api/scala/index.html#org.apache.spark.ml.feature.StandardScaler)
|
||||
for more details on the API.
|
||||
|
@ -717,7 +717,7 @@ for more details on the API.
|
|||
{% include_example scala/org/apache/spark/examples/ml/StandardScalerExample.scala %}
|
||||
</div>
|
||||
|
||||
<div data-lang="java">
|
||||
<div data-lang="java" markdown="1">
|
||||
|
||||
Refer to the [StandardScaler Java docs](api/java/org/apache/spark/ml/feature/StandardScaler.html)
|
||||
for more details on the API.
|
||||
|
@ -725,7 +725,7 @@ for more details on the API.
|
|||
{% include_example java/org/apache/spark/examples/ml/JavaStandardScalerExample.java %}
|
||||
</div>
|
||||
|
||||
<div data-lang="python">
|
||||
<div data-lang="python" markdown="1">
|
||||
|
||||
Refer to the [StandardScaler Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.StandardScaler)
|
||||
for more details on the API.
|
||||
|
@ -788,7 +788,7 @@ More details can be found in the API docs for [Bucketizer](api/scala/index.html#
|
|||
The following example demonstrates how to bucketize a column of `Double`s into another index-wised column.
|
||||
|
||||
<div class="codetabs">
|
||||
<div data-lang="scala">
|
||||
<div data-lang="scala" markdown="1">
|
||||
|
||||
Refer to the [Bucketizer Scala docs](api/scala/index.html#org.apache.spark.ml.feature.Bucketizer)
|
||||
for more details on the API.
|
||||
|
@ -796,7 +796,7 @@ for more details on the API.
|
|||
{% include_example scala/org/apache/spark/examples/ml/BucketizerExample.scala %}
|
||||
</div>
|
||||
|
||||
<div data-lang="java">
|
||||
<div data-lang="java" markdown="1">
|
||||
|
||||
Refer to the [Bucketizer Java docs](api/java/org/apache/spark/ml/feature/Bucketizer.html)
|
||||
for more details on the API.
|
||||
|
@ -804,7 +804,7 @@ for more details on the API.
|
|||
{% include_example java/org/apache/spark/examples/ml/JavaBucketizerExample.java %}
|
||||
</div>
|
||||
|
||||
<div data-lang="python">
|
||||
<div data-lang="python" markdown="1">
|
||||
|
||||
Refer to the [Bucketizer Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.Bucketizer)
|
||||
for more details on the API.
|
||||
|
|
|
@ -59,7 +59,7 @@ public class JavaBinarizerExample {
|
|||
DataFrame binarizedDataFrame = binarizer.transform(continuousDataFrame);
|
||||
DataFrame binarizedFeatures = binarizedDataFrame.select("binarized_feature");
|
||||
for (Row r : binarizedFeatures.collect()) {
|
||||
Double binarized_value = r.getDouble(0);
|
||||
Double binarized_value = r.getDouble(0);
|
||||
System.out.println(binarized_value);
|
||||
}
|
||||
// $example off$
|
||||
|
|
|
@ -30,9 +30,9 @@ if __name__ == "__main__":
|
|||
|
||||
# $example on$
|
||||
df = sqlContext\
|
||||
.createDataFrame([(Vectors.dense([-2.0, 2.3]), ),
|
||||
(Vectors.dense([0.0, 0.0]), ),
|
||||
(Vectors.dense([0.6, -1.1]), )],
|
||||
.createDataFrame([(Vectors.dense([-2.0, 2.3]),),
|
||||
(Vectors.dense([0.0, 0.0]),),
|
||||
(Vectors.dense([0.6, -1.1]),)],
|
||||
["features"])
|
||||
px = PolynomialExpansion(degree=2, inputCol="features", outputCol="polyFeatures")
|
||||
polyDF = px.transform(df)
|
||||
|
|
Loading…
Reference in a new issue