[MINOR][DOCS] Updates to the Accumulator example in the programming guide. Fixed typos, AccumulatorV2 in Java
## What changes were proposed in this pull request? This pull request contains updates to Scala and Java Accumulator code snippets in the programming guide. - For Scala, the pull request fixes the signature of the 'add()' method in the custom Accumulator, which contained two params (as the old AccumulatorParam) instead of one (as in AccumulatorV2). - The Java example was updated to use the AccumulatorV2 class since AccumulatorParam is marked as deprecated. - Scala and Java examples are more consistent now. ## How was this patch tested? This patch was tested manually by building the docs locally. ![image](https://cloud.githubusercontent.com/assets/6235869/20652099/77d98d18-b4f3-11e6-8565-a995fe8cf8e5.png) Author: aokolnychyi <okolnychyyanton@gmail.com> Closes #16024 from aokolnychyi/fixed_accumulator_example.
This commit is contained in:
parent
f830bb9170
commit
f045d9dade
|
@ -1378,18 +1378,23 @@ res2: Long = 10
|
|||
|
||||
While this code used the built-in support for accumulators of type Long, programmers can also
|
||||
create their own types by subclassing [AccumulatorV2](api/scala/index.html#org.apache.spark.util.AccumulatorV2).
|
||||
The AccumulatorV2 abstract class has several methods which need to override:
|
||||
`reset` for resetting the accumulator to zero, and `add` for add anothor value into the accumulator, `merge` for merging another same-type accumulator into this one. Other methods need to override can refer to scala API document. For example, supposing we had a `MyVector` class
|
||||
The AccumulatorV2 abstract class has several methods which one has to override: `reset` for resetting
|
||||
the accumulator to zero, `add` for adding another value into the accumulator,
|
||||
`merge` for merging another same-type accumulator into this one. Other methods that must be overridden
|
||||
are contained in the [API documentation](api/scala/index.html#org.apache.spark.util.AccumulatorV2). For example, supposing we had a `MyVector` class
|
||||
representing mathematical vectors, we could write:
|
||||
|
||||
{% highlight scala %}
|
||||
object VectorAccumulatorV2 extends AccumulatorV2[MyVector, MyVector] {
|
||||
val vec_ : MyVector = MyVector.createZeroVector
|
||||
def reset(): MyVector = {
|
||||
vec_.reset()
|
||||
class VectorAccumulatorV2 extends AccumulatorV2[MyVector, MyVector] {
|
||||
|
||||
private val myVector: MyVector = MyVector.createZeroVector
|
||||
|
||||
def reset(): Unit = {
|
||||
myVector.reset()
|
||||
}
|
||||
def add(v1: MyVector, v2: MyVector): MyVector = {
|
||||
vec_.add(v2)
|
||||
|
||||
def add(v: MyVector): Unit = {
|
||||
myVector.add(v)
|
||||
}
|
||||
...
|
||||
}
|
||||
|
@ -1424,29 +1429,36 @@ accum.value();
|
|||
// returns 10
|
||||
{% endhighlight %}
|
||||
|
||||
Programmers can also create their own types by subclassing
|
||||
[AccumulatorParam](api/java/index.html?org/apache/spark/AccumulatorParam.html).
|
||||
The AccumulatorParam interface has two methods: `zero` for providing a "zero value" for your data
|
||||
type, and `addInPlace` for adding two values together. For example, supposing we had a `Vector` class
|
||||
While this code used the built-in support for accumulators of type Long, programmers can also
|
||||
create their own types by subclassing [AccumulatorV2](api/scala/index.html#org.apache.spark.util.AccumulatorV2).
|
||||
The AccumulatorV2 abstract class has several methods which one has to override: `reset` for resetting
|
||||
the accumulator to zero, `add` for adding another value into the accumulator,
|
||||
`merge` for merging another same-type accumulator into this one. Other methods that must be overridden
|
||||
are contained in the [API documentation](api/scala/index.html#org.apache.spark.util.AccumulatorV2). For example, supposing we had a `MyVector` class
|
||||
representing mathematical vectors, we could write:
|
||||
|
||||
{% highlight java %}
|
||||
class VectorAccumulatorParam implements AccumulatorParam<Vector> {
|
||||
public Vector zero(Vector initialValue) {
|
||||
return Vector.zeros(initialValue.size());
|
||||
class VectorAccumulatorV2 implements AccumulatorV2<MyVector, MyVector> {
|
||||
|
||||
private MyVector myVector = MyVector.createZeroVector();
|
||||
|
||||
public void reset() {
|
||||
myVector.reset();
|
||||
}
|
||||
public Vector addInPlace(Vector v1, Vector v2) {
|
||||
v1.addInPlace(v2); return v1;
|
||||
|
||||
public void add(MyVector v) {
|
||||
myVector.add(v);
|
||||
}
|
||||
...
|
||||
}
|
||||
|
||||
// Then, create an Accumulator of this type:
|
||||
Accumulator<Vector> vecAccum = sc.accumulator(new Vector(...), new VectorAccumulatorParam());
|
||||
VectorAccumulatorV2 myVectorAcc = new VectorAccumulatorV2();
|
||||
// Then, register it into spark context:
|
||||
jsc.sc().register(myVectorAcc, "MyVectorAcc1");
|
||||
{% endhighlight %}
|
||||
|
||||
In Java, Spark also supports the more general [Accumulable](api/java/index.html?org/apache/spark/Accumulable.html)
|
||||
interface to accumulate data where the resulting type is not the same as the elements added (e.g. build
|
||||
a list by collecting together elements).
|
||||
Note that, when programmers define their own type of AccumulatorV2, the resulting type can be different than that of the elements added.
|
||||
|
||||
</div>
|
||||
|
||||
|
|
Loading…
Reference in a new issue