spark-instrumented-optimizer/repl
Ergin Seyfe 8a538c97b5 [SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset
## What changes were proposed in this pull request?
Likewise [DataSet.scala](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L156) KeyValueGroupedDataset should mark the queryExecution as transient.

As mentioned in the Jira ticket, without transient we saw serialization issues like

```
Caused by: java.io.NotSerializableException: org.apache.spark.sql.execution.QueryExecution
Serialization stack:
        - object not serializable (class: org.apache.spark.sql.execution.QueryExecution, value: ==
```

## How was this patch tested?

Run the query which is specified in the Jira ticket before and after:
```
val a = spark.createDataFrame(sc.parallelize(Seq((1,2),(3,4)))).as[(Int,Int)]
val grouped = a.groupByKey(
{x:(Int,Int)=>x._1}
)
val mappedGroups = grouped.mapGroups((k,x)=>
{(k,1)}
)
val yyy = sc.broadcast(1)
val last = mappedGroups.rdd.map(xx=>
{ val simpley = yyy.value 1 }
)
```

Author: Ergin Seyfe <eseyfe@fb.com>

Closes #15706 from seyfe/keyvaluegrouped_serialization.
2016-11-01 11:18:42 -07:00
..
scala-2.10/src [SPARK-15487][WEB UI] Spark Master UI to reverse proxy Application and Workers UI 2016-09-08 17:20:20 -07:00
scala-2.11/src [SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset 2016-11-01 11:18:42 -07:00
src [SPARK-16736][CORE][SQL] purge superfluous fs calls 2016-08-17 11:43:01 -07:00
pom.xml [SPARK-16770][BUILD] Fix JLine dependency management and version (Sca… 2016-08-03 17:07:10 -07:00