spark-instrumented-optimizer

History

Marcelo Vanzin 6d16b9885d [SPARK-24552][CORE][SQL] Use task ID instead of attempt number for writes. This passes the unique task attempt id instead of attempt number to v2 data sources because attempt number is reused when stages are retried. When attempt numbers are reused, sources that track data by partition id and attempt number may incorrectly clean up data because the same attempt number can be both committed and aborted. For v1 / Hadoop writes, generate a unique ID based on available attempt numbers to avoid a similar problem. Closes #21558 Author: Marcelo Vanzin <vanzin@cloudera.com> Author: Ryan Blue <blue@apache.org> Closes #21606 from vanzin/SPARK-24552.2.	2018-06-25 16:54:57 -07:00
..
src	[SPARK-24552][CORE][SQL] Use task ID instead of attempt number for writes.	2018-06-25 16:54:57 -07:00
pom.xml	[PYSPARK] Update py4j to version 0.10.7.	2018-05-09 10:47:35 -07:00

Marcelo Vanzin 6d16b9885d [SPARK-24552][CORE][SQL] Use task ID instead of attempt number for writes.

This passes the unique task attempt id instead of attempt number to v2 data sources because attempt number is reused when stages are retried. When attempt numbers are reused, sources that track data by partition id and attempt number may incorrectly clean up data because the same attempt number can be both committed and aborted.

For v1 / Hadoop writes, generate a unique ID based on available attempt numbers to avoid a similar problem.

Closes #21558

Author: Marcelo Vanzin <vanzin@cloudera.com>
Author: Ryan Blue <blue@apache.org>

Closes #21606 from vanzin/SPARK-24552.2.

2018-06-25 16:54:57 -07:00

src

[SPARK-24552][CORE][SQL] Use task ID instead of attempt number for writes.

2018-06-25 16:54:57 -07:00

pom.xml

[PYSPARK] Update py4j to version 0.10.7.

2018-05-09 10:47:35 -07:00