spark-instrumented-optimizer

History

Dhruve Ashar 4bafacaa5f [SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing ## What changes were proposed in this pull request? Currently the no. of partition files are limited to 10000 files (%05d format). If there are more than 10000 part files, the logic goes for a toss while recreating the RDD as it sorts them by string. More details can be found in the JIRA desc [here](https://issues.apache.org/jira/browse/SPARK-17417). ## How was this patch tested? I tested this patch by checkpointing a RDD and then manually renaming part files to the old format and tried to access the RDD. It was successfully created from the old format. Also verified loading a sample parquet file and saving it as multiple formats - CSV, JSON, Text, Parquet, ORC and read them successfully back from the saved files. I couldn't launch the unit test from my local box, so will wait for the Jenkins output. Author: Dhruve Ashar <dhruveashar@gmail.com> Closes #15370 from dhruve/bug/SPARK-17417.	2016-10-10 10:55:57 -05:00
..
src	[SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing	2016-10-10 10:55:57 -05:00
pom.xml	[SPARK-17639][BUILD] Add jce.jar to buildclasspath when building.	2016-09-22 21:35:25 -07:00

Dhruve Ashar 4bafacaa5f [SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing

## What changes were proposed in this pull request?
Currently the no. of partition files are limited to 10000 files (%05d format). If there are more than 10000 part files, the logic goes for a toss while recreating the RDD as it sorts them by string. More details can be found in the JIRA desc [here](https://issues.apache.org/jira/browse/SPARK-17417).

## How was this patch tested?
I tested this patch by checkpointing a RDD and then manually renaming part files to the old format and tried to access the RDD. It was successfully created from the old format. Also verified loading a sample parquet file and saving it as multiple formats - CSV, JSON, Text, Parquet, ORC and read them successfully back from the saved files. I couldn't launch the unit test from my local box, so will wait for the Jenkins output.

Author: Dhruve Ashar <dhruveashar@gmail.com>

Closes #15370 from dhruve/bug/SPARK-17417.

2016-10-10 10:55:57 -05:00

src

[SPARK-17417][CORE] Fix # of partitions for Reliable RDD checkpointing

2016-10-10 10:55:57 -05:00

pom.xml

[SPARK-17639][BUILD] Add jce.jar to buildclasspath when building.

2016-09-22 21:35:25 -07:00