spark-instrumented-optimizer

History

Sital Kedia 9c15d079df [SPARK-15074][SHUFFLE] Cache shuffle index file to speedup shuffle fetch ## What changes were proposed in this pull request? Shuffle fetch on large intermediate dataset is slow because the shuffle service open/close the index file for each shuffle fetch. This change introduces a cache for the index information so that we can avoid accessing the index files for each block fetch ## How was this patch tested? Tested by running a job on the cluster and the shuffle read time was reduced by 50%. Author: Sital Kedia <skedia@fb.com> Closes #12944 from sitalkedia/shuffle_service.	2016-08-04 14:54:38 -07:00
..
src	[SPARK-15074][SHUFFLE] Cache shuffle index file to speedup shuffle fetch	2016-08-04 14:54:38 -07:00
pom.xml	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00

Sital Kedia 9c15d079df [SPARK-15074][SHUFFLE] Cache shuffle index file to speedup shuffle fetch

## What changes were proposed in this pull request?

Shuffle fetch on large intermediate dataset is slow because the shuffle service open/close the index file for each shuffle fetch. This change introduces a cache for the index information so that we can avoid accessing the index files for each block fetch

## How was this patch tested?

Tested by running a job on the cluster and the shuffle read time was reduced by 50%.

Author: Sital Kedia <skedia@fb.com>

Closes #12944 from sitalkedia/shuffle_service.

2016-08-04 14:54:38 -07:00

src

[SPARK-15074][SHUFFLE] Cache shuffle index file to speedup shuffle fetch

2016-08-04 14:54:38 -07:00

pom.xml

[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent

2016-07-19 11:59:46 +01:00