spark-instrumented-optimizer/streaming
Prashant Sharma 3ae4f07de0 [SPARK-17159][STREAM] Significant speed up for running spark streaming against Object store.
## What changes were proposed in this pull request?

Original work by Steve Loughran.
Based on #17745.

This is a minimal patch of changes to FileInputDStream to reduce File status requests when querying files. Each call to file status is 3+ http calls to object store. This patch eliminates the need for it, by using FileStatus objects.

This is a minor optimisation when working with filesystems, but significant when working with object stores.

## How was this patch tested?

Tests included. Existing tests pass.

Closes #22339 from ScrapCodes/PR_17745.

Lead-authored-by: Prashant Sharma <prashant@apache.org>
Co-authored-by: Steve Loughran <stevel@hortonworks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-10-05 02:22:06 +01:00
..
src [SPARK-17159][STREAM] Significant speed up for running spark streaming against Object store. 2018-10-05 02:22:06 +01:00
pom.xml [SPARK-25592] Setting version to 3.0.0-SNAPSHOT 2018-10-02 08:48:24 -07:00