6e27cb630d
This change reorders the replicas returned by HadoopRDD#getPreferredLocations so that replicas cached by HDFS are at the start of the list. This requires Hadoop 2.5 or higher; previous versions of Hadoop do not expose the information needed to determine whether a replica is cached. Author: Colin Patrick Mccabe <cmccabe@cloudera.com> Closes #1486 from cmccabe/SPARK-1767 and squashes the following commits: 338d4f8 [Colin Patrick Mccabe] SPARK-1767: Prefer HDFS-cached replicas when scheduling data-local tasks |
||
---|---|---|
.. | ||
project | ||
spark-style/src/main/scala/org/apache/spark/scalastyle | ||
build.properties | ||
MimaBuild.scala | ||
MimaExcludes.scala | ||
plugins.sbt | ||
SparkBuild.scala |