SPARK-3318: Documentation update in addFile on how to use SparkFiles.get

Rather than specifying the path to SparkFiles we need to use the filename.

Author: Holden Karau <holden@pigscanfly.ca>

Closes #2210 from holdenk/SPARK-3318-documentation-for-addfiles-should-say-to-use-file-not-path and squashes the following commits:

a25d27a [Holden Karau] Update the JavaSparkContext addFile method to be clear about using fileName with SparkFiles as well
0ebcb05 [Holden Karau] Documentation update in addFile on how to use SparkFiles.get to specify filename rather than path
This commit is contained in:
Holden Karau 2014-08-30 16:58:17 -07:00 committed by Matei Zaharia
parent b6cf134817
commit ba78383bac
3 changed files with 4 additions and 5 deletions

View file

@ -796,7 +796,7 @@ class SparkContext(config: SparkConf) extends Logging {
* Add a file to be downloaded with this Spark job on every node.
* The `path` passed can be either a local file, a file in HDFS (or other Hadoop-supported
* filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs,
* use `SparkFiles.get(path)` to find its download location.
* use `SparkFiles.get(fileName)` to find its download location.
*/
def addFile(path: String) {
val uri = new URI(path)
@ -1619,4 +1619,3 @@ private[spark] class WritableConverter[T](
val writableClass: ClassTag[T] => Class[_ <: Writable],
val convert: Writable => T)
extends Serializable

View file

@ -545,7 +545,7 @@ class JavaSparkContext(val sc: SparkContext) extends JavaSparkContextVarargsWork
* Add a file to be downloaded with this Spark job on every node.
* The `path` passed can be either a local file, a file in HDFS (or other Hadoop-supported
* filesystems), or an HTTP, HTTPS or FTP URI. To access the file in Spark jobs,
* use `SparkFiles.get(path)` to find its download location.
* use `SparkFiles.get(fileName)` to find its download location.
*/
def addFile(path: String) {
sc.addFile(path)

View file

@ -606,8 +606,8 @@ class SparkContext(object):
FTP URI.
To access the file in Spark jobs, use
L{SparkFiles.get(path)<pyspark.files.SparkFiles.get>} to find its
download location.
L{SparkFiles.get(fileName)<pyspark.files.SparkFiles.get>} with the
filename to find its download location.
>>> from pyspark import SparkFiles
>>> path = os.path.join(tempdir, "test.txt")