From ac712e48af3068672e629cec7766caae3cd77c37 Mon Sep 17 00:00:00 2001 From: Reynold Xin Date: Thu, 30 Jan 2014 09:33:18 -0800 Subject: [PATCH] Merge pull request #524 from rxin/doc Added spark.shuffle.file.buffer.kb to configuration doc. Author: Reynold Xin == Merge branch commits == commit 0eea1d761ff772ff89be234e1e28035d54e5a7de Author: Reynold Xin Date: Wed Jan 29 14:40:48 2014 -0800 Added spark.shuffle.file.buffer.kb to configuration doc. --- docs/configuration.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/configuration.md b/docs/configuration.md index 4bb5371cc2..1f9fa70566 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -398,6 +398,14 @@ Apart from these, the following properties are also available, and may be useful If set to "true", consolidates intermediate files created during a shuffle. Creating fewer files can improve filesystem performance for shuffles with large numbers of reduce tasks. It is recommended to set this to "true" when using ext4 or xfs filesystems. On ext3, this option might degrade performance on machines with many (>8) cores due to filesystem limitations. + + spark.shuffle.file.buffer.kb + 100 + + Size of the in-memory buffer for each shuffle file output stream, in kilobytes. These buffers + reduce the number of disk seeks and system calls made in creating intermediate shuffle files. + + spark.shuffle.spill true