spark-instrumented-optimizer/python/pyspark
Reynold Xin 4309262ec9 [SPARK-9700] Pick default page size more intelligently.
Previously, we use 64MB as the default page size, which was way too big for a lot of Spark applications (especially for single node).

This patch changes it so that the default page size, if unset by the user, is determined by the number of cores available and the total execution memory available.

Author: Reynold Xin <rxin@databricks.com>

Closes #8012 from rxin/pagesize and squashes the following commits:

16f4756 [Reynold Xin] Fixed failing test.
5afd570 [Reynold Xin] private...
0d5fb98 [Reynold Xin] Update default value.
674a6cd [Reynold Xin] Address review feedback.
dc00e05 [Reynold Xin] Merge with master.
73ebdb6 [Reynold Xin] [SPARK-9700] Pick default page size more intelligently.
2015-08-06 23:18:29 -07:00
..
ml [SPARK-9533] [PYSPARK] [ML] Add missing methods in Word2Vec ML 2015-08-06 10:09:58 -07:00
mllib [SPARK-6486] [MLLIB] [PYTHON] Add BlockMatrix to PySpark. 2015-08-05 07:40:50 -07:00
sql [SPARK-9691] [SQL] PySpark SQL rand function treats seed 0 as no seed 2015-08-06 17:03:14 -07:00
streaming [SPARK-8564] [STREAMING] Add the Python API for Kinesis 2015-07-31 12:09:48 -07:00
__init__.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
accumulators.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
broadcast.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
cloudpickle.py [SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__ 2015-07-29 22:30:49 -07:00
conf.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
context.py [SPARK-9144] Remove DAGScheduler.runLocallyWithinThread and spark.localExecution.enabled 2015-07-22 21:04:04 -07:00
daemon.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
files.py [SPARK-3309] [PySpark] Put all public API in __all__ 2014-09-03 11:49:45 -07:00
heapq3.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
java_gateway.py [SPARK-9700] Pick default page size more intelligently. 2015-08-06 23:18:29 -07:00
join.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
profiler.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
rdd.py [SPARK-9144] Remove DAGScheduler.runLocallyWithinThread and spark.localExecution.enabled 2015-07-22 21:04:04 -07:00
rddsampler.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
resultiterable.py [SPARK-3074] [PySpark] support groupByKey() with single huge key 2015-04-09 17:07:23 -07:00
serializers.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
shell.py [SPARK-9270] [PYSPARK] allow --name option in pyspark 2015-07-24 11:56:55 -07:00
shuffle.py [SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__ 2015-07-29 22:30:49 -07:00
statcounter.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
status.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
storagelevel.py [SPARK-3417] Use new-style classes in PySpark 2014-09-08 15:45:36 -07:00
tests.py [SPARK-9244] Increase some memory defaults 2015-07-22 15:28:09 -07:00
traceback_utils.py [SPARK-1087] Move python traceback utilities into new traceback_utils.py file. 2014-09-15 19:28:17 -07:00
worker.py [SPARK-6216] [PYSPARK] check python version of worker with driver 2015-05-18 12:55:13 -07:00