spark-instrumented-optimizer

History

Tathagata Das 90b11439b3 [SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming ## What changes were proposed in this pull request? Currently structured streaming only supports append output mode. This PR adds the following. - Added support for Complete output mode in the internal state store, analyzer and planner. - Added public API in Scala and Python for users to specify output mode - Added checks for unsupported combinations of output mode and DF operations - Plans with no aggregation should support only Append mode - Plans with aggregation should support only Update and Complete modes - Default output mode is Append mode (Question: should we change this to automatically set to Complete mode when there is aggregation?) - Added support for Complete output mode in Memory Sink. So Memory Sink internally supports append and complete, update. But from public API only Complete and Append output modes are supported. ## How was this patch tested? Unit tests in various test suites - StreamingAggregationSuite: tests for complete mode - MemorySinkSuite: tests for checking behavior in Append and Complete modes. - UnsupportedOperationSuite: tests for checking unsupported combinations of DF ops and output modes - DataFrameReaderWriterSuite: tests for checking that output mode cannot be called on static DFs - Python doc test and existing unit tests modified to call write.outputMode. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #13286 from tdas/complete-mode.		2016-05-31 15:57:01 -07:00
..
__init__.py	[SPARK-14945][PYTHON] SparkSession Python API	2016-04-28 10:55:48 -07:00
catalog.py	[SPARK-15464][ML][MLLIB][SQL][TESTS] Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code	2016-05-23 18:14:48 -07:00
column.py	[SPARK-15464][ML][MLLIB][SQL][TESTS] Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code	2016-05-23 18:14:48 -07:00
conf.py	[SPARK-15464][ML][MLLIB][SQL][TESTS] Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code	2016-05-23 18:14:48 -07:00
context.py	[SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified	2016-05-19 21:53:26 -07:00
dataframe.py	[SPARK-15392][SQL] fix default value of size estimation of logical plan	2016-05-19 12:12:42 -07:00
functions.py	[MINOR] Fix Typos 'a -> an'	2016-05-26 22:39:14 -07:00
group.py	[SPARK-15464][ML][MLLIB][SQL][TESTS] Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code	2016-05-23 18:14:48 -07:00
readwriter.py	[SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming	2016-05-31 15:57:01 -07:00
session.py	[SPARK-15520][SQL] Also set sparkContext confs when using SparkSession builder in pyspark	2016-05-26 12:05:47 -07:00
streaming.py	[SPARK-14896][SQL] Deprecate HiveContext in python	2016-05-04 17:39:30 -07:00
tests.py	[SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming	2016-05-31 15:57:01 -07:00
types.py	[SPARK-15342] [SQL] [PYSPARK] PySpark test for non ascii column name does not actually test with unicode column name	2016-05-18 11:18:33 -07:00
utils.py	[SPARK-14603][SQL][FOLLOWUP] Verification of Metadata Operations by Session Catalog	2016-05-19 11:46:11 -07:00
window.py	[SPARK-14058][PYTHON] Incorrect docstring in Window.order	2016-03-21 23:52:33 -07:00