spark-instrumented-optimizer/sql
Jungtaek Lim (HeartSaVioR) bd0a04baab [SPARK-26949][SS] Prevent 'purge' to remove needed batch files in CompactibleFileStreamLog
## What changes were proposed in this pull request?

This patch proposes making `purge` in `CompactibleFileStreamLog` to throw `UnsupportedOperationException` to prevent purging necessary batch files, as well as adding javadoc to document its behavior. Actually it would only break when latest compaction batch is requested to be purged, but caller wouldn't be aware of this so safer to just prevent it.

## How was this patch tested?

Added UT.

Closes #23850 from HeartSaVioR/SPARK-26949.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-06-14 20:34:18 -07:00
..
catalyst [SPARK-28057][SQL] Add method clone in catalyst TreeNode 2019-06-15 00:40:55 +02:00
core [SPARK-26949][SS] Prevent 'purge' to remove needed batch files in CompactibleFileStreamLog 2019-06-14 20:34:18 -07:00
hive [SPARK-27578][SQL] Support INTERVAL ... HOUR TO SECOND syntax 2019-06-13 10:12:55 -07:00
hive-thriftserver [SPARK-27899][SQL] Make HiveMetastoreClient.getTableObjectsByName available in ExternalCatalog/SessionCatalog API 2019-06-11 15:32:59 +08:00
create-docs.sh [MINOR][DOCS] Minor doc fixes related with doc build and uses script dir in SQL doc gen script 2017-08-26 13:56:24 +09:00
gen-sql-markdown.py [SPARK-27328][SQL] Add 'deprecated' in ExpressionDescription for extended usage and SQL doc 2019-04-09 13:49:42 +08:00
mkdocs.yml
README.md [MINOR][DOC] Fix some typos and grammar issues 2018-04-06 13:37:08 +08:00

Spark SQL

This module provides support for executing relational queries expressed in either SQL or the DataFrame/Dataset API.

Spark SQL is broken up into four subprojects:

  • Catalyst (sql/catalyst) - An implementation-agnostic framework for manipulating trees of relational operators and expressions.
  • Execution (sql/core) - A query planner / execution engine for translating Catalyst's logical query plans into Spark RDDs. This component also includes a new public interface, SQLContext, that allows users to execute SQL or LINQ statements against existing RDDs and Parquet files.
  • Hive Support (sql/hive) - Includes an extension of SQLContext called HiveContext that allows users to write queries using a subset of HiveQL and access data from a Hive Metastore using Hive SerDes. There are also wrappers that allow users to run queries that include Hive UDFs, UDAFs, and UDTFs.
  • HiveServer and CLI support (sql/hive-thriftserver) - Includes support for the SQL CLI (bin/spark-sql) and a HiveServer2 (for JDBC/ODBC) compatible server.

Running sql/create-docs.sh generates SQL documentation for built-in functions under sql/site.