History

Maxim Gekk 63e4bf42c2 [SPARK-27401][SQL] Refactoring conversion of Timestamp to/from java.sql.Timestamp ## What changes were proposed in this pull request? In the PR, I propose simpler implementation of `toJavaTimestamp()`/`fromJavaTimestamp()` by reusing existing functions of `DateTimeUtils`. This will allow to: - Simply implementation of `toJavaTimestamp()`, and handle properly negative inputs. - Detect `Long` overflow in conversion of milliseconds (`java.sql.Timestamp`) to microseconds (Catalyst's Timestamp). ## How was this patch tested? By existing test suites `DateTimeUtilsSuite`, `DateFunctionsSuite`, `DateExpressionsSuite` and `CastSuite`. And by new benchmark for export/import timestamps added to `DateTimeBenchmark`: Before: ``` To/from java.sql.Timestamp: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ From java.sql.Timestamp 290 335 49 17.2 58.0 1.0X Collect longs 1234 1681 487 4.1 246.8 0.2X Collect timestamps 1718 1755 63 2.9 343.7 0.2X ``` After: ``` To/from java.sql.Timestamp: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ From java.sql.Timestamp 283 301 19 17.7 56.6 1.0X Collect longs 1048 1087 36 4.8 209.6 0.3X Collect timestamps 1425 1479 56 3.5 285.1 0.2X ``` Closes #24311 from MaxGekk/conv-java-sql-date-timestamp. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>		2019-04-09 15:42:27 -07:00
..
catalyst	[SPARK-27401][SQL] Refactoring conversion of Timestamp to/from java.sql.Timestamp	2019-04-09 15:42:27 -07:00
core	[SPARK-27401][SQL] Refactoring conversion of Timestamp to/from java.sql.Timestamp	2019-04-09 15:42:27 -07:00
hive	[SPARK-27271][SQL] Migrate Text to File Data Source V2	2019-04-08 10:15:22 -07:00
hive-thriftserver	[SPARK-26992][STS] Fix STS scheduler pool correct delivery	2019-04-06 17:14:29 -05:00
create-docs.sh	[MINOR][DOCS] Minor doc fixes related with doc build and uses script dir in SQL doc gen script	2017-08-26 13:56:24 +09:00
gen-sql-markdown.py	[SPARK-27328][SQL] Add 'deprecated' in ExpressionDescription for extended usage and SQL doc	2019-04-09 13:49:42 +08:00
mkdocs.yml	[SPARK-21485][SQL][DOCS] Spark SQL documentation generation for built-in functions	2017-07-26 09:38:51 -07:00
README.md	[MINOR][DOC] Fix some typos and grammar issues	2018-04-06 13:37:08 +08:00

README.md

Spark SQL

This module provides support for executing relational queries expressed in either SQL or the DataFrame/Dataset API.

Spark SQL is broken up into four subprojects:

Catalyst (sql/catalyst) - An implementation-agnostic framework for manipulating trees of relational operators and expressions.
Execution (sql/core) - A query planner / execution engine for translating Catalyst's logical query plans into Spark RDDs. This component also includes a new public interface, SQLContext, that allows users to execute SQL or LINQ statements against existing RDDs and Parquet files.
Hive Support (sql/hive) - Includes an extension of SQLContext called HiveContext that allows users to write queries using a subset of HiveQL and access data from a Hive Metastore using Hive SerDes. There are also wrappers that allow users to run queries that include Hive UDFs, UDAFs, and UDTFs.
HiveServer and CLI support (sql/hive-thriftserver) - Includes support for the SQL CLI (bin/spark-sql) and a HiveServer2 (for JDBC/ODBC) compatible server.

Running sql/create-docs.sh generates SQL documentation for built-in functions under sql/site.