History

Kevin Yu d22db62785 [SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN subquery 2nd batch ## What changes were proposed in this pull request? This is 2nd batch of test case for IN/NOT IN subquery. In this PR, it has these test cases: `in-limit.sql` `in-order-by.sql` `not-in-group-by.sql` These are the queries and results from running on DB2. [in-limit DB2 version](https://github.com/apache/spark/files/743267/in-limit.sql.db2.out.txt) [in-order-by DB2 version](https://github.com/apache/spark/files/743269/in-order-by.sql.db2.txt) [not-in-group-by DB2 version](https://github.com/apache/spark/files/743271/not-in-group-by.sql.db2.txt) [output of in-limit.sql DB2](https://github.com/apache/spark/files/743276/in-limit.sql.db2.out.txt) [output of in-order-by.sql DB2](https://github.com/apache/spark/files/743278/in-order-by.sql.db2.out.txt) [output of not-in-group-by.sql DB2](https://github.com/apache/spark/files/743279/not-in-group-by.sql.db2.out.txt) ## How was this patch tested? This pr is adding new test cases. Author: Kevin Yu <qyu@us.ibm.com> Closes #16759 from kevinyu98/spark-18871-2.		2017-02-15 17:28:42 +01:00
..
catalyst	[SPARK-17076][SQL] Cardinality estimation for join based on basic column statistics	2017-02-15 08:21:51 -08:00
core	[SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN subquery 2nd batch	2017-02-15 17:28:42 +01:00
hive	[SPARK-19587][SQL] bucket sorting columns should not be picked from partition columns	2017-02-15 08:15:03 -08:00
hive-thriftserver	[SPARK-18857][SQL] Don't use `Iterator.duplicate` for `incrementalCollect` in Thrift Server	2017-01-10 13:27:55 +00:00
README.md	[SPARK-16557][SQL] Remove stale doc in sql/README.md	2016-07-14 19:24:42 -07:00

README.md

Spark SQL

This module provides support for executing relational queries expressed in either SQL or the DataFrame/Dataset API.

Spark SQL is broken up into four subprojects:

Catalyst (sql/catalyst) - An implementation-agnostic framework for manipulating trees of relational operators and expressions.
Execution (sql/core) - A query planner / execution engine for translating Catalyst's logical query plans into Spark RDDs. This component also includes a new public interface, SQLContext, that allows users to execute SQL or LINQ statements against existing RDDs and Parquet files.
Hive Support (sql/hive) - Includes an extension of SQLContext called HiveContext that allows users to write queries using a subset of HiveQL and access data from a Hive Metastore using Hive SerDes. There are also wrappers that allows users to run queries that include Hive UDFs, UDAFs, and UDTFs.
HiveServer and CLI support (sql/hive-thriftserver) - Includes support for the SQL CLI (bin/spark-sql) and a HiveServer2 (for JDBC/ODBC) compatible server.