27e815c31d
## What changes were proposed in this pull request? Implements `eval()` method for expression `AssertNotNull` so that we can convert local projection on LocalRelation to another LocalRelation. ### Before change: ``` scala> import org.apache.spark.sql.catalyst.dsl.expressions._ scala> import org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull scala> import org.apache.spark.sql.Column scala> case class A(a: Int) scala> Seq((A(1),2)).toDS().select(new Column(AssertNotNull("_1".attr, Nil))).explain java.lang.UnsupportedOperationException: Only code-generated evaluation is supported. at org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull.eval(objects.scala:850) ... ``` ### After the change: ``` scala> Seq((A(1),2)).toDS().select(new Column(AssertNotNull("_1".attr, Nil))).explain(true) == Parsed Logical Plan == 'Project [assertnotnull('_1) AS assertnotnull(_1)#5] +- LocalRelation [_1#2, _2#3] == Analyzed Logical Plan == assertnotnull(_1): struct<a:int> Project [assertnotnull(_1#2) AS assertnotnull(_1)#5] +- LocalRelation [_1#2, _2#3] == Optimized Logical Plan == LocalRelation [assertnotnull(_1)#5] == Physical Plan == LocalTableScan [assertnotnull(_1)#5] ``` ## How was this patch tested? Unit test. Author: Sean Zhong <seanzhong@databricks.com> Closes #14486 from clockfly/assertnotnull_eval. |
||
---|---|---|
.. | ||
catalyst | ||
core | ||
hive | ||
hive-thriftserver | ||
README.md |
Spark SQL
This module provides support for executing relational queries expressed in either SQL or the DataFrame/Dataset API.
Spark SQL is broken up into four subprojects:
- Catalyst (sql/catalyst) - An implementation-agnostic framework for manipulating trees of relational operators and expressions.
- Execution (sql/core) - A query planner / execution engine for translating Catalyst's logical query plans into Spark RDDs. This component also includes a new public interface, SQLContext, that allows users to execute SQL or LINQ statements against existing RDDs and Parquet files.
- Hive Support (sql/hive) - Includes an extension of SQLContext called HiveContext that allows users to write queries using a subset of HiveQL and access data from a Hive Metastore using Hive SerDes. There are also wrappers that allows users to run queries that include Hive UDFs, UDAFs, and UDTFs.
- HiveServer and CLI support (sql/hive-thriftserver) - Includes support for the SQL CLI (bin/spark-sql) and a HiveServer2 (for JDBC/ODBC) compatible server.