spark-instrumented-optimizer

History

LantaoJin 63e0711524 [SPARK-27899][SQL] Make HiveMetastoreClient.getTableObjectsByName available in ExternalCatalog/SessionCatalog API ## What changes were proposed in this pull request? The new Spark ThriftServer SparkGetTablesOperation implemented in https://github.com/apache/spark/pull/22794 does a catalog.getTableMetadata request for every table. This can get very slow for large schemas (~50ms per table with an external Hive metastore). Hive ThriftServer GetTablesOperation uses HiveMetastoreClient.getTableObjectsByName to get table information in bulk, but we don't expose that through our APIs that go through Hive -> HiveClientImpl (HiveClient) -> HiveExternalCatalog (ExternalCatalog) -> SessionCatalog. If we added and exposed getTableObjectsByName through our catalog APIs, we could resolve that performance problem in SparkGetTablesOperation. ## How was this patch tested? Add UT Closes #24774 from LantaoJin/SPARK-27899. Authored-by: LantaoJin <jinlantao@gmail.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>		2019-06-11 15:32:59 +08:00
..
src	[SPARK-27899][SQL] Make HiveMetastoreClient.getTableObjectsByName available in ExternalCatalog/SessionCatalog API	2019-06-11 15:32:59 +08:00
v1.2.1	[SPARK-27749][SQL] hadoop-3.2 support hive-thriftserver	2019-06-05 08:40:05 -07:00
v2.3.5	[SPARK-27749][SQL] hadoop-3.2 support hive-thriftserver	2019-06-05 08:40:05 -07:00
pom.xml	[SPARK-27831][SQL][TEST] Move Hive test jars to maven dependency	2019-06-02 20:23:08 -07:00