spark-instrumented-optimizer

History

hyukjinkwon 4f7f1c4362 [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD ## What changes were proposed in this pull request? This PR makes `input_file_name()` function return the file paths not empty strings for external data sources based on `NewHadoopRDD`, such as [spark-redshift](`cba5eee1ab/src/main/scala/com/databricks/spark/redshift/RedshiftRelation.scala (L149)`) and [spark-xml](https://github.com/databricks/spark-xml/blob/master/src/main/scala/com/databricks/spark/xml/util/XmlFile.scala#L39-L47). The codes with the external data sources below: ```scala df.select(input_file_name).show() ``` will produce - Before ``` +-----------------+ \|input_file_name()\| +-----------------+ \| \| +-----------------+ ``` - After ``` +--------------------+ \| input_file_name()\| +--------------------+ \|file:/private/var...\| +--------------------+ ``` ## How was this patch tested? Unit tests in `ColumnExpressionSuite`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #13759 from HyukjinKwon/SPARK-16044.	2016-06-20 21:55:34 -07:00
..
src	[SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD	2016-06-20 21:55:34 -07:00
pom.xml	[SPARK-15851][BUILD] Fix the call of the bash script to enable proper run in Windows	2016-06-15 20:11:23 -07:00

hyukjinkwon 4f7f1c4362 [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD

## What changes were proposed in this pull request?

This PR makes `input_file_name()` function return the file paths not empty strings for external data sources based on `NewHadoopRDD`, such as [spark-redshift](cba5eee1ab/src/main/scala/com/databricks/spark/redshift/RedshiftRelation.scala (L149)) and [spark-xml](https://github.com/databricks/spark-xml/blob/master/src/main/scala/com/databricks/spark/xml/util/XmlFile.scala#L39-L47).

The codes with the external data sources below:

```scala
df.select(input_file_name).show()
```

will produce

- **Before**
  ```
+-----------------+
|input_file_name()|
+-----------------+
|                 |
+-----------------+
```

- **After**
  ```
+--------------------+
|   input_file_name()|
+--------------------+
|file:/private/var...|
+--------------------+
```

## How was this patch tested?

Unit tests in `ColumnExpressionSuite`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #13759 from HyukjinKwon/SPARK-16044.

2016-06-20 21:55:34 -07:00

src [SPARK-16044][SQL] input_file_name() returns empty strings in data sources based on NewHadoopRDD 2016-06-20 21:55:34 -07:00

pom.xml [SPARK-15851][BUILD] Fix the call of the bash script to enable proper run in Windows 2016-06-15 20:11:23 -07:00