spark-instrumented-optimizer/common
Tejas Patil c96d14abae [SPARK-19843][SQL] UTF8String => (int / long) conversion expensive for invalid inputs
## What changes were proposed in this pull request?

Jira : https://issues.apache.org/jira/browse/SPARK-19843

Created wrapper classes (`IntWrapper`, `LongWrapper`) to wrap the result of parsing (which are primitive types). In case of problem in parsing, the method would return a boolean.

## How was this patch tested?

- Added new unit tests
- Ran a prod job which had conversion from string -> int and verified the outputs

## Performance

Tiny regression when all strings are valid integers

```
conversion to int:       Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------
trunk                         502 /  522         33.4          29.9       1.0X
SPARK-19843                   493 /  503         34.0          29.4       1.0X
```

Huge gain when all strings are invalid integers
```
conversion to int:      Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
-------------------------------------------------------------------------------
trunk                     33913 / 34219          0.5        2021.4       1.0X
SPARK-19843                  154 /  162        108.8           9.2     220.0X
```

Author: Tejas Patil <tejasp@fb.com>

Closes #17184 from tejasapatil/SPARK-19843_is_numeric_maybe.
2017-03-07 20:19:30 -08:00
..
network-common [MINOR][BUILD] Fix lint-java breaks in Java 2017-02-27 08:44:26 +00:00
network-shuffle [SPARK-19534][TESTS] Convert Java tests to use lambdas, Java 8 features 2017-02-19 09:42:50 -08:00
network-yarn [SPARK-19139][CORE] New auth mechanism for transport library. 2017-01-24 10:44:04 -08:00
sketch [SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support 2017-02-16 12:32:45 +00:00
tags [SPARK-18993][BUILD] Unable to build/compile Spark in IntelliJ due to missing Scala deps in spark-tags 2016-12-28 12:17:33 +00:00
unsafe [SPARK-19843][SQL] UTF8String => (int / long) conversion expensive for invalid inputs 2017-03-07 20:19:30 -08:00