spark-instrumented-optimizer/sql/catalyst/src/main
Sylvain Zimmer 2460f03ffe [SPARK-16826][SQL] Switch to java.net.URI for parse_url()
## What changes were proposed in this pull request?
The java.net.URL class has a globally synchronized Hashtable, which limits the throughput of any single executor doing lots of calls to parse_url(). Tests have shown that a 36-core machine can only get to 10% CPU use because the threads are locked most of the time.

This patch switches to java.net.URI which has less features than java.net.URL but focuses on URI parsing, which is enough for parse_url().

New tests were added to make sure a few common edge cases didn't change behaviour.
https://issues.apache.org/jira/browse/SPARK-16826

## How was this patch tested?
I've kept the old URL code commented for now, so that people can verify that the new unit tests do pass with java.net.URL.

Thanks to srowen for the help!

Author: Sylvain Zimmer <sylvain@sylvainzimmer.com>

Closes #14488 from sylvinus/master.
2016-08-05 20:55:58 +01:00
..
antlr4/org/apache/spark/sql/catalyst/parser [SPARK-16836][SQL] Add support for CURRENT_DATE/CURRENT_TIMESTAMP literals 2016-08-02 10:09:47 -07:00
java/org/apache/spark/sql [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGenerator 2016-07-26 18:08:07 -07:00
scala/org/apache/spark/sql [SPARK-16826][SQL] Switch to java.net.URI for parse_url() 2016-08-05 20:55:58 +01:00