f34b872aed
## What changes were proposed in this pull request? According to Brian Goetz et al in Java Concurrency in Practice, the double checked locking pattern has worked since Java 5, but only if the resource is declared volatile: > Subsequent changes in the JMM (Java 5.0 and later) have enabled DCL to work if resource is made volatile, and the performance impact of this is small since volatile reads are usually only slightly more expensive than nonvolatile reads. CachedRDDBuilder. cachedColumnBuffers and CachedRDDBuilder.clearCache both use DCL to manage the resource ``_cachedColumnBuffers``. The missing ingredient is that ``_cachedColumnBuffers`` is not volatile. Because of this, clearCache may see ``_cachedColumnBuffers`` as null, when in fact it is not, and therefore fail to un-cache the RDD. There may be other, more subtle bugs due to visibility issues. To avoid these issues, this PR makes ``_cachedColumnBuffers`` volatile. ## How was this patch tested? - Existing SQL unit tests - Existing pyspark-sql tests Closes #23768 from bersprockets/SPARK-26851. Authored-by: Bruce Robbins <bersprockets@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |