[DOCS][MINOR] Update sql-performance-tuning.md
### What changes were proposed in this pull request? Update "Caching Data in Memory" section, add suggestion to call DataFrame `unpersist` method to make it consistent with previous suggestion of using `persist` method. ### Why are the changes needed? Keep documentation consistent. ### Does this PR introduce _any_ user-facing change? Yes, fixes the user-facing docs. ### How was this patch tested? Manually. Closes #33069 from Silverlight42/caching-data-doc. Authored-by: Carlos Peña <Cdpm42@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
parent
2da42ca3b4
commit
c22f17c573
|
@ -29,7 +29,7 @@ turning on some experimental options.
|
|||
|
||||
Spark SQL can cache tables using an in-memory columnar format by calling `spark.catalog.cacheTable("tableName")` or `dataFrame.cache()`.
|
||||
Then Spark SQL will scan only required columns and will automatically tune compression to minimize
|
||||
memory usage and GC pressure. You can call `spark.catalog.uncacheTable("tableName")` to remove the table from memory.
|
||||
memory usage and GC pressure. You can call `spark.catalog.uncacheTable("tableName")` or `dataFrame.unpersist()` to remove the table from memory.
|
||||
|
||||
Configuration of in-memory caching can be done using the `setConf` method on `SparkSession` or by running
|
||||
`SET key=value` commands using SQL.
|
||||
|
|
Loading…
Reference in a new issue