Add an option to disable reference tracking in Kryo

This commit is contained in:
Matei Zaharia 2013-07-15 01:55:54 +00:00
parent 238d0e6893
commit d47c16f78d
2 changed files with 15 additions and 1 deletions

View file

@ -210,6 +210,10 @@ class KryoSerializer extends spark.serializer.Serializer with Logging {
val reg = Class.forName(regCls, true, classLoader).newInstance().asInstanceOf[KryoRegistrator] val reg = Class.forName(regCls, true, classLoader).newInstance().asInstanceOf[KryoRegistrator]
reg.registerClasses(kryo) reg.registerClasses(kryo)
} }
// Allow disabling Kryo reference tracking if user knows their object graphs don't have loops
kryo.setReferences(System.getProperty("spark.kryo.referenceTracking", "true").toBoolean)
kryo kryo
} }

View file

@ -197,9 +197,19 @@ Apart from these, the following properties are also available, and may be useful
(e.g. map functions) reference large objects in the driver program. (e.g. map functions) reference large objects in the driver program.
</td> </td>
</tr> </tr>
<tr>
<td>spark.kryo.referenceTracking</td>
<td>true</td>
<td>
Whether to track references to the same object when serializing data with Kryo, which is
necessary if your object graphs have loops and useful for efficiency if they contain multiple
copies of the same object. Can be disabled to improve performance if you know this is not the
case.
</td>
</tr>
<tr> <tr>
<td>spark.kryoserializer.buffer.mb</td> <td>spark.kryoserializer.buffer.mb</td>
<td>32</td> <td>2</td>
<td> <td>
Maximum object size to allow within Kryo (the library needs to create a buffer at least as Maximum object size to allow within Kryo (the library needs to create a buffer at least as
large as the largest single object you'll serialize). Increase this if you get a "buffer limit large as the largest single object you'll serialize). Increase this if you get a "buffer limit