ba81b92402
### What changes were proposed in this pull request? Supporting Netty level logging at the network layer. To configure Netty level logging a `LogHandler` must be added to the channel pipeline. In this PR I have introduced a new class `NettyLogger` which is able to construct a log handler depending on the log level: - in case of `log4j.logger.org.apache.spark.network.util.NettyLogger=DEBUG`: a custom log handler is created which does not dump the message contents. This way the log is a bit more compact. Moreover when network level encryption is switched on this level might be sufficient. - in case of `log4j.logger.org.apache.spark.network.util.NettyLogger=TRACE`: Netty's own log handler is used which dumps the message contents. - otherwise (when the logger is not TRACE or DEBUG) the pipeline does not contain a log handler (there is no runtime penalty for the default setting but a long running app/service must be restarted along with the new log level to have an effect). ### Why are the changes needed? This level of logging proved to be sufficient during debugging some external shuffle related problem. Compared with the tcpdump this log lines can be more easily correlated with the Spark internal calls. Moreover the log layout can be configured to contain the thread names that way for a timeout a busy thread could be identified. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manually. #### DEBUG level ``` ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ tail -1 ./conf/log4j.properties log4j.logger.org.apache.spark.network.util.NettyLogger=DEBUG ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ ./bin/spark-submit --class org.apache.spark.examples.JavaWordCount --master local\[8\] ./examples/target/original-spark-examples_2.12-3.3.0-SNAPSHOT.jar README.md 2> >(grep NettyLogger) 1> /dev/null 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf] REGISTERED 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf] CONNECT: /172.30.64.219:61014 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] ACTIVE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] REGISTERED 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] ACTIVE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] WRITE 66B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] FLUSH 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] READ 66B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] WRITE: MessageWithHeader [headerLength: 74, bodyLength: 1552705] 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] FLUSH 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 74B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 - R:/172.30.64.219:61015] READ COMPLETE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 2048B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 32768B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 65536B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ 10561B 21/09/10 15:24:35 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 ! R:/172.30.64.219:61015] INACTIVE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 - R:/172.30.64.219:61014] READ COMPLETE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 ! R:/172.30.64.219:61014] INACTIVE 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0xb9d94fcf, L:/172.30.64.219:61015 ! R:/172.30.64.219:61014] UNREGISTERED 21/09/10 15:24:40 DEBUG NettyLogger: [id: 0x28101520, L:/172.30.64.219:61014 ! R:/172.30.64.219:61015] UNREGISTERED ``` #### TRACE level ``` ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ tail -1 ./conf/log4j.properties log4j.logger.org.apache.spark.network.util.NettyLogger=TRACE ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-36719*› ╰─$ ./bin/spark-submit --class org.apache.spark.examples.JavaWordCount --master local\[8\] ./examples/target/original-spark-examples_2.12-3.3.0-SNAPSHOT.jar README.md 1> /dev/null 2>&1 ... 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786] REGISTERED 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786] CONNECT: /172.30.64.219:61044 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] ACTIVE 21/09/10 15:29:14 INFO TransportClientFactory: Successfully created connection to /172.30.64.219:61044 after 37 ms (0 ms spent in bootstraps) 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] REGISTERED 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] ACTIVE 21/09/10 15:29:14 INFO Utils: Fetching spark://172.30.64.219:61044/jars/original-spark-examples_2.12-3.3.0-SNAPSHOT.jar to /private/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/spark-91e059f5-1e29-4727-8602-f81206bbe48b/userFiles-50b48490-8950-4c46-b3d3-61a2c85412a3/fetchFileTemp8803030587223485061.tmp 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] WRITE: 66B +-------------------------------------------------+ | 0 1 2 3 4 5 6 7 8 9 a b c d e f | +--------+-------------------------------------------------+----------------+ |00000000| 00 00 00 00 00 00 00 42 06 00 00 00 35 2f 6a 61 |.......B....5/ja| |00000010| 72 73 2f 6f 72 69 67 69 6e 61 6c 2d 73 70 61 72 |rs/original-spar| |00000020| 6b 2d 65 78 61 6d 70 6c 65 73 5f 32 2e 31 32 2d |k-examples_2.12-| |00000030| 33 2e 33 2e 30 2d 53 4e 41 50 53 48 4f 54 2e 6a |3.3.0-SNAPSHOT.j| |00000040| 61 72 |ar | +--------+-------------------------------------------------+----------------+ 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] FLUSH 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] READ: 66B +-------------------------------------------------+ | 0 1 2 3 4 5 6 7 8 9 a b c d e f | +--------+-------------------------------------------------+----------------+ |00000000| 00 00 00 00 00 00 00 42 06 00 00 00 35 2f 6a 61 |.......B....5/ja| |00000010| 72 73 2f 6f 72 69 67 69 6e 61 6c 2d 73 70 61 72 |rs/original-spar| |00000020| 6b 2d 65 78 61 6d 70 6c 65 73 5f 32 2e 31 32 2d |k-examples_2.12-| |00000030| 33 2e 33 2e 30 2d 53 4e 41 50 53 48 4f 54 2e 6a |3.3.0-SNAPSHOT.j| |00000040| 61 72 |ar | +--------+-------------------------------------------------+----------------+ 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] WRITE: MessageWithHeader [headerLength: 74, bodyLength: 1552705] 21/09/10 15:29:14 TRACE NettyLogger: [id: 0x362fc693, L:/172.30.64.219:61044 - R:/172.30.64.219:61045] FLUSH 21/09/10 15:29:14 TRACE NettyLogger: [id: 0xf1d25786, L:/172.30.64.219:61045 - R:/172.30.64.219:61044] READ: 74B ... ``` Closes #33962 from attilapiros/SPARK-36719. Authored-by: attilapiros <piros.attila.zsolt@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> |
||
---|---|---|
.. | ||
kvstore | ||
network-common | ||
network-shuffle | ||
network-yarn | ||
sketch | ||
tags | ||
unsafe |