Testing network / NAT performance

Rafał Miłecki zajec5 at gmail.com
Tue Jun 14 06:16:18 PDT 2022


On 12.06.2022 21:58, Rafał Miłecki wrote:
> 5. 7125323b81d7 ("bcm53xx: switch to kernel 5.4")
> 
> Improved network speed by 25% (256 Mb/s → 320 Mb/s).
> 
> I didn't have time to bisect this *improvement* to a single kernel
> commit. I tried profiling but it isn't obvious to me what caused that
> improvement.
> 
> Kernel 4.19:
>      11.94%  ksoftirqd/0      [kernel.kallsyms]       [k] v7_dma_inv_range
>       7.06%  ksoftirqd/0      [kernel.kallsyms]       [k] l2c210_inv_range
>       3.37%  ksoftirqd/0      [kernel.kallsyms]       [k] v7_dma_clean_range
>       2.80%  ksoftirqd/0      [kernel.kallsyms]       [k] l2c210_clean_range
>       2.67%  ksoftirqd/0      [kernel.kallsyms]       [k] bgmac_poll
>       2.63%  ksoftirqd/0      [kernel.kallsyms]       [k] __dev_queue_xmit
>       2.43%  ksoftirqd/0      [kernel.kallsyms]       [k] __netif_receive_skb_core
>       2.13%  ksoftirqd/0      [kernel.kallsyms]       [k] bgmac_start_xmit
>       1.82%  ksoftirqd/0      [kernel.kallsyms]       [k] nf_hook_slow
>       1.54%  ksoftirqd/0      [kernel.kallsyms]       [k] ip_forward
>       1.50%  ksoftirqd/0      [kernel.kallsyms]       [k] dma_cache_maint_page
> 
> Kernel 5.4:
>      14.53%  ksoftirqd/0      [kernel.kallsyms]       [k] v7_dma_inv_range
>       8.02%  ksoftirqd/0      [kernel.kallsyms]       [k] l2c210_inv_range
>       3.32%  ksoftirqd/0      [kernel.kallsyms]       [k] bgmac_poll
>       3.28%  ksoftirqd/0      [kernel.kallsyms]       [k] v7_dma_clean_range
>       3.12%  ksoftirqd/0      [kernel.kallsyms]       [k] __netif_receive_skb_core
>       2.70%  ksoftirqd/0      [kernel.kallsyms]       [k] l2c210_clean_range
>       2.46%  ksoftirqd/0      [kernel.kallsyms]       [k] __dev_queue_xmit
>       2.26%  ksoftirqd/0      [kernel.kallsyms]       [k] bgmac_start_xmit
>       1.73%  ksoftirqd/0      [kernel.kallsyms]       [k] __dma_page_dev_to_cpu
>       1.72%  ksoftirqd/0      [kernel.kallsyms]       [k] nf_hook_slow

Riddle solved. Change to bless/blame: 4e0c54bc5bc8 ("kernel: add support
for kernel 5.4").

First of all bcm53xx uses
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y


OpenWrt's kernel Makefile in kernel 4.19:

ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
KBUILD_CFLAGS	+= -Os $(EXTRA_OPTIMIZATION)
else
KBUILD_CFLAGS   += -O2 -fno-reorder-blocks -fno-tree-ch $(EXTRA_OPTIMIZATION)
endif


OpenWrt's kernel Makefile in 5.4:

ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE
KBUILD_CFLAGS += -O2 $(EXTRA_OPTIMIZATION)
else ifdef CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3
KBUILD_CFLAGS += -O3 $(EXTRA_OPTIMIZATION)
else ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
KBUILD_CFLAGS += -Os -fno-reorder-blocks -fno-tree-ch $(EXTRA_OPTIMIZATION)
endif


As you can see 4e0c54bc5bc8 has accidentally moved -fno-reorder-blocks
from !CONFIG_CC_OPTIMIZE_FOR_SIZE to CONFIG_CC_OPTIMIZE_FOR_SIZE.

I've noticed problem with -fno-reorder-blocks long time ago, see:
[PATCH RFC] kernel: drop -fno-reorder-blocks
https://patchwork.ozlabs.org/project/openwrt/patch/20190409093046.13401-1-zajec5@gmail.com/

It should really get sorted out...



More information about the openwrt-devel mailing list