CloudCoreRouter And RouterOS V6.x Tips And Tricks!

Transcription

CloudCoreRouter andRouterOS v6.xTips and tricks!MoscowMUM Russia 20141

RouterOS v6 Tile architecture First 64bit RouterOSMulti Memory Channelsupport (faster RAM) Hardware AcceleratedMulti-Threading(no RPS and IRQ needed) Hardware AcceleratedEncryption 2

3

4

Yes, still - Packet Flow Diagram(page 3)5

6

7

8

Multi-Core Packet Processing On receive packet gets assigned to a CPU coreRouterOS is trying to keep packets from thesame connection assigned to same CPU coreIf CPU core is overloaded single connectionpackets will be distributed between coresRe-assigning packet from one CPU core toanother is very “expensive” processProcessing packet on each separate CPU coremight take different amount of time – packetorder might change during the processing9

Fast Path Fast Path allows to forward packets withoutadditional processing in the Linux Kernel. Itimproves forwarding speeds significantly.Fast path requirements–Fast Path should be allowed in configuration–Interface driver must have support–Specific configuration conditionsCurrently RouterOS has fast path handlers for:ipv4 routing, traffic generator, mpls, bridgeMore handlers will be added in future10

New Throughput test results11

Throughput in millions pps12

Traffic Generator Tool Traffic Generator is a bandwidth-tool evolution Traffic Generator can:–Determine transfer rates, packet loss–Detect out-of-order packets–Collect latency and jitter values–Inject and replay *.pcap file–Working on TCP protocol emulation “Quick” mode Full Winbox support (coming soon)13

Queuing Changes Packets can be placed in queue by any numberof CPU cores, but processed and taken out ofqueue only by a single CPU coreIn RouterOS v5.x there was several differentplaces in packets “life-cycle” where it can bequeuedIn RouterOS v6.x QoS system was redesignedso that queuing happens is the same placerespectively to other processes in the router.Now all queuing happens at the very end ofpacket's “life-cycle” in the router14

HTB in RouterOS v515

HTB in RouterOS v616

Simple Queues Matching algorithm hasbeen updated–based on hash–faster miss-matchesAt least 32 top levelqueues are necessaryto fully utilize CCR1036potential ( 9x fasterthan single queue)17

Queue Tree and CCR Currently (RouterOS v6.11) only one CPU corecan take packets out from one HTB treeWe are working on possible update of HTBalgorithm, or introducing completely newmethod instead of HTBSuggestions:–Use Interface HTB as much as possible to offloadtraffic from HTB “global”–Use simple queues18

PPTP,L2TP and PPPoE on CCR Changes introduced in v6.8:–kernel drivers for ppp, pppoe, pptp, l2tp now arelock-less on transmit & receive–all ppp packets (except discovery packets) now canbe handled by multiple cores–MPPE driver now can handle up to 256 out-of-orderpackets (Previously even single out-of-order packetwas dropped)–roughly doubled MPPE driver encryptionperformance19

Single PPTP Tunnel Performanceon CCR1036in packets per second with 0,01% loss tolerance20

Single L2TP Tunnel Performanceon CCR1036in packets per second with 0,01% loss tolerance21

Single PPPoE Tunnel Performanceon CCR1036in packets per second with 0,01% loss tolerance22

CCR and Packet Fragment Currently (in RouterOS v6.11) ConnectionTracking required packet to be re-assembledbefore further processingIt is impossible to ensure that all fragments ofthe packet is received by the same CPU coreProcess that stores and waits for fragments tore-assemble nullifies all multi-core benefitsWe plan to–add full support to Path MTU Discovery to alltunnels and interfaces–Update Connection Tracking to handle fragments.23

Firewall Efficiency Each Firewall rule in RouterOS takes adedicated place in system memory (RAM)CPU need to process a packet through all rulesthat packet passes before it is captured by aruleReducing Average number of rules that packetneed to pass before it is captured cansignificantly improve your firewall performance Make use of action jump Simplify rules.24

Changes in the Firewall Firewall now has ��all-ppp” as possibilities in interfacematchingOnly 2 dynamic “change-mss” mangle rules arecreated for “all-ppp” interfacesNew Mangle Actions “snif-tzsp”,”snif-pc” to sendpacket stream to remote sniffer.25

Layer-7 Layer-7 is the most “expensive” firewall option,it takes a lot of memory and processing powerto match each connection to regexp string.Layer-7 should be used only on traffic that can'tbe identified any different wayLayer-7 should be used only as trigger - useconnection-mark or address-list to keep track ofrelated packets or connectionsDo not use direct action (like accept, drop) inLayer-7 rule26

Routing and CCR Packet routing can utilize all coresAll dynamic routing protocols (more precisely routing table updates and protocol calculations)in RouterOS v6.x are limited to a single core.–One BGP full feed will take 1-3min to load on CCR–Two BGP full feeds will take 6min to load on CCRTry to avoid configurations that continuouslyupdates routing tableAll routing protocols will be updated to multicore for RouterOS v727

IPSec and CCR Hardware acceleration support for aes-cbc md5 sha1 sha256 Authenticated Encryptionwith Associated Data (AEAD) was added onCCR in RouterOS v6.8Now CCR1036 can handle 3,2Gbps encryptedIPSec traffic–Maintaining 80% CPU load–No fragmentation (1470byte packets)–Many peers (100 separate tunnels)–AES128 was used28

Tools /system resources cpu /tool profile29

Partitions Partition will always allow you to keep oneworking copy of RouterOS just one reboot awayand backup configuration before mayorchanges30

Questions!!!31

10 Fast Path Fast Path allows to forward packets without additional processing in the Linux Kernel. It improves forwarding speeds significantly. Fast path requirements – Fast Path should be allowed in configuration – Interface driver must have support – Specific configuration conditions Currently RouterOS has fast path handlers for: ipv4 routing, traffic generator, mpls, bridge