How do you find the bottleneck of a network?

wop@infosec.pub · edit-2 1 year ago

How do you find the bottleneck of a network?

MSgtRedFox@infosec.pub · edit-2 10 months ago

Does fortigate not have a form of DMVPN like Cisco?

Just curious why ISP/third party MPLS? Purely interest.

Also, did you find this purely from user complaining or have monitoring tool?

I’m assuming using third party was supposed to offload the work/config from you?

wop@infosec.pub · 10 months ago

Does fortigate not have a form of DMVPN like Cisco?

ADVPN (Auto-discovery VPN) seems to be the equivalent. https://docs.fortinet.com/document/fortimanager/7.2.0/single-datacenter-for-enterprise/282533/advpn

Just curious why ISP/third party MPLS? Purely interest.

I guess it was easier at some point? - Taht was way before my time there. But we are going to replace the MPLS part with simple internet-breakout points on location and the the rest with SDWAN.

Also, did you find this purely from user complaining or have monitoring tool?

Purely from users complaining and other departments getting frustrated about why their stuff was not working (e.g. Citrix). The new FW had to be installed in a short time and ‘everything’ worked fine at first. Problems only occurred after some load was put on the network. We failed - as in network dep - by NOT doing a stress/limit test of the network and finding this problem immediately, and NOT implementing some kind of monitoring that would have notified us of all those lost packets and connections. We caught up, but we should have done it in the first place, because it is necessary.

I’m assuming using third party was supposed to offload the work/config from you?

Do you mean the ISP/MPLS provider? - If so, not really.

MSgtRedFox@infosec.pub · 10 months ago

Good lessons learned here. Thanks for sharing.

wop@infosec.pub · 1 year ago

Ping - Update 2 @Avian_Carrier@infosec.pub @jharrison@infosec.pub @SgtKetchup@infosec.pub

I hope it is ok to ping you.

wop@infosec.pub · edit-2 1 year ago

Ping - Update 2 @Avian_Carrier@infosec.pub @jharrison@infosec.pub @SgtKetchup@infosec.pub

Ping - Update 3 @Avian_Carrier@infosec.pub @jharrison@infosec.pub @SgtKetchup@infosec.pub

Avian_Carrier@infosec.pub · 1 year ago

Figured you’d discover something like this. This behavior is usually caused by MTU/duplicate IP.

Good job sir :)

Hope to see a final update from you.

wop@infosec.pub · 1 year ago

Yeah, after more testing, we can say that the second IPStunnel was the issue. Re-worked the route over a single tunnel and the whole 100 Mbps are available again. Users are happy, I am happy. Even tho a little bit frustrating.

Thank you for your input!

Avian_Carrier@infosec.pub · 1 year ago

Good fucking job. Celebrate this weekend.

Avian_Carrier@infosec.pub · edit-2 1 year ago

Thank you for the ping and the update!

Looks like you’re on the right path to chasing the gremlins out. I’m glad iperf3 was helpful to you. It has helped me out tremendously many times.

For the record, you can always ping me anytime. I’m here to help and Lemmy notifications don’t work half the time. But direct mentions always work.

Please keep me in the loop with further updates. At this time, nothing further to add from me. You’re doing the right things.

wop@infosec.pub · 1 year ago

Yeah, notifications are really unreliable here. I’ve got another window for more stress test today. Going to post update later, or tomorrow. Focus on MTU/MSS

Avian_Carrier@infosec.pub · edit-2 1 year ago

@wop@infosec.pub Apologies for the delay. I’ve been very tired lately. I’m going to most likely repeat some of the things others have mentioned and what you’ve already noted, but this would be my t/s process. (NOTE: all tests should be ran on the endpoints, not network infra)

Traceroute from UK -> Germany and Germany -> UK. Look for latency spikes. The reason I say do both directions is that sometimes there is weird pathing issues present that only show in the opposite direction.
iperf 3 from UK -> Germany and Germany -> UK.

2a. Clear counters on switches/routers/firewalls.
2b. During an extended iperf test, look for interface errors, CPU usage on the devices in path.
2c. This is tedious and will take time, but you’re dealing with gremlins.

TCPdump on both sides during a transaction. Check for re-xmits and window scaling problems. Most likely not the endpoints, but something to rule out.
Monitor fortigate logs during all of this
Setup test boxes in UK and Germany that are exempt from IPSec tunnels and test throughput again (this should be a clear indicator that the firewalls are fucked if this is good)
All else fails, open TAC case with Fortigate.

wop@infosec.pub · 1 year ago

No worries, thank you for your input!

what logging/debugging would you activate for that case? - Not too familiar with Fortigate yet and would appreciate some tipps, IF you are familiar with those.
the IPSec tunnel is the only connection between these locations so it is rather difficult. But I get what you mean and check if there is another option.

Good points!

Avian_Carrier@infosec.pub · 1 year ago

Not sure on the logging. I’m a data center guy and would rather see firewalls in the trash lol. They usually just cause problems.

For the WAN, surely there is some way you can reach those sites over the general internet. You have ISP connections.

Are you sharing BGP to the ISP? Maybe make a couple of 1:1 NATs with test boxes not in prod so that you can quickly test pathing outside of the tunnel.

wop@infosec.pub · 1 year ago

Not sure on the logging. I’m a data center guy and would rather see firewalls in the trash lol. They usually just cause problems.

Haha - I’d like to disagree, but you are right.

For the WAN, surely there is some way you can reach those sites over the general internet. You have ISP connections.

I for sure could do it, but it is not that easy to expose a server to the internet. There would be multiple departments involved and I need to get permission. And yeah, even with IP whitelisting. I guess that will be my last resort.

Still waiting for the test clients. Probably going to shift some hours into the weekend so I don’t disturb daily business.

SgtKetchup@infosec.pub · 1 year ago

Might be too simple but does a traceroute show pretty standard latency all the way down the line?

wop@infosec.pub · 1 year ago

I am certain that we block ICMP on multiple FW in between. I could allow it temporary and check. Good suggestion.

jharrison@infosec.pub · 1 year ago

Blocking ICMP entirely is a recipe for weird stuff happening. There’s some ICMP worth blocking - redirects, etc - but turning it off entirely A) makes debugging stuff a nightmare and B) can break some things entirely e.g. MTU probing.

wop@infosec.pub · 1 year ago

You are right. Still an active policy that we have to work on.