MX - Low TTL L3 FQDN Support

RaphaelL
Kind of a big deal
Kind of a big deal

MX - Low TTL L3 FQDN Support

Hi ,

 

Has anyone ever encountered issues with a L3 firewall rule that contains a FQDN with a 'low' TTL ( 60s ) ?

 

Context :  Users are reporting issues with apps.powerapps.com ( Microsoft ). The problems are more frequent during busy hours ( when multiple users are trying to reach the website ).  

 

I took pcaps and I can see the DNS requests/response and then I see a bunch a SYN retransmissions on ONE of the IP returned in the DNS response. Couple seconds / minutes later , it starts working and the SAME IP is allowed. 

 

This looks like the behavior described by Phil in : https://community.meraki.com/t5/Security-SD-WAN/Meraki-MX-Firewall-with-FQDN/m-p/116312

 

Currently in Canada , this fqdn returns either 13.107.226.36 or 13.107.253.36.

 

The documentation about FQDN support is pretty limited so my knowledge about it is also limited. It feels like the MX is only caching the first IP and when another client ( client B ) does a DNS requests , it overrides the original IP received by Client A.

I can't repro this issue in my lab since I'm the only user. But users in branches all have the issue.

 

 

URL support with SNI lookup would solve this issue , but it is not supported by MXs.

26 Replies 26
CptnCrnch
Kind of a big deal

To my knowledge, MX is working exactly like you're describing it: the IP is simply cached for a specific amount of time. Don't know if the TTL is recognized over there.

This is by the way (almost) exactly how ASA / Firepower are working internally. These platforms will take the TLL into account though (and could "extend" it internally if needed).

RaphaelL
Kind of a big deal
Kind of a big deal

To the best of you knowlege , do you know if the MX 'caches' all the IPs snooped by the DNS response ?

 

From my example : apps.powerapps.com , would it cache  all the IPs returned by that query or only the first one ?

If the answer is all IPs then it answers most of my questions about this issue and I would have to do more troubleshooting. 

 

I also have a case open to get more info on how the MX should behave. 


Cheers , 

CptnCrnch
Kind of a big deal

I don't know the answer for the MX, but for ASA / Firepower it would be: all IPs.

 

EDIT: This has been discussed in another thread too https://community.meraki.com/t5/Security-SD-WAN/Meraki-MX-Firewall-with-FQDN/m-p/116396/highlight/tr...

PhilipDAth
Kind of a big deal
Kind of a big deal

>would it cache  all the IPs returned by that query

 

That is my understanding of how the MX works.

PhilipDAth
Kind of a big deal
Kind of a big deal

>This is by the way (almost) exactly how ASA

 

Negative.  The MX intercepts DNS queries to learn the IP address of FQDNs.

 

ASA uses a scheduler and pre-emptively looks up the DNS entries used in every ACL, even if they are not used.

PhilipDAth
Kind of a big deal
Kind of a big deal

Yes, I run into this problem all the time.  Typically you end up having to get a list of all the IP addresses that can be used for that service (which can be a lot), make a group up, and allow those.

RaphaelL
Kind of a big deal
Kind of a big deal

Thank you guys for your inputs ! Really appreciate it !

 

One of the drawbacks of using FQDN rules 😕 

RaphaelL
Kind of a big deal
Kind of a big deal

Best example I could find this morning.

 

Took a packet capture on MX LAN side ( that includes ALL dns queries of every client )

 

 

RaphaelL_0-1701282569884.png

 

Client A did a DNS query for apps.powerapps.com & content.powerapps.com. It received a DNS reply with 13.107.226.36 & 13.107.253.36 with a TTL of 5s. Client A did a TCP SYN 2.57 secs later , despite the TTL still not expired yet,  the flow was denied. No other client did a DNS query to those domains in the same period. 

 

5s > 2.5s

FQDN is allowed in the rule : 

 

RaphaelL_1-1701282730538.png

 

 

I can't explain it. Doesn't seem to behave like it should... At the moment this is working 25-50% of the time and users are getting frustrated. 2 options : allow the IPs ( from a CDN that can change at any moment... ) or route this trafic through our proxies. Ticket already open.  

 

More to come..

PhilipDAth
Kind of a big deal
Kind of a big deal

Maybe there is a wider issue at play - a bug.  It sounds like it might be expiring the DNS results to quickly.

RaphaelL
Kind of a big deal
Kind of a big deal

I will try to see if there is a pattern ( eg : when dns replies contains a TTL under 5s , 10s and so on ) !

RaphaelL
Kind of a big deal
Kind of a big deal

@PhilipDAth  I got it !!! 

 

RaphaelL_0-1701286565187.png

( its my home network I don't care about bluring stuff ) 

 

RaphaelL_1-1701286603276.png

 

I'm only allowing 10.22.0.79 ( my home PC ) to talk to apps.powerapps.com . First DNS packet shows a DNS query with a response including the IP 13.107.246.36. I did a cURL to that IP ( 3-way handshake works fine ). I did another DNS query to a DIFFERENT FQDN but that contains the SAME IPs.  MX sees that it is the same IP but different FQDN then ( probably ) flushes the entry. Now I can't reach 13.107.246.36 at all or until I do a new DNS query to the allowed domain.

 

TL;DR , if you have to allow a FQDN , make sure that it is not behind a CDN or you could end up with multiple urls ( not always present in your rules ) blocking your own trafic. 

 

Is that expected ? No idea. But this is what is happening and I'm 200% sure of it since I can repro it at home with ease.

PhilipDAth
Kind of a big deal
Kind of a big deal

Wow!  Brilliant find!

 

You need to log this as a bug.  It will help a LOT of people out if you manage to get this fixed.  Let me know the case number once you have it opened.

PhilipDAth
Kind of a big deal
Kind of a big deal

Do you have access to a DNS zone you could play with?

 

If so, create two DNS records pointing to the same IP address.  This would be very simple to replicate then.  This would also make it MUCH simpler for a developer to work on this bug.

RaphaelL
Kind of a big deal
Kind of a big deal

Luckily for me apps.powerapps.com and content.powerapps.com are sharing the same IPs. 


Steps to replicate : 

1- Allow apps.powerapps.com in FW rule.

2- Do a DNS query that will be snooped by the MX ( different vlan or on the WAN eg : 8.8.8.8 )

3- Do a curl to the IP returned by step 2

4- Do a DNS query to content.powerapps.com that will be snooped by the MX ( just like step 2 )

5- Do a curl to the IP returned by step 4 ( which HAS to be same from step 2 ) 

 

Curl from 3 should be tcp-reseted from Microsoft

Curl from 5 should be denied by MX ( timeout )

🙂 


Case updated. Waiting for a reply. Will share case ID after that 

RaphaelL
Kind of a big deal
Kind of a big deal

Bonus info as I'm still trying to reverse engineer how that thing is working. 

 

The "DNS snooping table" ( no idea what other vendors call it ) doesn't track the source IP of the DNS query. It is global to the MX and looks probably like that : 

 

FQDN : IP : TTL ( which decrements every seconds )

If a new entry matches the same IP, it gets overridden.

 

I just tested it and doing DNS queries (from another PC ) to content.powerapps.com ( the domain that isn't included in my firewall rule ) I was able to "block" any additionnal tests done by my main PC ( the one that did the DNS query to the apps.powerapps.com domain that is allowed ).

 

So ! I could spam dns queries to content.powerapps.com and prevent anyone on my network from reaching apps.powerapps.com. Cool cool 😀

 

 I get the idea of security behind that behavior, but in the mean time I don't think that adds much..

PhilipDAth
Kind of a big deal
Kind of a big deal

I bet it is actually:
IP : FQDN : TTL

and is keyed on IP address.

 

You look up the FQDN, and then update the primary key, the IP address.  The IP address is normally considered unique, so it is just updating that primary key.

RaphaelL
Kind of a big deal
Kind of a big deal

Yup , you are right ! probably looks like that !

CptnCrnch
Kind of a big deal

This is brilliant, thank you so much for your effort to explore the MX further! 👍

RaphaelL
Kind of a big deal
Kind of a big deal

Update 2023-12-04 , did more digging with the NSE.

 

We were able to repro and take logs of the following :

 

If the MX has an entry in cache of that nature : IP:FQDN:TTL and the MX snoops another query under the same IP but a different FQDN it will override it , despite the TTL being still valid. NSE was able to export that table into a txt file ( which is pretty cool to see )


We have no clue if that is expected. More to come in the next days.

PhilipDAth
Kind of a big deal
Kind of a big deal

What is NSE?

 

This can not be expected behavior.

RaphaelL
Kind of a big deal
Kind of a big deal

Network Support Engineer. 

 

Well Phil, you know better than us, we can't ever be sure 😅

K2_Josh
Building a reputation

As a workaround, you could consider allowlisting the Azure Front Door IP addresses (and rely on DNS to mitigate the risk of DNS resolution to malicious domains in these IP ranges). This would require pulling the IPv4 ranges from the Azure IP ranges JSON file and getting the ranges from the name/id "AzureFrontDoor.Frontend".

 

Ideally, there would exist scripting to already to update firewall rules or group policy based on IP ranges, but at least the documentation for firewall rules says that "Multiple IPs or subnets can be entered comma-separated. ", so it might be pretty quick to get this in place, at least for testing.

RaphaelL
Kind of a big deal
Kind of a big deal

End of the case : 

 

Hello Raphael,

As it stands, the behavior that we are experiencing is expected. This means that if the MX receives DNS responses for 2 different URLs that share the same IP address, the latest DNS entry will override the previous one; the previous entry will not re-populate the DNS table of the MX.

 

 

Conclusion : 

 

If you are creating L3 Firewall rules with FQDN , be careful if those IPs are behind a CDN which could host many more services under the same IP , you could end up with weird issues like the ones described in my post.

CptnCrnch
Kind of a big deal

Thanks for the update! Great catch!

PhilipDAth
Kind of a big deal
Kind of a big deal

That is not good.

RaphaelL
Kind of a big deal
Kind of a big deal

Agreed. Wasn't able to confirm if other vendors behave like that.

Get notified when there are additional replies to this discussion.
Welcome to the Meraki Community!
To start contributing, simply sign in with your Cisco account. If you don't yet have a Cisco account, you can sign up.
Labels