vMX HA on AWS - Issue on Failover

Solved
spendot
Comes here often

vMX HA on AWS - Issue on Failover

Scenario:

 

- Setup 2 vMX on AWS under 2 Availability Zone

- Both vMX status "on" and AWS is working too.

- Able to ping (8.0.1.x & 172.19.4.x) when vMX-1A is up.

- Tested Failover when vMX-1A is down, swing over to vMX-1B.

- PC (172.19.4.x) & AWS Instance (8.0.1.x) not able to ping from console.

However, from the dashboard (vMX-1B) can ping to the PC & the Instance but PC not able to access the AWS instances.

 

Follow the guide from https://aws-quickstart.github.io/quickstart-cisco-meraki-sd-wan-vmx/

 

Not sure if its because of the routing table in AWS.

Any expert to guide or advise.

1 Accepted Solution
MyHomeNWLab
A model citizen

Did you deploy from CloudFormation?

 

When deployed from CloudFormation, a lambda script for failover is also deployed.

If you do not use CloudFormation, you will need to configure the appropriate settings yourself.

 

[Lambda script]
https://github.com/aws-quickstart/quickstart-cisco-meraki-sd-wan-vmx/blob/main/functions/source/lamb...

 

This script is deployed from CloudFormation Template (AWS Quick Start for Meraki vMX).

View solution in original post

5 Replies 5
PhilipDAth
Kind of a big deal
Kind of a big deal

You need to check the Lambda functions and CloudWatch events to see what is going on.

 

Long ago before AWS and Meraki published their guides, I did my own.  You might find this of interest:

https://www.ifm.net.nz/cookbooks/meraki-ha-vmx-amazon-aws.html 

 

 

 

spendot
Comes here often

We did follow the guide as per your link. but somehow rather its cant work.

 

below is our lambda script

 

import boto3

import json

 

# Change this to be the same as the region you are using

region='ap-southeast-1'

 

ec2 = boto3.resource('ec2',region_name=region)

cloudwatch = boto3.resource('cloudwatch',region_name=region)

client = boto3.client('ec2',region_name=region)

 

def change_subnet_routetable(subnetID,rtID):

  response = client.describe_route_tables(

      Filters=[{'Name': 'association.subnet-id','Values': [subnetID]}]

  )

  rtaID=response['RouteTables'][0]['Associations'][0]['RouteTableAssociationId']

  ec2.RouteTableAssociation(rtaID).replace_subnet(RouteTableId=rtID)

 

def lambda_handler(event, context):

  # Change this to be the instance ID of VMX-1A

  vmx1a = ec2.Instance('i-0a1afe42842357215')

  # Create detailed monitoring for vmx1a in 1 minute interval, VMX-1A-STATUS-CHECK is the name of the cloudwatch alarm config of vMX-1A

  vmx1aalarm = cloudwatch.Alarm('VMX-1A-STATUS-CHECK')

  # Change this to be the instance ID of VMX-1B - If setup no load balancing then no need check vMX-1B

  #vmx1b = ec2.Instance('i-05d0aa2e92b6bf890')

  # Create detailed monitoring for vmx1b in 1 minute interval

  #vmx1balarm = cloudwatch.Alarm('VMX-1B-STATUS-CHECK')

  if vmx1a.state['Name']=="running" and vmx1aalarm.state_value=='OK':

    print("VMX1A UP")

    # Add one of these lines for every subnet you have.  Change "rtb" to be the ID for the route table VMX-1A-UP

    # DEV private subnets

    change_subnet_routetable('subnet-0f06b39deb3a6a0bc','rtb-02c9f6a6f60bd36e4')

    change_subnet_routetable('subnet-08104babe83ad85d3','rtb-02c9f6a6f60bd36e4')

    change_subnet_routetable('subnet-00d984b9423f807db','rtb-02c9f6a6f60bd36e4')

    # DEV public subnets

    change_subnet_routetable('subnet-0783ca3c2a9a1fa87','rtb-070257ae8030b52c0')

    change_subnet_routetable('subnet-01a0bd7312b9bd17f','rtb-070257ae8030b52c0')

    change_subnet_routetable('subnet-0af94494946901b21','rtb-070257ae8030b52c0')

  else:

    print("VMX1A DOWN")

    # Add one of these lines for every subnet you have.  Change "rtb" to be the ID for the route table VMX-1A-DOWN

    change_subnet_routetable('subnet-0f06b39deb3a6a0bc','rtb-0a1c0956298d9af66')

    change_subnet_routetable('subnet-08104babe83ad85d3','rtb-0a1c0956298d9af66')

    change_subnet_routetable('subnet-00d984b9423f807db','rtb-0a1c0956298d9af66')

    # DEV public subnets

    change_subnet_routetable('subnet-0783ca3c2a9a1fa87','rtb-0ee718c05fe7d2f47')

    change_subnet_routetable('subnet-01a0bd7312b9bd17f','rtb-0ee718c05fe7d2f47')

    change_subnet_routetable('subnet-0af94494946901b21','rtb-0ee718c05fe7d2f47')

 

Not sure if this help.

MyHomeNWLab
A model citizen

Did you deploy from CloudFormation?

 

When deployed from CloudFormation, a lambda script for failover is also deployed.

If you do not use CloudFormation, you will need to configure the appropriate settings yourself.

 

[Lambda script]
https://github.com/aws-quickstart/quickstart-cisco-meraki-sd-wan-vmx/blob/main/functions/source/lamb...

 

This script is deployed from CloudFormation Template (AWS Quick Start for Meraki vMX).

spendot
Comes here often

No, i have not deploy CloudFormation yet. 

 

Is it a must to have this module in place or i need to edit the script as per above to match so that the failover can be done?

MyHomeNWLab
A model citizen

Deploying from CloudFormation at first is easy to understand.

 

If deploying manually, you will need to decipher the necessary information from the CloudFormation Template.
I have deployed manually. But it took a long time to organize the configuration information.

 


> Is it a must to have this module in place or i need to edit the script as per above to match so that the failover can be done?

 

If the routing table can be switched, it does not matter what script is used.
It is recommended that you choose one that is easy to operate.

 

AWS Quick Start script is useful.
This is to automatically reflect the routes being learned by Meraki vMX using the Meraki Dashboard API.

Get notified when there are additional replies to this discussion.