I'm not so familiar with Azure, most of my time is spent in Amazon AWS.
In Amazon I would deploy your redundant servers into two different availability zones (effectively you can think of this as just two different subnets). I would use a vMX to service the first subnet, and a second vMX to service the second subnet.
For example, if it was Exchange you could put a client access server into each subnet. If one vMX failed then the clients would fail over to the second unit and the second CAS server.
In this case, both the the vMX would be active/active. So if you spread the load nicely you'll end up with a potential Gigabit/s of throughput.
Also if you really are pumping such large volumes of data between your on-premise DC's and Azure you should consider getting an Azure ExpressRoute connection instead, and leave the vMX(s) for the branches to use.