This guide lists the maximum number of VPN tunnels (page 2).
>If one HA pair fails, whats the the convergence time to the Warm Spare?
Depends on the type of failure. Usually 10s to 30s.
Check out this DC to DC failover guide. It uses an active/active design. It uses a pair of MX in each DC (4 in total), but in your case you might only use two. Note that they MUST have a seperate and unique stub network each for this to work. You also need to use dynamic routing.
BGP is the most popular routing protocol for this kind of design.