[ixpmanager] Order of customers' config in the RS configuration of the IXP manager
Barry O'Donovan
barry.odonovan at inex.ie
Mon Dec 15 14:43:06 GMT 2014
Hi Andreas,
First of all, apologies for the delay in replying. You've sent a few
mails I've wanted to reply to - including to the Euro-IX Bird list - but
I've just been swamped.
Starting with the end of your email:
> Before filing a feature request, I wonder how other IXP-manager users
> handle this: Do you use a different procedure to update the route
> servers? Do humans need to review the changes before getting applied
> (or at least they get notified)? Do you solve the problem in a
> different approach?
At INEX we've been running route servers for >7 years and have been
auto-generating them from the very beginning. We've never had any issues.
Human review is not necessary nor do we /require/ emails of changes. Our
Quagga based route server is on our RANCID configuration management
system which does email us. Bird is not (but see below re backups).
All changes are deterministic (* - see below also!) - if we haven't
enabled / disabled / changed a vlan interface for the route server, then
the configuration doesn't change. It's also important to say that there
are never 'human' changes to the generated script. The deterministic
nature of the configuration is the basis for the Travis-CI tests also
(more below).
Any changes in the configuration go via IXP Manager templates (with
testing including Travis CI tests for known good / expected against
generated). This is all documented in the wiki - http://git.io/Evv2MQ
and http://git.io/GrAnqg.
That's not to say we have blind faith that every step of the process
just works! I.e. there's no point reloading Bird of the API call results
in an error.
> Also, if someone would be interested to share scripts, we would
> certainly appreciate it. And we would be glad to upload ours if such
> a repository exists somewhere.
See: http://git.io/dsFP9w (documented in the wiki at: http://git.io/bv_i0g
This is the script we use to download a new route server configuration
and it:
- ensures wget terminated successfully
- ensures the new file exists and has content
- ensures we have a minimum number of neighbors
- runs Bird parser against the new file
Only if these succeed do we reload Bird. At this point we could also
email a diff of new versus old.
If any step fails, the script produce output which cron emails to us.
Note also that Bird / Quagga are monitored on another level by our
Nagios implementation.
> The IXP manager does a great job creating the bird configuration
> through the skinned templates. What is missing, however, is a way to
> push the configuration to the route servers, check it, load it, and
> maybe notify the admins.
Please see script referenced above :-)
> For that reason we are writing a script that:
> a) updates the IXP manager ASN and prefixes database
> b) produces the bird configuration
> c) compares with the old bird config, notify us about changes
> d) push the configuration to the route servers Then, the route
> servers need to get reloaded; manually or semi-automatically
> at the begging, automatically later.
We've discussed this script with Rowan also. For (a) above, please note
that we have documentation at http://git.io/7jnKcQ
We have put a lot of effort into decoupling (a) from the RS build.
Chaining the update of the as/prefix table to the rs build can lead to
~1hour of processing time (ymmv depending on members, AS-SET unwrapping,
etc). What we have now is a transaction safe process where they can be
run independently and on separate servers.
Note two things mentioned in the documentation for (a):
* we use transactions to update the database so, even in the middle of a
(route server configuration) refresh, a full set of prefixes for all
customers will still be available.
* The command will rigorously validate the return code and output of
BGPQ3 and it will throw an alert rather than removing prefixes when/if
BGPQ3 returns an empty prefix list where prefixes already exist in the
database.
I suggest, in the UNIX way, a tool should do one job and do it well - a
complicated script trying to (unnecessarily) couple together different
jobs can be confusing and difficult to manage.
Rowan's script also had Git functionality. I'm not sure this is
necessary as:
- for a known database and version of IXP Manager, the configuration
is always reproducible.
- standard server backups should maintain all three of the above
without the additional need for Git.
- in 7 years, we have never needed to revisit an older configuration.
> And now our problem: If a customer has more that one connections
> (i.e, there are at least two different #vliid}, the order that these
> will be processed by the IXP manager is random.
Well, not random but rather based on the order returned by the database.
Granted, that may effectively be random ;-)
Interestingly, our test database always returned data in the same order;
even prefix and ASN lists - so we were never hit with this issue.
I've just tested and pushed the following diff:
--- a/application/Repositories/VlanInterface.php
+++ b/application/Repositories/VlanInterface.php
@@ -72,7 +72,7 @@ class VlanInterface extends EntityRepository
AND " . Customer::DQL_CUST_TRAFFICING . "
AND pi.status = :pistatus";
- $qstr .= " ORDER BY c.autsys ASC";
+ $qstr .= " ORDER BY c.autsys ASC, vli.id ASC";
This will ensure deterministic ordering in all cases.
...
> This causes diffs to appear, which require unnecessary human
> attention. (You can find our neighbor.cfg file at the end of this
> email)
I've also pushed changes to the select queries for routes / asn acls to
ensure a deterministic ordering of those.
Good feedback, thanks!
- Barry
More information about the ixpmanager
mailing list