[ixpmanager] SFLOW Under Reporting?
Ian Chilton
ian at lonap.net
Sat Jun 24 09:13:14 IST 2023
Hi,
On 2023-06-22 16:37, Nick Hilliard (INEX) wrote:
> yes, this would make a difference. For reference, if your switch config
> is automated with L2 ACLs, then we recommend using configured macs for
> sflow
> collection. If you have any rejected flows then, that means there's a
> straightforward misconfig, either in IXP Manager or else on the
> participant port.
Yep - problem is, we were running an old verison of the script as when
we had done
some work to update/re-deploy it in the past, the graphs coming from the
new
deployment were not consistent with the old ones, so it ended up going
on the
back burner. That old script is pre the ability to switch between
configured/discovered.
What i've done in the past day or so is to deploy a new box, with the
scripts
from the latest IXP Manager release. I've switched to using configured
MACs
and put a temporary fix in place for subinterfaces. This has fixed most
of the
dropped/rejected lines and means I can run in debug mode without
accumulating
huge logs.
Curiously, i'm not seeing the exact same results between the old and new
boxes,
even though they are receiving the same data - fanned out with
sflowtool...but
they are kind of in the ball park.
The strange thing is - all of the sflow stats seem to follow the correct
trend -
i.e shape of the graph, but it's just ~50% too low at any particular
time.
I've tried different sflow sample rates: 8,192, 16,384, 32,768 and
65,536.
The script seems to do the right thing and this seems to make no
difference -
which is good at least. This and the fact that it seems to be
consistently
50% of what it should be would indicate that it's not struggling with
the amount of flows, CPU, I/O etc.
One thing I have noticed is the default periodic/flush interval in the
code is
60s, but it can sometimes take longer than that to run (possibly when
it's
reloading the mac table?)
It took 93s here:
Jun 24 07:53:27 sflow sflow-to-rrd-handler[74757]: DEBUG: starting rrd
flush at time interval: 60.001857, time: 1687589607
Jun 24 07:55:00 sflow sflow-to-rrd-handler[74757]: DEBUG: flush
completed at 1687589700
Not sure if that has any effect in the resulting RRD files.
The standard run time is between 30s and 40s (i've modified the script
to show it):
Jun 24 08:56:02 sflow sflow-to-rrd-handler[76458]: DEBUG: flush
completed at 1687593362 (33s)
Jun 24 08:56:29 sflow sflow-to-rrd-handler[76458]: DEBUG: starting rrd
flush at time interval: 60.001306, time: 1687593389
Jun 24 08:57:05 sflow sflow-to-rrd-handler[76458]: DEBUG: flush
completed at 1687593425 (36s)
Jun 24 08:57:29 sflow sflow-to-rrd-handler[76458]: DEBUG: starting rrd
flush at time interval: 59.99794, time: 1687593449
Jun 24 08:58:02 sflow sflow-to-rrd-handler[76458]: DEBUG: flush
completed at 1687593482 (33s)
Jun 24 08:58:29 sflow sflow-to-rrd-handler[76458]: DEBUG: starting rrd
flush at time interval: 59.999949, time: 1687593509
Jun 24 08:58:58 sflow sflow-to-rrd-handler[76458]: DEBUG: flush
completed at 1687593538 (29s)
Ian
More information about the ixpmanager
mailing list