<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 10pt; font-family: Verdana,Geneva,sans-serif'>
<p>Hi Nick,</p>
<p>TPS is 0 for a while and then spikes up.... as high as ~40,000 - obviously when it's doing the flush.</p>
<p>I do have graphs of iops, which I think is the same thing. It's odd, because nothing looks saturated/struggling.</p>
<p>It does seem to be hitting some kind of limit within the VM though, as I have since tested it on bare metal and it's noticeably better.... and the graph smoother now I compare them, which would intimate it's not struggling. I tried to change some I/O related tweaks on the VM, but it didn't seem to help.</p>
<p>Flush is taking ~15s on bare metal, down from ~30s on the VM.</p>
<p>Right now we're doing 667G exchange traffic with MRTG, 280G with the original sflow setup, 317G with the new VM (+subinterfaces fix) and 468G on bare metal. So still a long way under actual interface traffic, but a lot better.</p>
<p>The filesystem (XFS) is mounted with noatime,nodiratime.</p>
<p>Do you run in a VM or on bare metal? - what specs?</p>
<p>Have you played with the flush and threads options on rrdcached, or do you just run with the defaults?</p>
<p>Interestingly, if I increase the number of threads from 4 to 8, the flush time goes up to ~60s. I guess that makes sense if the VM is I/O bound, but I would have thought it would be better in hardware, allowing the script to offload it quicker, where it then gets written to disk in the background. The flush time & jitter seems to make no notable difference.</p>
<p>Thanks,</p>
<p>Ian</p>
<p><br /></p>
<p id="reply-intro">On 2023-06-25 20:50, Nick Hilliard (INEX) wrote:</p>
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">
<div id="replybody1">
<div><span>Ian Chilton wrote on 24/06/2023 16:27:</span><br />
<blockquote style="font-size: 10pt; font-family: Verdana,Geneva,sans-serif;">
<div><span>vdb 5866 14013 16371 0 1654383830 1932780751 0</span></div>
</blockquote>
<br />5866 looks fairly high for the average tps since boot, but you need more granular output than this. You can get a time series sample of the 1s tps using e.g.<br /><br /># iostat --dec=0 <span class="v1__postbox-detected-content
v1__postbox-detected-date" style="display: inline; font-size: inherit; padding: 0px;">-y</span> vdb 1 | grep --line-buffered vdb<br /><br />I'd check out several hours of this - maybe throw it into a graph and see what's going on.<br /><br />Presumably the partition is mounted with performance tuning options, e.g. "<span class="v1lang:default v1decode:true v1crayon-inline">noatime,delalloc"?</span><br /><br />We use freebsd for our sflow collector - the i/o performance was significantly better when we benchmarked it several years ago.<br /><br />Nick<br /><br /></div>
</div>
</blockquote>
</body></html>