Archive for the ‘Statistics’ Category

pmacct – my new best friend.

Saturday, October 24th, 2009

How do you manage your transit data? If you source your traffic through more than one upstream ISP – and you should – then you need a way to view those relationships in detail, and make sure you are receiving your best value for money. However, it’s not always easy convincing your manager (or your mother) why Internet traffic costs money (and how it is priced).

You need a way to see what and where your routers are sending traffic. What I’m talking about here is Netflow. There is a wealth of information about this once-upon-a-time Cisco protocol, and it’s neighbours, jFlow, sFlow and NetStream.

I’ve tried so many software packages to make sense from my Netflow. OSS command-line based to huge Java run pay-per-port commercial packages. Not one has ever been agile enough (without substantial hacking) to document what is really happening at my network borders. The reason being that I don’t always control all our traffic on the edge. How can I account for the traffic that isn’t routed via my border routers?

Recently I saw on NANOG (or was it c-nsp list?) another vague request for network management tools. I found pmacct through the author, Paolo Lucente. He convinced me in one email that his OSS product (developed since 2003) would fit my needs. I’m going to show you via this HOWTO, as there isn’t that much newbie information in the googlesphere.

Our network runs like a bit this: border routers have full BGP routes from multiple transits. This services most of our network. network diagram
We also have some servers connected directly to an ISP, as they push out megabandwidth compared with what our routers are capable of. I give the servers a default route to some IP space hosted by the ISP which has bigger routing iron. They provide a connection with an HSRP default gateway that I can plug these machines directly into. This way we don’t need to invest in huge amounts of money for those big name routers.

The problem with these latter machines, is that they are not aware of the Internet. They just punt a packet towards a gateway, and get a packet back. I want to be able to see where these packets are going (ie. the ASN), but our ISP doesn’t provide Netflow for us. This is the crux of why all other Netflow accounting packages haven’t worked for me.

I have a basic setup on these machines. The short story is that within their ufw/iptables rules I insert a ULOG rule for all traffic on their “external” interfaces. Then I run a program called fprobe-ulog which interprets the basics of these ULOG messages and respews Netflow packets towards my collector.

Before, I would script the netflow records from these machines and match it against a BGP table, while my name-brand routers were lovingly supported by name-brand software packages. pmacct will aggregate all this functionality into one lightweight, stable and scaleable system.

Now onto setting up pmacct.

  1. Enable netflow on your interfaces with which you peer:
    Router(config-subif)#ip flow ingress
    Router(config-subif)#ip flow egress
    Router(config-subif)#ipv6 flow egress
    Router(config-subif)#ipv6 flow ingress
    Router(config-subif)#exit
    Router(config)#ipv6 flow-export destination 192.168.1.111 3001
    Router(config)#ipv6 flow-export version 9 bgp-nexthop
    Router(config)#ipv6 flow-export source Loopback1
    Router(config)#ip flow-export destination 192.168.1.111 3001
    Router(config)#ip flow-export source Loopback1
    Router(config)#ip flow-export version 5 origin-as bgp-nexthop
    


    without the last line your netflow will not have the ASN information within the Netflow packets.

  2. Make sure netflow packets are arriving at your collector:
    # tshark port 3001
    Running as user "root" and group "root". This could be dangerous.
    Capturing on eth0
      0.000000  94.228.64.1 -> 192.168.1.111 UDP Source port: 51901  Destination port: 3001
      0.000384 94.228.64.41 -> 192.168.1.111 UDP Source port: 57689  Destination port: 3001
      0.980030  94.228.64.2 -> 192.168.1.111 UDP Source port: 58299  Destination port: 3001
      0.980253 94.228.64.42 -> 192.168.1.111 UDP Source port: 56640  Destination port: 3001
      0.999718  94.228.64.1 -> 192.168.1.111 UDP Source port: 51901  Destination port: 3001
    5 packets captured
    # 

  3. Download and compile libbgpdump.
    # wget http://www.ris.ripe.net/source/libbgpdump-1.4.99.10.tar.gz
    # tar xvfz libbgpdump-1.4.99.10.tar.gz
    # cd libbgpdump-1.4.99.10
    # apt-get install build-essential libbz2-dev
    # ./configure
    # make
    

  4. download the DFZ routes de jour from RIPE.
    # wget http://data.ris.ripe.net/rrc00/2009.10/bview.20091024.0759.gz
    # gunzip bview.20091024.0759.gz
    

  5. parse the binary bview format, strip out any bogus lines, and reformat it as “$ASN,$PREFIX”:
    # ./libbgpdump-1.4.99.10/bgpdump -m bview.20091024.0759 > \
    ~/bgptable.20091024.0759.dump
    # awk -F\| '$0 !~ /INCOMPLETE/ {print $6,$7}' bgptable.20091024.0759.dump\
    |awk '{ print $NF "," $1 }'|uniq > bgptable.20091024.0759
    

  6. Now install pmacct! (i just leave mysql root blank for testing)
    # apt-get install libmysqlclient15-dev mysql-server libpcap-dev
    # wget http://www.pmacct.net/pmacct-0.12.0rc2.tar.gz
    # tar xvfz pmacct-0.12.0rc2.tar.gz
    # cd pmacct-0.12.0rc2
    # ./configure  --enable-mysql --enable-ipv6
    # make install
    # mysql < sql/pmacct-create-db_v6.mysql
    # mysql < sql/pmacct-grant-db.mysql
    


    This gets you up to the point of having a working mysql table for nfacctd to insert the data into, the next step I'll walk through some the settings and how-it-works.

  7. Now you are ready to create your config. pmacct has 3 daemons; sfacctd, pmacctd, nfacctd and the pmacct command line client.
    sfacctd is for sFlow. sFlow usually comes from hardware samplers on L3 switches. I don't use this, but I will have to in the future - glad to know its there.
    pmacctd is for either running on a client machine in promiscuous mode (a bit like i do with the ULOG and fprobe) or using it as a Netflow aggregator which can normalize your flows in all sorts of weird and wonderful ways, and then forward it back onto a Netflow datastore. Useful - but not for my basic setup.
    nfacctd collects the Netflow data. This is what I'm using.
    pmacct is the cli interface to the memory plugin. We are using the MySQL plugin - but to get us up and running we will start with the memory plugin.

    Let's create our config file:

    # vim /etc/nfacctd.conf
    
    daemonize: true
    aggregate: src_as,dst_as,dst_host,src_host,flows,dst_port,src_port,proto

    The values for aggregate are the sources in the netflow datagrams that you are interested in storing. This is pretty self-evident.

    ! plugin_buffer_size: 1024
    nfacctd_port: 3001
    nfacctd_time_secs: true
    nfacctd_time_new: true
    ! read bgp table from here..
    nfacctd_as_new: file
    networks_file: /home/charlie/bgptable.20091024.0759
    plugins: memory
    sql_db: pmacct
    sql_table: acct_v6
    sql_table_version: 6
    sql_passwd: arealsmartpwd
    sql_user: pmacct
    sql_refresh_time: 30
    sql_history: 10m
    sql_history_roundoff: m

    Keep the buffer at nil until you are in production mode.
    Set your listening UDP port (3001).
    Configure the path to the BGP table we built before.
    Set the plugin to be "memory".
    The rest commented will remain unused until you are ready to store data in MySQL. Now let's try it out!

  8. # nfacctd -f /etc/nfacctd-mysql_v6.conf
    # pmacct -s
    SRC_AS  DST_AS  SRC_IP          DST_IP          SRC_PORT  DST_PORT  PROTOCOL  PACKETS   FLOWS   BYTES
    6849    47998   94.179.57.134   94.228.76.67    3153      445       tcp       2         1       96
    786     47998   131.251.141.20 193.34.28.19   4782      80        tcp       5         1       1152
    ^C

    woo! There is data! I can even see the external routes being populated with the correct ASN's. Go back to your config and change memory to mysql and you're done. If you want to see it not add the ASN's then change nfacctd_as_new: file to nfacctd_as_new: false.

    There is a great resource in the distribution called CONFIG-KEYS which will help you in figuring out your nfacctd.conf and README.mysql to understand how to expand upon this basic setup.

  9. From here on in you can start to create reports via the mysql cli or from some pre-made web front ends. Flox is a basic one. Here is a screenshot showing some IRC servers.

    flox

    You really must try this out. Paolo has kept the programs really easy and maleable and he promises me that I can soon discard my ULOG for a native pmacctd ULOG implementation.

    More information about our network is of course available from RIS, DB and here.

Playlouder market research

Friday, October 19th, 2007

Back in June we commissioned an independent market survey carried out on our behalf by Entertainment Media Research. We presented a summary of this research this week at the “Who Is In Control” conference in Reykjavik. Here is a copy of the presentation: Market Research presentation

There are a number of interesting points to note and conclusions to draw. In particular: the implications for ISPs are that an MSP service bundled with broadband access would reduce customer acquisition costs and reduce churn; and music fans are prepared to pay a significant premium to vanilla broadband for unlimited legal access to music.

If we extrapolate from the findings of the survey it appears that there is market of more than £250 million p.a. in the UK alone for the MSP service.

Bus waiting times

Friday, June 22nd, 2007

Lazy bastard and travelcard-holder that I am, I regularly hover around the bus stop for a while waiting to see if a 55 or a 243 will arrive and take me some of the ~10min walk from tube station to Playlouder MSP (see, relevance to work!).

Having a maths degree which I struggle to put to use, I find myself wondering about the following problem: How long do you wait for a bus before you start walking? What is the optimal strategy for this?

To simplify things you might presume, along with numerous textbook probability questions, that buses follow an exponential distribution. In english: that buses are randomly scattered subject to an average number per time period. The answer is then quite clear-cut. There are only two valid strategies – a strategy in which you always wait indefinitely until a bus arrives, and a strategy where you always walk no matter what. If the average waiting time, plus the bus journey time, is greater than the walking time, then you always walk; otherwise you always wait.

This is a bit counter-intuitive, though, and doesn’t satisfy my desire for a practical strategy. In practise buses don’t follow an exponential distribution, the waiting times are correlated as buses are subject to bunching phenomena, service disruptions etc. So if you wait 3 hours and no bus arrives, you might well be able to infer extra information about subsequent waiting times based on more sophisticated distributional assumptions (perhaps a bus route suspension is a probable occurrence? perhaps you’re likely to have just missed a bunched grouping of buses?). And so with a real-life bus distribution, there is actually likely to be a valid strategy based on “Wait for n minutes; if no bus arrives then start walking”.

It turns out that there’s a lot theory on this, and the M/G/1 queue is a better model to use than the more simple M/M/1 queue based on an exponential waiting times. M/G/1 tells us that, if bus waiting times are correlated, you may actually expect to wait longer than the mean interval between buses! This is due to phenomena like bunching (‘why do buses always come in threes?’), and the extra expected time can be given in quite a general way, in terms of the mean and variance of the distribution of inter-arrival times.

However the M/G/1 queuing model is a little too general to infer more detail, such as specific strategies on how long to wait before walking. To do this I’d need to assume a particular distribution for bus arrival times, and it seems at this point that most of the research turns to using empirical distributions (ie just going out and measuring loads of buses rather than trying to derive from a mathematical model), and simulations (simulate a bunch of buses on a computer and measure the arrival time distribution in different situations).

And so I lack a simple answer: given sensible real-world assumptions about bus waiting times, how do I calculate a value for my “Wait n minutes and then walk” strategy?! Queuing theorists please respond kthx.

On with some MSP work ;)