notes from the fosdem 2018 networking devroom

5 February 2018 5:22 PM (fosdem | networking | userspace | snabb | vpp | dpdk | igalia)

Greetings, internet!

I am on my way back from FOSDEM and thought I would share with yall some impressions from talks in the Networking devroom. I didn't get to go to all that many talks -- FOSDEM's hallway track is the hottest of them all -- but I did hit a select few. Thanks to Dave Neary at Red Hat for organizing the room.

Ray Kinsella -- Intel -- The path to data-plane micro-services

The day started with a drum-beating talk that was very light on technical information.

Essentially Ray was arguing for an evolution of network function virtualization -- that instead of running VNFs on bare metal as was done in the days of yore, that people started to run them in virtual machines, and now they run them in containers -- what's next? Ray is saying that "cloud-native VNFs" are the next step.

Cloud-native VNFs to move from "greedy" VNFs that take charge of the cores that are available to them, to some kind of resource sharing. "Maybe users value flexibility over performance", says Ray. It's the Care Bears approach to networking: (resource) sharing is caring.

In practice he proposed two ways that VNFs can map to cores and cards.

One was in-process sharing, which if I understood him properly was actually as nodes running within a VPP process. Basically in this case VPP or DPDK is the scheduler and multiplexes two or more network functions in one process.

The other was letting Linux schedule separate processes. In networking, we don't usually do it this way: we run network functions on dedicated cores on which nothing else runs. Ray was suggesting that perhaps network functions could be more like "normal" Linux services. Ray doesn't know if Linux scheduling will work in practice. Also it might mean allowing DPDK to work with 4K pages instead of the 2M hugepages it currently requires. This obviously has the potential for more latency hazards and would need some tighter engineering, and ultimately would have fewer guarantees than the "greedy" approach.

Interesting side things I noticed:

  • All the diagrams show Kubernetes managing CPU node allocation and interface assignment. I guess in marketing diagrams, Kubernetes has completely replaced OpenStack.

  • One slide showed guest VNFs differentiated between "virtual network functions" and "socket-based applications", the latter ones being the legacy services that use kernel APIs. It's a useful terminology difference.

  • The talk identifies user-space networking with DPDK (only!).

Finally, I note that Conway's law is obviously reflected in the performance overheads: because there are organizational isolations between dev teams, vendors, and users, there are big technical barriers between them too. The least-overhead forms of resource sharing are also those with the highest technical consistency and integration (nodes in a single VPP instance).

Magnus Karlsson -- Intel -- AF_XDP

This was a talk about getting good throughput from the NIC to userspace, but by using some kernel facilities. The idea is to get the kernel to set up the NIC and virtualize the transmit and receive ring buffers, but to let the NIC's DMA'd packets go directly to userspace.

The performance goal is 40Gbps for thousand-byte packets, or 25 Gbps for traffic with only the smallest packets (64 bytes). The fast path does "zero copy" on the packets if the hardware has the capability to steer the subset of traffic associated with the AF_XDP socket to that particular process.

The AF_XDP project builds on XDP, a newish thing where a little kind of bytecode can run on the kernel or possibly on the NIC. One of the bytecode commands (REDIRECT) causes packets to be forwarded to user-space instead of handled by the kernel's otherwise heavyweight networking stack. AF_XDP is the bridge between XDP on the kernel side and an interface to user-space using sockets (as opposed to e.g. AF_INET). The performance goal was to be within 10% or so of DPDK's raw user-space-only performance.

The benefits of AF_XDP over the current situation would be that you have just one device driver, in the kernel, rather than having to have one driver in the kernel (which you have to have anyway) and one in user-space (for speed). Also, with the kernel involved, there is a possibility for better isolation between different processes or containers, when compared with raw PCI access from user-space..

AF_XDP is what was previously known as AF_PACKET v4, and its numbers are looking somewhat OK. Though it's not upstream yet, it might be interesting to get a Snabb driver here.

I would note that kernel-userspace cooperation is a bit of a theme these days. There are other points of potential cooperation or common domain sharing, storage being an obvious one. However I heard more than once this weekend the kind of "I don't know, that area of the kernel has a different culture" sort of concern as that highlighted by Daniel Vetter in his recent LCA talk.

François-Frédéric Ozog -- Linaro -- Userland Network I/O

This talk is hard to summarize. Like the previous one, it's again about getting packets to userspace with some support from the kernel, but the speaker went really deep and I'm not quite sure what in the talk is new and what is known.

François-Frédéric is working on a new set of abstractions for relating the kernel and user-space. He works on OpenDataPlane (ODP), which is kinda like DPDK in some ways. ARM seems to be a big target for his work; that x86-64 is also a target goes without saying.

His problem statement was, how should we enable fast userland network I/O, without duplicating drivers?

François-Frédéric was a bit negative on AF_XDP because (he says) it is so focused on packets that it neglects other kinds of devices with similar needs, such as crypto accelerators. Apparently the challenge here is accelerating a single large IPsec tunnel -- because the cryptographic operations are serialized, you need good single-core performance, and making use of hardware accelerators seems necessary right now for even a single 10Gbps stream. (If you had many tunnels, you could parallelize, but that's not the case here.)

He was also a bit skeptical about standardizing on the "packet array I/O model" which AF_XDP and most NICS use. What he means here is that most current NICs move packets to and from main memory with the help of a "descriptor array" ring buffer that holds pointers to packets. A transmit array stores packets ready to transmit; a receive array stores maximum-sized packet buffers ready to be filled by the NIC. The packet data itself is somewhere else in memory; the descriptor only points to it. When a new packet is received, the NIC fills the corresponding packet buffer and then updates the "descriptor array" to point to the newly available packet. This requires at least two memory writes from the NIC to memory: at least one to write the packet data (one per 64 bytes of packet data), and one to update the DMA descriptor with the packet length and possible other metadata.

Although these writes go directly to cache, there's a limit to the number of DMA operations that can happen per second, and with 100Gbps cards, we can't afford to make one such transaction per packet.

François-Frédéric promoted an alternative I/O model for high-throughput use cases: the "tape I/O model", where packets are just written back-to-back in a uniform array of memory. Every so often a block of memory containing some number of packets is made available to user-space. This has the advantage of packing in more packets per memory block, as there's no wasted space between packets. This increases cache density and decreases DMA transaction count for transferring packet data, as we can use each 64-byte DMA write to its fullest. Additionally there's no side table of descriptors to update, saving a DMA write there.

Apparently the only cards currently capable of 100 Gbps traffic, the Chelsio and Netcope cards, use the "tape I/O model".

Incidentally, the DMA transfer limit isn't the only constraint. Something I hadn't fully appreciated before was memory write bandwidth. Before, I had thought that because the NIC would transfer in packet data directly to cache, that this wouldn't necessarily cause any write traffic to RAM. Apparently that's not the case. Later over drinks (thanks to Red Hat's networking group for organizing), François-Frédéric asserted that the DMA transfers would eventually use up DDR4 bandwidth as well.

A NIC-to-RAM DMA transaction will write one cache line (usually 64 bytes) to the socket's last-level cache. This write will evict whatever was there before. As far as I can tell, there are three cases of interest here. The best case is where the evicted cache line is from a previous DMA transfer to the same address. In that case it's modified in the cache and not yet flushed to main memory, and we can just update the cache instead of flushing to RAM. (Do I misunderstand the way caches work here? Do let me know.)

However if the evicted cache line is from some other address, we might have to flush to RAM if the cache line is dirty. That causes a memory write traffic. But if the cache line is clean, that means it was probably loaded as part of a memory read operation, and then that means we're evicting part of the network function's working set, which will later cause memory read traffic as the data gets loaded in again, and write traffic to flush out the DMA'd packet data cache line.

François-Frédéric simplified the whole thing to equate packet bandwidth with memory write bandwidth, that yes, the packet goes directly to cache but it is also written to RAM. I can't convince myself that that's the case for all packets, but I need to look more into this.

Of course the cache pressure and the memory traffic is worse if the packet data is less compact in memory; and worse still if there is any need to copy data. Ultimately, processing small packets at 100Gbps is still a huge challenge for user-space networking, and it's no wonder that there are only a couple devices on the market that can do it reliably, not that I've seen either of them operate first-hand :)

Talking with Snabb's Luke Gorrie later on, he thought that it could be that we can still stretch the packet array I/O model for a while, given that PCIe gen4 is coming soon, which will increase the DMA transaction rate. So that's a possibility to keep in mind.

At the same time, apparently there are some "coherent interconnects" coming too which will allow the NIC's memory to be mapped into the "normal" address space available to the CPU. In this model, instead of having the NIC transfer packets to the CPU, the NIC's memory will be directly addressable from the CPU, as if it were part of RAM. The latency to pull data in from the NIC to cache is expected to be slightly longer than a RAM access; for comparison, RAM access takes about 70 nanoseconds.

For a user-space networking workload, coherent interconnects don't change much. You still need to get the packet data into cache. True, you do avoid the writeback to main memory, as the packet is already in addressable memory before it's in cache. But, if it's possible to keep the packet on the NIC -- like maybe you are able to add some kind of inline classifier on the NIC that could directly shunt a packet towards an on-board IPSec accelerator -- in that case you could avoid a lot of memory transfer. That appears to be the driving factor for coherent interconnects.

At some point in François-Frédéric's talk, my brain just died. I didn't quite understand all the complexities that he was taking into account. Later, after he kindly took the time to dispell some more of my ignorance, I understand more of it, though not yet all :) The concrete "deliverable" of the talk was a model for kernel modules and user-space drivers that uses the paradigms he was promoting. It's a work in progress from Linaro's networking group, with some support from NIC vendors and CPU manufacturers.

Luke Gorrie and Asumu Takikawa -- SnabbCo and Igalia -- How to write your own NIC driver, and why

This talk had the most magnificent beginning: a sort of "repent now ye sinners" sermon from Luke Gorrie, a seasoned veteran of software networking. Luke started by describing the path of righteousness leading to "driver heaven", a world in which all vendors have publically accessible datasheets which parsimoniously describe what you need to get packets flowing. In this blessed land it's easy to write drivers, and for that reason there are many of them. Developers choose a driver based on their needs, or they write one themselves if their needs are quite specific.

But there is another path, says Luke, that of "driver hell": a world of wickedness and proprietary datasheets, where even when you buy the hardware, you can't program it unless you're buying a hundred thousand units, and even then you are smitten with the cursed non-disclosure agreements. In this inferno, only a vendor is practically empowered to write drivers, but their poor driver developers are only incentivized to get the driver out the door deployed on all nine architectural circles of driver hell. So they include some kind of circle-of-hell abstraction layer, resulting in a hundred thousand lines of code like a tangled frozen beard. We all saw the abyss and repented.

Luke described the process that led to Mellanox releasing the specification for its ConnectX line of cards, something that was warmly appreciated by the entire audience, users and driver developers included. Wonderful stuff.

My Igalia colleague Asumu Takikawa took the last half of the presentation, showing some code for the driver for the Intel i210, i350, and 82599 cards. For more on that, I recommend his recent blog post on user-space driver development. It was truly a ray of sunshine in dark, dark Brussels.

Ole Trøan -- Cisco -- Fast dataplanes with VPP

This talk was a delightful introduction to VPP, but without all of the marketing; the sort of talk that makes FOSDEM worthwhile. Usually at more commercial, vendory events, you can't really get close to the technical people unless you have a vendor relationship: they are surrounded by a phalanx of salesfolk. But in FOSDEM it is clear that we are all comrades out on the open source networking front.

The speaker expressed great personal pleasure on having being able to work on open source software; his relief was palpable. A nice moment.

He also had some kind words about Snabb, too, saying at one point that "of course you can do it on snabb as well -- Snabb and VPP are quite similar in their approach to life". He trolled the horrible complexity diagrams of many "NFV" stacks whose components reflect the org charts that produce them more than the needs of the network functions in question (service chaining anyone?).

He did get to drop some numbers as well, which I found interesting. One is that recently they have been working on carrier-grade NAT, aiming for 6 terabits per second. Those are pretty big boxes and I hope they are getting paid appropriately for that :) For context he said that for a 4-unit server, these days you can build one that does a little less than a terabit per second. I assume that's with ten dual-port 40Gbps cards, and I would guess to power that you'd need around 40 cores or so, split between two sockets.

Finally, he finished with a long example on lightweight 4-over-6. Incidentally this is the same network function my group at Igalia has been building in Snabb over the last couple years, so it was interesting to see the comparison. I enjoyed his commentary that although all of these technologies (carrier-grade NAT, MAP, lightweight 4-over-6) have the ostensible goal of keeping IPv4 running, in reality "we're day by day making IPv4 work worse", mainly by breaking the assumption that just because you get traffic from port P on IP M, doesn't mean you can send traffic to M from another port or another protocol and have it reach the target.

All of these technologies also have problems with IPv4 fragmentation. Getting it right is possible but expensive. Instead, Ole mentions that he and a cross-vendor cabal of dataplane people have a "dark RFC" in the works to deprecate IPv4 fragmentation entirely :)

OK that's it. If I get around to writing up the couple of interesting Java talks I went to (I know right?) I'll let yall know. Happy hacking!

99 responses

  1. Max Rottenkolber says:

    Awesome write up, thanks!

  2. website says:

    Your writing ability is amazing. A good article should include all the relevant data for the topic in a gentle way so that the readers can get your thought easily. In this article, you have arranged all your ideas in a pleasant way. I am really impressed with that. I like to refer your site to get good topics to read. Almost I have all your posts and this one is the best. This article has clarified the ideas in an interesting way. Keep the same quality in the rest of posts which is the best thing I have got from your articles. Thank you.

  3. website says:

    will ask him to have a big price, or to wait for another Lord to go. I gave you a shred, and if you feel astray

  4. website says:

    and consider it the most valuable gift and the value of a woman

  5. website says:

    every day that you married also personality distorted, ready to use loyalty blackmail others. In a word,

  6. assignments writing services says:

    CPU is always the important part of computers and we call it the heart of the computer. Through CPU important information is transferred to monitor and we get results of our work is done.

  7. Help Me Write My Essay says:

    Will request that he have a major cost, or to sit tight for another Lord to go. I gave you a shred, and in the event that you feel off track

  8. Spider Solitaire Game says:

    If you are finding a best entertaining place in without any charges then go to play spider card game and make more exiting moment in without download and account signup.

  9. Thesis writing service says:

    I like your blog. Your way of presentation is one of the attracting feature. As a writer of Thesis writing help your blog helps me to how to present a content in good way. Thank you

  10. Pinterest says:

    Good Blog, interesting articles and simple and clean design.

  11. Help With Medical Thesis says:

    The Leading Assignment Help UK Firm Offers State Of The Art Services To Its Clients With A Promise Of Delivering All The Required Work Well Within The Deadline.

  12. yalla shoot says:


  13. says:

    target credit card login

  14. fanfiction says:

    Very useful info, thanks!

  15. cheap law Assignment writing says:

    I like your blog. Your method for introduction is one of the drawings in include. As an author of Thesis composing help, your blog encourages me to how to exhibit a substance in a great way. Much obliged to you.

  16. sudoku says:

    Really very inforative post.

  17. Latest says:

    Thanks my friend this is great and fantastic topic also is helpful i love this weblog so much.

  18. 212research says:

    Traveling is not my thing because I can't travel so much so far. The health issues always stop me when I am going for any adventure things.

  19. majece majece says:

    I know that you can read more about coursework writing on It will help you to get a high grade and become successful student

  20. Cheap Assignment Help | says:

    I never knew about the netwroking devroom before. This is something very oblivious for me and i admire your effort for providing information about it.

  21. Email Support Number says:

    Great Buddy, Thanks For Sharing
    If you are facing any email related issue then Dial Our Shaw Webmail Support Phone Number +(1) 844 489 7268 who Provide 24*7 Customer Support Service all over the world.

  22. Maricar says:

    As I continued reading, I have really appreciated your efforts for the information that you have shared in this blog. I hope you keep writing posts like this. Kudos!

  23. Quickbooks Support says:

    Very helpful information; very useful too in work. Thanks.

  24. check out this page says:

    It may take too long to read the entire guide in APA style, so we have shortened the information. This article briefly explains how to add in-text citations correctly as well as generate a bibliography (a.k.a. References page in APA). We tell which resources to use to format the work. Try to minimize bias in your language, avoid plagiarism (copy-pasted text without proper citations), and pick only high-quality examples of APA formatted essays with in-text citations.

  25. stove repair says:

    I have no a clear idea about this. If any one have more information and to like to share this then please post his/her comment.I am very thank full for this act of kindness.

  26. happy wheels says:

    Very interesting stuff here. I need this information. Thanks for sharing!

  27. <a href="">backsplash tile installers</a> says:

    Great notes, enjoyed reading through them!

  28. backsplash tile installers says:

    Great notes, enjoyed reading through them!

  29. Types and uses of tarot cards says:

    Marvellous Job! This article is so well written!

  30. click here says:

    Thanks for taking this opportunity to discuss this, I feel fervently about this and I like learning about this subject.

  31. adamschule85 says:

    I also believe in writing for a specific type of person in mind. kidstube

  32. Thompson says:


  33. Thompson says:


  34. Cassandra D. Everhart says:

    For a fellow blogger, you deserve a pat in the back. Cheers! click here

  35. geometry dash says:

    Your article makes me more experienced and impressed, I hope you will have more good posts in the near future to share with readers.

  36. red ball says:

    I think it would be better to have more information, understand the difficulty because I have learned a lot.

  37. Lonely Desires says:

    Our Bangalore escorts girls interestingly follow the rules of sex. What they do has symptoms to be loyal. Yes, unlike other call girls in Bangalore , our girls gives you alluring feelings so that you can fulfill your complete sex desires.

  38. Umzug Berlin says:

    Schöner Artikel. Sie haben auf sehr bequeme Weise exzellentes Wissen ausgetauscht. Gute Arbeit, weiter so!

  39. lehenga dresses says:

    I’m sure you will find a issue with your internet site utilizing Flock browser.

  40. Most rude hero based romantic novels says:

    I am very time read your blog. Thanks for the sharing this blog with us. Keep it up.

  41. Areej Shah Novels List says:

    Took me time to understand all of the comments, but I seriously enjoyed the write-up. It proved being really helpful to me

  42. Admission List says:

    Hello, Admin, I was wondering if I can write an article for this blog how about That???

  43. PAST QUESTIONS says:

    This is incredible i think it is important to read this page one after the other to better understand it..Thanks for the inspiration.


    wow…i just learnt a wonderful thing today— thanks alot

  45. Custom Thesis Help Writing Services says:

    Do you urgently require Custom Thesis Writing Services or are you unsure of which Best Dissertation Writing Services provider to contract for your essays? Online Dissertation Writing Service renders all sorts of writing assistance to scholars.

  46. GHBASE says:

    Very nicely written blog. I am a big fan of your blog and writing style.

  47. Saleonleather says:

    Graphs can not be incorporated here however here is the reason they are made. Here is a connection that shows a chart, however: Deserts happens in 2 wide belts, at north and south of the Equator, along the Tropics of Cancer and Capricorn.

  48. Ideal Assignment Help says:

    We have been the best in the business when it comes to providing Best My assignment help services. It is not at all easy for scholars around the world to cover their Solidworks assignment because of the lack of time and understanding of the topic. So, get connected to our team of Solidworks assignment help experts and we will help you out of trouble and make it ease for you to cover your different academic needs in the most convenient manner. We will make sure that you are able to cover your different academic requirements keeping the quality high and without missing the deadline.

  49. 360 Assignment says:

    Everything is perfect with this work. The writer has managed to keep the article interesting while discussing some really serious points. All the best.assignment help

  50. Happiness Danson says:

    Thank you so much for this interesting and impactful writeup. It is nice coming across websites with such writeups Download Civil Defence Past Questions

  51. Zack Rosenfeld says:

    Hey, but an evolution of network function virtualization really happened. Look around mate. If u re not sure visit this writing service where I do my homework cheap to get an article from experienced writers on this topic. They really have a lot of good content.

  52. Umzug Berlin says:

    Nice article!

  53. Dan M. says:

    Great notes, good to know. Love from the whole appliance repair beaumont team tonight on this glorious New Years. Love live winglog!!!!!!

  54. WAEC Result says:
  55. Direct Entry says:
  56. haslett says:

    wew, thanks for the backlink dude.

  57. uynime says:

    great articles

  58. Email Help Customer Care Number says:

    Touch with Customer Care Number +1-866- 501-0503 to get the best technical support for Yahoo, AOL, Hotmail, Zoho, etc. Our Experts available 24/7 For Your Services. web:-

  59. Post Utme Past questions says:
  60. atari breakout says:

    Your article is very good, I have read many articles but I am really impressed with your posts. Thank you, I will review this article. To know about me, try talking to me.

  61. best convertible car seat for twins says:

    And various other 5-star correspondents concur that the acceptable extra room on this twofold carriage is an inconceivable ideal position. One onlooker says that there's so a lot of cutoff, "I've yet to have the decision to hit full on all gathering zones." Another master gives up, "I don't consider the farthest point since I don't generally pass on a diaper sack … yet DANG the utmost is titanic. I discover motivations to utilize it.

  62. concrete contractors in chicago says:

    I love this.

  63. basement waterproofing near me says:

    Very excellent blog, you have motivated me to strive for excellence. Dan

  64. car window tinting says:

    You are the bomb diggity.

  65. rocky das says:

    Very excellent blog, you have motivated me to strive for excellence. Dan

  66. Neptuno says:

    Thanks for sharing such content. We appreciate it.

  67. Gamecoolmath says:

    If any page has quality posts like yours then I will search for information very quickly. Thanks!

  68. hot vpn apk says:

    Hot VPN Apk [Mod + Premium Unlocked] Latest Version – Hello Dreamers how are you? I hope you all are doing Great. So, today I am going to discuss about Hot VPN Apk Apk.

    Description – Hot VPN Apk is the fastest and most smooth FREE VPN. It is also very easy and user friendly in terms of uses. The Hot VPN Apk is one of the best VPN to unblock sites and for Wi-Fi Security and Privacy Protection.

  69. Tricksgeek says:

    Well really good shared article i have been ever seen.

  70. Richmond Hill says:

    Nice notes!

  71. ananya pandey says:

    celebrity news, bollywood biography, biography Flash,
    bollywood news, entertainment news, celebrity biography,
    celebrity wiki, celebrity age, celebrity height

  72. Forex Trading For Beginners says:

    Very great!

  73. Maid Nashville says:


  74. Septic Tank Inspection says:

    Alabama is the slamma!

  75. Basement Waterproofing Lincoln Nebraska says:


  76. gauri says:


  77. rushabh says:


  78. modlelo says:

    boom boom music

  79. modlelo says:

    gangadhar hi shaktimaan hai

  80. gauri says:


  81. HomeworkforMe says:

    I’m getting lots of search results for “synthesis essay topics” but don’t know which ones I can trust. Any advice?

  82. ala Vaikunta puramlo Full Movie says:

    Watch unlimited exclusive movies and originals Download online

  83. Choices MOD Apk says:

    This is almost same

  84. electrcian service in clarksville says:

    When will the Fosdem event be held again?

  85. electrician service in Clarksville says:

    When will the Fosdem event be held again?

  86. window tinting services in aurora il says:

    Will you be doing a boofa soon?

  87. nashville house cleaning says:

    Also, e ayolem?

  88. concrete appleton wi contractors says:

    Love your website almost as I love thos beans.

  89. bowling green window tints says:

    Android 21 is kinda in there!

  90. waterproofing says:


  91. says:

    Very good.

  92. efasheen says:

    Thanks for the great post on your blog, it really gives me an insight on this topic:

  93. BMW engines says:

    I too belong in the field of Computer Science and I have a lot of interest in the branch of Networking. The information you shared about networking has helped me a lot to clear my concepts.

  94. 4b Automotive says:

    Extremely brilliant blog, you have propelled me to take a stab at greatness.

  95. hamraaz says:

    thanks a lot great site keep it up !

  96. How to unblock someone on facebook says:

    read ad learn

  97. mark says:

    Extremely brilliant blog, you have propelled me to take a stab at greatness.

  98. gbwa says:

    Extremely brilliant blog, you have propelled me to take a stab at greatness.

  99. slope unblocked says:

    Wow, this is very interesting reading. I found a lot of things which I need. Great job on this content. This is very interesting reading. Great job on this content. I like it

Leave a Reply