this repo has no description
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

at main 12 lines 61 kB view raw
1{ 2 "id": "https://ryan.freumh.org/network-layer-mobility.html", 3 "title": "Network layer mobility", 4 "link": "https://ryan.freumh.org/network-layer-mobility.html", 5 "updated": "2025-03-24T00:00:00", 6 "published": "2021-05-10T00:00:00", 7 "summary": "<div>\n \n <span>Published 10 May 2021.</span>\n \n \n <span>Last update 24 Mar 2025.</span>\n \n </div>\n \n <div> Tags: <a href=\"/projects.html\" title=\"All pages tagged 'projects'.\">projects</a>, <a href=\"/research.html\" title=\"All pages tagged 'research'.\">research</a>. </div>\n \n \n\n \n<p><span>My undergraduate dissertation, “Ubiquitous\nCommunication for the Internet of Things: An Identifier-Locator\naddressing split overlay network”, explored how we can modify the\nInternet Protocol to better support resource-constrained highly mobile\nembedded devices. A copy can be found <a href=\"papers/2021-bsc-ubicomm.pdf\">here</a> (or <a href=\"https://studres.cs.st-andrews.ac.uk/Library/ProjectLibrary/cs4099/2021/rtg2-Final_Report.pdf\">here</a>\nfor St Andrews folk), and the associated implementation can be found at\n<a href=\"https://github.com/RyanGibb/ilnp-overlay-network\">ilnp-overlay-network</a>.</span></p>\n<h2>Network stack</h2>\n<p><span>First, some prerequisite networking\nknowledge. A network protocol stack is a view of how protocols are\norganised into layers. The <a href=\"https://en.wikipedia.org/wiki/OSI_model\">OSI model</a> describes\nnamed layers; including the physical, link, network, transport, and\napplication layers. Note the OSI model and TCP/IP have <a href=\"https://en.wikipedia.org/wiki/Internet_protocol_suite#Layer_names_and_number_of_layers_in_the_literature\">differing\nsemantics</a>, but this is beyond the scope of this blog post. The\nmodularity of protocols in a network stack has many advantages, such as\nallowing the protocol used at a layer to be exchanged\ntransparently.</span></p>\n<p><span>The protocol stack we’re concerned\nwith is based off the <a href=\"https://en.wikipedia.org/wiki/Internet_protocol_suite\">Internet\nProtocol suite</a>, also known as TCP/IP.</span></p>\n<p><span>This network stack is often referred\nto as an hourglass, with the Internet Protocol (IP) as the skinny\nwaist.</span></p>\n\n\n<img alt=\"Steve Deering. “Watching the Waist of the Protocol Hourglass”. In: IETF 51 London. 2001.\" src=\"./images/network-layer-mobility/diagrams/hourglass-cropped.svg\">\n\n<a href=\"https://people.cs.pitt.edu/~znati/Courses/WANs/Dir-Rel/Pprs/hourglass-london-ietf.pdf\">Steve\nDeering. “Watching the Waist of the Protocol Hourglass”. In: IETF 51\nLondon. 2001.</a>\n\n<p><span>Packets of a protocol are\nencapsulated by the protocol below, for example:</span></p>\n\n\n<img alt=\"Wikimedia UDP encapsulation.svg\" src=\"./images/network-layer-mobility/diagrams/udp-encapsulation.svg\">\n\n<a href=\"https://commons.wikimedia.org/wiki/File:UDP_encapsulation.svg\">Wikimedia\nUDP encapsulation.svg</a>\n\n<h2>Motivation</h2>\n<p><span>Ubiquitous Computing is a vision of the\nfuture of computing where devices are omnipresent and exist in many\nforms. The Internet of Things (IoT) is a modern interpretation of this\nwhich envisions many objects existing as Internet-connected smart\ndevices; such as wearable devices, smart vehicles, and smart appliances\nlike fridges, washing machines, and ovens. Many of these devices are\nphysically mobile, which requires network support when moving\nlocation.</span></p>\n<p><span>When we say network mobility in this\nblog, what we are in fact referring to is network layer (layer 3)\nmobility. This is also known as a vertical handoff, where the underlying\nlink layer technology can change, like moving from a WiFi to a cellular\nnetwork. This is to distinguish it from link layer (layer 2) mobility -\nhorizontal handoffs - where the link layer technology and layer 3\nnetwork remain the same but the network access point changes, such as\nwhen moving between cells in a mobile cellular network. Layer 2 mobility\nis insufficient when a mobile device moves between link layer\ntechnologies or layer 3 networks.</span></p>\n<p><span>Some examples of mobile IoT devices\nwould be health monitoring devices and smart vehicles. These devices may\nrequire constant connectivity with a fast-changing large number of\nnetwork connectivity options available, particularly in urban\nenvironments. For example, a health monitoring device switching from a\ncellar network to a WiFi network when entering an office building where\nno cellular signal is available.</span></p>\n<p><span>The largest solution space for this at\nthe moment is implementing mobility through IoT middleware applications.\nMiddleware, sitting in the application layer, provides a platform for\ncommon functionality, including mobility. It is comparatively very easy\nto deploy such a solution compared to reworking the networking stack.\nHowever, it requires the application software to be written for and tied\nto a specific middleware API, which is rarely standardised. It also adds\nan additional layer to the node’s network stack, with performance and\nenergy use implications, which are particularly relevant to\nresource-constrained IoT devices.</span></p>\n<p><span>Ideally, what we want is network support\nfor mobility transparent to the application layer. If we were able to\nimplement mobility at the network layer it would solve our\nproblems!</span></p>\n<h2>Mobility in IP</h2>\n<p><span>As we’ve discussed, IP is the skinny\nwaist of the Internet. It ties all the other protocols together allowing\nnodes (computers in a network) to communicate over interoperating\nnetworks with potentially different underlying technologies.</span></p>\n<p><span>IP was designed in 1981. In the same\nyear, IBM introduced its Personal Computer (PC) weighing over 9kg.\nToday, many mobile computers exist in the form of personal smartphones,\nin addition to the IoT devices already discussed. IP was not designed\nfor such mobile devices and does not support mobility.</span></p>\n<p><span>There are two issues with IP\npertaining to mobility.</span></p>\n<p><span>The first is the\n<em>overloading</em> of IP address semantics. IP addresses are used to\nidentify a node’s location in the Internet with routing prefixes and to\nuniquely identify a node in some scope. This becomes an issue for\nmobility when a node changes its location in the network as it also has\nto change its IP address.</span></p>\n<p><span>This wouldn’t be an issue in and of\nitself if a transport (layer 4) flow could dynamically adjust to a new\nIP address, which brings us to the second issue with IP addresses: the\n<em>entanglement</em> of layers. All layers of the TCP/IP stack use IP\naddresses, and IP addresses are semi-permanently bound to an\ninterface.</span></p>\n<p><span>These issues together mean that when\nmoving network all existing communication flows have to be\nreestablished. This results in application-specific logic being required\nto deal with network transitions. This has performance and energy use\nimplications due to dropped packets when switching networks and having\nto reestablish communication sessions. For example, TCP’s 3-way\nhandshake has to be re-done, and cryptographic protocols like TLS have\nto redo their key exchange. The more resource-constrained a device, such\nas IoT devices, and the more continuous connectivity is required, the\nmore important these considerations become.</span></p>\n<h2>ILNP</h2>\n<p><span>As IP was not designed with mobility in mind\nmost solutions to try retrofit mobility to IP somehow, such as the\nmiddleware platforms already discussed. This is symptomatic of a larger\nproblem: the ossification of the Internet. It’s easier to build up (in\nthe protocol stack) than to redesign it, especially when the protocol\nstack is as omnipresent and critical as the modern Internet. A radical\nchange in IP’s addressing from the Identifier-Locator Network Protocol\n(ILNP) architecture provides a solution to this mobility problem by\nseparating the semantics of IP addresses into their constituent parts:\nan identifier and a locator. An identifier uniquely identifies the node\n- within some scope - and the locator identifies the network in which\nthe node resides, giving the node’s location in the Internet. See <a href=\"https://tools.ietf.org/html/rfc6740\">RFC6740</a> for more\ndetail.</span></p>\n<p><span>The overloading of IP address is solved with\nthis Identifier-Locator addressing split. This also allows us to solve\nthe entanglement of layers:</span></p>\n\n\n<img alt=\"S. N. Bhatti and R. Yanagida. “Seamless internet connectivity for ubiquitous communication” In: PURBA UBICOMP. 2019\" src=\"./images/network-layer-mobility/diagrams/ilnp-ipv6-names-cropped.svg\">\n\n<a href=\"https://dl.acm.org/doi/abs/10.1145/3341162.3349315\">S. N. Bhatti\nand R. Yanagida. “Seamless internet connectivity for ubiquitous\ncommunication” In: PURBA UBICOMP. 2019</a>\n\n<p><span>Applications that use DNS to obtain IP\naddresses (conforming to <a href=\"https://tools.ietf.org/html/rfc1958#section-4\">RFC1958</a>) will\nbe backwards compatible with ILNPv6 with modifications to DNS <a href=\"https://tools.ietf.org/html/rfc6742\">RFC6742</a>).</span></p>\n<p><span>ILNP can be implemented as an extension to\nIPv6, called ILNPv6. ILNP can also be implemented as an extension to\nIPv4 as ILNPv4, but this is not as elegant as ILNPv6 and will not be\nconsidered here. The upper 64 bits of an IPv6 address is already used as\na routing prefix and is taken as the locator in ILNPv6. The lower 64\nbits, the interface identifier in IPv6, is taken as the identifier.\nILNPv6’s Identifier-Locator Vector (I-LV) corresponds to the IPv6\naddress. The syntax is identical but the semantics differ. That is, IPv6\naddresses and ILNPv6 I-LVs look the same on the wire but are interpreted\ndifferently.</span></p>\n\n\n<img alt=\"RFC6741\" src=\"./images/network-layer-mobility/diagrams/ilnp-ipv6-addresses-cropped.svg\">\n\n<a href=\"https://tools.ietf.org/html/rfc6741#section-3.1\">RFC6741</a>\n\n<p><span>So given an IPv6 address\n“2001:db8:1:2:3:4:5:6”, the ILNPv6 locator would be “2001:db8:1:2” and\nthe identifier “3:4:5:6”.</span></p>\n<p><span>ILNPv6 supports mobility through dynamic\nbinding of identifiers to locators, and ICMP locator update messages.\nThe locator of a node can change while retaining its identifier and\ncommunication flows. Additionally, ILNPv6 supports seamless connectivity\nduring a network transition with a soft handoff - making the new\nconnection before breaking the old connection. Note that this does\nrequire hardware support for multiple connections on the same adaptor,\nsuch as through CDMA, or two physical network adapters.</span></p>\n<p><span><a href=\"https://tools.ietf.org/html/rfc6115\">RFC6115</a> contains a survey\nof other solutions available. Unlike alternatives ILNPv6 requires\nupdates to the end hosts only, and does require a proxy or agent,\ntunnelling, address mapping, or application modifications. The\ndisadvantage of this approach is that it requires a reworking of the\nwhole network stack, which makes it more difficult to deploy.</span></p>\n<p><span>ILNP also supports other functionality of\nbenefit to IoT devices, such as multihoming and locator rewriting relays\n(LRRs). Multihoming refers to connecting a node to more than one network\nwhich enables a device to exploit any connectivity available. This is\nsupported by ILNP through allowing transport flows to use multiple\nlocators simultaneously via a dynamic binding of identifiers to\nlocators. LLRs are middleboxes that rewrite locators for privacy and\nsecurity benefits similar to those provided by NAT without breaking the\nend-to-end principle.</span></p>\n<h2>Overlay network</h2>\n<p><span>An overlay network is a ‘virtual’\nnetwork built on another network. Think <a href=\"https://www.torproject.org/\">tor</a>. An underlay network is the\nunderlying network beneath an overlay network.</span></p>\n<p><span>To demonstrate the operation of the\nprotocol and its support for mobility an ILNPv6 overlay network was\ncreated on top of UDP/IPv6 Multicast. An IPv6 multicast group\ncorresponds to a locator in our overlay network, or a ‘virtual network’.\nThere is a mechanical translation between 32-bit locators and 64-bit\nIPv6 multicast groups.</span></p>\n<p><span>This overlay network was\nimplemented in user space with Python due to time constraints of the\nproject and difficulties associated with kernel programming.</span></p>\n<p><span>A simple transport protocol (STP)\nwas created for demultiplexing received ILNPv6 packets by wrapping them\nwith a port, similar to UDP.</span></p>\n\n\n<img alt=\"Overlay network protocol stack\" src=\"./images/network-layer-mobility/diagrams/overlay-network-stack.svg\">\n\nOverlay network protocol\nstack\n\n<p><span>Note that in our overlay network,\nfor a node, an interface simply refers to a locator which the node is\nconnected to, via configuration files. The node will have connected to\nthe corresponding IP multicast address.</span></p>\n<h2>Discovery protocol</h2>\n<p><span>A discovery protocol was\nrequired for nodes to discover each other and to discover routing paths.\nIt is inspired by the IPv6 Neighbour Discovery Protocol. Nodes send\nsolicitations (requests for advertisements) and advertisements\n(responses to solicitations). Both solicitations and advertisements\ncontain a node’s hostname, set of valid locators, and identifier. This\nmeans that hostname resolution is included in our protocol, which was\ndone to avoid the complications of a DNS deployment in our\noverlay.</span></p>\n<p><span>A simple flood and backwards\nlearn approach was taken. When a node receives a discovery protocol\nmessage on an interface it forwards it to every other interface. This\nrelies on the ILNPv6 hop count being decremented to avoid infinitely\nlooping packages in circular topologies. Nodes eavesdrop on discovery\nprotocol messages so one solicitation is sufficient for all nodes in a\nnetwork to learn about all the others.</span></p>\n<p><span>Discovery protocol messages are\nsent to a special ILNPv6 all nodes locator - essentially local broadcast\nin a virtual network. Forwarding happens at the discovery protocol\nlayer, not the ILNPv6 layer.</span></p>\n<p><span>Backwards learning is done on\nthese discovery protocol messages; when an ILNPv6 packet is received the\nforwarding table is updated mapping the source locator of the packet to\nthe interface it was received on. This means the discovery protocol\nserves to bootstrap the network by populating the forwarding\ntable.</span></p>\n<p><span>This protocol scales poorly -\nthe number of messages scales quadratically with every additional\nnetwork containing a node - but it is sufficient for our\npurposes.</span></p>\n<p><span>See an example operation of the\nprotocol below. Node A is in network 1, node B in network 2, and node C\nin both networks.</span></p>\n\n\n<img alt=\"Discovery protocol example topology\" src=\"./images/network-layer-mobility/diagrams/discovery-protocol-topology.svg\">\n\nDiscovery protocol example\ntopology\n\n\n\n<img alt=\"Discovery protocol example sequence diagram\" src=\"./images/network-layer-mobility/diagrams/discovery-protocol-sequence-diagram.svg\">\n\nDiscovery protocol example sequence\ndiagram\n\n<h2>Locator updates</h2>\n<p><span>Our overlay network supports\nmobility with locator update messages as part of the ILNPv6 layer. The\nmobile node (MN) sends a locator update over its old locator, and the\ncorresponding node (CN) responds with an acknowledgement via the new\nlocator - verifying a path exists between the new locator and CN\nexists.</span></p>\n<p><span>The discovery message sent by the\nMN on the new locator is simply for path discovery as the CN will not\nknow how to route to 0:0:0:c with no node sending discovery messages\nfrom that locator. An alternative solution to this would have been to\nmake nodes send packets to all connected interfaces if there is no\nmapping in the forwarding table.</span></p>\n<p><span>See an example of a MN moving from\nlocator 0:0:0:a to locator 0:0:0:c, in a communication session with a CN\nin locator 0:0:0:b, below:</span></p>\n\n\n<img alt=\"locator update example topology\" src=\"./images/network-layer-mobility/diagrams/locator-update-topology.svg\">\n\nlocator update example\ntopology\n\n\n\n<img alt=\"locator update example sequence diagram\" src=\"./images/network-layer-mobility/diagrams/locator-update-sequence-diagram.svg\">\n\nlocator update example sequence\ndiagram\n\n<h2>Experiments</h2>\n<p><span>To demonstrate the operation of the\noverlay network on resource-constrained IoT devices a Raspberry Pi\ntestbed communicating via ethernet was used. Previous work in this area\nhas been confined to workstation or server machines.</span></p>\n<p><img src=\"./images/network-layer-mobility/testbed.jpg\"></p>\n<p><span>The virtual network topology was 3\nnetworks that the MN moved between every 20 seconds, one of which the CN\nresided in.</span></p>\n<p><img src=\"./images/network-layer-mobility/diagrams/experiment.svg\"></p>\n<p><span>The experimental application sent an\nMTU packet with a sequence number every 10ms from the MN to CN, and CN\nto MN, resulting in a throughput of 266.6kB/s.</span></p>\n<p><span>Looking at the received sequence by the\nCN we can see that there’s no loss or misordering - just a smooth\nseamless line with a constant gradient. The dotted vertical lines show\nthe network transitions.</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Received sequence numbers vs time on CN\" src=\"./images/network-layer-mobility/graphs/exp3/received-sequence-numbers-vs-time-on-cn.svg\">\n<img alt=\"Received sequence numbers vs time on MN\" src=\"./images/network-layer-mobility/graphs/exp3/received-sequence-numbers-vs-time-on-mn.svg\">\n\n\nReceived sequence numbers vs time on\nCN\nReceived sequence numbers vs time on\nMN\n\n\n\n<p><span>Looking at the throughputs we can see\ndiscrete rectangles for each individual locator showing the separation\nbetween locator uses. The smooth aggregate throughput shows that, as\nsuggested by the sequence number graphs, there is seamless connectivity\nbetween network transitions. Note that the locators listed refer to the\nlocator the MN is connected to, even for the throughputs on the\nCN.</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Throughput in 1s buckets vs Time on CN\" src=\"./images/network-layer-mobility/graphs/exp3/throughput-in-1s-buckets-vs-time-on-cn.svg\">\n<img alt=\"Throughput in 1s buckets vs Time on MN\" src=\"./images/network-layer-mobility/graphs/exp3/throughput-in-1s-buckets-vs-time-on-mn.svg\">\n\n\nThroughput in 1s buckets vs Time on\nCN\nThroughput in 1s buckets vs Time on\nMN\n\n\n\n<h2>System stability issues</h2>\n<p><span>An interesting hardware\nproblem was encountered when performing experiments with the overlay\nnetwork on the Raspberry Pi testbed that caused system stability\nissues.</span></p>\n<p><span>Taking experiment 3 as an\nexample, the received sequence numbers were mostly linear, but there\nwere horizontal gaps and sometimes subsequent spikes (likely due to\nbuffering on one of the nodes):</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Received sequence numbers vs time on CN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/received-sequence-numbers-vs-time-on-cn.svg\">\n<img alt=\"Received sequence numbers vs time on MN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/received-sequence-numbers-vs-time-on-mn.svg\">\n\n\nReceived sequence numbers vs time on\nCN\nReceived sequence numbers vs time on\nMN\n\n\n\n<p><span>There was no loss,\nhowever.</span></p>\n<p><span>This issue could be seen a\nlot more clearly in the throughput graphs:</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Throughput in 1s buckets vs Time on CN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/throughput-in-1s-buckets-vs-time-on-cn.svg\">\n<img alt=\"Throughput in 1s buckets vs Time on MN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/throughput-in-1s-buckets-vs-time-on-mn.svg\">\n\n\nThroughput in 1s buckets vs Time on\nCN\nThroughput in 1s buckets vs Time on\nMN\n\n\n\n<p><span>There are drops in\nthroughput, corresponding to horizontal gaps in the graph, and sometimes\nsubsequent spikes, corresponding to the spikes in received sequence\nnumbers.</span></p>\n<p><span>As the main focus of this\nproject is obviously networking that was the first area assumed to be\nwhere the problem lay, as a scheduling or buffering issue. But the UDP\nsend was not blocking, and the threading and thread synchronisation were\nworking perfectly. The process was tried pinned to a specific CPU core\nwith <code>$ taskset 0x1 &lt;program&gt;</code> to no avail. Using\n<code>tcpdump</code> showed the same gaps in packets sent and received\non the CN, router, and MN.</span></p>\n<p><span>Running <code>top</code> on\nthe Pi while running showed that when systems issues occurred (printed\nas a warning by the experiment program) the process was in a ‘D’ state.\nThis means it was in an uninterruptible sleep, due to I/O, otherwise\ndata corruption could occur. As network issues were already ruled out,\nthe only other I/O was logging. A long D state seems to be a common\nissue in Network File Systems (NFS), but that is not used here. A system\nrequest to display the list of blocked (D state) tasks with\n<code>echo w &gt; /proc/sysrq-trigger</code> was made when the process\nwas running. The relevant section of the kernel log from this\nis:</span></p>\n<pre><code>$ dmesg\n...\n[6367695.195711] sysrq: Show Blocked State\n[6367695.199742] task PC stack pid father\n[6367695.199791] jbd2/mmcblk0p2- D 0 824 2 0x00000028\n[6367695.199801] Call trace:\n[6367695.199818] __switch_to+0x108/0x1c0\n[6367695.199828] __schedule+0x328/0x828\n[6367695.199835] schedule+0x4c/0xe8\n[6367695.199843] io_schedule+0x24/0x90\n[6367695.199850] bit_wait_io+0x20/0x60\n[6367695.199857] __wait_on_bit+0x80/0xf0\n[6367695.199864] out_of_line_wait_on_bit+0xa8/0xd8\n[6367695.199872] __wait_on_buffer+0x40/0x50\n[6367695.199881] jbd2_journal_commit_transaction+0xdf0/0x19f0\n[6367695.199889] kjournald2+0xc4/0x268\n[6367695.199897] kthread+0x150/0x170\n[6367695.199904] ret_from_fork+0x10/0x18\n[6367695.199957] kworker/1:1 D 0 378944 2 0x00000028\n[6367695.199984] Workqueue: events dbs_work_handler\n[6367695.199990] Call trace:\n[6367695.199998] __switch_to+0x108/0x1c0\n[6367695.200004] __schedule+0x328/0x828\n[6367695.200011] schedule+0x4c/0xe8\n[6367695.200019] schedule_timeout+0x15c/0x368\n[6367695.200026] wait_for_completion_timeout+0xa0/0x120\n[6367695.200034] mbox_send_message+0xa8/0x120\n[6367695.200042] rpi_firmware_transaction+0x6c/0x110\n[6367695.200048] rpi_firmware_property_list+0xbc/0x178\n[6367695.200055] rpi_firmware_property+0x78/0x110\n[6367695.200063] raspberrypi_fw_set_rate+0x5c/0xd8\n[6367695.200070] clk_change_rate+0xdc/0x500\n[6367695.200077] clk_core_set_rate_nolock+0x1cc/0x1f0\n[6367695.200084] clk_set_rate+0x3c/0xc0\n[6367695.200090] dev_pm_opp_set_rate+0x3d4/0x520\n[6367695.200096] set_target+0x4c/0x90\n[6367695.200103] __cpufreq_driver_target+0x2c8/0x678\n[6367695.200110] od_dbs_update+0xc4/0x1a0\n[6367695.200116] dbs_work_handler+0x48/0x80\n[6367695.200123] process_one_work+0x1c4/0x460\n[6367695.200129] worker_thread+0x54/0x428\n[6367695.200136] kthread+0x150/0x170\n[6367695.200142] ret_from_fork+0x10/0x1\n[6367695.200155] python3 D 0 379325 379321 0x00000000\n[6367695.200163] Call trace:\n[6367695.200170] __switch_to+0x108/0x1c0\n[6367695.200177] __schedule+0x328/0x828\n[6367695.200184] schedule+0x4c/0xe8\n[6367695.200190] io_schedule+0x24/0x90\n[6367695.200197] bit_wait_io+0x20/0x60\n[6367695.200204] __wait_on_bit+0x80/0xf0\n[6367695.200210] out_of_line_wait_on_bit+0xa8/0xd8\n[6367695.200217] do_get_write_access+0x438/0x5e8\n[6367695.200224] jbd2_journal_get_write_access+0x6c/0xc0\n[6367695.200233] __ext4_journal_get_write_access+0x40/0xa8\n[6367695.200241] ext4_reserve_inode_write+0xa8/0xf8\n[6367695.200248] ext4_mark_inode_dirty+0x68/0x248\n[6367695.200255] ext4_dirty_inode+0x54/0x78\n[6367695.200262] __mark_inode_dirty+0x268/0x4a8\n[6367695.200269] generic_update_time+0xb0/0xf8\n[6367695.200275] file_update_time+0xf8/0x138\n[6367695.200284] __generic_file_write_iter+0x94/0x1e8\n[6367695.200290] ext4_file_write_iter+0xb4/0x338\n[6367695.200298] new_sync_write+0x104/0x1b0\n[6367695.200305] __vfs_write+0x78/0x90\n[6367695.200312] vfs_write+0xe8/0x1c8\n[6367695.200318] ksys_write+0x7c/0x108\n[6367695.200324] __arm64_sys_write+0x28/0x38\n[6367695.200330] el0_svc_common.constprop.0+0x84/0x218\n[6367695.200336] el0_svc_handler+0x38/0xa0\n[6367695.200342] el0_svc+0x10/0x2d4</code></pre>\n<p><span>Looking at the\n<code>python3</code> task stacktrace:</span></p>\n<ul>\n<li><p><span><code>jbd2</code> is\nthe thread that updates the filesystem journal, and <code>ext4</code> is\nthe default Ubuntu file system (as well as a lot of other\ndistributions)</span></p></li>\n<li><p><span>We can see than an\ninode is marked as dirty with <code>ext4_mark_inode_dirty</code>, and a\nfile written with <code>ext4_file_write_iter</code>, and then a virtual\nfile system write <code>vfs_write</code> is translated into an ARM write\n<code>__arm64_sys_write</code>.</span></p>\n<p><span>So this is happening\nduring a file write.</span></p></li>\n<li><p><span>In ARM,\n<code>svc</code> means supervisor call, and <code>el0</code> exception\nlevel 0 (the lowest level of exception), so some sort of exception\noccurs and is then handled with\n<code>el0_svc_handler</code>.</span></p></li>\n</ul>\n<p><span>Running\n<code>trace -r -t -v -p &lt;PID of process&gt;</code>, we can see the\nwrites that take an exceptionally long amount of time. Here is an\nexample where the write of 288 bytes to file descriptor 5 executes\nsuccessfully but takes 2.24 seconds to complete:</span></p>\n<pre><code>21:47:28.684124 (+ 0.000226) write(7, &quot;2021-04-10 21:47:28.684061 [0:0:&quot;..., 194) = 194\n21:47:28.684381 (+ 0.000256) write(1, &quot;2021-04-10 21:47:28.684308 [alic&quot;..., 122) = 122\n21:47:28.684583 (+ 0.000202) write(1, &quot;\\n&quot;, 1) = 1\n21:47:28.684786 (+ 0.000202) pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=5647000}, NULL) = 0 (Timeout)\n21:47:28.690796 (+ 0.006023) pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=0}, NULL) = 0 (Timeout)\n21:47:30.930965 (+ 2.240200) write(5, &quot;2021-04-10 21:47:30.930813 0:0:0&quot;..., 228) = 228\n21:47:30.931427 (+ 0.000433) getuid() = 1000\n21:47:30.931812 (+ 0.000385) socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 9\n21:47:30.932142 (+ 0.000328) ioctl(9, SIOCGIFINDEX, {ifr_name=&quot;eth0&quot;, }) = 0\n21:47:30.932506 (+ 0.000364) close(9) = 0\n21:47:30.933208 (+ 0.000705) write(4, &quot;2021-04-10 21:47:30.933090 [ff12&quot;..., 348) = 348</code></pre>\n<p><span>So the problem seems to be\nexceptions that sometimes occur during file writes, which take a long\ntime to resolve. These block the process executing by putting it in a D\nstate until the write returns, affecting the system stability. These\nexceptions being the cause would make sense, as these issues aren’t\noccurring consistently, but rather intermittently. This is happening on\nthe MN, on the router, and on the CN; so its effect is being amplified 3\ntimes. These exceptions are likely due to the page cache being flushed\nto disk, combined with poor performance of the Pi’s SD cards. But\nfinding the root cause would require more investigation. Regardless,\nenough is now known to fix the problem.</span></p>\n<p><span>Removing the logging\nimproved the system stability, but the issues still occurred with\nreduced frequency. This is because the experimental log is written to\n<code>stdout</code>, and <code>stdout</code> is piped to\ndisk.</span></p>\n<p><span>The program was being ran\non the Pi’s through SSH piping <code>stdout</code> to a file, like\nthis:</span></p>\n<pre><code>$ ssh HOST &quot;RUN &gt; EXPERIMENT_LOG_FILE&quot;</code></pre>\n<p><span>Changing this\nto:</span></p>\n<pre><code>$ ssh HOST &quot;RUN | cat &gt; EXPERIMENT_LOG_FILE&quot;</code></pre>\n<p><span>Fixed the issue once and\nfor all.</span></p>\n<p><span>This essentially spawns\nanother process to write to the file, and lets the shell buffer between\nthem. When an I/O exception occurs the writing process is put in a D\nstate until the exception is handled, but the Python process is\nunaffected as its output is buffered until the writing process is able\nto read from it again.</span></p>\n<h2>Conclusion</h2>\n<p><span>This project has involved creating an\nILNP overlay network, focusing on protocol design and operation;\nperforming an experimental analysis with resource-constrained IoT\ndevices; and demonstrating the protocol’s support for mobility with\nseamless network transitions through the use of a soft\nhandoff.</span></p>\n<p><span>The limitations of this project are the\nperformance of the program due to the overlay and use of Python; the\nscaling of the discovery protocol; only one application program is\nsupported for a virtual network stack as it runs on a single process\nwithout IPC; and only one instance of the program can be run on a\nmachine, due to the multicast UDP socket used by each instance of the\nprogram being bound to the same port.</span></p>\n<p><span>Further work in this area\nincludes:</span></p>\n<ul>\n<li>experimenting with a kernel implementation of ILNPv6 on IoT\ndevices</li>\n<li>investigating a multihoming policy and the benefits gained from the\nmultipath effect for IoT devices</li>\n<li>performing experiments of IoT devices transitioning between networks\nusing a wireless communication link layer such as IEEE 802.11/WiFi, as\nthis more appropriate than Ethernet for an IoT context</li>\n<li>performing experiments with two mobile nodes communicating</li>\n<li>performing experiments with even more resource-constrained devices\nthan Raspberry Pis, such as wireless sensors nodes</li>\n</ul>\n\n\n<p><span>As mentioned at the start, see the <a href=\"papers/2021-bsc-ubicomm.pdf\">dissertation</a> on which this blog\nwas based for a bit more nuance, and a lot more detail.</span></p>\n<p><span>If you have any questions or comments on\nthis feel free to <a href=\"./about.html#contact\">get in\ntouch</a>.</span></p>", 8 "content": "<div>\n \n <span>Published 10 May 2021.</span>\n \n \n <span>Last update 24 Mar 2025.</span>\n \n </div>\n \n <div> Tags: <a href=\"/projects.html\" title=\"All pages tagged 'projects'.\">projects</a>, <a href=\"/research.html\" title=\"All pages tagged 'research'.\">research</a>. </div>\n \n \n\n \n<p><span>My undergraduate dissertation, “Ubiquitous\nCommunication for the Internet of Things: An Identifier-Locator\naddressing split overlay network”, explored how we can modify the\nInternet Protocol to better support resource-constrained highly mobile\nembedded devices. A copy can be found <a href=\"papers/2021-bsc-ubicomm.pdf\">here</a> (or <a href=\"https://studres.cs.st-andrews.ac.uk/Library/ProjectLibrary/cs4099/2021/rtg2-Final_Report.pdf\">here</a>\nfor St Andrews folk), and the associated implementation can be found at\n<a href=\"https://github.com/RyanGibb/ilnp-overlay-network\">ilnp-overlay-network</a>.</span></p>\n<h2>Network stack</h2>\n<p><span>First, some prerequisite networking\nknowledge. A network protocol stack is a view of how protocols are\norganised into layers. The <a href=\"https://en.wikipedia.org/wiki/OSI_model\">OSI model</a> describes\nnamed layers; including the physical, link, network, transport, and\napplication layers. Note the OSI model and TCP/IP have <a href=\"https://en.wikipedia.org/wiki/Internet_protocol_suite#Layer_names_and_number_of_layers_in_the_literature\">differing\nsemantics</a>, but this is beyond the scope of this blog post. The\nmodularity of protocols in a network stack has many advantages, such as\nallowing the protocol used at a layer to be exchanged\ntransparently.</span></p>\n<p><span>The protocol stack we’re concerned\nwith is based off the <a href=\"https://en.wikipedia.org/wiki/Internet_protocol_suite\">Internet\nProtocol suite</a>, also known as TCP/IP.</span></p>\n<p><span>This network stack is often referred\nto as an hourglass, with the Internet Protocol (IP) as the skinny\nwaist.</span></p>\n\n\n<img alt=\"Steve Deering. “Watching the Waist of the Protocol Hourglass”. In: IETF 51 London. 2001.\" src=\"./images/network-layer-mobility/diagrams/hourglass-cropped.svg\">\n\n<a href=\"https://people.cs.pitt.edu/~znati/Courses/WANs/Dir-Rel/Pprs/hourglass-london-ietf.pdf\">Steve\nDeering. “Watching the Waist of the Protocol Hourglass”. In: IETF 51\nLondon. 2001.</a>\n\n<p><span>Packets of a protocol are\nencapsulated by the protocol below, for example:</span></p>\n\n\n<img alt=\"Wikimedia UDP encapsulation.svg\" src=\"./images/network-layer-mobility/diagrams/udp-encapsulation.svg\">\n\n<a href=\"https://commons.wikimedia.org/wiki/File:UDP_encapsulation.svg\">Wikimedia\nUDP encapsulation.svg</a>\n\n<h2>Motivation</h2>\n<p><span>Ubiquitous Computing is a vision of the\nfuture of computing where devices are omnipresent and exist in many\nforms. The Internet of Things (IoT) is a modern interpretation of this\nwhich envisions many objects existing as Internet-connected smart\ndevices; such as wearable devices, smart vehicles, and smart appliances\nlike fridges, washing machines, and ovens. Many of these devices are\nphysically mobile, which requires network support when moving\nlocation.</span></p>\n<p><span>When we say network mobility in this\nblog, what we are in fact referring to is network layer (layer 3)\nmobility. This is also known as a vertical handoff, where the underlying\nlink layer technology can change, like moving from a WiFi to a cellular\nnetwork. This is to distinguish it from link layer (layer 2) mobility -\nhorizontal handoffs - where the link layer technology and layer 3\nnetwork remain the same but the network access point changes, such as\nwhen moving between cells in a mobile cellular network. Layer 2 mobility\nis insufficient when a mobile device moves between link layer\ntechnologies or layer 3 networks.</span></p>\n<p><span>Some examples of mobile IoT devices\nwould be health monitoring devices and smart vehicles. These devices may\nrequire constant connectivity with a fast-changing large number of\nnetwork connectivity options available, particularly in urban\nenvironments. For example, a health monitoring device switching from a\ncellar network to a WiFi network when entering an office building where\nno cellular signal is available.</span></p>\n<p><span>The largest solution space for this at\nthe moment is implementing mobility through IoT middleware applications.\nMiddleware, sitting in the application layer, provides a platform for\ncommon functionality, including mobility. It is comparatively very easy\nto deploy such a solution compared to reworking the networking stack.\nHowever, it requires the application software to be written for and tied\nto a specific middleware API, which is rarely standardised. It also adds\nan additional layer to the node’s network stack, with performance and\nenergy use implications, which are particularly relevant to\nresource-constrained IoT devices.</span></p>\n<p><span>Ideally, what we want is network support\nfor mobility transparent to the application layer. If we were able to\nimplement mobility at the network layer it would solve our\nproblems!</span></p>\n<h2>Mobility in IP</h2>\n<p><span>As we’ve discussed, IP is the skinny\nwaist of the Internet. It ties all the other protocols together allowing\nnodes (computers in a network) to communicate over interoperating\nnetworks with potentially different underlying technologies.</span></p>\n<p><span>IP was designed in 1981. In the same\nyear, IBM introduced its Personal Computer (PC) weighing over 9kg.\nToday, many mobile computers exist in the form of personal smartphones,\nin addition to the IoT devices already discussed. IP was not designed\nfor such mobile devices and does not support mobility.</span></p>\n<p><span>There are two issues with IP\npertaining to mobility.</span></p>\n<p><span>The first is the\n<em>overloading</em> of IP address semantics. IP addresses are used to\nidentify a node’s location in the Internet with routing prefixes and to\nuniquely identify a node in some scope. This becomes an issue for\nmobility when a node changes its location in the network as it also has\nto change its IP address.</span></p>\n<p><span>This wouldn’t be an issue in and of\nitself if a transport (layer 4) flow could dynamically adjust to a new\nIP address, which brings us to the second issue with IP addresses: the\n<em>entanglement</em> of layers. All layers of the TCP/IP stack use IP\naddresses, and IP addresses are semi-permanently bound to an\ninterface.</span></p>\n<p><span>These issues together mean that when\nmoving network all existing communication flows have to be\nreestablished. This results in application-specific logic being required\nto deal with network transitions. This has performance and energy use\nimplications due to dropped packets when switching networks and having\nto reestablish communication sessions. For example, TCP’s 3-way\nhandshake has to be re-done, and cryptographic protocols like TLS have\nto redo their key exchange. The more resource-constrained a device, such\nas IoT devices, and the more continuous connectivity is required, the\nmore important these considerations become.</span></p>\n<h2>ILNP</h2>\n<p><span>As IP was not designed with mobility in mind\nmost solutions to try retrofit mobility to IP somehow, such as the\nmiddleware platforms already discussed. This is symptomatic of a larger\nproblem: the ossification of the Internet. It’s easier to build up (in\nthe protocol stack) than to redesign it, especially when the protocol\nstack is as omnipresent and critical as the modern Internet. A radical\nchange in IP’s addressing from the Identifier-Locator Network Protocol\n(ILNP) architecture provides a solution to this mobility problem by\nseparating the semantics of IP addresses into their constituent parts:\nan identifier and a locator. An identifier uniquely identifies the node\n- within some scope - and the locator identifies the network in which\nthe node resides, giving the node’s location in the Internet. See <a href=\"https://tools.ietf.org/html/rfc6740\">RFC6740</a> for more\ndetail.</span></p>\n<p><span>The overloading of IP address is solved with\nthis Identifier-Locator addressing split. This also allows us to solve\nthe entanglement of layers:</span></p>\n\n\n<img alt=\"S. N. Bhatti and R. Yanagida. “Seamless internet connectivity for ubiquitous communication” In: PURBA UBICOMP. 2019\" src=\"./images/network-layer-mobility/diagrams/ilnp-ipv6-names-cropped.svg\">\n\n<a href=\"https://dl.acm.org/doi/abs/10.1145/3341162.3349315\">S. N. Bhatti\nand R. Yanagida. “Seamless internet connectivity for ubiquitous\ncommunication” In: PURBA UBICOMP. 2019</a>\n\n<p><span>Applications that use DNS to obtain IP\naddresses (conforming to <a href=\"https://tools.ietf.org/html/rfc1958#section-4\">RFC1958</a>) will\nbe backwards compatible with ILNPv6 with modifications to DNS <a href=\"https://tools.ietf.org/html/rfc6742\">RFC6742</a>).</span></p>\n<p><span>ILNP can be implemented as an extension to\nIPv6, called ILNPv6. ILNP can also be implemented as an extension to\nIPv4 as ILNPv4, but this is not as elegant as ILNPv6 and will not be\nconsidered here. The upper 64 bits of an IPv6 address is already used as\na routing prefix and is taken as the locator in ILNPv6. The lower 64\nbits, the interface identifier in IPv6, is taken as the identifier.\nILNPv6’s Identifier-Locator Vector (I-LV) corresponds to the IPv6\naddress. The syntax is identical but the semantics differ. That is, IPv6\naddresses and ILNPv6 I-LVs look the same on the wire but are interpreted\ndifferently.</span></p>\n\n\n<img alt=\"RFC6741\" src=\"./images/network-layer-mobility/diagrams/ilnp-ipv6-addresses-cropped.svg\">\n\n<a href=\"https://tools.ietf.org/html/rfc6741#section-3.1\">RFC6741</a>\n\n<p><span>So given an IPv6 address\n“2001:db8:1:2:3:4:5:6”, the ILNPv6 locator would be “2001:db8:1:2” and\nthe identifier “3:4:5:6”.</span></p>\n<p><span>ILNPv6 supports mobility through dynamic\nbinding of identifiers to locators, and ICMP locator update messages.\nThe locator of a node can change while retaining its identifier and\ncommunication flows. Additionally, ILNPv6 supports seamless connectivity\nduring a network transition with a soft handoff - making the new\nconnection before breaking the old connection. Note that this does\nrequire hardware support for multiple connections on the same adaptor,\nsuch as through CDMA, or two physical network adapters.</span></p>\n<p><span><a href=\"https://tools.ietf.org/html/rfc6115\">RFC6115</a> contains a survey\nof other solutions available. Unlike alternatives ILNPv6 requires\nupdates to the end hosts only, and does require a proxy or agent,\ntunnelling, address mapping, or application modifications. The\ndisadvantage of this approach is that it requires a reworking of the\nwhole network stack, which makes it more difficult to deploy.</span></p>\n<p><span>ILNP also supports other functionality of\nbenefit to IoT devices, such as multihoming and locator rewriting relays\n(LRRs). Multihoming refers to connecting a node to more than one network\nwhich enables a device to exploit any connectivity available. This is\nsupported by ILNP through allowing transport flows to use multiple\nlocators simultaneously via a dynamic binding of identifiers to\nlocators. LLRs are middleboxes that rewrite locators for privacy and\nsecurity benefits similar to those provided by NAT without breaking the\nend-to-end principle.</span></p>\n<h2>Overlay network</h2>\n<p><span>An overlay network is a ‘virtual’\nnetwork built on another network. Think <a href=\"https://www.torproject.org/\">tor</a>. An underlay network is the\nunderlying network beneath an overlay network.</span></p>\n<p><span>To demonstrate the operation of the\nprotocol and its support for mobility an ILNPv6 overlay network was\ncreated on top of UDP/IPv6 Multicast. An IPv6 multicast group\ncorresponds to a locator in our overlay network, or a ‘virtual network’.\nThere is a mechanical translation between 32-bit locators and 64-bit\nIPv6 multicast groups.</span></p>\n<p><span>This overlay network was\nimplemented in user space with Python due to time constraints of the\nproject and difficulties associated with kernel programming.</span></p>\n<p><span>A simple transport protocol (STP)\nwas created for demultiplexing received ILNPv6 packets by wrapping them\nwith a port, similar to UDP.</span></p>\n\n\n<img alt=\"Overlay network protocol stack\" src=\"./images/network-layer-mobility/diagrams/overlay-network-stack.svg\">\n\nOverlay network protocol\nstack\n\n<p><span>Note that in our overlay network,\nfor a node, an interface simply refers to a locator which the node is\nconnected to, via configuration files. The node will have connected to\nthe corresponding IP multicast address.</span></p>\n<h2>Discovery protocol</h2>\n<p><span>A discovery protocol was\nrequired for nodes to discover each other and to discover routing paths.\nIt is inspired by the IPv6 Neighbour Discovery Protocol. Nodes send\nsolicitations (requests for advertisements) and advertisements\n(responses to solicitations). Both solicitations and advertisements\ncontain a node’s hostname, set of valid locators, and identifier. This\nmeans that hostname resolution is included in our protocol, which was\ndone to avoid the complications of a DNS deployment in our\noverlay.</span></p>\n<p><span>A simple flood and backwards\nlearn approach was taken. When a node receives a discovery protocol\nmessage on an interface it forwards it to every other interface. This\nrelies on the ILNPv6 hop count being decremented to avoid infinitely\nlooping packages in circular topologies. Nodes eavesdrop on discovery\nprotocol messages so one solicitation is sufficient for all nodes in a\nnetwork to learn about all the others.</span></p>\n<p><span>Discovery protocol messages are\nsent to a special ILNPv6 all nodes locator - essentially local broadcast\nin a virtual network. Forwarding happens at the discovery protocol\nlayer, not the ILNPv6 layer.</span></p>\n<p><span>Backwards learning is done on\nthese discovery protocol messages; when an ILNPv6 packet is received the\nforwarding table is updated mapping the source locator of the packet to\nthe interface it was received on. This means the discovery protocol\nserves to bootstrap the network by populating the forwarding\ntable.</span></p>\n<p><span>This protocol scales poorly -\nthe number of messages scales quadratically with every additional\nnetwork containing a node - but it is sufficient for our\npurposes.</span></p>\n<p><span>See an example operation of the\nprotocol below. Node A is in network 1, node B in network 2, and node C\nin both networks.</span></p>\n\n\n<img alt=\"Discovery protocol example topology\" src=\"./images/network-layer-mobility/diagrams/discovery-protocol-topology.svg\">\n\nDiscovery protocol example\ntopology\n\n\n\n<img alt=\"Discovery protocol example sequence diagram\" src=\"./images/network-layer-mobility/diagrams/discovery-protocol-sequence-diagram.svg\">\n\nDiscovery protocol example sequence\ndiagram\n\n<h2>Locator updates</h2>\n<p><span>Our overlay network supports\nmobility with locator update messages as part of the ILNPv6 layer. The\nmobile node (MN) sends a locator update over its old locator, and the\ncorresponding node (CN) responds with an acknowledgement via the new\nlocator - verifying a path exists between the new locator and CN\nexists.</span></p>\n<p><span>The discovery message sent by the\nMN on the new locator is simply for path discovery as the CN will not\nknow how to route to 0:0:0:c with no node sending discovery messages\nfrom that locator. An alternative solution to this would have been to\nmake nodes send packets to all connected interfaces if there is no\nmapping in the forwarding table.</span></p>\n<p><span>See an example of a MN moving from\nlocator 0:0:0:a to locator 0:0:0:c, in a communication session with a CN\nin locator 0:0:0:b, below:</span></p>\n\n\n<img alt=\"locator update example topology\" src=\"./images/network-layer-mobility/diagrams/locator-update-topology.svg\">\n\nlocator update example\ntopology\n\n\n\n<img alt=\"locator update example sequence diagram\" src=\"./images/network-layer-mobility/diagrams/locator-update-sequence-diagram.svg\">\n\nlocator update example sequence\ndiagram\n\n<h2>Experiments</h2>\n<p><span>To demonstrate the operation of the\noverlay network on resource-constrained IoT devices a Raspberry Pi\ntestbed communicating via ethernet was used. Previous work in this area\nhas been confined to workstation or server machines.</span></p>\n<p><img src=\"./images/network-layer-mobility/testbed.jpg\"></p>\n<p><span>The virtual network topology was 3\nnetworks that the MN moved between every 20 seconds, one of which the CN\nresided in.</span></p>\n<p><img src=\"./images/network-layer-mobility/diagrams/experiment.svg\"></p>\n<p><span>The experimental application sent an\nMTU packet with a sequence number every 10ms from the MN to CN, and CN\nto MN, resulting in a throughput of 266.6kB/s.</span></p>\n<p><span>Looking at the received sequence by the\nCN we can see that there’s no loss or misordering - just a smooth\nseamless line with a constant gradient. The dotted vertical lines show\nthe network transitions.</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Received sequence numbers vs time on CN\" src=\"./images/network-layer-mobility/graphs/exp3/received-sequence-numbers-vs-time-on-cn.svg\">\n<img alt=\"Received sequence numbers vs time on MN\" src=\"./images/network-layer-mobility/graphs/exp3/received-sequence-numbers-vs-time-on-mn.svg\">\n\n\nReceived sequence numbers vs time on\nCN\nReceived sequence numbers vs time on\nMN\n\n\n\n<p><span>Looking at the throughputs we can see\ndiscrete rectangles for each individual locator showing the separation\nbetween locator uses. The smooth aggregate throughput shows that, as\nsuggested by the sequence number graphs, there is seamless connectivity\nbetween network transitions. Note that the locators listed refer to the\nlocator the MN is connected to, even for the throughputs on the\nCN.</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Throughput in 1s buckets vs Time on CN\" src=\"./images/network-layer-mobility/graphs/exp3/throughput-in-1s-buckets-vs-time-on-cn.svg\">\n<img alt=\"Throughput in 1s buckets vs Time on MN\" src=\"./images/network-layer-mobility/graphs/exp3/throughput-in-1s-buckets-vs-time-on-mn.svg\">\n\n\nThroughput in 1s buckets vs Time on\nCN\nThroughput in 1s buckets vs Time on\nMN\n\n\n\n<h2>System stability issues</h2>\n<p><span>An interesting hardware\nproblem was encountered when performing experiments with the overlay\nnetwork on the Raspberry Pi testbed that caused system stability\nissues.</span></p>\n<p><span>Taking experiment 3 as an\nexample, the received sequence numbers were mostly linear, but there\nwere horizontal gaps and sometimes subsequent spikes (likely due to\nbuffering on one of the nodes):</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Received sequence numbers vs time on CN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/received-sequence-numbers-vs-time-on-cn.svg\">\n<img alt=\"Received sequence numbers vs time on MN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/received-sequence-numbers-vs-time-on-mn.svg\">\n\n\nReceived sequence numbers vs time on\nCN\nReceived sequence numbers vs time on\nMN\n\n\n\n<p><span>There was no loss,\nhowever.</span></p>\n<p><span>This issue could be seen a\nlot more clearly in the throughput graphs:</span></p>\n\n\n\n\n\n\n\n\n<img alt=\"Throughput in 1s buckets vs Time on CN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/throughput-in-1s-buckets-vs-time-on-cn.svg\">\n<img alt=\"Throughput in 1s buckets vs Time on MN\" src=\"./images/network-layer-mobility/systems-issues-graphs/exp3/throughput-in-1s-buckets-vs-time-on-mn.svg\">\n\n\nThroughput in 1s buckets vs Time on\nCN\nThroughput in 1s buckets vs Time on\nMN\n\n\n\n<p><span>There are drops in\nthroughput, corresponding to horizontal gaps in the graph, and sometimes\nsubsequent spikes, corresponding to the spikes in received sequence\nnumbers.</span></p>\n<p><span>As the main focus of this\nproject is obviously networking that was the first area assumed to be\nwhere the problem lay, as a scheduling or buffering issue. But the UDP\nsend was not blocking, and the threading and thread synchronisation were\nworking perfectly. The process was tried pinned to a specific CPU core\nwith <code>$ taskset 0x1 &lt;program&gt;</code> to no avail. Using\n<code>tcpdump</code> showed the same gaps in packets sent and received\non the CN, router, and MN.</span></p>\n<p><span>Running <code>top</code> on\nthe Pi while running showed that when systems issues occurred (printed\nas a warning by the experiment program) the process was in a ‘D’ state.\nThis means it was in an uninterruptible sleep, due to I/O, otherwise\ndata corruption could occur. As network issues were already ruled out,\nthe only other I/O was logging. A long D state seems to be a common\nissue in Network File Systems (NFS), but that is not used here. A system\nrequest to display the list of blocked (D state) tasks with\n<code>echo w &gt; /proc/sysrq-trigger</code> was made when the process\nwas running. The relevant section of the kernel log from this\nis:</span></p>\n<pre><code>$ dmesg\n...\n[6367695.195711] sysrq: Show Blocked State\n[6367695.199742] task PC stack pid father\n[6367695.199791] jbd2/mmcblk0p2- D 0 824 2 0x00000028\n[6367695.199801] Call trace:\n[6367695.199818] __switch_to+0x108/0x1c0\n[6367695.199828] __schedule+0x328/0x828\n[6367695.199835] schedule+0x4c/0xe8\n[6367695.199843] io_schedule+0x24/0x90\n[6367695.199850] bit_wait_io+0x20/0x60\n[6367695.199857] __wait_on_bit+0x80/0xf0\n[6367695.199864] out_of_line_wait_on_bit+0xa8/0xd8\n[6367695.199872] __wait_on_buffer+0x40/0x50\n[6367695.199881] jbd2_journal_commit_transaction+0xdf0/0x19f0\n[6367695.199889] kjournald2+0xc4/0x268\n[6367695.199897] kthread+0x150/0x170\n[6367695.199904] ret_from_fork+0x10/0x18\n[6367695.199957] kworker/1:1 D 0 378944 2 0x00000028\n[6367695.199984] Workqueue: events dbs_work_handler\n[6367695.199990] Call trace:\n[6367695.199998] __switch_to+0x108/0x1c0\n[6367695.200004] __schedule+0x328/0x828\n[6367695.200011] schedule+0x4c/0xe8\n[6367695.200019] schedule_timeout+0x15c/0x368\n[6367695.200026] wait_for_completion_timeout+0xa0/0x120\n[6367695.200034] mbox_send_message+0xa8/0x120\n[6367695.200042] rpi_firmware_transaction+0x6c/0x110\n[6367695.200048] rpi_firmware_property_list+0xbc/0x178\n[6367695.200055] rpi_firmware_property+0x78/0x110\n[6367695.200063] raspberrypi_fw_set_rate+0x5c/0xd8\n[6367695.200070] clk_change_rate+0xdc/0x500\n[6367695.200077] clk_core_set_rate_nolock+0x1cc/0x1f0\n[6367695.200084] clk_set_rate+0x3c/0xc0\n[6367695.200090] dev_pm_opp_set_rate+0x3d4/0x520\n[6367695.200096] set_target+0x4c/0x90\n[6367695.200103] __cpufreq_driver_target+0x2c8/0x678\n[6367695.200110] od_dbs_update+0xc4/0x1a0\n[6367695.200116] dbs_work_handler+0x48/0x80\n[6367695.200123] process_one_work+0x1c4/0x460\n[6367695.200129] worker_thread+0x54/0x428\n[6367695.200136] kthread+0x150/0x170\n[6367695.200142] ret_from_fork+0x10/0x1\n[6367695.200155] python3 D 0 379325 379321 0x00000000\n[6367695.200163] Call trace:\n[6367695.200170] __switch_to+0x108/0x1c0\n[6367695.200177] __schedule+0x328/0x828\n[6367695.200184] schedule+0x4c/0xe8\n[6367695.200190] io_schedule+0x24/0x90\n[6367695.200197] bit_wait_io+0x20/0x60\n[6367695.200204] __wait_on_bit+0x80/0xf0\n[6367695.200210] out_of_line_wait_on_bit+0xa8/0xd8\n[6367695.200217] do_get_write_access+0x438/0x5e8\n[6367695.200224] jbd2_journal_get_write_access+0x6c/0xc0\n[6367695.200233] __ext4_journal_get_write_access+0x40/0xa8\n[6367695.200241] ext4_reserve_inode_write+0xa8/0xf8\n[6367695.200248] ext4_mark_inode_dirty+0x68/0x248\n[6367695.200255] ext4_dirty_inode+0x54/0x78\n[6367695.200262] __mark_inode_dirty+0x268/0x4a8\n[6367695.200269] generic_update_time+0xb0/0xf8\n[6367695.200275] file_update_time+0xf8/0x138\n[6367695.200284] __generic_file_write_iter+0x94/0x1e8\n[6367695.200290] ext4_file_write_iter+0xb4/0x338\n[6367695.200298] new_sync_write+0x104/0x1b0\n[6367695.200305] __vfs_write+0x78/0x90\n[6367695.200312] vfs_write+0xe8/0x1c8\n[6367695.200318] ksys_write+0x7c/0x108\n[6367695.200324] __arm64_sys_write+0x28/0x38\n[6367695.200330] el0_svc_common.constprop.0+0x84/0x218\n[6367695.200336] el0_svc_handler+0x38/0xa0\n[6367695.200342] el0_svc+0x10/0x2d4</code></pre>\n<p><span>Looking at the\n<code>python3</code> task stacktrace:</span></p>\n<ul>\n<li><p><span><code>jbd2</code> is\nthe thread that updates the filesystem journal, and <code>ext4</code> is\nthe default Ubuntu file system (as well as a lot of other\ndistributions)</span></p></li>\n<li><p><span>We can see than an\ninode is marked as dirty with <code>ext4_mark_inode_dirty</code>, and a\nfile written with <code>ext4_file_write_iter</code>, and then a virtual\nfile system write <code>vfs_write</code> is translated into an ARM write\n<code>__arm64_sys_write</code>.</span></p>\n<p><span>So this is happening\nduring a file write.</span></p></li>\n<li><p><span>In ARM,\n<code>svc</code> means supervisor call, and <code>el0</code> exception\nlevel 0 (the lowest level of exception), so some sort of exception\noccurs and is then handled with\n<code>el0_svc_handler</code>.</span></p></li>\n</ul>\n<p><span>Running\n<code>trace -r -t -v -p &lt;PID of process&gt;</code>, we can see the\nwrites that take an exceptionally long amount of time. Here is an\nexample where the write of 288 bytes to file descriptor 5 executes\nsuccessfully but takes 2.24 seconds to complete:</span></p>\n<pre><code>21:47:28.684124 (+ 0.000226) write(7, &quot;2021-04-10 21:47:28.684061 [0:0:&quot;..., 194) = 194\n21:47:28.684381 (+ 0.000256) write(1, &quot;2021-04-10 21:47:28.684308 [alic&quot;..., 122) = 122\n21:47:28.684583 (+ 0.000202) write(1, &quot;\\n&quot;, 1) = 1\n21:47:28.684786 (+ 0.000202) pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=5647000}, NULL) = 0 (Timeout)\n21:47:28.690796 (+ 0.006023) pselect6(0, NULL, NULL, NULL, {tv_sec=0, tv_nsec=0}, NULL) = 0 (Timeout)\n21:47:30.930965 (+ 2.240200) write(5, &quot;2021-04-10 21:47:30.930813 0:0:0&quot;..., 228) = 228\n21:47:30.931427 (+ 0.000433) getuid() = 1000\n21:47:30.931812 (+ 0.000385) socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 9\n21:47:30.932142 (+ 0.000328) ioctl(9, SIOCGIFINDEX, {ifr_name=&quot;eth0&quot;, }) = 0\n21:47:30.932506 (+ 0.000364) close(9) = 0\n21:47:30.933208 (+ 0.000705) write(4, &quot;2021-04-10 21:47:30.933090 [ff12&quot;..., 348) = 348</code></pre>\n<p><span>So the problem seems to be\nexceptions that sometimes occur during file writes, which take a long\ntime to resolve. These block the process executing by putting it in a D\nstate until the write returns, affecting the system stability. These\nexceptions being the cause would make sense, as these issues aren’t\noccurring consistently, but rather intermittently. This is happening on\nthe MN, on the router, and on the CN; so its effect is being amplified 3\ntimes. These exceptions are likely due to the page cache being flushed\nto disk, combined with poor performance of the Pi’s SD cards. But\nfinding the root cause would require more investigation. Regardless,\nenough is now known to fix the problem.</span></p>\n<p><span>Removing the logging\nimproved the system stability, but the issues still occurred with\nreduced frequency. This is because the experimental log is written to\n<code>stdout</code>, and <code>stdout</code> is piped to\ndisk.</span></p>\n<p><span>The program was being ran\non the Pi’s through SSH piping <code>stdout</code> to a file, like\nthis:</span></p>\n<pre><code>$ ssh HOST &quot;RUN &gt; EXPERIMENT_LOG_FILE&quot;</code></pre>\n<p><span>Changing this\nto:</span></p>\n<pre><code>$ ssh HOST &quot;RUN | cat &gt; EXPERIMENT_LOG_FILE&quot;</code></pre>\n<p><span>Fixed the issue once and\nfor all.</span></p>\n<p><span>This essentially spawns\nanother process to write to the file, and lets the shell buffer between\nthem. When an I/O exception occurs the writing process is put in a D\nstate until the exception is handled, but the Python process is\nunaffected as its output is buffered until the writing process is able\nto read from it again.</span></p>\n<h2>Conclusion</h2>\n<p><span>This project has involved creating an\nILNP overlay network, focusing on protocol design and operation;\nperforming an experimental analysis with resource-constrained IoT\ndevices; and demonstrating the protocol’s support for mobility with\nseamless network transitions through the use of a soft\nhandoff.</span></p>\n<p><span>The limitations of this project are the\nperformance of the program due to the overlay and use of Python; the\nscaling of the discovery protocol; only one application program is\nsupported for a virtual network stack as it runs on a single process\nwithout IPC; and only one instance of the program can be run on a\nmachine, due to the multicast UDP socket used by each instance of the\nprogram being bound to the same port.</span></p>\n<p><span>Further work in this area\nincludes:</span></p>\n<ul>\n<li>experimenting with a kernel implementation of ILNPv6 on IoT\ndevices</li>\n<li>investigating a multihoming policy and the benefits gained from the\nmultipath effect for IoT devices</li>\n<li>performing experiments of IoT devices transitioning between networks\nusing a wireless communication link layer such as IEEE 802.11/WiFi, as\nthis more appropriate than Ethernet for an IoT context</li>\n<li>performing experiments with two mobile nodes communicating</li>\n<li>performing experiments with even more resource-constrained devices\nthan Raspberry Pis, such as wireless sensors nodes</li>\n</ul>\n\n\n<p><span>As mentioned at the start, see the <a href=\"papers/2021-bsc-ubicomm.pdf\">dissertation</a> on which this blog\nwas based for a bit more nuance, and a lot more detail.</span></p>\n<p><span>If you have any questions or comments on\nthis feel free to <a href=\"./about.html#contact\">get in\ntouch</a>.</span></p>", 9 "content_type": "html", 10 "categories": [], 11 "source": "https://ryan.freumh.org/atom.xml" 12}