A Public Option for the Core
Contact author: Scott Shenker
This project is focused not on the Internet architecture -- as defined by layering, the narrow waist of IP, and other core design principles -- but on the Internet infrastructure, as embodied in the technologies and organizations that provide Internet service. We consider both the challenges and the opportunities that make this an auspicious time to revisit how we might best structure the Internet’s infrastructure. Currently, the tasks of transit-between-domains and last-mile-delivery are jointly handled by a set of ISPs who interconnect through BGP. Instead, we propose cleanly separating these two tasks. For transit, we propose the creation of a ``public option’' for the Internet’s core backbone. This public option core, which complements rather than replaces the backbones used by large-scale ISPs, would (i) run an open market for backbone bandwidth so it could leverage links offered by third-parties, and (ii) structure its terms-of-service to enforce network neutrality so as to encourage competition and reduce the advantage of large incumbents.
Michael Alan Chang, Will Wang, Aurojit Panda, Scott Shenker
Most large web-scale applications are now built by composing collections (up to 100s or 1000s) of microservices. Operators need to decide how many resources are allocated to each microservice, and these allocations can have a large impact on application performance. Manually determining allocations that are both cost-efficient and meet performance requirements is challenging, even for experienced operators. In this paper we present Autotune, an end-to-end tool that automatically minimizes resource utilization while maintaining good application performance.
Aisha Mushtaq, Yotam Harchol, Vivian Fang, Murphy McCauley, Aurojit Panda, Scott Shenker
The introduction of computational resources at the network edge allows application designers to offload computation from clients and/or servers, thereby reducing response latency and backbone bandwidth. More fundamentally, edge-computing moves applications from a client-server model to a client-edge-server model. While this is an attractive paradigm for many use cases, it raises the question of how to design client-edge-server systems so they can tolerate edge failures and client mobility. This is particularly challenging when edge processing is strongly stateful. In this work we propose a design for meeting this challenge called the Client-Edge-Server for Stateful Network Applications (CESSNA).
Emmanuel Amaro, Christopher Branner-Augmon, Zhihong Luo, Amy Ousterhout, Marcos K. Aguilera, Aurojit Panda, Sylvia Ratnasamy, Scott Shenker
As memory requirements grow, and advances in memory technology slow, the availability of sufficient main memory is increasingly the bottleneck in large compute clusters. One solution to this is memory disaggregation, where jobs can remotely access memory on other servers, or far memory. This project presents a faster swapping mechanism and a far memory-aware cluster scheduler that make it possible to support far memory at rack scale. While far memory is not a panacea, for memory-intensive workloads, CFM can provide performance improvements on the order of 10% or more even without changing the total amount of memory available.
Zhihong Luo, Silvery Fu, Mark Theis, Shaddi Hasan, Sylvia Ratnasamy, Scott Shenker
Markets in which competition thrives are good for both consumers and innovation but, unfortunately, competition is not thriving in the increasingly important cellular market. We propose CellBricks, a novel cellular architecture that lowers the barrier to entry for new operators by enabling users to consume access on-demand from any available cellular operator — small or large, trusted or untrusted. CellBricks achieves this by moving support for mobility and user management (authentication and billing) out of the network and into end hosts. These changes, we believe, bring valuable benefits beyond enabling competition: they lead to a cellular infrastructure that is simpler and more efficient. We design, build, and evaluate CellBricks, showing that its benefits come at little-to-no cost in performance compared to what's achieved by current cellular infrastructure.
Economics of Data Sharing
Contact author: Scott Shenker
There are many learning tasks — detecting bank fraud, predicting treatment effectiveness -- that are done by separate entities, but where sharing the data would produce better results. However, there are both privacy and incentive problems with sharing data. We propose the use of learning brokers that receive data from various entities and perform two valuable functions. First, they run the training algorithms over the data, and only give the inference engines to others, not the raw data. Second, they share the data in such a way that all entities have an incentive to share everything with the broker. This work is in the very early stages.
John Westhoff, Aisha Mushtaq, Amir Shahatit, Yotam Harchol, Aurojit Panda, Scott Shenker
In the last few years there has been increased interest in deploying application logic at the network edge. However, in order to effectively use edge resources, applications must replicate state at edges. In current applications the policy for when and where state should be replicated is embedded in the application, however this policy depends not just on application logic but also on workloads and resource availability. Workload and resource availability can vary significantly over an application's lifetime, and this can result in poor performance when adopting edge computing. In this work we propose Edgy, a framework that decouples replication logic from application logic, thus enabling applications to better respond to workload and infrastructure changes.
Efficient Work Stealing
Contact author: Amy Ousterhout
Datacenter servers must balance load across many (sometimes dozens) of cores. Existing techniques such as work stealing perform well for long tasks, but can be inefficient for short tasks that take only a couple of microseconds. At these timescales, cores may spend a significant fraction of their time just looking for work, rather than actually doing useful work; this wastes CPU resources that could be used by other applications on the same server. We are exploring techniques to perform load balancing more efficiently, so that requests are handled faster and cores waste fewer cycles looking for work.
Enabling a Permanent Revolution in Internet Architecture
James Murphy McCauley, Yotam Harchol, Aurojit Panda, Barath Raghavan, Scott Shenker
The research community has developed a number of interesting proposals for new Internet architectures (e.g., NDN and XIA), which sometimes leads to asking the high stakes question of which one of these new proposals should succeed the current Internet architecture. This project, presented at SIGCOMM 2019 as "Enabling a Permanent Revolution in Internet Architecture", explored a vision for the future wherein multiple Internet architectures can be deployed on and coexist on the same existing infrastructure. The proposed design radically reduces the requirements for deploying a new architecture (i.e., it doesn't require replacing every router in the world), and removes the requirement that a single architectural design must meet everyone's needs.
Wen Zhang, Vivian Fang, Aurojit Panda, Scott Shenker
Serverless computing (e.g., AWS Lambda) was initially designed for event-driven applications, where each event handler is guaranteed to complete within a limited time duration. Kappa aims to enable general purpose, parallel computation on serverless platforms. To do this, Kappa provides a continuation-based checkpointing mechanism that allows long-running computations on time-bounded lambda functions; and, a message-passing concurrency API for easily expressing parallelism and exploiting the elasticity of serverless platforms.
OS for the Modern Age
Contact author: Murphy McCauley
This NetSys project attempts to rethink foundational aspects of operating system design for servers in the modern age, confronting issues such as changing performance bottlenecks (e.g., due to vast performance changes in underlying technologies like networks and storage), the central importance of isolation (in the sense of performance, data, and metadata), and shifting workloads (such as extremely short and varied edge workloads).
Pancake is the first system to protect key-value stores from access pattern leakage attacks with small constant factor bandwidth overhead. Pancake uses a new approach, that we call frequency smoothing, to transform plaintext accesses into uniformly distributed encrypted accesses to an encrypted data store. We show that frequency smoothing prevents access pattern leakage attacks by passive persistent adversaries in a new formal security model. Pancake is integrated into three key-value stores used in production clusters. Pancake achieves 229× better throughput than non-recursive Path ORAM — within 3–6× of insecure baselines for these key-value stores.
Wen Zhang, Scott Shenker, Irene Zhang
Distributed in-memory storage systems are crucial for meeting the low latency requirements of modern datacenter services. However, they lose all state on failure, so recovery is expensive and data loss is always a risk. The Persimmon system leverages persistent memory (PM) to convert existing in-memory storage systems into persistent, crash-consistent versions with low overhead and minimal code changes. Persimmon offers a simple Persistent State Machine abstraction for PM and implements persistence through operation logging on the critical path and a novel crash-consistent shadow execution technique for log digestion in the background.
Predicting System Performance
Silvery Fu, Saurabh Gupta, Radhika Mittal, Sylvia Ratnasamy
The ability to predict system performance is key to enabling better system optimization and planning. Given recent advances in Machine learning (ML), one might ask whether ML is a natural fit for this task. We study whether and how ML can be used to predict performance. Our findings reveal that the performance variability stemming from optimization or randomization techniques in many applications makes performance prediction inherently difficult, i.e. in such cases, no ML technique can predict performance with high enough accuracy. We show how eliminating the discovered sources of variability greatly improves prediction accuracy. Since it is difficult to eliminate all possible sources of system performance variability, we further discuss how ML models can be extended to cope with them by learning a distribution as opposed to point estimates.
Michael Alan Chang, Wen Zhang, Eric Sheng, Aurojit Panda, Mooly Sagiv, Scott Shenker
Several recent laws (e.g., GDPR and CCPA) constrain how applications collect and utilize user data, and ensuring compliance to these constraints in an application is challenging. Privoxy is a system that enforces data access policies for web applications using a proxy that interposes on the connection between the application and the database. To verify policy compliance, Privoxy leverages SMT solvers and, to achieve adequate performance, generalizes and caches the results of previous compliance decisions. We demonstrate that Privoxy supports legacy web applications while adding only modest overheads.
Contact author: Emmanuel Amaro
We propose extensions to RDMA called Remote Memory Calls (RMCs) that allows applications to install a customized set of 1-sided RDMA operations. We are exploring how can RMCs be implemented on the forthcoming generation of SmartNICs, and we will compare its performance to pure 1-sided and 2-sided RDMA operations.
Lloyd Brown, Arvind Krishnamurth (UW), Ganesh Ananthanarayanan (MSR), Ethan Katz Basset (Columbia), Sylvia Ratnasamy, Scott Shenker
The conventional wisdom requires that all congestion control algorithms deployed on the public Internet be TCP-friendly. If universally obeyed, this requirement would greatly constrain the future of such congestion control algorithms. If partially ignored, as is increasingly likely, then there could be significant inequities in the bandwidth received by different flows. To avoid this dilemma, we propose an alternative to the TCP-friendly paradigm that can accommodate innovation, is consistent with the Internet’s current economic model, and is feasible to deploy given current usage trends.
Contact author: Murphy McCauley
A project in this area attempts to deliver a routing resiliency mechanism that is easily implementable, easily deployable, and easily manageable while offering packet delivery rates that rival those of the most sophisticated resiliency mechanisms.
Lloyd Brown, Peter Gao, Ed Oakes, Wen Zhang, Aurojit Panda, Sylvia Ratnasamy, Scott Shenker
Serverless computing is a relatively recent cloud computing paradigm that allows customers to write lightweight, short-lived functions that are executed on resources provisioned and managed by cloud providers. This architecture was originally designed to simplify stateless, event-driven applications such as those designed to handle web requests, compress images, or generate thumbnails. However, this paradigm has been increasingly adopted in other domains including data analytics, machine learning, and sensor data processing. These domains, and other potential applications, could benefit from better fault-tolerance, consistent concurrent file access, and improved I/O performance than are provided by current serverless offerings. In this paper we propose Savanna, a system that implements these features for serverless applications. Savanna requires applications to use a file API, but is otherwise transparent to serverless applications.
Contact author: Christopher Branner-Augmon
Many common algorithms are data oblivious, meaning that their memory access patterns are independent of their input data (e.g., common matrix operations). Our goal in this project is to exploit the predictability of memory access patterns in data oblivious algorithms in order to reduce their memory footprint, while limiting their performance degradation. We do this by utilizing a smart memory prefetcher, which is able to use information garnered from one execution to accurately prefetch on subsequent executions of an application. For data-oblivious applications, our approach will allow us to achieve much better prefetching accuracy when compared to existing approaches.
Theory of Network Neutrality
Contact author: Scott Shenker
There is a huge literature on network neutrality, which typically starts with our current economic arrangements on the Internet and asks about the wisdom of certain limited measures (such as termination fees or prioritizing service). This projects instead looks at the set of economic actors and asks which economic arrangements would lead to the socially optimal outcome. This work is in the very early stages.
Tunnels as an Abstraction
Contact author: Scott Shenker
Network APIs such as UNIX sockets, DPDK, and Netmap, assume that networks provide only end-to-end connectivity. However, networks increasingly include smartNICs and programmable switches that can implement both network and application functions. Several recent works have shown the benefit of offloading application functionality to the network, but using these approaches requires changing not just the applications, but also network and system configuration. In this project we propose a network API that rovides a uniform abstraction for offloads, aiming to simplify their use.
Understanding the root causes and fixes of congestion collapse
Contact author: Aisha Mustaq
In many ways, our current approach to Internet congestion control arose from the need to deal with a series of congestion collapses. While some of those innovations were explicitly intended to prevent congestion collapse, others helped give TCP better congestion control more generally. In this paper we try to identify the aspects of congestion control that are necessary and sufficient to prevent congestion collapse. We argue that -- in the context of a basic congestion control framework that includes such features as sliding window, reasonably accurate RTTs, and backing off retransmit timers -- we need two and only two relatively straightforward conditions on retransmit timers to prevent congestion collapse. Making it easy to tell if a congestion control algorithm is congestion-collapse resistant will be important as congestion control turns away from the current paradigm and adopts rather different approaches, as is starting to occur.
Lloyd Brown, Scott Shenker, Aurojit Panda, Amy Outerhout, Brian Kim, Michael Perry
The goal of edge computing is to make nearby computing resources available to a rich set of applications. However, current offerings provide incomplete interfaces that limit applications, and disparate interfaces that prevent applications from using edges from other providers. To fix this we propose Vertex, a standard edge computing interface to enable provider-agnostic edge computing with a fuller set of abstractions.
Zed - Leveraging Data Types to Process Heterogeneous and Evolving Data
Contact author: Amy Ousterhout, Silvery Fu
Processing heterogeneous and evolving data is challenging with today’s data frameworks. In the Zed project, we re-architect data processing from the ground up to handle data with heterogeneous and evolving schemas by design. We argue that the key to doing so is to make data types first-class members of our data-processing systems, particularly of our data models and query languages.
Silvery Fu, Sylvia Ratnasamy
dSpace is an open and modular programming framework that aims to simplify and accelerate the development of smart space applications. To achieve this, dSpace provides two key building blocks – digivices that implement device control and actuation and digilakes that process IoT data to generate events and insights — together with novel abstractions for composing these building blocks into higher-level abstractions. We apply dSpace to home automation systems and show how developers can easily and flexibly compose these basic abstractions to support a wide range of home automation scenarios.