The full program will be available in May 2021. We first introduce two new hardware primitives: 1) Guarded Page Table (GPT), which protects page table pages to support page-level secure memory isolation; 2) Mountable Merkle Tree (MMT), which supports scalable integrity protection for secure memory. Sponsored by USENIX in cooperation with ACM SIGOPS. We present the results of a 1% experiment at fleet scale as well as the longitudinal rollout in Googles warehouse scale computers. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linuxs NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. See the USENIX Conference Submissions Policy for details. These results outperform state-of-the-art HTAP systems by several orders of magnitude on transactional performance, while just incurring little performance slowdown (5% over pure OLTP workloads) and still enjoying data freshness for analytical queries (less than 20 ms of maximum delay) in the failure-free case. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh . Questions? We present TEMERAIRE, a hugepage-aware enhancement of TCMALLOC to reduce CPU overheads in the applications code. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. We propose a new framework for computing the embeddings of large-scale graphs on a single machine. To help more profitably utilize sanitizers, we introduce SanRazor, a practical tool aiming to effectively detect and remove redundant sanitizer checks. The chairs will review paper conflicts to ensure the integrity of the reviewing process, adding or removing conflicts if necessary. USENIX, like other scientific and technical conferences and journals, prohibits these practices and may, on the recommendation of a program chair, take action against authors who have committed them. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. We compare Marius against two state-of-the-art industrial systems on a diverse array of benchmarks. Camera-ready submission (all accepted papers): 15 Mars 2022. The symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. Author Response Period 23 artifacts received the Artifacts Functional badge (88%). Papers must be in PDF format and must be submitted via the submission form. Cores can safely and concurrently read from their local kernel replica, eliminating remote NUMA accesses. See www.cs.cmu.edu/~mmv/Veloso.html for her scientific publications. Academic and industrial participants present research and experience papers that cover the full range of theory . We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive class of datacenter applications: those that utilize many small Remote Procedure Calls (RPCs) with very short (s-scale) processing times. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. The novel aspect of the nanoPU is the design of a fast path between the network and applications---bypassing the cache and memory hierarchy, and placing arriving messages directly into the CPU register file. PLDI is a premier forum for programming language research, broadly construed, including design, implementation, theory, applications, and performance. . Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. Further, Vegito can recover from cascading machine failures by using the columnar backup in less than 60 ms. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. As a member of ACCT, I have served two years on the bylaws and governance committee and two years on the finance and audit committee. We will look at various problems and approaches, and for each, see if blockchain would help. USENIX new Date().getFullYear()>document.write(new Date().getFullYear()); Grants for Black Computer Science Students Application, Propose an interesting, compelling solution, Demonstrate the practicality and benefits of the solution, Clearly describe the paper's contributions, Clearly articulate the advances beyond previous work. Existing algorithms are designed to work well for certain workloads. We demonstrate that the hardware thread scheduler is able to lower RPC tail response time by about 5 while enabling the system to sustain 20% higher load, relative to traditional thread scheduling techniques. Instead, we propose addressing the root cause of the heuristics problem by allowing software to explicitly specify to the device if submitted requests are latency-sensitive. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. We propose PET, the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. Horcrux-compliant web servers perform offline analysis of all the JavaScript code on any frame they serve to conservatively identify, for every JavaScript function, the union of the page state that the function could access across all loads of that page. In experiments with real DL jobs and with trace-driven simulations, Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers, even when they are provided with ideal resource and training configurations for every job. As has been standard practice in OSDI and SOSP in recent years, we will allow authors to submit quick responses to PC reviews: they will be made available to the PC before the final online discussion and PC meeting. If you are uncertain about how to anonymize your submission, please contact the program co-chairs, osdi21chairs@usenix.org, well in advance of the submission deadline. Welcome to the SOSP 2021 Website. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. This yielded 6% fewer TLB miss stalls, and 26% reduction in memory wasted due to fragmentation. In this paper, we present Vegito, a distributed in-memory HTAP system that embraces freshness and performance with the following three techniques: (1) a lightweight gossip-style scheme to apply logs on backups consistently; (2) a block-based design for multi-version columnar backups; (3) a two-phase concurrent updating mechanism for the tree-based index of backups. The ZNS+ also allows each zone to be overwritten with sparse sequential write requests, which enables the LFS to use threaded logging-based block reclamation instead of segment compaction. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. We also show that Marius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550 GB of total parameters on a single machine with 16 GB of GPU memory and 64 GB of CPU memory. Jiang Zhang, University of Southern California; Shuai Wang, HKUST; Manuel Rigger, Pinjia He, and Zhendong Su, ETH Zurich. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. For realistic workloads, KEVIN improves throughput by 68% on average. Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. VLDB 2021: Venue Tivoli Hotel & Congress Center Arni Magnussons Gade 2 1577 Copenhagen, Denmark +45 3268 4300 In-person attendees can purchase tickets for the park / gardens with a 15% discount, which is a special offer by Tivoli Hotel & Congress Center to VLDB 2021 attendees. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. Moreover, to handle dynamic workloads, Nap adopts a fast NAL switch mechanism. For general conference information, see https://www . Authors are also encouraged to contact the program co-chairs, osdi21chairs@usenix.org, if needed to relate their OSDI submissions to relevant submissions of their own that are simultaneously under review or awaiting publication at other venues. Authors may upload supplementary material in files separate from their submissions. Memory allocation represents significant compute cost at the warehouse scale and its optimization can yield considerable cost savings. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. However, with the increasingly speedy transactions and queries thanks to large memory and fast interconnect, commodity HTAP systems have to make a tradeoff between data freshness and performance degradation. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. Furthermore, to enable automatic runtime optimization, GNNAdvisor incorporates a lightweight analytical model for an effective design parameter search. KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. These are hard deadlines, and no extensions will be given. Radia Perlman is a Fellow at Dell Technologies. Title Page, Copyright Page, and List of Organizers | Log search and log archiving, despite being critical problems, are mutually exclusive. Sam Kumar, David E. Culler, and Raluca Ada Popa, University of California, Berkeley. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. (Registered attendees: Sign in to your USENIX account to download these files. Jiachen Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Ding Ding, Department of Computer Science, New York University; Huan Wang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Conrad Christensen, Department of Computer Science, New York University; Zhaoguo Wang and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Jinyang Li, Department of Computer Science, New York University. Zeph enforces privacy policies cryptographically and ensures that data available to third-party applications complies with users' privacy policies. All the times listed below are in Pacific Daylight Time (PDT). Today, privacy controls are enforced by data curators with full access to data in the clear. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. However, your OSDI submission must use an anonymized name for your project or system that differs from any used in such contexts. Fortunately, we observe that the backups for high availability in modern distributed OLTP systems can be retrofitted to bridge the analytical queries and transactions in HTAP workloads. Ankit Bhardwaj and Chinmay Kulkarni, University of Utah; Reto Achermann, University of British Columbia; Irina Calciu, VMware Research; Sanidhya Kashyap, EPFL; Ryan Stutsman, University of Utah; Amy Tai and Gerd Zellweger, VMware Research. Welcome to the SOSP 2021 Website. Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. Evaluations show that Vegito can perform 1.9 million TPC-C NewOrder transactions and 24 TPC-H-equivalent queries per second simultaneously, which retain the excellent performance of specialized OLTP and OLAP counterparts (e.g., DrTM+H and MonetDB). Call for Papers. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. The NAL maintains 1) per-node partial views in PM for serving insert/update/delete operations with failure atomicity and 2) a global view in DRAM for serving lookup operations. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. How can we design systems that will be reliable despite misbehaving participants? A graph neural network (GNN) enables deep learning on structured graph data. Currently, for large graphs, CPU servers offer the best performance-per-dollar over GPU servers. Conference site 49 papers accepted out of 251 submitted. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. Of the 26 submitted artifacts: 26 artifacts received the Artifacts Available badge (100%). First, it enables a caller to push a message to a callee in two hops, using a new way of assigning mailboxes to users that resembles how a post office assigns PO boxes to its customers. She has been recognized with many industry honors including induction into the National Academy of Engineering, the Inventor Hall of Fame, The Internet Hall of Fame, Washington State Academy of Science, and lifetime achievement awards from USENIX and SIGCOMM. For conference information, . (Visa applications can take at least 30 working days to process.) In addition, increasing CPU core counts further complicate kernel development. Consensus bugs are bugs that make Ethereum clients transition to incorrect blockchain states and fail to reach consensus with other clients. HotCRP.com signin Sign in using your HotCRP.com account. One important reason for the high cost is, as we observe in this paper, that many sanitizer checks are redundant the same safety property is repeatedly checked leading to unnecessarily wasted computing resources. Compared to a state-of-the-art fuzzer, Fluffy improves the fuzzing throughput by 510 and the code coverage by 2.7 with various optimizations: in-process fuzzing, fuzzing harnesses for Ethereum clients, and semantic-aware mutation that reduces erroneous test cases. Responses should be limited to clarifying the submitted work. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). By submitting a paper, you agree that at least one of the authors will attend the conference to present it. We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented.