About

Welcome!
I'm currently a Ph.D. student at The University of Colorado Boulder. I major in Computer Science and currently, my research focuses are Serverless Computing, and Intersection of Machine Learning and Systems. I also work as a research assistant under supervision of Prof. Eric Keller. Feel free to reach out if you're interested!

Experiences

CoreOS Intern

May 2021 - Aug 2021
Apple, Cupertino, CA, USA

Performance analysis and improvement of Sysdiagnose up to 40% on all Apple platforms

Software Engineer Intern

May 2020 - Aug 2020
Salesforce, Louisville, CO, USA

Design and implementation of a proxy server for both REST and gRPC calls, intercepting REST and gRPC request/responses to generate stubs for Mocking purpose
Software Engineer Intern at Data Manager Setup team, C360 product

Research Assistant

Aug 2018 - Present
Computer Systems Lab, University of Colorado Boulder, CO, USA

Currently my research focus is on Serverless Computing, and intersection of machine learing/deep learning and networked systems.

Teaching Assistant

2016 - 2018
University of Tehran, Tehran, Iran

Teaching Assistant of the following courses:

Computer Networking

Operating Systems & Operating Systems Lab

Computer Architecture

Introduction to Computing Systems and C Programming

Artificial Intelligence

Publication

Escra: Event-driven, Sub-second Container Resource Allocation

July 2022
ICDCS 2022: 42nd IEEE International Conference on Distributed Computing Systems

This paper pushes the limits of automated resource allocation in container environments. Recent works set container CPU and memory limits by automatically scaling containers based on past resource usage.However, these systems are heavy-weight and run on coarse-grained time scales, resulting in poor performance when predictions are incorrect.We propose Escra, a container orchestrator that enables fine-grained, event-based resource allocation for a single container and distributed resource allocation to manage a collection of containers.Escra performs resource allocation on sub-second intervals within and across hosts, allowing operators to cost-effectively scale resources without performance penalty. We evaluate Escra on two types of containerized applications: microservices and serverless functions.In microservice environments, fine-grained and event-based resource allocation can reduce application latency by up to 96.9% and increase throughput by up to 3.2x when compared against the current state-of-the-art. Escra can increase performance while simultaneously reducing 50th and 99th%ile CPU waste by over 10x and 3.2x, respectively.In serverless environments, Escra can reduce CPU reservations by over 2.1x and memory reservations by more than 2x while maintaining similar end-to-end performance.

Optimizing and Extending Serverless Platforms: A Survey

Dec 2021
SDS 2021: The Eighth International Conference on Software Defined Systems

Serverless Computing is a new cloud computing paradigm wherein people in academia and industry are actively proposing either interesting improvements or building excellent applications on top of it. AWS, Google Cloud, Microsoft Azure, and IBM are popular samples of public clouds that offer Function-as-a-Service on top of their Serverless Computing platforms. Although this paradigm has had numerous advantages for software developers and programmers, it has introduced new challenges to cloud providers. Factors like fine-grained pricing and pay-as-you-go manner, eliminating the responsibility of resource management on the developer side, promises of elasticity and highly-available service, fault tolerance, auto-scaling, and being able to run embarrassingly parallel jobs make it a suitable platform for developers. On the other hand, efficient resource management, offering low-latency service, and providing proper security/isolation have been partly the main challenges introduced on the cloud provider side. This paper presents a literature review on today's Serverless platform optimizations and extensions that people have proposed and implemented to further capitalize the Serverless infrastructure. In the end, we will provide the current Serverless paradigm's limitations and a few future directions and research opportunities regarding Serverless Computing.

Survey Paper

SmartOS: towards automated learning and user-adaptive
resource allocation in operating systems

Aug 2021
APSys '21: Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems

Today's operating systems typically apply a one-size-fits-all approach to resource management, such as applying a scheduler that treats all processes of equal importance. The goal of this paper is to explore a learning-based approach to resource management in modern operating systems in which the OS automatically learns what tasks the user deems to be most important at that time and seamlessly adjusts allocation of CPU, memory, I/O, and network bandwidth to prioritize user preferences on demand. We demonstrate an implementation of such a learning-based OS in Linux and present evaluation results showing that a reinforcement learning-based approach can rapidly learn and adjust system resources to meet user demands.

SmartOS Paper

Software Packet-Level Network Analytics at Cloud Scale

Feb 2021
IEEE Transactions on Network and Service Management

As networks grow in speed, scale, and complexity, operating them reliably requires continuous monitoring and increasingly sophisticated analytics. Because of these requirements, the platforms that support analytics in cloud-scale networks face demands for both higher throughput (to keep up with high packet rates) and increased generality and programmability (to cover a wider range of applications). Recent proposals have worked toward these goals by offloading analytics application logic to line-rate programmable data plane hardware, as scaling existing software analytics platforms is prohibitively expensive. The rigid design and constrained resources of data plane devices, however, fundamentally limit the types of analysis and the number of tasks that can run concurrently. In this article, we demonstrate that generality need not be sacrificed for high performance. Rather than offloading entire analytics applications to hardware, the core idea of our work is to offload only critical preprocessing tasks that are shared among applications (e.g., load balancing) to a line-rate hardware frontend while optimizing the core analytics software to exploit properties of network analytics workloads. Based on this design, we present Jetstream, a hybrid platform for network analytics that can run custom software-based analytics pipelines at throughputs of up to 250 million packets per second on a 16-core commodity server. Jetstream makes sophisticated, network-wide packet analytics feasible without compromising on generality or performance.

Toccoa Paper

Resource Management in Cloud Computing Using Machine Learning:
A Survey

Dec 2020
19TH IEEE International Conference on Machine Learning and Applications, 2020

Efficient resource management in cloud computing research is a crucial problem because over-provisioning resources increases costs for cloud providers and cloud customers; under-provisioning resources increases the application latency, and it may violate service level agreement, which eventually makes cloud providers lose their customers and income. As a result, researchers have been striving to develop optimal resource management in the cloud computing environments in different ways, such as container placement, job scheduling and multi-resource scheduling. Machine learning techniques are extensively used in this area. In this paper, we present a comprehensive survey on the projects which leveraged machine learning techniques for resource management solutions in the cloud computing environment. At the end, we provide a comparison between these works. Furthermore, we propose some future directions that willguide researchers to advance this field

(Poster) Efficient Microservices with Elastic Containers

Dec 2019
Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies (CoNEXT)

Containers are a popular mechanism used among application developers when deploying their systems on cloud platforms. Both developers and cloud providers are constantly looking to simplify container management, provisioning, and monitoring. In this paper, we present a container management layer that sits beside a container orchestrator that runs, what we call, Elastic Containers. Each elastic container contains multiple subcontainers that are connected to a centralized Global Cloud Manager (GCM). The GCM gathers subcontainer resource utilization information directly from inside each kernel running the subcontainers. The GCM then tries to efficiently and optimally distribute resources between the application subcontainers residing on a distributed environment.

Elastic Containers Paper

Shimmy: Shared Memory Channels for High Performance Inter-Container Communication

July 2019
USENIX Workshop on Hot Topics in Edge Computing (HotEdge 19), USENIX Association, 2019

Abstract: With the increasing need for more reactive services, and theneed to process large amounts of IoT data, edge clouds are emerging to enable applications to be run close to the users and/or devices. Following the trend in hyperscale clouds, applications are trending toward a microservices architecture where the application is decomposed into smaller pieces that can each run in its own container and communicate with each other over a network through well defined APIs. This improves the development effort and deployability, but also introduces inefficiencies in communication. In this paper, we rethink the communication model, and introduce the ability tocreate shared memory channels between containers supporting both a pub/sub model and streaming model. Our approachis not only applicable to the edge clouds but also beneficial incore cloud environments. Local communication is made more efficient, and remote communication is efficiently supported through synchronizing shared memory regions via RDMA.

Shimmy Paper

Projects

Here are some more recent and notable projects:

Mini Internet - Design and configure a backbone network running OSPF as IGP and BGP
having different customers, transit service and peers
applying routing policies based on the relashionships
using route reflection, redistribution for statically routed customers and BGP attributes for traffic manipulation
Tools: Cisco Router (IOS), GNS3
IP Routing course project
Prediction and characterization of application power use in a high performance computing environment - Machine Learning course project, Department of Computer Science, University of Colorado Boulder
Re-implementing the paper and improving the results by using feature engineering techniques and adding LSTM
Rootkit Module in Linux - Develop an LKM to intercept Linux kernel predefined syscall in order to change the ”ls” command functionality and a utility function which checks whether a syscall is changed in syscall table (Kernel version: 4.x)
Advanced Operating Systems course project
Service Discovery & Version Updating - Build and run a simple Flask app, using Vagrant, Docker, Ansible, Etcd, Registrator, Confd, Nginx, Bash script and automate the version update
DevOps in the Cloud course project
Web Proxy - A web proxy written in Python, working with HTTP protocol, with URL caching, and an admin interface
Computer Networking course project
Module in Floodlight - Adding a module to floodlight controller to exchange key with each new host and register it as a valid host in the network
Computer Networking Lab course project
Kernel Programming (Kernel version 2.6.x) - add a new semaphore to the kernel, having Priority Inheritance Protocol to avoid Priority Inversion
Operating Systems Lab course project
Airplane Reservation Web App - An airplane reservation web application, using MVC, Object Oriented Patterns, HTML, CSS, JS, Bootstrap, AngularJs, JSP, JavaEE, Socket Programming in Java, Tomcat, Log4J, JUnit, Git, Maven, Docker, Kubernetes, Minikube, HSQL DB, Session State, and handling SQL Injection, CSRF issues, and Access Control
Internet Engineering course project
Web-Dota - Web-Dota-like game completely implemented in database, using EERD Design, SQL Server, SQL Server Management Studio, Stored Procedures, Function & Views, Agent, and basic Windows Form Application
Database Lab course project
Customs House Software - An application related to Customs house procedures, Design (having prototype, Domain Modeling, System Sequence Diagram, Class Diagram, etc), implementation(C#, SQL Server database), Test(Unit Test, Integration Test)
System Analysis & Design course project
Genetic Algorithm - Artificial Intelligence course project, Department of Electrical & Computer Engineering, University of Tehran
An algorithm using Genetic approach to solve a minimization problem

Skills & Proficiency

Programming Languages (C/C++, Python, Java, Bash Script)

Linux Kernel

Software Defined Networking

DevOps

Web Programming

Project Management/Version Control

Awards & Honors

USENIX ATC 2019 Grant Sponsored by NSF and VMWare

July 2019
2019 USENIX Annual Technical Conference, Renton, WA, USA

Best Prize in Entrepreneurship Track

Feb 2019
T9Hacks Hackathon, University of Colorado Boulder

Early Career Professional Development Fellowship

Aug 2018
Department of Computer Science, University of Colorado Boulder

Best Undergraduate Project Award

Feb 2018
Department of Electrical and Computer Engineering, University of Tehran

Build and deploy virtual machine live migration in cloud environment, using OpenStack, NFS, VSphere

Ranked 4th among All Information Technology Students

2015 - 2016 Academic Year
Department of Electrical and Computer Engineering, University of Tehran