Ten years ago deploying a development environment sucked.
Heroku and AWS were in their infancy, relegated to that special set of developer tools best left to people with beards who listened to Interpol and only drank pour-over coffee. Most of the world still relied on colos — colocated computing centers where you rented a raw physical server or a virtual machine running out from a remote, managed data center.
This meant that deploying an app was a lengthy, painful process. I’d first have to set up an operating system on the machine, usually deployed via disk image. Vagrant didn’t exist back then, so even if that disk image contained a few applications and libraries I’d still have to painfully “string” everything together after the image was successfully installed. Somewhere along the line I would also have to figure out how to get that machine properly (and ideally safely) exposed to the web to allow it to service requests outside me logging in as root.
Basically it sucked.
Even still, it was better than it was five years before that. Virtualization dramatically dropped the price of computing, allowing small teams and individual developers to actually rent out an infrastructure whereas before you would have to rent out a full colo’d box. VMWare well deserved their multi-billion dollar valuation for their work on ESX, and the trinity of hypervisors — ESX, Microsoft’s Hyper-V, and the FOSS Linux KVM — enabled the cost of running a development environment to drop to an astoundingly low level.
But early virtualization really just solved the cost problem. The problems of managing complexity, especially complexity in isolation to maintain secure multi-tenancy and improve the ability to isolate and identify the root cause of complex problems, and reducing the time to deploy and update infrastructure remained.
Over the last decade we’ve married virtualization with abstraction to address these problems. Today application infrastructure is increasingly dynamic. Rather than relying on a VM or a full colo’d server and a Rube Goldberg machine of personal scripts to deploy an app, I can deploy quickly a Kubernetes container and have Chef, Puppet, or Terraform handle the “string everything together and get it online” portion of the job. Serverless computing is rapidly approaching on the horizon as well, promising the ability to deploy some compute jobs without having to worry about infrastructure.
Despite these monumental advances in infrastructure, we are still fairly far away from dealing with our three key problems:
- Making computing cheaper
- Making deploying applications in isolation easier and safer
- Making automatically configuring application infrastructure easier
You still need to know how to instrument things in Terraform. You still need to know how to configure Kubernetes and Nomad. And in the case of serverless, a burgeoning sector of infrastructure that is bizarrely infatuated with functional programming (I’m glaring at you, hipster hackers), you need to be bought into a specific style of programming in order to use it properly today.
Atomic Computing: Solving the Three Key Problems
Computing is a process of progress. Infrastructure is no exception, and continually iterating on virtualization and infrastructure abstraction is a generational process to try and find better solutions to our three key problems.
I like to call this process Atomic Computing. I draw the name from a similar search in physics to reduce the elements (and later molecules and elementary particles) to a single isolated unit. Just like 19th and early 20th century physicists and chemists tried to reduce the world to atomic units, in infrastructure we are trying to reduce the complexities of infrastructure to isolated, discrete units of computing for running applications.
If we draw a parallel between the development of atomic theory and quantum mechanics with what we’re doing in infrastructure, we start to see some interesting similarities. As we started to iterate on atomic (and later quantum) theory to move towards this set of singular, isolated units to describe fundamental aspects of matter and energy, we spent increasing amounts of time defining frameworks of abstraction that governed how that unit collectively mapped to addressing the higher “level” of problems.
In physics and chemistry, this became understanding how atoms and molecules addressed behavior we observed about without microscopes. Then, when we created quantum theory, figuring out how subatomic particles contributed to behavior we observed in atoms.
In infrastructure, the frameworks of abstraction we need to discover are around simplifying the developer experience in regards to the three key problems.
Successfully moving from VMs to containers meant we needed/need to figure out how applications are provisioned, strung together, and ultimately are cheaper and easier to deploy given the current state of the developer world. Successfully moving “one level down” from container-focused infrastructure to atomic serverless units means we need to answer those same questions: how do we simplify the process of deploying a new atomic unit, configuring it, and providing it cheaper to developers given their current way of provisioning infrastructure.
Finding our Brownian Motion
Honestly, I don’t know what’s next after serverless. Serverless itself is a fairly new and exciting field of infrastructure, and we’re in its infancy today much like virtual clouds like AWS were in their infancy a decade ago. But it’s clear that serverless and more popular contemporary computing form factors like containers are just a mile marker on the atomic computing road of progress.
I like to draw a parallel between physics and chemistry and infrastructure because a similar challenge emerged in the 18th century that led to the discovery of the atom.
In 1827, botanist Robert Browning observed a set of curious behavior around the movement of microscopic particles suspended in liquid. He observed how particulates from pollen moved in erratic, seemingly random interactions under a microscope.
A century later Browning’s discovery of Brownian Motion led a young Albert Einstein and his peers to prove the existence of the atom — a unit that long had been theorized to exist but had been undefined and unproven until Einstein extrapolated Brownian Motion to define an abstract framework that described this next atomic unit.
Figuring out what the next atomic computing unit will be is a lot like proving the existence of the atom. Rather than trying to figure out out what exactly that unit will be, we can use observations around the necessary behavior of that unit and extrapolate a framework that defines how that unit must exist.
Returning to our three key problems that govern atomic computing (make computing cheaper, making deployments of apps in isolation easier/safer, and making configuring that environment easier), we can look at how other areas in computing are advancing to support solutions to those key problems to help define what the next atomic unit will be.
Better Hyperthreading: Making Computing Cheaper
The next atomic unit of computing needs to make computing significantly cheaper. Just like virtualization made it cheaper to deploy apps on AWS, the next atomic unit needs to reduce the total economic cost associated with deploying a computing workload in order to make an impact and be successful.
One area where I believe there is a significantly untapped set of technologies is in multithreading. Multithreading is the process of stretching a computing task across a number of computing cores. It was a necessary development over the last 20 years to figure out how to “divide and conquer” different sub-problems of a computing task as Moore’s Law has begun to slow down, and multithreading today is already used in a number of applications from high performance computing to bitcoin mining.
But we really haven’t embraced multithreading broadly across computing. Multithreading is complicated to employ as a software developer, and really requires specialized training to do properly. Even in high performance computing tasks like rendering video games, multithreading is rarely used properly to fully take advantage of modern processors’ multitude of cores — thus ensuring that the superior single-threaded performance of Intel processors continues to dominate in gaming over their AMD counterparts who are otherwise arguably better at multicore processing.
This isn’t a technology problem. This is a cultural problem. We need a new generation of developers who natively have “grown up” around multithreaded programming to push forward new computing frameworks that support multithreading and make it easier and more proliferate.
Widely proliferate multithreading has the potential to significantly reduce the cost of computing. Superior multithreading at the hypervisor level will allow new atomic units of computing to be deployed and run performently on an already busy virtual environment. The fundamental technology of having multithreading at the hypervisor level already exists. But there are still significant improvements to be made in the hyperthreading framework that governs how this works.
A new generation of infrastructure engineers who have grown up around computing platforms where multithreading was normal (i.e.: mobile development and GPU development) will likely bring a new perspective and new technology that reduces the cost of computing for a new atomic computing architecture.
Cryptographic Isolation and Improved Provisioning: Making Deployment Safer and Easier
The next atomic computing architecture needs to make deploying isolated computing environments safer and easier. Just like the movement from VMs to containers is celebrated for reducing the complexity of spinning up an isolated computing unit, whatever comes next needs to simplify the complexity of provisioning and atomic computing platform to run a workload.
Examining this aspect of atomic computing leads me to question the staying power of serverless computing as we know it today. While serverless significantly reduces the complexity of some computing workloads, it really doesn’t do so in isolation. Tossing a lambda job to AWS Lamda certainly means I don’t have to worry as much about provisioning anything as I would with Kubernetes or a full VM.
But it also means that I need to spend a significant amount of time and effort in the “pre-work” to running this job. I need to think about the complexity of other helper applications to supply instance variables and data to the lambda job. I need to figure out the exact order that this job will run in regards to these other helper applications or services, as the serverless form factor is unsurprisingly spartan. And if I’m using other AWS services like RDS, I need to have my lambda function operate within the same region — highlighting that there is some kind of geographical data challenge behind the scenes that ensures that the world’s biggest serverless platform really isn’t as dynamic as we might think behind the scenes.
Also why do I need to write a lambda function? Why we have decided to rely solely on one programming paradigm to define serverless computing form factors is beyond me. Yes, as someone who also was forced (at gunpoint) to learn Scheme I get the value of functional programming for distributed computing. But in a world where we’re trying to get the next, and current, generation of developers onto the next big thing in infrastructure through diverse and approachable programming languages like Python, shouldn’t we be trying to similarly promote non-functional programming paradigms that are typically used in those languages?
I believe serverless is a pretty solid milemarker improvement on some types of computing workloads and will help define what comes next. But the next atomic computing unit (be it a derivative of serverless or otherwise) will need to significantly improve on these challenges in order to simplify the process of scheduling and deploying a computing workload.
Furthermore, that workload needs to be better isolated. We already discussed why isolation is important for security and simplifying the task of conducting root cause analysis when debugging infrastructure problems. The security part here is key though, as modern adversaries hacking virtualized infrastructures are getting better navigating around measures of isolation.
This is where cryptography like Intel’s SGX platform are going to play a key role. Introducing cryptographic protection of computing tasks at a fundamental level within the application will ensure a strong gap between different computing tasks. Today we solely rely on the division of different containers or VMs, weak isolation which still exposes the computing job to attacks that can hit the computing job at either a lower level on the processing stack or at the scheduling level.
If the next computing unit can introduce strong, tamper — resistant cryptography at both steps, this alone will provide a strong casus belli for moving to this new atomic architecture.
Intelligent Scheduling: Making Configuration Easier
The next generation of computing needs to simplify the process of configuration. It’s one thing to have this atomic computing unit deploy itself quickly, securely, and reduce total cost. But if it’s too hard to configure, all of these advances may hinder its ability to be embraced by the developer mainstream.
Again we can look to the current state of AWS Lambda to see why this challenge is important. Once I have set up my Rube Goldberg machine to run my lambda infrastructure, if I change anything in that infrastructure it poses a significant (and potentially very frustrating) challenge to ensure that job runs properly. With the world of infrastructure becoming increasingly dynamic and multicloud, change within my infrastructure is almost assured.
The next atomic computing unit needs to address this by providing a more elegant mechanism for working with a scheduler to adapt to infrastructure changes. This isn’t something the computing may/should do on its own. Instead, we need to ensure that there is a codified way for a next generation scheduler(s) to operate with that computing unit to fully abstract the “primitives” of how that unit runs — its storage, its networking configuration, etc.
Much like Brownian Motion defined a relationship for how atoms operate with each other, some other kind of protocol needs to exist for intelligent schedulers to help whatever is the next generation of atomic computing architecture to operate in near complete abstraction with its underlying infrastructure. This is a monumental task that likely falls to the Chefs, Puppets, Terraforms, and Pulumi-s of the world, as well as the open source communities that promote them all.
The Future is Atomic
Turing was right. We don’t really know what’s coming in virtualization and atomic computing.
But using this framework of analyzing the relationship of how infrastructure must always iterate to solve these three key problems, we can get a good idea of what the next popular atomic computing unit will look like.
About the Author
This article is written by Andy Manoske, product manager at HashiCorp and Advisor at Amplify Partners. See more