Advancing Sustainability Innovation at VMware
With the rapid adoption of compute-intensive technologies, such as cryptocurrencies, machine learning and big data, datacenter electricity use — currently 200 billion kWh or 1% of global electricity consumption — could increase ten-fold over the next decade. This massive energy consumption comes with a hefty carbon footprint that would thwart the global effort to halve carbon emissions before 2030. Over the last 10 years, workloads and Internet traffic have grown eight-fold and twelve-fold, respectively, but advances in datacenter hardware and operational efficiencies, like solid-state drives, RISC processors, virtualized computing and cooling technology, have kept datacenter electricity consumption flat.
The big question is: could technological innovation help meet the growing demand, without a proportional increase in our carbon footprint? At VMware, we think so; and we think it will also come with significant business co-benefits.
In the first installment of this blog series, “Accelerating Decarbonization of the World’s Digital Infrastructure,” I defined and explained why the next frontier for innovation is sustainable computing. In the second, “Pathways to Sustainable Clouds,” I delved into the following five key strategies for achieving sustainable computing through workload energy and carbon efficiency:
- Visualizing energy use and carbon footprint
- Maximizing productive host utilization
- Operating energy-efficient IT hardware
- Designing compute-efficient and carbon-aware applications
- Powering workloads with renewable energy
In this final post, I’ll share some recently-released and in-pipeline sustainability innovations* in these areas.
A long history of sustainable computing
VMware has always been committed to sustainability. Our foundational compute virtualization product, ESXi, is inherently sustainable — it dramatically reduces the IT and datacenter infrastructure needed to run workloads. Since the launch of ESXi in 1998, VMware has extended virtualization to software-based storage (vSAN) and networking (NSX), making it possible to downsize the need for external storage and networking infrastructure by leveraging existing host capacity.
VMware Cloud Foundation brings together ESXi, vSAN, and NSX, wrapping them up in management and security layers to provide an on-premises software-defined datacenter (SDDC) private cloud. VMware Cloud Foundation enables customers to replicate public cloud’s more efficient, shared environments and higher utilization. A 2020 IDC report (IDC white paper, sponsored by VMware: “Enabling More Agile & Sustainable Business through Carbon-Efficient Digital Transformations,” August 2020.) estimated that, globally, since 2003, VMware’s virtualized compute, storage, and networking technologies have enabled our customers to collectively avoid the deployment of 142 million servers, the consumption of 2.4 billion megawatts of energy, and the emission of 1.2 billion metric tons of greenhouse gases. That’s roughly equivalent to three percent of the world’s carbon emissions in 2019.
At the end of 2020, we doubled down on sustainability innovation as part of our 2030 Agenda, a decade-long environmental, social, and governance (ESG) commitment to foster a more sustainable, equitable, and secure world.
Visualizing energy use and carbon footprint
vRealize Operations Sustainability Dashboards
vRealize Operations helps customers automate management of their virtualized, hybrid, and multi-cloud environments with a unified operations platform that delivers continuous performance, capacity, and cost-optimization. The recent release of vRealize 8.6 adds carbon management, which quantifies the carbon-footprint reduction enabled by workload virtualization. It also quantifies host carbon footprint and provides insights into further carbon- (and cost-) reduction opportunities, including:
- Identifying powered-off and idle VMs, orphaned files, and old snapshots
- Identifying aging hardware
- VM rightsizing
Green Metrics: You can’t manage what you can’t measure
The long-term goal of Green Metrics, a capability of ESXi, is to quantify and visualize energy and carbon at the host, container, and application layers, as well as across multi-cloud environments. Modern servers can convey their real-time power draws (e.g., watts), but not how that power is being used by the container and application layers. Green Metrics builds on vSphere’s current energy metrics capability to provide improved monitoring and optimization — enabling customers to manage the carbon footprint of their cloud operations and applications. Look for the first version in vSphere 8.0 release in June!
Maximizing productive host utilization
Disaster recovery in the cloud
Imagine the amount of energy usage, carbon emissions, and infrastructure costs the enterprise could eliminate if backup, disaster recovery (DR), and datacenters were unnecessary! Unfortunately, they are necessary. But the VMware Cloud Disaster Recovery (VCDR) solution significantly reduces the infrastructure required for backup functionality, helping customers reduce the carbon footprint of their DR operations by up to 80%, with commensurate cost savings. Customers can easily and quickly spin up capacity during DR operations and conduct DR testing and failover, but without the “always-on” infrastructure. In addition, VCDR operates on VMware Cloud-on-AWS, which is backed by AWS’s Climate Pledge to power its datacenters with 100% renewable energy by 2025 — reducing carbon even further.
Software-defined datacenter on a SmartNIC
Project Monterey leverages the hardware capabilities of today’s “smart” network-interface cards (SmartNIC) to offload VMware’s compute, storage and networking virtualization, security, and management capabilities. A SmartNIC is a NIC with a general-purpose CPU (ARM processor), out-of-band management, and virtualized device functionality. This offloading has the benefits of freeing up a host’s core CPUs to support additional business applications, improving performance and facilitating sharing of hardware components like GPUs, FPGAs, and storage across a host cluster. This can minimize the host infrastructure needed to support business workloads, reducing energy and carbon. The development team is working to quantify this energy and carbon-reduction benefit — along with potential CAPEX and OPEX savings — so stay tuned for those metrics.
VM Desired State Configuration (VMDSC)
One cause of low host utilization is the presence of oversized VMs that are provisioned with more CPU and/or memory than necessary. These resources are stranded and unavailable to other workloads. Over-provisioning a VM’s CPU and memory is believed to help avoid performance issues, which is why it’s so common. In aggregate, these represent a lot of stranded resources, which is costly and comes with an unnecessary carbon footprint. While vRealize Operations and CloudHealth make it easy to find oversized VMs, they are tricky to remedy. This is because to rightsize a VM, it needs to be shut down and restarted to activate the new configuration. This can be a time-consuming hassle, requiring coordination with the business and application owners.
VM Desired State Configuration (VMDSC), a downloadable vCenter appliance, addresses this issue — making it easy and low-risk to resize VMs and recover stranded assets for productive use. This helps increase the productive utilization of existing hosts and slows the deployment of new infrastructure. VMDSC waits until the next guest OS reboot and implements the new, rightsized VM configuration state. VMDSC also facilitates configuration expansion for under-sized VMs. If you are using vRealize Operations, it can help you identify VMs that need rightsizing, then invoke the VMSDC API to schedule selected VMs for rightsizing. Learn more about VMDSC.
Zombie VM detection
Here, we are defining “zombies” as VMs and servers that were originally deployed for a specific purpose but are no longer doing useful work. We find that zombies are common in customers’ cloud environments (between 15% and > 50% of servers or VMs). This can be a significant cost and security risk and represents an enormous environmental impact in the form of wasted infrastructure capacity, energy consumption, and carbon emissions. For example, in 2019, in the process of completing a datacenter migration, VMware found that 47% of its VMs were not being used and were deprecated! Zombies are so prolific because they are easy to create (how many zombie apps do you have on your phone?) but can be hard to find. VMware’s vRealize Operations and CloudHealth can help find the ones that are powered off or have little to no CPU utilization. However, many zombies have some residual activity that is ancillary to the primary application, such as virus scanning, patch updates, and backups. Current zombie detectors will miss these “crawling zombies” because, based on their activity, they look like they might be doing productive work.
To overcome the challenge, we are testing an approach that monitors VMs over their lifespans and watches for dramatic and persistent drop-offs in their activity. Any seasonal and quasi-periodic activity is removed from the activity signal. If the remaining activity is persistently low, then the VM becomes a zombie candidate and is watched further. Correlated zombie behavior across multiple activity metrics reinforces evidence of zombie status. Since legitimate VMs can go dormant for long periods (weeks or months), our zombie detection is patient, so it will minimize false positives. If a VM becomes active again, it is removed from the zombie candidate list. Ultimately, our goal is to identify zombie VMs in customers’ cloud environments, highlight the associated financial and carbon costs, and provide remediation options (like VMDSC) to recover costs and stranded resources. We are in the process of testing this against customer data to hone the detection algorithm.
Flowgate: Converging facility and IT management in datacenters
Historically, datacenter and IT infrastructure have been managed as separate systems, despite their strong interdependencies. Datacenter management systems and data abound with data center infrastructure (DCIM), assets, IP, configuration (CMDB), facility topology, and power/environment. These systems have little visibility into the IT application layer. Similarly, IT management systems have no knowledge of datacenter operations or state outside the servers. For example, datacenter cooling systems are reactive to changes in IT heat loads as server activity increases or decreases. The remedy for hot spots is to blast more cold air. The outcome is sub-optimized operations across both infrastructures.
Project Flowgate, an open-source, vendor-agnostic vRealize Operations plugin, helps enterprises converge facility and IT systems to form a unified view for more efficient operations management. Flowgate ingests, processes, and correlates metadata and runtime metrics from various systems, such as DCIM and CMDB and IT systems. By combining these two disparate data sets into one view, datacenter and IT managers are better equipped to optimize operations and make smarter management choices. This unified information is accessible to IT administrators via IT infrastructure-management systems, such as vCenter and vRealize Operations. They can also view and analyze additional resources — such as power supply, cooling capacity, and temperature/moisture for every specific server — and conduct many facility-aware operations that were previously impossible. Early results with Quarkdata for China Telecom show an initial cooling system power savings of 35%.
Flowgate is available as a vRealize Operations management pack, as of release 8.2.
Designing compute-efficient and carbon-aware applications
Sustainable software development
An exciting, emerging area of sustainable computing is sustainable software development (SSD). It is focused on two key areas: 1) minimizing the datacenter infrastructure required to run application instances through efficient software design, architecture, and coding, and 2) designing software to be carbon-aware to run when and where energy is cleanest. To this end, VMware recently joined the Green Software Foundation (GSF) to learn about and contribute to the development of SSD principles, tools, libraries, metrics, SDKs, and standards. While we are early on in this adventure, we are excited to see how we can apply SSD to application modernization to help customers reduce both the costs and carbon footprints of running their enterprise applications.
You can learn more about VMware’s recent GSF membership here.
Developer Sustainability Dashboard
The Developer Sustainability Dashboard for analyzing software builds is a prototype that is designed for developers and managers, to raise awareness of the compute resources used for builds — both sandbox and official builds. This real-time dashboard surfaces key information on the builds (such as energy and carbon impact) that was previously not visible, enabling development teams to make sustainability-conscious decisions throughout the builds’ development lifecycle.
The aim of this dashboard is to provide an intuitive visualization of the resources used in build processes — exposing data and metrics — to prompt changes in development practices. The product teams can take charge, understand the resources used for software development, and make informed sustainability-conscious choices.
Powering workloads with renewable energy
Zero Carbon Committed
In May of last year, VMware launched our Zero Carbon Committed (ZCC) public-cloud partner initiative. It was born from our recognition that as the only cross-cloud and cloud-neutral platform, VMware is uniquely positioned to partner with our customers to help them achieve their decarbonization goals, so we can all work together towards a sustainable future. We have an enormous opportunity to accelerate the decarbonization of digital infrastructure through our public-cloud partner ecosystem.
The ZCC vision is that by 2030, all VMware public clouds will have zero carbon emissions through energy-efficient and 100% renewable-energy powered datacenters. Our portal helps customers easily find our VMware Zero Carbon Committed providers, who have set goals to power their datacenters with 100% renewable energy or achieve carbon neutrality by or before 2030. These partners are also running highly efficient operations using VMware’s full stack of virtualized compute, storage, and networking solutions, as well as VMware Cloud Director, which facilitates multi-tenancy to achieve high host utilization and high performance.
Envisioning a sustainable future
These projects are a sampling of the sustainability innovation that’s happening at VMware. To match and exceed the efficiency gains of the past decade and push towards zero carbon — even as Internet traffic and workloads grow exponentially — we will need to keep sustainability top of mind in every layer of the world’s cloud operations: IT infrastructure, DC infrastructure, software, and energy sourcing. This is a journey of co-innovation and collaboration with our employees, customers, and partners. Together we can redefine the future!
*Disclaimers: Overviews of new technology represent no commitment from VMware to deliver these features in any generally available product. This post may describe product features or functionality that are currently under development — technical feasibility and market demand will affect final delivery. Features this post refers to are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. This article may contain hyperlinks to non-VMware websites that are created and maintained by third parties who are solely responsible for the content on such websites.
Nicola is the Director of Sustainability Innovation at VMware in the Office of the CTO. She collaborates with product and R&D teams to improve product and operational efficiency helping customers.