Table of Contents
ToggleIntroduction
In the fast-growing world of containerized applications, efficient container data management solutions are paramount for operational success and cost-effectiveness. This blog highlights a compelling case study from Crossvale, showcasing how a PodOps Managed Platform Engineering Service healthcare client overcame significant challenges in their Red Hat OpenShift environment. Faced with slow storage speeds causing application failures after an upgrade, PodOps turned to Portworx to recommend to the client, leading to a remarkable improvement in performance and a significant reduction in overall operating costs. This case study exemplifies the critical role of advanced container data management solutions in unlocking the value of Red Hat OpenShift at scale. Read below how Crossvale is using Portworx to solve challenges for their customers.
Problem: Slow ODF storage caused application failures post-upgrade
Main issue
A PodOpsSM by CrossvaleTM healthcare client was initially running Red Hat OpenShift version 4.8 and Red Hat ODF 4.8. Among the applications hosted on this setup, some took approximately 20 minutes to initiate due to their reliance on a volume containing around 1 million files. However, upon upgrading to OpenShift 4.9 and ODF 4.9, these applications encountered a problem where they could no longer initiate successfully. The only error message recorded in the logs was “context deadline exceeded.”
Additional issue
An ODF pod was utilizing over 70 gigabytes of memory, prompting the PodOpsSM team to upgrade the ODF nodes to 128 gigabytes to accommodate the pod’s requirements. This adjustment became necessary as the existing 64-gigabyte AWS (Amazon Web Services) instances couldn’t accommodate the pod’s needs, and the next available size was twice as large, at 128 gigabytes. Consequently, this increase significantly raised the compute costs associated with AWS for the maintenance of the OpenShift cluster.
Solution: Change the container storage to Portworx
Diagnosis:
While conducting performance testing, the Crossvale PodOpsSM Team determined that ODF was delivering inadequate storage speed. Further investigation and troubleshooting with the Red Hat support team revealed that the pods needed to relabel the entire filesystem upon every boot due to SELinux constraints. To address this issue temporarily, we implemented a workaround that skips this relabeling process. However, it’s worth noting that this workaround deviates from the best security practices, so this could not be a permanent solution.
This is where Portworx comes into the picture. Portworx was able to replicated the same use case on Portworx volumes and witnessed a significant improvement in performance, approximately 10 times better. For detailed test results, please reach out to Crossvale at https://www.crossvale.com/contact/
Implementation
Upon recognizing the significantly improved storage speed with Portworx, we encountered our initial challenge: migrating all volumes from the ODF StorageClass to a Portworx StorageClass. This process couldn’t simply involve renaming the StorageClass on the volumes; it required creating entirely new Portworx volumes for each existing ODF volume within the cluster and transferring all associated data.
This operation presented several complexities, including the limitation imposed by ODF’s lower performance and the necessity to handle ongoing data writes by applications while copying data from the current volumes to the new Portworx volumes.
To address these challenges, our approach aimed to minimize downtime, data loss, and the potential for human error. As a solution, we developed automation capable of performing incremental backups for all volumes slated for migration and configuring applications with the new volumes. Crucially, this automation could conduct backups while the applications remained operational, without affecting performance. While the initial backup process took hours and, in some cases, days for certain volumes, subsequent incremental backups completed in a matter of seconds, contingent on the time elapsed since the last backup.
In the final phase of implementation, we executed the automation during a maintenance window, ensuring a seamless transition. Remarkably, all applications were operational with the new volumes configured in less than 5 minutes, preserving data integrity throughout the process.
The PodOpsSM Team, as part of the Managed Platform Engineering maintenance service, spearheaded the design and execution of this solution. Leveraging a service like PodOpsSM by CrossvaleTM, coupled with high-performance platforms like Portworx, reduces your need and cost to maintain internal FTEs with expertise in troubleshooting and resolving complex containerization issues.
Testing
In order to conduct precise and authentic testing, we undertook the migration of all volumes from the Production (Prod) cluster to the Laboratory (Lab) cluster. Subsequently, we executed the automation multiple times in the Lab environment. Furthermore, a comprehensive assessment of data integrity and application functionality was carried out. It’s worth noting that during the Production intervention, we collaborated with the client developers who conducted their own validation process.
Gain: Applications launch, reducing storage and client costs
Performance Enhancement
As demonstrated by the benchmarks, Portworx not only provides over 10 times better performance than ODF (for this use case) but also significantly reduces resource consumption. This substantial performance enhancement has not only improved cost savings but has also notably enhanced the usability of the application.
In fact, this transition to Portworx resulted in a remarkable estimated 63% reduction in overall operating costs for the customer, encompassing infrastructure expenses, licenses, and subscriptions when compared to the combined costs of ODF and the required infrastructure.
Lessons Learned
As an integral part of PodOpsSM Best Practices, we adhere to a Standard Operating Procedure (SOP) that we implement consistently across our customer base. This SOP is the result of lessons learned from experiences like the one described here. Its primary purpose is to mitigate the risk of failures and downtime for our clients while enhancing the performance of our dynamic maintenance processes.
We recognize that a robust container storage solution is vital for containerized applications, offering crucial functionalities such as data persistence, high availability, scalability, snapshots, and backup, all of which contribute significantly to data protection and efficient recovery.
Through these experiences, we’ve come to appreciate the importance of having a backup of applications and an alternative cluster readily available before proceeding with upgrades. To address this need, we have developed an automation system that empowers us to swiftly create a new cluster with all the applications up and running in a matter of hours, minimizing disruptions and ensuring a seamless transition.
.
To discover how Crossvale PodOps and Portworx can transform your container operations while also reducing operating costs, check out Crossvale’s PodOps Managed Platform Engineering and Portworx at Portworx.com