Reducing Operational Costs for Video Hosting

Project Description
The client provides high-load video hosting services. The entire infrastructure is based on bare-metal servers in Europe and Canada, using UCDN as a backup CDN to smooth out loads in emergency situations. The content is divided into several independent blocks, each served by its own independent server group. The entire infrastructure had previously been covered by code developed by our team (Ansible, Terraform).

The main request from the client was to reduce operational expenses on infrastructure while maintaining data storage reliability and availability.

Key Metrics
  • 54.7% reduction in operational expenses
  • 72% reduction in server fleet with increased reliability
  • 83% achieved level of SSD caching
  • 3+ petabytes of monthly traffic for each server group
  • 400+ domain names associated with various projects
Project Goals
To reduce operational expenses by 30% while maintaining a 99.9% availability level
Key Challenges and Results
We conducted an efficiency analysis of content delivery from nodes, identifying performance by IOPS as a bottleneck. Further analysis showed that during peak user activity, most requests fell on a small portion of the total volume of content. We conducted an experiment and proposed a new node deployment variant:
  • Increased storage density of the disk subsystem by replacing HDDs with high-capacity solutions
  • Increased the SSD on which the OS is installed, allowing the use of free space for caching.
  • Configured a custom version of bcache - a technology that allows creating hybrid storage from slow, but high-capacity HDDs and fast SSDs.
  • Optimized file system parameters after analyzing the storage structure. This allowed for a 12% faster access to a random file on the HDD backend. As a result of modifications in the configuration of a typical server, we achieved a x20 to x30 increase in IOPS performance per node. On average, 83% of all requests were served from the fast SSD. We also conducted an audit and financial optimization of expenses for the guaranteed outgoing channel for content distribution servers. We analyzed databases and were able to develop a plan for defragmenting storage systems by removing unused content and migrating to newer, more efficient servers. The migration was completely seamless, without failures from users, with smooth switching of load for more than 400 domains.
As a result, we were able to radically reduce the number of rented servers by 72% while increasing storage reliability by transitioning from RAID-5 to RAID-6. The final reduction in operational expenses significantly exceeded the planned and amounted to 54.7% savings.

Related services
Comprehensive IT and DevOps Audit Services | Boost Efficiency and Security
Enhance your IT operations with our comprehensive audit services, including it audit, it security audit, and devops audit. Ensure compliance, improve performance, and protect your data with our expert solutions.
24/7 DevOps Support Services | Expert DevOps Support Team | WiseOps
Discover top-tier DevOps support services with WiseOps. Our expert DevOps support team provides continuous integration, rapid deployment, and proactive monitoring to ensure seamless IT operations.
Comprehensive Infrastructure Monitoring Services 24/7 | WiseOps Team
Enhance your IT infrastructure with our expert monitoring services, including network, cloud, server, and remote monitoring. Proactive and continuous oversight ensures maximum performance and security.
Infrastructure as a Code (IaC) Solutions | WiseOps
Discover expert infrastructure as a code (IaC) services with WiseOps. Optimize your IAAC cloud deployment and management processes. Contact us for reliable IAAC infrastructure solutions.