{"id":20,"date":"2025-02-24T16:30:20","date_gmt":"2025-02-24T16:30:20","guid":{"rendered":"https:\/\/s461.sofamoci.com\/?p=20"},"modified":"2025-02-24T16:30:20","modified_gmt":"2025-02-24T16:30:20","slug":"how-to-optimize-your-cloud-data-costs-4-steps-to-reduce-cloud-data-platform-costs","status":"publish","type":"post","link":"https:\/\/s461.sofamoci.com\/?p=20","title":{"rendered":"How to optimize your cloud data costs: 4 steps to reduce cloud data platform costs"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1800\" height=\"1200\" src=\"https:\/\/s461.sofamoci.com\/wp-content\/uploads\/2025\/02\/3-1024x683.jpg\" alt=\"\" class=\"wp-image-21\" srcset=\"https:\/\/s461.sofamoci.com\/wp-content\/uploads\/2025\/02\/3-1024x683.jpg 1024w, https:\/\/s461.sofamoci.com\/wp-content\/uploads\/2025\/02\/3-300x200.jpg 300w, https:\/\/s461.sofamoci.com\/wp-content\/uploads\/2025\/02\/3-768x512.jpg 768w, https:\/\/s461.sofamoci.com\/wp-content\/uploads\/2025\/02\/3-1536x1024.jpg 1536w, https:\/\/s461.sofamoci.com\/wp-content\/uploads\/2025\/02\/3.jpg 1800w\" sizes=\"auto, (max-width: 1800px) 100vw, 1800px\" \/><\/figure>\n\n\n\n<p>If you have managed a cloud data platform, you have undoubtedly gotten that call.&nbsp; You know the one, it&#8217;s usually from finance or the office of the CFO, inquiring about your monthly spend. And it usually comes in one of two forms:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>This usage trend is on track to exceed our annual budget and contract. WTF?<\/li>\n\n\n\n<li>What the hell is going on with usage? It shot up by 30% last month!<\/li>\n<\/ul>\n\n\n\n<p>While both are clear and present dangers to cloud data platform owners, they don\u2019t have to be.&nbsp; In this post, I\u2019ll share four essential components of a cost control capability for your cloud data platform (CDP.)&nbsp; I\u2019ll provide some examples using Snowflake, but the features and principles will also apply to other CDPs.&nbsp;<\/p>\n\n\n\n<p>Financial management (FinOps) for cloud infrastructure is one of the most pressing issues for organizations today. With Gartner now predicting cloud data consumption will rise to 75% of all cloud data, up from less than 10% in 2016, it\u2019s not surprising to see new products like CapitalOne SlingShot or CloudZero, which help organizations manage the cost of the Snowflake data platform. These principles will apply whether you use a 3rd party tool, the CDP\u2019s tools, or build custom solutions.<\/p>\n\n\n\n<p><strong>Visibility<\/strong><br>We\u2019ve all heard the saying, \u201cif you can\u2019t measure it, you can\u2019t manage it.\u201d It\u2019s true for controlling cost and is the starting point for developing our cost control capabilities. With calls for more visibility into usage and cost, cloud data platform providers realize that more observability is good for both customers and vendors. Many have begun providing dashboards and user interfaces for increased visibility. Although having a single pane of glass that offers a view into real-time or near real-time usage metrics is critical, cost control capabilities will need to go deeper &#8211; because incurring a $20K overrun from a killer query now happens too easily.<\/p>\n\n\n\n<p>An essential part of surfacing usage insights is tagging your resources with additional metadata. Cloud data platforms have rich metadata capabilities that vendors make available to customers. Attaching metadata or tags to the granular usage data will provide valuable insights into usage, performance, and cost. You can now better understand the usage and cost associated with a resource and align it with the line of business to understand the value. Be sure to apply metadata to all cost-incurring resources such as compute, tasks or storage. For example, Snowflake supports query tagging at the user and session level. Every query execution will log the metadata in the query history.<\/p>\n\n\n\n<p>It is worth noting that developing a quality taxonomy for tags will enable us to do much more than cost and usage controls. For example, platforms offer metadata-driven policies which dynamically apply privacy and access control policies.<\/p>\n\n\n\n<p>The compute cost is often the most significant portion of the cloud data cost, so it makes sense to track it closely. Here again, attaching additional metadata to virtual warehouses (elastic compute) will provide further insights. Additionally, tools like dbt enable metadata tagging when developing data models.&nbsp;<\/p>\n\n\n\n<p>There are a few other resources to tag and review besides the virtual warehouse. Take a hard look at the \u201cserverless\u201d platform features as well. Serverless features are great for ease of use but can lead to unseen and unmanaged costs. These fall into the categories:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Automated ingestion like Snowflake\u2019s Snowpipe.<\/li>\n\n\n\n<li>Background or scheduled tasks.<\/li>\n\n\n\n<li>Automated re-indexing or re-clustering.<\/li>\n<\/ul>\n\n\n\n<p>Here are a couple of tips:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>First, tag every cost-incurring resource with additional metadata.<\/li>\n\n\n\n<li>Ensure all virtual warehouses have auto-suspend and auto-restart configured.<\/li>\n\n\n\n<li>Visualize usage trends. Even a simple forecast can save thousands of dollars.<\/li>\n\n\n\n<li>Visualize and track the total number of virtual warehouses over time to reduce compute sprawl.<\/li>\n\n\n\n<li>Measure the utilization of the warehouses \u2014 too high usually means high concurrency or undersized resources, while too low means wasted consumption.<\/li>\n\n\n\n<li>Visualize the usage and costs for all environments (e.g., development, QA, production.)<\/li>\n\n\n\n<li>Tag and monitor real-time streaming or change data capture (CDC)\u00a0 separately from other compute services (warehouse for analytics, data feeds, etc.)<\/li>\n\n\n\n<li>Experiment and simplify.\u00a0<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Monitors and Controls<\/h2>\n\n\n\n<p>So, now that you have visualized your insights, let&#8217;s follow that with actions. First, your cost controls capability needs proactive monitoring and automated controls.<\/p>\n\n\n\n<p>Resources monitoring and alerting are critical components of cost controls, and these features vary by CDP. Therefore, review the CDP&#8217;s monitoring and alerting capabilities to understand better which levers you can pull<\/p>\n\n\n\n<p>For an example, let\u2019s examine Snowflake\u2019s approach. Snowflake has resource monitors at the account and warehouse levels. Resource monitors track usage, compare them to quotas, and perform actions. It is worth noting that multiple resources can also be grouped into a single resource monitor. Snowflake\u2019s resource monitors support three types of actions:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Notification only.<\/li>\n\n\n\n<li>Notification and Suspend.<\/li>\n\n\n\n<li>Notification and Suspend immediately.<\/li>\n<\/ul>\n\n\n\n<p>Here are a few best practices for monitoring and alerting:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Create a resource monitor for each organizational account that provides only alerting.<\/li>\n\n\n\n<li>Monitor all cost-incurring resources using quotas, notifications, and service suspension actions or rules.<\/li>\n\n\n\n<li>Add metadata tags to all serverless features that fall outside of resource monitoring.\u00a0<\/li>\n\n\n\n<li>Use additional custom actions to alert and suspend services where needed.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><a><\/a>Optimization<\/h2>\n\n\n\n<p>Price performance is the relationship between costs versus performance, and it\u2019s not always easy to understand or interpret. So, when it comes to consumption-based pricing, price performance is another form of cost control. Look at the diagram below; why does x-small compute cost more than x-large while still having poorer performance? Why does going from 2x-large to 3x-large increase cost by over 60% and provide little performance gain? In the first case, it may be due to data spillage, effectively trashing, by the compute resources. In the second case, compute has been over-provisioned, so there are excess compute resources for the job.<\/p>\n\n\n\n<p>Let\u2019s face it; data solutions must embrace cloud-first or cloud-native approaches because the old methods used for on-premises solutions aren\u2019t well suited for the cloud. Long-running queries, poorly designed models, and needlessly scheduled data pipelines are candidates for control cost and an opportunity to delight our stakeholders.&nbsp;<\/p>\n\n\n\n<p>With some refactoring focused on performance optimization, you can save 15-20% on computing costs. Try some of these cloud cost optimization tips:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Watch out for data spillage, the equivalent of your computer thrashing or writing data to disk due to low memory. Cloud data platforms do this, too, and it usually indicates undersized compute.<\/li>\n\n\n\n<li>Properly size your compute for the workload. Over-provisioning for small queries will not improve performance. Likewise, under-provisioning will not save money on large queries. Due to the \u201cdata spillage\u201d issue listed above, it will usually cost more.\u00a0\u00a0<\/li>\n\n\n\n<li>Ensure your analytics tools support \u201cjoin optimization\u201d (include only the minimum number of joins required).<\/li>\n\n\n\n<li>Develop separate pipeline and replication schedules by the environment.\u00a0 There are massive savings by sizing each environment properly. For example, development and integration environments may need less compute resources, few and smaller data pipelines, and costly real-time replication may be reduced to daily or on-demand replication.<\/li>\n\n\n\n<li>As always, simplify wherever you can.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><a><\/a>Collaboration<\/h2>\n\n\n\n<p>Collaborate, collaborate, collaborate. Create a cost control team with stakeholders from architecture, engineering, analyst, and business departments. Having a diverse group dramatically increases the impact of any cost control initiative. The typical time commitment is 30 minutes weekly, even less once the process starts. This team sets priorities, leverages technical expertise, and ensures stakeholder buy-in.<\/p>\n\n\n\n<p>Quick wins are crucial for the long-term success of the team. Know this; I like to create aggressive 30, 60, and 90-day goals for the team.&nbsp;<\/p>\n\n\n\n<p><strong>Pro tip:<\/strong>&nbsp;Don\u2019t be afraid to make Big Hairy Audacious Goal or a BHAG, as management guru Jim Collins calls them. Often, these are easily achieved because there is so much low-hanging fruit. For example, in my last role as a data leader in financial services, we had a BHAG of 15% reduction compute credits in 30 days.&nbsp; A few months later, a peer called asking, \u201cWhat did you do? We were using compute growth as a proxy measure for data ingestion progress &amp; productivity?\u201d Aside from highlighting the wrong metric for data ingestion progress, our early quick wins in cost controls were a huge success.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Leverage the momentum of those quick wins by creating longer-term goals for systemic cost improvements. CFOs and CDOs must become more symbiotic in the cloud data world, so having more cost transparency coupled with those quick wins will go a long way.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><a><\/a>Where to go from here?<\/h2>\n\n\n\n<p>All of the above are important to any cost control capability you undertake, whether you use 3rd party tools, your CDP\u2019s features, a custom set of services, or some combination. There is rarely a single solution to any problem. I recommend forming your team now, getting started ASAP (you\u2019re only one unmonitored resource away from a fire drill!), and putting the right resources in place to surface, monitor, and optimize cloud cost and usage.<\/p>\n\n\n\n<p>So next time you have a call with the CFO, she may just be saying, \u201cGreat job! Look how much money we saved this month\u201d.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you have managed a cloud data platform, you have undoubtedly gotten that call.&nbsp; You know the one, it&#8217;s usually from finance or the office of the CFO, inquiring about your monthly spend. And it usually comes in one of&#8230; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-20","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=\/wp\/v2\/posts\/20","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=20"}],"version-history":[{"count":1,"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=\/wp\/v2\/posts\/20\/revisions"}],"predecessor-version":[{"id":22,"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=\/wp\/v2\/posts\/20\/revisions\/22"}],"wp:attachment":[{"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=20"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=20"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/s461.sofamoci.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=20"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}