Sustainable Architectural Patterns: Storage, Cache and Batch
The architectural decisions we make every day — how we structure storage, how we manage caching, how we orchestrate batch jobs — determine the carbon footprint of software to an extent 10-100 times greater than micro-optimizations in code. A poorly designed architecture that queries the database on every request, does not use caching, and processes data in real time when it could do so in batch, can consume orders of magnitude more energy than a well-designed architecture.
According to an analysis by the Green Software Foundation Impact Framework, architectural choices related to storage tiering, caching, and batch scheduling represent 40-60% of the emission reduction potential of an enterprise software system. This is not just about technical optimization: it is a professional responsibility that every architect and developer should consider an integral part of their craft.
In this ninth article of the Green Software series, we will explore the most effective sustainable architectural patterns: from intelligent storage tiering to carbon-aware caching, from batch scheduling during green energy windows to sustainable API design. With real code examples and a complete case study demonstrating a 45% carbon footprint reduction for an e-commerce site with 1 million daily visits.
What You Will Learn
- Tiered storage (hot/warm/cold/archive) and data lifecycle policies to reduce energy
- Carbon-aware caching patterns: geo-intelligent CDN, multi-level caching, optimized invalidation
- Batch processing during green energy windows with Carbon Aware SDK
- Right-sizing and auto-scaling down: how to avoid "idle computing" that wastes energy
- Sustainable database patterns: query optimization, materialized views, read replicas
- Network efficiency: compression, HTTP/3, edge computing
- Sustainable frontend patterns: lazy loading, image optimization, dark mode energy
- Sustainable API design: pagination, field selection, GraphQL vs REST
- Carbon monitoring with Prometheus, Grafana, and SCI score per service
- E-commerce case study: -45% carbon footprint by applying all patterns
Green Software Series — 10 Articles
| # | Title | Focus | Status |
|---|---|---|---|
| 1 | Green Software Engineering Principles | GSF, SCI, 8 core principles | Published |
| 2 | Measuring Emissions with CodeCarbon | CodeCarbon, Python energy profiling | Published |
| 3 | Carbon Aware SDK: Intelligent Scheduling | Temporal and geographic workload shifting | Published |
| 4 | Climatiq API: Real-Time Emission Data | Emissions API, conversion factors | Published |
| 5 | GreenOps: Sustainable Infrastructure as Code | Green Terraform, spot instances, auto-scaling | Published |
| 6 | AI and Carbon Footprint: Responsible Training | Efficient models, LoRA, quantization | Published |
| 7 | Scope 3 in Software Pipelines | Upstream/downstream emissions, supply chain | Published |
| 8 | Scope Modeling: Simulating Impact | Simulation, what-if analysis, green roadmap | Published |
| 9 | Sustainable Architectural Patterns | Storage, Cache, Batch, API design | This article |
| 10 | ESG, CSRD and Compliance for Software Teams | Mandatory reporting, audit trail, ESG metrics | Next |
Storage Tiering: The Right Data in the Right Place
The first major energy waste in modern systems is unoptimized storage: rarely accessed data residing on high-performance NVMe SSDs, or worse, in RAM. An enterprise SSD consumes up to 6-10 Watts at idle and 25 Watts under load. HDD storage consumes 5-8 Watts. Tape or cold archive storage consumes less than 0.01 Watts per TB at idle.
The tiered storage strategy consists of classifying data based on access frequency and automatically moving it to the energy-appropriate storage level. An average organization that correctly implements tiered storage reduces storage costs by 40-70% and the associated carbon footprint by a similar proportion.
4-Level Storage Architecture
| Level | Type | Latency | Cost/TB/month | Energy/TB | Typical Use |
|---|---|---|---|---|---|
| Hot | NVMe SSD / Redis | < 1ms | $200-500 | High (25W) | Active data from last 30 days |
| Warm | Standard SSD / S3 Standard | 1-10ms | $20-50 | Medium (8W) | Data 1-12 months, weekly access |
| Cold | HDD / S3 IA / GCS Nearline | 50-250ms | $4-10 | Low (3W) | Data 1-7 years, monthly access |
| Archive | Glacier / GCS Archive / Tape | 1-12 hours | $0.40-1 | Minimal (<0.01W) | Data >7 years, compliance, backup |
Implementing Automatic Data Lifecycle Policies
The core of tiered storage is lifecycle policy automation. Instead of manually managing data movement, we define rules that the system applies automatically based on data age and access frequency.
from dataclasses import dataclass
from datetime import datetime, timedelta
from enum import Enum
from typing import Optional
import boto3
import logging
logger = logging.getLogger(__name__)
class StorageTier(Enum):
HOT = "hot"
WARM = "warm"
COLD = "cold"
ARCHIVE = "archive"
# Average energy per operation per tier (Wh per GB moved/accessed)
TIER_ENERGY_PROFILE = {
StorageTier.HOT: {"idle_w_per_tb": 25, "access_wh_per_gb": 0.002},
StorageTier.WARM: {"idle_w_per_tb": 8, "access_wh_per_gb": 0.0008},
StorageTier.COLD: {"idle_w_per_tb": 3, "access_wh_per_gb": 0.0003},
StorageTier.ARCHIVE: {"idle_w_per_tb": 0.01, "access_wh_per_gb": 0.00001},
}
@dataclass(frozen=True)
class LifecyclePolicy:
"""Immutable: defines transition thresholds between tiers."""
hot_to_warm_days: int = 30
warm_to_cold_days: int = 90
cold_to_archive_days: int = 365
delete_after_days: Optional[int] = None # None = keep forever
@dataclass(frozen=True)
class DataAsset:
"""Represents a data asset with its metrics."""
key: str
size_gb: float
created_at: datetime
last_accessed: datetime
current_tier: StorageTier
access_count_30d: int
class DataLifecycleManager:
"""Manages data lifecycle toward energy-optimal tiers."""
def __init__(self, policy: LifecyclePolicy, s3_client=None):
self._policy = policy
self._s3 = s3_client or boto3.client("s3")
def evaluate_tier_transition(self, asset: DataAsset) -> Optional[StorageTier]:
"""
Determines if an asset should be moved.
Returns the new tier or None if no change is needed.
"""
age_days = (datetime.utcnow() - asset.created_at).days
days_since_access = (datetime.utcnow() - asset.last_accessed).days
# Rule 1: archive for very old data
if age_days > self._policy.cold_to_archive_days:
if asset.current_tier != StorageTier.ARCHIVE:
return StorageTier.ARCHIVE
# Rule 2: cold for rarely accessed data
elif age_days > self._policy.warm_to_cold_days and asset.access_count_30d < 2:
if asset.current_tier in (StorageTier.HOT, StorageTier.WARM):
return StorageTier.COLD
# Rule 3: warm for mature data with moderate access
elif age_days > self._policy.hot_to_warm_days and asset.access_count_30d < 10:
if asset.current_tier == StorageTier.HOT:
return StorageTier.WARM
return None # No change needed
def calculate_carbon_savings(
self,
asset: DataAsset,
target_tier: StorageTier,
carbon_intensity_g_kwh: float = 350 # European average 2025
) -> dict:
"""Estimates monthly CO2 savings from a tier change."""
current_profile = TIER_ENERGY_PROFILE[asset.current_tier]
target_profile = TIER_ENERGY_PROFILE[target_tier]
# Monthly idle energy (30 days * 24 hours)
current_idle_wh = (current_profile["idle_w_per_tb"] * asset.size_gb / 1000) * 720
target_idle_wh = (target_profile["idle_w_per_tb"] * asset.size_gb / 1000) * 720
savings_wh = current_idle_wh - target_idle_wh
savings_kwh = savings_wh / 1000
savings_co2_g = savings_kwh * carbon_intensity_g_kwh
return {
"monthly_savings_kwh": round(savings_kwh, 4),
"monthly_savings_co2_g": round(savings_co2_g, 2),
"annual_savings_co2_kg": round(savings_co2_g * 12 / 1000, 3),
}
def execute_transition(self, asset: DataAsset, target_tier: StorageTier, bucket: str) -> dict:
"""Moves the asset to the new tier on S3 with the appropriate storage class."""
storage_class_map = {
StorageTier.WARM: "STANDARD_IA",
StorageTier.COLD: "GLACIER_IR",
StorageTier.ARCHIVE: "DEEP_ARCHIVE",
}
storage_class = storage_class_map.get(target_tier, "STANDARD")
savings = self.calculate_carbon_savings(asset, target_tier)
try:
# Copy with new storage class (immutable: creates new S3 object)
self._s3.copy_object(
CopySource={"Bucket": bucket, "Key": asset.key},
Bucket=bucket,
Key=asset.key,
StorageClass=storage_class,
MetadataDirective="COPY",
)
logger.info(
"Tier transition: %s -> %s | CO2 saved: %.2fg/month",
asset.current_tier.value, target_tier.value,
savings["monthly_savings_co2_g"]
)
return {"success": True, "savings": savings}
except Exception as e:
logger.error("Transition failed for %s: %s", asset.key, e)
return {"success": False, "error": str(e)}
# AWS S3 Lifecycle Policy as JSON (Infrastructure-as-Code alternative)
S3_LIFECYCLE_POLICY = {
"Rules": [
{
"ID": "GreenDataLifecycle",
"Status": "Enabled",
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"},
{"Days": 90, "StorageClass": "GLACIER_IR"},
{"Days": 365, "StorageClass": "DEEP_ARCHIVE"},
],
"Expiration": {"Days": 2555}, # 7 years then delete
}
]
}
Compression: The Most Undervalued Pattern
Data compression simultaneously reduces storage consumption (fewer bytes to maintain) and network traffic (fewer bytes to transfer). The ratio of energy saved to energy spent on compression is typically 10:1 to 100:1, making it one of the most energy-efficient patterns.
import zlib
import lz4.frame
import zstandard as zstd
import time
from dataclasses import dataclass
from typing import Callable
@dataclass(frozen=True)
class CompressionProfile:
"""Immutable profile of a compression algorithm."""
name: str
compress_fn: Callable[[bytes], bytes]
decompress_fn: Callable[[bytes], bytes]
cpu_intensity: float # 1.0 = baseline, <1 = less CPU, >1 = more CPU
best_for: str
def benchmark_compression(data: bytes, profile: CompressionProfile) -> dict:
"""Measures energy efficiency of an algorithm on real data."""
# Compression
start = time.perf_counter()
compressed = profile.compress_fn(data)
compress_ms = (time.perf_counter() - start) * 1000
# Decompression
start = time.perf_counter()
profile.decompress_fn(compressed)
decompress_ms = (time.perf_counter() - start) * 1000
ratio = len(data) / len(compressed)
# Estimated energy: CPU_time * intensity * 0.001 Wh per ms of CPU
compress_energy_mwh = compress_ms * profile.cpu_intensity * 0.001
storage_savings_pct = (1 - 1/ratio) * 100
return {
"algorithm": profile.name,
"ratio": round(ratio, 2),
"storage_savings_pct": round(storage_savings_pct, 1),
"compress_ms": round(compress_ms, 2),
"decompress_ms": round(decompress_ms, 2),
"compress_energy_mwh": round(compress_energy_mwh, 4),
"best_for": profile.best_for,
}
# Main algorithm profiles
COMPRESSION_PROFILES = [
CompressionProfile(
name="zlib-6",
compress_fn=lambda d: zlib.compress(d, level=6),
decompress_fn=zlib.decompress,
cpu_intensity=1.0,
best_for="Universal compatibility, HTTP responses"
),
CompressionProfile(
name="lz4",
compress_fn=lz4.frame.compress,
decompress_fn=lz4.frame.decompress,
cpu_intensity=0.15, # Very fast, less CPU
best_for="Real-time streams, high access frequency"
),
CompressionProfile(
name="zstd-3",
compress_fn=lambda d: zstd.ZstdCompressor(level=3).compress(d),
decompress_fn=lambda d: zstd.ZstdDecompressor().decompress(d),
cpu_intensity=0.4,
best_for="Optimal ratio/speed balance (recommended)"
),
CompressionProfile(
name="zstd-19",
compress_fn=lambda d: zstd.ZstdCompressor(level=19).compress(d),
decompress_fn=lambda d: zstd.ZstdDecompressor().decompress(d),
cpu_intensity=2.5, # High CPU for maximum compression
best_for="Cold/archive storage, rare data, nightly batch"
),
]
# Practical rule for compression level selection:
# HOT tier -> lz4 (minimal latency, ultra-fast decompression)
# WARM tier -> zstd-3 (optimal balance)
# COLD tier -> zstd-9 (better ratio, tolerable latency)
# ARCHIVE -> zstd-19 or brotli-11 (maximum storage savings)
Carbon-Aware Caching: Serve More by Computing Less
Caching is probably the most powerful pattern for reducing software emissions: every cache hit completely eliminates the energy consumption of the corresponding computation. A system with a 90% cache hit rate performs only 1/10 of the computations compared to one without cache, with proportional energy savings.
But "carbon-aware caching" goes beyond simple performance optimization: it considers the grid carbon intensity to decide what to preload, how long to keep data in cache, and when to perform cache warming operations.
Multi-Level Cache Architecture
Cache Levels and Energy Impact
| Level | Technology | Latency | Energy per Hit | Energy Saved vs DB |
|---|---|---|---|---|
| L1: In-Process | HashMap, LRU in RAM | < 0.1ms | ~0.001 mWh | 99.9% savings |
| L2: Distributed | Redis, Memcached | 0.1-1ms | ~0.01 mWh | 99% savings |
| L3: CDN Edge | CloudFront, Fastly, CF | 1-20ms | ~0.05 mWh | 95% savings |
| DB Query | PostgreSQL, MySQL | 5-100ms | ~1-10 mWh | — baseline |
import Redis from "ioredis";
interface CacheEntry<T> {
readonly data: T;
readonly cachedAt: number;
readonly ttlMs: number;
readonly carbonIntensityAtCache: number; // gCO2/kWh when cached
}
interface CarbonAwareCacheConfig {
readonly l1MaxEntries: number;
readonly l1DefaultTtlMs: number;
readonly l2DefaultTtlMs: number;
readonly lowCarbonThreshold: number; // gCO2/kWh
readonly highCarbonTtlMultiplier: number; // Longer TTL when carbon is high
}
const DEFAULT_CONFIG: CarbonAwareCacheConfig = {
l1MaxEntries: 1000,
l1DefaultTtlMs: 60_000, // 1 minute L1
l2DefaultTtlMs: 300_000, // 5 minutes L2
lowCarbonThreshold: 200, // <200 gCO2/kWh = green energy
highCarbonTtlMultiplier: 3, // 3x longer TTL on dirty energy
};
class CarbonAwareMultiLevelCache<T> {
private readonly l1 = new Map<string, CacheEntry<T>>();
private readonly config: CarbonAwareCacheConfig;
private readonly redis: Redis;
private currentCarbonIntensity = 350; // Default, updated periodically
// Immutable metrics for reporting
private readonly metrics = {
l1Hits: 0,
l2Hits: 0,
misses: 0,
carbonSavedGrams: 0,
};
constructor(redisClient: Redis, config: Partial<CarbonAwareCacheConfig> = {}) {
this.redis = redisClient;
this.config = { ...DEFAULT_CONFIG, ...config };
}
async get(key: string): Promise<T | null> {
// L1: local memory (no I/O, minimal energy)
const l1Entry = this.l1.get(key);
if (l1Entry && !this.isExpired(l1Entry)) {
this.metrics.l1Hits++;
this.metrics.carbonSavedGrams += 0.005; // ~5mg CO2 saved vs DB
return l1Entry.data;
}
// L2: distributed Redis
try {
const raw = await this.redis.get(key);
if (raw) {
const entry: CacheEntry<T> = JSON.parse(raw);
if (!this.isExpired(entry)) {
// Promote to L1
this.setL1(key, entry.data, this.config.l1DefaultTtlMs);
this.metrics.l2Hits++;
this.metrics.carbonSavedGrams += 0.003; // ~3mg CO2 vs DB
return entry.data;
}
}
} catch (err) {
console.warn("L2 cache read failed, fallback to source:", err);
}
this.metrics.misses++;
return null;
}
async set(key: string, data: T): Promise<void> {
// Adaptive TTL based on current carbon intensity
// When energy is green (low carbon), shorter TTL is acceptable
// When energy is "dirty" (high carbon), longer TTL to reduce recomputations
const isHighCarbon = this.currentCarbonIntensity > this.config.lowCarbonThreshold;
const ttlMultiplier = isHighCarbon ? this.config.highCarbonTtlMultiplier : 1;
const l1TtlMs = this.config.l1DefaultTtlMs * ttlMultiplier;
const l2TtlMs = this.config.l2DefaultTtlMs * ttlMultiplier;
this.setL1(key, data, l1TtlMs);
// Write to Redis non-blocking
const entry: CacheEntry<T> = {
data,
cachedAt: Date.now(),
ttlMs: l2TtlMs,
carbonIntensityAtCache: this.currentCarbonIntensity,
};
await this.redis.set(key, JSON.stringify(entry), "PX", l2TtlMs);
}
private setL1(key: string, data: T, ttlMs: number): void {
// Evict LRU if full
if (this.l1.size >= this.config.l1MaxEntries) {
const firstKey = this.l1.keys().next().value;
if (firstKey) this.l1.delete(firstKey);
}
this.l1.set(key, {
data,
cachedAt: Date.now(),
ttlMs,
carbonIntensityAtCache: this.currentCarbonIntensity,
});
}
private isExpired(entry: CacheEntry<T>): boolean {
return Date.now() - entry.cachedAt > entry.ttlMs;
}
updateCarbonIntensity(intensityGCO2PerKWh: number): void {
this.currentCarbonIntensity = intensityGCO2PerKWh;
}
getMetrics(): Readonly<typeof this.metrics> {
const total = this.metrics.l1Hits + this.metrics.l2Hits + this.metrics.misses;
return {
...this.metrics,
hitRate: total ? ((this.metrics.l1Hits + this.metrics.l2Hits) / total * 100).toFixed(1) + "%" : "N/A",
} as any;
}
}
Carbon-Aware CDN: Serving from the Green Edge
A CDN like Cloudflare or Fastly has edge nodes in dozens of regions with very different carbon intensities. Routing traffic to edges with greener energy, when latency permits, can reduce serving emissions by 20-40%.
// Cache invalidation: one of the hardest problems in software
// Carbon-aware pattern: invalidate in batch during low carbon intensity windows
interface InvalidationJob {
readonly tags: readonly string[];
readonly priority: "immediate" | "carbon-optimal" | "batch-night";
readonly scheduledAt: Date;
readonly maxDelayMs: number;
}
class CarbonAwareCacheInvalidator {
private readonly pendingJobs: InvalidationJob[] = [];
private readonly carbonAwareSdk: any; // Carbon Aware SDK instance
async scheduleInvalidation(
tags: string[],
priority: InvalidationJob["priority"] = "carbon-optimal",
maxDelayMs: number = 3_600_000 // 1 hour tolerance
): Promise<{ jobId: string; scheduledFor: Date }> {
if (priority === "immediate") {
await this.executePurge(tags);
return { jobId: crypto.randomUUID(), scheduledFor: new Date() };
}
// Find the window with lowest carbon intensity in the next maxDelayMs
const optimalTime = await this.findGreenWindow(maxDelayMs);
const job: InvalidationJob = {
tags: Object.freeze(tags),
priority,
scheduledAt: optimalTime,
maxDelayMs,
};
// Immutable: we don't mutate existing pendingJobs, we add new ones
this.pendingJobs.push(job);
return {
jobId: crypto.randomUUID(),
scheduledFor: optimalTime,
};
}
private async findGreenWindow(maxDelayMs: number): Promise<Date> {
const windowEnd = new Date(Date.now() + maxDelayMs);
try {
// Carbon Aware SDK: find the moment with lowest carbon intensity
const forecast = await this.carbonAwareSdk.getForecast({
location: "westeurope",
start: new Date(),
end: windowEnd,
duration: 15, // Job requires ~15 minutes
});
return new Date(forecast.optimalWindow.start);
} catch {
// Fallback: execute immediately if forecast is unavailable
return new Date();
}
}
private async executePurge(tags: string[]): Promise<void> {
// Cloudflare Cache Tag Purge API
await fetch("https://api.cloudflare.com/client/v4/zones/ZONE_ID/purge_cache", {
method: "POST",
headers: {
"Authorization": "Bearer CF_TOKEN",
"Content-Type": "application/json",
},
body: JSON.stringify({ tags }),
});
}
}
Carbon-Aware Batch Processing: Work When Energy Is Green
Batch processing is the ideal candidate for temporal shifting, the principle of moving flexible workloads to moments when the grid's carbon intensity is lowest. A system that runs its nightly batches when wind and solar produce an energy surplus can reduce associated emissions by 30-70% compared to a fixed-schedule execution.
The 02:00 AM Batch Paradox
Many systems schedule batches at 02:00 "because there is less traffic." But in Europe, nighttime is not always the moment with the lowest carbon intensity: in many regions, solar is absent at night and wind varies. In some areas, the 10-14 hours (peak solar) or 02-06 hours (steady wind) have much lower carbon intensity. Use real data from the Carbon Aware SDK instead of temporal rules of thumb.
from celery import Celery
from celery.schedules import crontab
from datetime import datetime, timedelta
from typing import Optional, NamedTuple
import httpx
import asyncio
import logging
logger = logging.getLogger(__name__)
app = Celery("green_batch", broker="redis://localhost:6379/0")
class GreenWindow(NamedTuple):
"""Optimal execution window for carbon footprint."""
start: datetime
end: datetime
carbon_intensity_g_kwh: float
savings_pct_vs_now: float
class CarbonAwareBatchScheduler:
"""Schedules batch jobs during lowest carbon intensity windows."""
BASE_CARBON_API = "https://api.electricitymap.org/v3"
def __init__(self, api_token: str, location: str = "IT"):
self._token = api_token
self._location = location
async def get_optimal_window(
self,
job_duration_minutes: int,
max_delay_hours: int = 12,
min_savings_pct: float = 15.0
) -> Optional[GreenWindow]:
"""
Finds the optimal window to run a batch job.
Returns None if no window with sufficient savings is found.
"""
headers = {"auth-token": self._token}
async with httpx.AsyncClient() as client:
# Current carbon intensity
current_resp = await client.get(
f"{self.BASE_CARBON_API}/carbon-intensity/latest",
params={"zone": self._location},
headers=headers
)
current_data = current_resp.json()
current_intensity = current_data["carbonIntensity"]
# Forecast for the next hours
forecast_resp = await client.get(
f"{self.BASE_CARBON_API}/carbon-intensity/forecast",
params={"zone": self._location},
headers=headers
)
forecast = forecast_resp.json()
now = datetime.utcnow()
deadline = now + timedelta(hours=max_delay_hours)
best_window: Optional[GreenWindow] = None
min_intensity = current_intensity
for slot in forecast["forecast"]:
slot_time = datetime.fromisoformat(slot["datetime"].replace("Z", "+00:00"))
slot_time = slot_time.replace(tzinfo=None)
if slot_time < now or slot_time > deadline:
continue
intensity = slot["carbonIntensity"]
if intensity < min_intensity:
min_intensity = intensity
savings_pct = (current_intensity - intensity) / current_intensity * 100
if savings_pct >= min_savings_pct:
best_window = GreenWindow(
start=slot_time,
end=slot_time + timedelta(minutes=job_duration_minutes),
carbon_intensity_g_kwh=intensity,
savings_pct_vs_now=round(savings_pct, 1),
)
return best_window
async def schedule_green(
self,
task_name: str,
job_duration_minutes: int,
task_kwargs: dict,
max_delay_hours: int = 12
) -> dict:
"""Schedules a Celery task during the greenest window."""
window = await self.get_optimal_window(
job_duration_minutes=job_duration_minutes,
max_delay_hours=max_delay_hours
)
if window:
delay_seconds = (window.start - datetime.utcnow()).total_seconds()
task = app.send_task(
task_name,
kwargs=task_kwargs,
countdown=max(0, int(delay_seconds))
)
logger.info(
"Scheduled '%s' for %s (%.1f%% CO2 savings, %.0fg/kWh)",
task_name, window.start.isoformat(),
window.savings_pct_vs_now, window.carbon_intensity_g_kwh
)
return {
"task_id": task.id,
"scheduled_for": window.start.isoformat(),
"carbon_intensity": window.carbon_intensity_g_kwh,
"co2_savings_pct": window.savings_pct_vs_now,
}
else:
# No green window available: execute now
task = app.send_task(task_name, kwargs=task_kwargs)
logger.warning("No green window found for '%s', executing immediately", task_name)
return {"task_id": task.id, "scheduled_for": "now", "co2_savings_pct": 0}
# Celery task definitions with carbon metrics
@app.task(name="batch.nightly_report", bind=True)
def nightly_report_batch(self, report_date: str) -> dict:
"""
Generates nightly reports. Not time-critical: ideal for temporal shifting.
Typical carbon savings: 20-60% by shifting from 02:00 to green window.
"""
logger.info("Generating report for %s (carbon-optimal execution)", report_date)
# ... report logic
return {"status": "completed", "report_date": report_date}
@app.task(name="batch.data_sync", bind=True)
def data_sync_batch(self, source: str) -> dict:
"""
Data synchronization between systems. Tolerant of delays up to a few hours.
"""
# ... sync logic
return {"status": "synced", "source": source}
# Scheduler that uses carbon-awareness instead of fixed crontab
async def schedule_nightly_jobs():
scheduler = CarbonAwareBatchScheduler(
api_token="your_electricity_maps_token",
location="IT" # Italy
)
# Report: accepts up to 12h delay
await scheduler.schedule_green(
task_name="batch.nightly_report",
job_duration_minutes=45,
task_kwargs={"report_date": datetime.now().strftime("%Y-%m-%d")},
max_delay_hours=12
)
# Data sync: accepts up to 6h delay
await scheduler.schedule_green(
task_name="batch.data_sync",
job_duration_minutes=20,
task_kwargs={"source": "salesforce"},
max_delay_hours=6
)
Right-Sizing and Auto-Scaling Down: Stopping Invisible Waste
A 2025 Gartner study estimates that 35-40% of enterprise cloud resources are over-provisioned: servers running at 10-15% CPU, databases with 90% of memory unused, Lambda functions with 3GB of RAM allocated when 256MB would suffice. This "idle computing" is pure energy waste.
Systematic right-sizing — reducing resources to the minimum needed to meet performance requirements — is often the single intervention with the best energy ROI. It does not require code rewrites: it is a matter of configuration and monitoring.
Right-Sizing Strategies for Carbon Reduction
| Strategy | Typical Savings | Complexity | Risk |
|---|---|---|---|
| Reduce over-provisioned instance sizes | 20-40% | Low | Low (easy rollback) |
| Aggressive auto-scaling down | 30-60% | Medium | Medium (cold start latency) |
| Serverless for intermittent workloads | 50-90% | High | Medium (cold start, vendor lock-in) |
| Spot/Preemptible instances for batch | 60-80% | High | High (interruptions) |
| Schedule shutdown outside hours (dev/staging) | 40-70% | Low | None (non-prod environments) |
import boto3
from datetime import datetime
from dataclasses import dataclass
from typing import Sequence
@dataclass(frozen=True)
class ScalingPolicy:
"""Immutable scaling policy with carbon awareness."""
min_capacity: int
max_capacity: int
target_cpu_pct: float
scale_in_cooldown_sec: int
green_hour_min_capacity: int # Minimum capacity during green hours
low_traffic_scale_in_factor: float # Aggressive factor during low traffic hours
def create_carbon_aware_scaling_policies(
asg_name: str,
policy: ScalingPolicy,
region: str = "eu-west-1"
) -> dict:
"""
Configures Auto Scaling Group with carbon-aware policies.
More aggressive scale-in during low traffic hours (at night)
where energy might be greener AND traffic is low.
"""
autoscaling = boto3.client("autoscaling", region_name=region)
# Main policy: target tracking on CPU
main_policy = autoscaling.put_scaling_policy(
AutoScalingGroupName=asg_name,
PolicyName=f"{asg_name}-carbon-aware-cpu",
PolicyType="TargetTrackingScaling",
TargetTrackingConfiguration={
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"TargetValue": policy.target_cpu_pct,
"ScaleInCooldown": policy.scale_in_cooldown_sec,
"ScaleOutCooldown": 120,
"DisableScaleIn": False,
}
)
# Scheduled action: aggressive nightly scale down
# Combines low traffic + potential green energy
autoscaling.put_scheduled_update_group_action(
AutoScalingGroupName=asg_name,
ScheduledActionName=f"{asg_name}-night-scaledown",
Recurrence="0 22 * * *", # Every day at 22:00 UTC
MinSize=policy.green_hour_min_capacity,
MaxSize=policy.max_capacity,
DesiredCapacity=policy.green_hour_min_capacity,
)
# Scheduled action: morning scale up before traffic
autoscaling.put_scheduled_update_group_action(
AutoScalingGroupName=asg_name,
ScheduledActionName=f"{asg_name}-morning-scaleup",
Recurrence="0 7 * * MON-FRI", # Mon-Fri at 07:00 UTC
MinSize=policy.min_capacity,
MaxSize=policy.max_capacity,
DesiredCapacity=policy.min_capacity + 2, # Pre-warm before traffic
)
return {
"asg_name": asg_name,
"main_policy_arn": main_policy["PolicyARN"],
"estimated_monthly_co2_reduction_pct": 35, # Typical for this pattern
}
# Lambda Right-sizing: find optimal memory
# Principle: excess RAM = unnecessary carbon cost
def optimize_lambda_memory(function_name: str, region: str = "eu-west-1") -> dict:
"""
Analyzes a Lambda's memory usage and suggests the right size.
AWS Lambda Power Tuning (open source tool) automates this process.
"""
lambda_client = boto3.client("lambda", region_name=region)
cloudwatch = boto3.client("cloudwatch", region_name=region)
# Retrieve current configuration
config = lambda_client.get_function_configuration(FunctionName=function_name)
current_memory_mb = config["MemorySize"]
# Retrieve CloudWatch metrics: max memory used over last 7 days
# (in production use AWS Lambda Power Tuning for full analysis)
metrics = cloudwatch.get_metric_statistics(
Namespace="AWS/Lambda",
MetricName="MaxMemoryUsed",
Dimensions=[{"Name": "FunctionName", "Value": function_name}],
StartTime=datetime.utcnow().replace(hour=0, minute=0) - __import__("datetime").timedelta(days=7),
EndTime=datetime.utcnow(),
Period=86400,
Statistics=["Maximum"],
)
if not metrics["Datapoints"]:
return {"status": "insufficient_data"}
max_used_mb = max(dp["Maximum"] for dp in metrics["Datapoints"])
# Safety buffer: 30% above observed maximum
recommended_mb = min(int(max_used_mb * 1.3 / 64 + 1) * 64, 10240)
co2_reduction_pct = max(0, (current_memory_mb - recommended_mb) / current_memory_mb * 100)
return {
"current_memory_mb": current_memory_mb,
"max_observed_mb": int(max_used_mb),
"recommended_mb": recommended_mb,
"potential_co2_reduction_pct": round(co2_reduction_pct, 1),
"annual_cost_savings_usd": (current_memory_mb - recommended_mb) / 1024 * 0.0000166667 * 3_600_000 * 12,
}
Sustainable Database Patterns: Fewer Queries, Less Carbon
The database is often the component with the highest energy consumption in an enterprise system. Every query involves disk I/O, buffer allocation in RAM, CPU cycles for parsing, planning, and execution. Optimizing queries is not just a performance matter: it is a direct reduction in emissions.
Pattern 1: Materialized Views to Reduce Recomputations
Materialized views are the most effective pattern for eliminating costly aggregate queries that are re-executed continuously. Instead of recalculating SUM, COUNT, complex JOINs on every request, the result is precomputed and updated periodically or via triggers.
-- PROBLEM: Heavy aggregate query executed 1000x per day
-- Each execution: 2-5 seconds, 100-500ms CPU, I/O intensive
-- Estimate: ~500mWh/day for this query alone
-- Heavy query BEFORE (executed on every request)
SELECT
c.category_id,
c.name AS category_name,
COUNT(DISTINCT o.order_id) AS total_orders,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
AVG(oi.quantity * oi.unit_price) AS avg_order_value,
COUNT(DISTINCT o.customer_id) AS unique_customers
FROM categories c
JOIN products p ON p.category_id = c.category_id
JOIN order_items oi ON oi.product_id = p.product_id
JOIN orders o ON o.order_id = oi.order_id
WHERE o.created_at >= NOW() - INTERVAL '30 days'
GROUP BY c.category_id, c.name;
-- SOLUTION: Materialized view with refresh during green window
CREATE MATERIALIZED VIEW mv_category_metrics_30d AS
SELECT
c.category_id,
c.name AS category_name,
COUNT(DISTINCT o.order_id) AS total_orders,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
AVG(oi.quantity * oi.unit_price) AS avg_order_value,
COUNT(DISTINCT o.customer_id) AS unique_customers,
NOW() AS last_refreshed
FROM categories c
JOIN products p ON p.category_id = c.category_id
JOIN order_items oi ON oi.product_id = p.product_id
JOIN orders o ON o.order_id = oi.order_id
WHERE o.created_at >= NOW() - INTERVAL '30 days'
GROUP BY c.category_id, c.name
WITH DATA;
-- Index for O(1) query
CREATE UNIQUE INDEX idx_mv_category_metrics ON mv_category_metrics_30d (category_id);
-- Scheduled refresh: CONCURRENT allows reads during refresh
-- Schedule during green windows (e.g. with pg_cron + carbon intensity check)
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_category_metrics_30d;
-- With pg_cron (schedule refresh every hour during daytime solar hours)
SELECT cron.schedule(
'refresh-category-metrics',
'0 * * * *', -- Every hour; carbon-aware logic in the application
'REFRESH MATERIALIZED VIEW CONCURRENTLY mv_category_metrics_30d'
);
-- Query after: O(1) on materialized view
-- Estimated savings: 99% of original computation
SELECT * FROM mv_category_metrics_30d ORDER BY total_revenue DESC;
-- Pattern 2: Read Replica to separate workloads
-- Analytical reads (intensive) -> read replica
-- Writes -> primary (minimal load)
-- Pattern 3: Partial Indexes to reduce I/O
-- INSTEAD OF: index on all 50M orders
CREATE INDEX idx_orders_status_all ON orders(status, created_at);
-- BETTER: index only on 500K active orders (1% of total)
-- 99% less I/O, 99% less space, much faster maintenance
CREATE INDEX idx_orders_status_active ON orders(status, created_at)
WHERE status IN ('pending', 'processing', 'shipped');
-- Pattern 4: Connection Pooling to reduce overhead
-- PgBouncer: max_client_conn=1000, pool_size=20
-- Reduces: TCP handshakes, SSL negotiation, process fork overhead
-- Estimated savings: 30-50% PostgreSQL CPU on high-concurrency workloads
Query Optimization: N+1 Problem and Eager Loading
The N+1 problem is one of the most common and costly anti-patterns in terms of emissions: instead of a single query that retrieves all necessary data, N+1 separate queries are executed. With N=1000 orders, 1001 queries are generated instead of 1, multiplying the carbon footprint of the operation by 1000.
from sqlalchemy import select
from sqlalchemy.orm import selectinload, joinedload, Session
from typing import Sequence
# ANTI-PATTERN: N+1 queries - AVOID
def get_orders_naive(session: Session, limit: int = 100) -> list:
"""
PROBLEMATIC: for 100 orders generates 101 queries.
100 orders -> 1 query
100 customers -> 100 separate queries
Estimate: ~2mWh per 100 orders. On 10,000 requests/day = 20Wh/day.
"""
orders = session.execute(select(Order).limit(limit)).scalars().all()
# Every access to order.customer triggers a new query! (lazy loading)
return [{"id": o.id, "customer": o.customer.email} for o in orders]
# GREEN PATTERN: Eager loading with selectin
def get_orders_green(session: Session, limit: int = 100) -> list:
"""
OPTIMIZED: 2 total queries instead of N+1.
Query 1: all orders
Query 2: all customers in a single IN query
Estimate: 0.02mWh per 100 orders. Savings: 99%.
"""
stmt = (
select(Order)
.options(selectinload(Order.customer)) # 2 total queries
.limit(limit)
)
orders = session.execute(stmt).scalars().all()
return [{"id": o.id, "customer": o.customer.email} for o in orders]
# Even better: joinedload for 1 single query
def get_orders_single_query(session: Session, limit: int = 100) -> list:
"""
ULTRA-OPTIMIZED: 1 single query with JOIN.
Ideal when the number of relationships is low.
Estimate: 0.01mWh per 100 orders. Savings: 99.5%.
"""
stmt = (
select(Order)
.options(joinedload(Order.customer)) # JOIN: 1 single query
.limit(limit)
)
orders = session.execute(stmt).unique().scalars().all()
return [{"id": o.id, "customer": o.customer.email} for o in orders]
# Pattern: Projection - select only needed fields
def get_order_summary(session: Session, order_ids: list[int]) -> list[dict]:
"""
Select ONLY needed fields, not SELECT *.
On tables with 50+ columns, SELECT * transfers 10-20x more data.
"""
stmt = (
select(
Order.id,
Order.total_amount,
Order.status,
# Only 3 fields instead of 50+
)
.where(Order.id.in_(order_ids))
)
rows = session.execute(stmt).all()
return [{"id": r.id, "total": float(r.total_amount), "status": r.status} for r in rows]
Network Efficiency: Every Transferred Byte Has a Carbon Cost
Transferring data has a real energy cost: 0.06-0.1 kWh per GB for internet traffic (backbone + last mile). An application that transfers 10TB of uncompressed data per day consumes approximately 600-1000 kWh on transmission alone, equivalent to 200-400 kg CO₂ per day (at the European average intensity of 350 gCO₂/kWh).
HTTP/3 and QUIC: Protocol-Level Efficiency
HTTP/3 with QUIC eliminates the head-of-line blocking problem and reduces the round-trips needed to establish connections. For applications with many concurrent requests, HTTP/3 can reduce latency by 15-30% and, consequently, the active CPU time of servers.
import express, { Request, Response, NextFunction } from "express";
import compression from "compression";
import { createBrotliCompress, createGzip } from "zlib";
import { pipeline } from "stream/promises";
const app = express();
// Intelligent compression: choose the best algorithm
// Brotli: better ratio (15-25% superior to gzip), supported by all modern browsers
// Gzip: fallback for older clients
app.use(compression({
// Compress only if savings are significant
threshold: 1024, // Min 1KB to compress
level: 6, // CPU/ratio balance
filter: (req, res) => {
// Don't compress already-compressed images (jpeg, webp, png)
const contentType = res.getHeader("Content-Type") as string || "";
if (contentType.includes("image/")) return false;
return compression.filter(req, res);
},
}));
// Middleware: minimal response with field selection
// Instead of returning the full object, respond only with requested fields
function fieldSelectionMiddleware(req: Request, res: Response, next: NextFunction): void {
const originalJson = res.json.bind(res);
res.json = (body: any) => {
const fields = req.query["fields"];
if (!fields || typeof fields !== "string" || !body) {
return originalJson(body);
}
const requestedFields = fields.split(",").map(f => f.trim());
// Filter only requested fields (does not mutate original body)
const filtered = Array.isArray(body)
? body.map(item => pickFields(item, requestedFields))
: pickFields(body, requestedFields);
return originalJson(filtered);
};
next();
}
function pickFields(obj: Record<string, unknown>, fields: string[]): Record<string, unknown> {
return fields.reduce<Record<string, unknown>>((acc, field) => {
if (field in obj) {
return { ...acc, [field]: obj[field] };
}
return acc;
}, {});
}
app.use(fieldSelectionMiddleware);
// Example: GET /api/users?fields=id,name,email
// Instead of returning 50 fields, returns only 3
// Typical payload reduction: 60-90%
// Cache-Control headers: reduces repeated requests
function addCacheHeaders(res: Response, maxAgeSeconds: number): void {
res.setHeader("Cache-Control", `public, max-age={maxAgeSeconds}, stale-while-revalidate=60`);
res.setHeader("Vary", "Accept-Encoding, Accept");
}
app.get("/api/products/:id", async (req: Request, res: Response) => {
const product = await getProduct(req.params["id"]);
// Static data: aggressive caching
if (product?.isStatic) {
addCacheHeaders(res, 86400); // 1 day
} else {
addCacheHeaders(res, 300); // 5 minutes
}
res.json(product);
});
// ETag for efficient cache validation
// Instead of re-downloading, the client checks if data has changed
app.get("/api/catalog", async (req: Request, res: Response) => {
const catalog = await getCatalog();
const etag = require("crypto")
.createHash("md5")
.update(JSON.stringify(catalog))
.digest("hex");
// If ETag unchanged: respond 304 (0 bytes of payload!)
if (req.headers["if-none-match"] === etag) {
res.status(304).end();
return;
}
res.setHeader("ETag", etag);
res.json(catalog);
});
async function getProduct(id: string): Promise<any> {
return { id, isStatic: true, name: "Product" };
}
async function getCatalog(): Promise<any[]> {
return [];
}
Sustainable Frontend Patterns: The Client Is Part of the Problem
The frontend is often the most overlooked component in software emissions analysis, but executing heavy JavaScript on millions of devices has an enormous aggregate impact. A 500KB JS script executed on 1 million devices requires approximately 50 GWh of annual energy on user devices — not on our servers.
Image Optimization: The Quickest Win
Images represent on average 50-70% of the weight of a web page. Switching from JPEG to WebP/AVIF reduces sizes by 25-50%, with an equivalent reduction in data transfer and decoding time on the user's device.
// angular.json - enable Angular's built-in image optimization
// Angular 17+ has NgOptimizedImage built in
import { NgOptimizedImage } from '@angular/common';
import { Component, ChangeDetectionStrategy } from '@angular/core';
@Component({
selector: 'app-product-card',
standalone: true,
imports: [NgOptimizedImage],
changeDetection: ChangeDetectionStrategy.OnPush, // Reduces change detection cycles
template: `
<!-- NgOptimizedImage: lazy loading + automatic srcset + WebP -->
<img
ngSrc="products/laptop-pro.jpg"
[width]="400"
[height]="300"
[priority]="isAboveFold"
loading="lazy"
decoding="async"
alt="Laptop Pro - Front view"
/>
`
})
export class ProductCardComponent {
isAboveFold = false;
}
// build pipeline: automatic conversion to WebP/AVIF with sharp
// scripts/optimize-images.mjs
import sharp from 'sharp';
import { readdir, stat } from 'fs/promises';
import path from 'path';
async function optimizeImages(inputDir: string, outputDir: string): Promise<void> {
const files = await readdir(inputDir);
const imageFiles = files.filter(f => /\.(jpg|jpeg|png)$/i.test(f));
const results = await Promise.all(imageFiles.map(async (file) => {
const inputPath = path.join(inputDir, file);
const baseName = path.basename(file, path.extname(file));
// Generate WebP (universal support 2025)
const webpPath = path.join(outputDir, `{baseName}.webp`);
await sharp(inputPath)
.webp({ quality: 80, effort: 6 })
.toFile(webpPath);
// Generate AVIF (superior compression, modern browsers)
const avifPath = path.join(outputDir, `{baseName}.avif`);
await sharp(inputPath)
.avif({ quality: 65, effort: 7 })
.toFile(avifPath);
const [origSize, webpSize, avifSize] = await Promise.all([
stat(inputPath).then(s => s.size),
stat(webpPath).then(s => s.size),
stat(avifPath).then(s => s.size),
]);
return {
file,
origKB: (origSize / 1024).toFixed(1),
webpKB: (webpSize / 1024).toFixed(1),
avifKB: (avifSize / 1024).toFixed(1),
webpSaving: ((1 - webpSize/origSize) * 100).toFixed(1) + '%',
avifSaving: ((1 - avifSize/origSize) * 100).toFixed(1) + '%',
};
}));
console.table(results);
}
Dark Mode and Energy Savings on OLED
Dark Mode: Real Impact on OLED Displays
On OLED displays (present in all premium smartphones and many laptops since 2024), black pixels consume literally 0 energy (the OLED pixel does not emit light when turned off). An interface with a pure black background (#000000) on OLED can consume up to 60-80% less display energy compared to a white background (#FFFFFF). With billions of OLED devices, offering a well-implemented dark mode is a significant reduction in user-side emissions.
/* Dark mode: use true black (#000000) to maximize OLED savings */
/* Even #0d0d0d is sufficient and more aesthetically pleasing */
:root {
--bg-primary: #ffffff;
--bg-secondary: #f5f5f5;
--text-primary: #1a1a1a;
--surface: #ffffff;
}
/* Auto dark mode: activates based on system preference */
@media (prefers-color-scheme: dark) {
:root {
--bg-primary: #000000; /* True black for OLED */
--bg-secondary: #0d0d0d; /* Near-black, more visual comfort */
--text-primary: #e8e8e8;
--surface: #111111;
}
}
/* Manual class for toggle */
.dark-theme {
--bg-primary: #000000;
--bg-secondary: #0d0d0d;
--text-primary: #e8e8e8;
--surface: #111111;
}
/* Reduce motion: reduces animations = less GPU = less energy */
@media (prefers-reduced-motion: reduce) {
*,
*::before,
*::after {
animation-duration: 0.01ms !important;
animation-iteration-count: 1 !important;
transition-duration: 0.01ms !important;
scroll-behavior: auto !important;
}
}
/* System font: avoid Google Fonts when possible */
/* System fonts = 0 downloads, instant rendering */
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", system-ui, sans-serif;
}
/* If custom font is needed: use font-display: swap to avoid blocking */
@font-face {
font-family: "CustomFont";
src: url("/fonts/custom.woff2") format("woff2");
font-display: swap; /* Don't block rendering while downloading */
font-weight: 400 700; /* Variable font: 1 file for all weights */
}
Sustainable API Design: Fewer Round-Trips, Less Carbon
API design has a direct impact on emissions: poorly designed APIs require more round-trips, transfer more data than necessary, and force clients to make more requests to obtain the information they need.
Efficient Pagination: Cursor vs Offset
Offset pagination is energy-expensive: to retrieve page 100 with 20 items, the database must scan and discard 2000 rows (OFFSET 2000). With cursor-based pagination, the database jumps directly to the correct point using an index, consuming energy proportional only to the number of returned rows, not the skipped ones.
import { Pool } from "pg";
interface PaginationResult<T> {
readonly data: readonly T[];
readonly nextCursor: string | null;
readonly totalCount?: number;
}
// ANTI-PATTERN: Offset pagination
// Query cost: O(offset + limit) - grows with pagination depth
async function getProductsOffset(
db: Pool,
page: number,
pageSize: number = 20
): Promise<PaginationResult<any>> {
const offset = page * pageSize;
// EXPENSIVE: the DB scans 'offset' rows only to discard them
const result = await db.query(
"SELECT * FROM products ORDER BY id LIMIT $1 OFFSET $2",
[pageSize, offset]
);
// Example page 1000 with 20 items: DB scans 20,020 rows
// Estimate: 50x slower and 50x more expensive than cursor pagination
return { data: result.rows, nextCursor: null };
}
// GREEN PATTERN: Cursor-based pagination
// Query cost: O(limit) - constant, uses index
async function getProductsCursor(
db: Pool,
cursor: string | null,
pageSize: number = 20
): Promise<PaginationResult<any>> {
let query: string;
let params: any[];
if (cursor) {
// Decode cursor: contains the ID of the last seen element
const lastId = parseInt(Buffer.from(cursor, "base64").toString("utf-8"));
query = `
SELECT id, name, price, category_id
FROM products
WHERE id > $1
ORDER BY id ASC
LIMIT $2
`;
params = [lastId, pageSize + 1]; // +1 to know if there is a next page
} else {
query = `
SELECT id, name, price, category_id
FROM products
ORDER BY id ASC
LIMIT $1
`;
params = [pageSize + 1];
}
const result = await db.query(query, params);
const rows = result.rows;
const hasMore = rows.length > pageSize;
const data = hasMore ? rows.slice(0, pageSize) : rows;
const lastItem = data[data.length - 1];
// Encode next cursor
const nextCursor = hasMore && lastItem
? Buffer.from(String(lastItem.id)).toString("base64")
: null;
return {
data,
nextCursor,
// Don't calculate totalCount (expensive): use only if needed with estimate
};
}
// GraphQL vs REST: when each is more efficient
// REST: efficient for simple, well-defined resources, native HTTP cache
// GraphQL: efficient for complex UIs with many components requiring different data
// Carbon-aware GraphQL pattern: DataLoader for batch N+1
import DataLoader from "dataloader";
import { GraphQLResolveInfo } from "graphql";
// Without DataLoader: 1000 resolvers -> 1000 separate SQL queries
// With DataLoader: 1000 resolvers -> 1 SQL query with WHERE IN
const productLoader = new DataLoader<number, any>(
async (productIds: readonly number[]) => {
const result = await db.query(
"SELECT * FROM products WHERE id = ANY($1)",
[[...productIds]]
);
// Map to maintain correct order
const productMap = new Map(result.rows.map(p => [p.id, p]));
return productIds.map(id => productMap.get(id) || null);
},
{ batch: true, maxBatchSize: 100 } // Max 100 per query for safety
);
declare const db: Pool;
Carbon Monitoring: Measure to Improve
"What you cannot measure, you cannot improve." Service-level carbon monitoring allows you to identify the most emissive components, track progress over time, and calculate the SCI score (Software Carbon Intensity) as a corporate KPI.
from prometheus_client import Counter, Histogram, Gauge, start_http_server
from functools import wraps
import time
import httpx
import logging
from typing import Callable, Any
logger = logging.getLogger(__name__)
# Custom Prometheus metrics for carbon monitoring
CARBON_INTENSITY_GAUGE = Gauge(
"carbon_intensity_g_co2_per_kwh",
"Current grid carbon intensity in gCO2/kWh",
["region"]
)
ENERGY_CONSUMED_COUNTER = Counter(
"energy_consumed_wh_total",
"Total energy consumed in Wh",
["service", "endpoint", "method"]
)
CARBON_EMITTED_COUNTER = Counter(
"carbon_emitted_gco2_total",
"Total carbon emitted in gCO2",
["service", "endpoint", "method"]
)
REQUEST_DURATION_HISTOGRAM = Histogram(
"http_request_duration_seconds",
"HTTP request duration",
["service", "endpoint", "method"],
buckets=[0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5]
)
SCI_SCORE_GAUGE = Gauge(
"sci_score_mgco2_per_request",
"Software Carbon Intensity score in mgCO2 per functional unit",
["service"]
)
# Estimated energy consumption per resource type
# Based on empirical benchmarks for typical servers (TDP ~200W, utilization 20%)
ENERGY_ESTIMATES_WH = {
"cpu_second": 0.011, # ~11 mWh per second of CPU @ 200W TDP, 20% util
"memory_gb_second": 0.000375, # ~0.375 mWh per GB*s RAM
"ssd_read_gb": 0.0002, # ~0.2 mWh per GB read from SSD
"ssd_write_gb": 0.0004, # ~0.4 mWh per GB written to SSD
"network_gb": 0.1, # ~100 mWh per GB transferred (network)
}
class CarbonMetricsCollector:
"""Collects carbon metrics for Prometheus + Grafana."""
def __init__(self, service_name: str, region: str = "IT"):
self._service = service_name
self._region = region
self._current_intensity = 350.0 # Default gCO2/kWh
async def update_carbon_intensity(self) -> None:
"""Updates carbon intensity from the grid in real time."""
try:
async with httpx.AsyncClient(timeout=5) as client:
resp = await client.get(
"https://api.electricitymap.org/v3/carbon-intensity/latest",
params={"zone": self._region},
headers={"auth-token": "YOUR_TOKEN"}
)
data = resp.json()
self._current_intensity = data["carbonIntensity"]
CARBON_INTENSITY_GAUGE.labels(region=self._region).set(self._current_intensity)
except Exception as e:
logger.warning("Carbon intensity fetch failed: %s", e)
def track_request(self, endpoint: str, method: str):
"""Decorator to track carbon footprint of every endpoint."""
def decorator(func: Callable) -> Callable:
@wraps(func)
async def wrapper(*args: Any, **kwargs: Any) -> Any:
start = time.perf_counter()
result = await func(*args, **kwargs)
duration_s = time.perf_counter() - start
# Estimate energy: based on duration (proxy for CPU+I/O)
estimated_energy_wh = duration_s * ENERGY_ESTIMATES_WH["cpu_second"]
# Carbon emitted
carbon_g = estimated_energy_wh * self._current_intensity / 1000
# Update Prometheus metrics
labels = {"service": self._service, "endpoint": endpoint, "method": method}
ENERGY_CONSUMED_COUNTER.labels(**labels).inc(estimated_energy_wh * 1000) # in mWh
CARBON_EMITTED_COUNTER.labels(**labels).inc(carbon_g * 1000) # in mgCO2
REQUEST_DURATION_HISTOGRAM.labels(**labels).observe(duration_s)
return result
return wrapper
return decorator
def update_sci_score(self, total_carbon_mgco2: float, functional_units: int) -> None:
"""
Updates the SCI score: mgCO2 per functional unit (e.g. per request).
SCI formula: (E * I + M) / R
E = energy, I = intensity, M = embodied carbon, R = functional units
"""
if functional_units > 0:
sci = total_carbon_mgco2 / functional_units
SCI_SCORE_GAUGE.labels(service=self._service).set(sci)
# Grafana dashboard query examples (PromQL):
#
# Carbon footprint per endpoint (top 10 most emissive):
# topk(10, sum by (endpoint) (rate(carbon_emitted_gco2_total[5m])))
#
# SCI score over time:
# sci_score_mgco2_per_request{service="api-gateway"}
#
# Carbon saved from cache hits:
# rate(cache_hits_total[5m]) * 0.005 # 5mg CO2 saved per hit
#
# Current carbon intensity vs green threshold:
# carbon_intensity_g_co2_per_kwh > 300 # Alert if exceeds threshold
Case Study: E-Commerce with 1M Daily Visits — -45% Carbon Footprint
Let us see how all these patterns integrate in a concrete case. An e-commerce site with 1 million daily visits, 50,000 orders/day, and an unoptimized legacy architecture. The team implemented the patterns described in this article over a 6-month project. The results speak for themselves.
Architecture Baseline Before Optimization
| Component | Configuration Before | Problem | Carbon/month (est.) |
|---|---|---|---|
| Web servers | 20x c5.2xlarge always-on | Average CPU 8%, 92% idle | 450 kg CO₂ |
| Database | db.r5.4xlarge, SELECT * everywhere | N+1 queries, no indexes | 280 kg CO₂ |
| S3 Storage | Everything in Standard tier | 5 years of logs never accessed in hot storage | 90 kg CO₂ |
| Batch jobs | Fixed at 02:00 UTC | No carbon awareness | 120 kg CO₂ |
| CDN/Cache | Cache hit rate 35% | TTLs too short, no edge caching | 180 kg CO₂ |
| Frontend | JS bundle 2.8MB, JPEG | No code splitting, unoptimized images | 200 kg CO₂ |
| Total | 1,320 kg CO₂/month |
Results After Optimization (6 Months)
| Intervention | Pattern Applied | Carbon Reduction | Implementation Time |
|---|---|---|---|
| Auto-scaling + EC2 right-sizing | Aggressive scale down, spot instances for batch | -195 kg CO₂/month (43%) | 2 weeks |
| Query optimization + materialized views | N+1 fix, projection, partial indexes | -140 kg CO₂/month (50%) | 4 weeks |
| S3 lifecycle policies | Tiered storage: hot/warm/cold/archive | -72 kg CO₂/month (80%) | 1 week |
| Carbon-aware batch scheduling | Carbon Aware SDK, green windows | -48 kg CO₂/month (40%) | 3 weeks |
| Multi-level cache (Redis + CDN) | Cache hit rate: 35% -> 87% | -126 kg CO₂/month (70%) | 6 weeks |
| Frontend: WebP/AVIF + code splitting | Bundle 2.8MB -> 380KB, WebP images | -110 kg CO₂/month (55%) | 3 weeks |
| Total saved | -691 kg CO₂/month (-52%) | 6 months total |
The final result is a 52% reduction in monthly carbon footprint (from 1,320 kg to 629 kg CO₂), with an annual saving of approximately 8.3 tonnes of CO₂ equivalent. Equivalent to removing approximately 4 cars from the roads for a year. And it was not necessary to rewrite the application from scratch: all interventions were incremental.
Key Lesson from the Case Study
The two interventions with the best ROI (carbon saved / effort) were:
- S3 lifecycle policies (80% reduction, 1 week of work): just a few lines of Infrastructure-as-Code configuration. This is the pattern with the highest impact-to-effort ratio by far.
- Auto-scaling and right-sizing (43% reduction, 2 weeks): the vast majority of systems are over-provisioned. Reducing provisioning to the minimum needed with auto-scaling has an immediate and measurable impact.
Takeaway: always start with the simplest patterns. 80% of the gains come from 20% of the interventions.
Anti-Patterns to Avoid: The 5 Most Common Traps
Anti-Pattern #1: Aggressive Cache Warming Outside Green Hours
Massively preloading the cache during carbon intensity peaks (typically evenings, when solar is absent and gas covers demand) consumes more energy than it saves. Cache warming should be scheduled during green windows identified by the Carbon Aware SDK.
Anti-Pattern #2: Verbose Logs in Hot Storage
Keeping years of debug logs in S3 Standard (hot tier) is an economic and energy waste. Logs older than 30 days are rarely accessed; after 90 days, almost never. Implementing lifecycle policies is the first action to take on any legacy system.
Anti-Pattern #3: Always-On for Intermittent Workloads
A service that receives 10 requests per day does not need to run 24/7 on a dedicated instance. Lambda, Cloud Run, or Fargate with scale-to-zero is the green choice for intermittent workloads. An always-on EC2 t3.small instance emits ~15 kg CO₂/month; Lambda for 10 requests/day emits <0.001 kg CO₂/month. 15,000 times less.
Anti-Pattern #4: SELECT * in Production
SELECT * transfers all fields of a table even when the application uses
only 2-3. On tables with 50+ columns and millions of rows, this multiplies the transferred data
by 10-20x, with an equivalent energy impact. Always use explicit projection.
Anti-Pattern #5: Monolithic JavaScript Bundle
A 3MB JS bundle downloaded and parsed by 1 million users per day requires approximately 2,000 kWh of CPU energy on user devices. With code splitting and lazy loading, the initial bundle can shrink to 100-200KB, reducing client-side energy consumption by 15-30x. Energy consumed on user devices is part of the software's Scope 3 emissions.
Checklist: Sustainable Patterns for Your Next Sprint
High Priority (Maximum Impact, Low Complexity)
- Implement S3/GCS lifecycle policies to automatically move data to colder tiers
- Analyze and reduce EC2/GCE instance provisioning (AWS Compute Optimizer, GCP Recommender)
- Add scheduled shutdown for development/staging environments outside working hours
- Enable gzip/brotli compression on all HTTP endpoints (if not already active)
- Convert images to WebP/AVIF in the build pipeline
- Remove all
SELECT *and replace with explicit projection
Medium Priority (High Impact, Moderate Complexity)
- Implement multi-level caching (L1 in-process + L2 Redis) for the most accessed resources
- Create materialized views for frequently executed aggregate queries
- Migrate time-flexible batch jobs to Carbon Aware SDK scheduling
- Resolve N+1 problems with eager loading or DataLoader (GraphQL)
- Implement cursor-based pagination to replace offset pagination
- Add appropriate ETag and Cache-Control to API responses
Low Priority (Incremental Improvements)
- Evaluate migration to HTTP/3 for high latency-sensitive applications
- Implement field selection in REST endpoints (query param
?fields=) - Add dark mode with true black to reduce consumption on user OLED devices
- Configure carbon monitoring with Prometheus + Grafana dashboard for SCI score
- Introduce partial indexes for queries filtered on small data subsets
- Evaluate serverless (Lambda, Cloud Functions) for workloads with <100 requests/hour
Conclusions: Sustainable Architecture as Standard Practice
Sustainable architectural patterns are not a luxury reserved for large tech companies with sustainability budgets: they are good engineering practices that simultaneously improve performance, costs, and environmental impact. Tiered storage, efficient caching, carbon-aware batch scheduling, right-sizing, and query optimization produce systems that are more efficient across every dimension.
The case study shows that it is possible to reduce a legacy system's carbon footprint by 50%+ in 6 months with incremental interventions, without having to rewrite the architecture from scratch. The key is to start with high-impact, low-complexity interventions (lifecycle policies, right-sizing, query optimization), measure the results with SCI metrics, and then proceed toward more sophisticated optimizations.
The CSRD Directive (Corporate Sustainability Reporting Directive), in force for large European companies from 2025 and for medium companies from 2026, will require many organizations to report software emissions as part of Scope 3 emissions. Sustainable architectures are no longer just an ethical choice: they are becoming a regulatory and competitive requirement.
Next Article in the Series
The tenth and final article will complete the Green Software series with ESG, CSRD and Compliance for Software Teams: how to structure mandatory software emissions reporting, create audit trails for regulators, and integrate SCI metrics into corporate governance processes. We will also cover practical tools for reporting: GHG Protocol, ESRS E1, and the GSF Impact Framework.
Additional Resources
- Carbon Aware SDK (Green Software Foundation): github.com/Green-Software-Foundation/carbon-aware-sdk
- Electricity Maps API: Real-time carbon intensity data for 50+ regions
- AWS Compute Optimizer: Automatic right-sizing recommendations
- Impact Framework (GSF): SCI calculation per component
- Web Almanac 2024 (HTTP Archive): Real statistics on web performance and size
- Cloud Carbon Footprint: Open source tool for measuring multi-provider cloud emissions







