Ciao! Sono

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

Contattami

Chi Sono

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

Le Mie Competenze

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

Automazione Processi

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

Sistemi Custom

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Satellite and Weather APIs for AgriTech: Predictive Data with Sentinel-2 and Python

In 2024, the European Space Agency acquired more than 1.5 petabytes of data every single day from the Sentinel satellites of the Copernicus programme. Of that volume, the share dedicated to agricultural monitoring across Europe represents the largest and strategically most significant segment. Yet the vast majority of farming operations — including the most technologically advanced ones — have not yet tapped into this public, free, and scientifically validated infrastructure.

The barrier is not a lack of data: it is the complexity of accessing it, processing it, and translating it into concrete agronomic decisions. A Sentinel-2 satellite passes over your vineyard every 5 days at a 10-metre-per-pixel resolution. Each pass produces 13 spectral bands encoding vegetation health, water stress, chlorophyll content, and pest presence. Combined with weather data, ground-level IoT sensors, and predictive models, this data can radically transform agronomic decision quality — cutting input costs by 15–25% and boosting yields by up to 10–20%.

The global Earth Observation market is valued at $10.07 billion in 2025 according to Straits Research, with projected growth to $17.20 billion by 2033 (CAGR 6.92%). The agriculture segment accounts for 21% of applications and is growing at the fastest pace, driven by demand for precision farming, yield monitoring, and irrigation optimisation. In this article we build, step by step, a complete Python pipeline to access Sentinel-2 via CDSE (Copernicus Data Space Ecosystem), compute NDVI and other vegetation indices, integrate weather data from Open-Meteo, and feed a predictive model for water stress.

What You Will Learn in This Article

Architecture of the Copernicus programme and how to access Sentinel-2 for free via CDSE
Vegetation indices: NDVI, EVI, SAVI, LAI — mathematical formulas, interpretation, and thresholds for Italian crops
Complete Python implementation: CDSE authentication, band download, NDVI computation with rasterio and numpy
Weather API comparison: Open-Meteo, OpenWeatherMap, Tomorrow.io, and integration with satellite data
Geospatial pipeline with GeoPandas, rasterio, and PostGIS for vector and raster data
Water-stress predictive model: combining Sentinel-2 + weather + IoT with scikit-learn
Practical case study: a Primitivo vineyard in Puglia, seasonal NDVI monitoring, and irrigation optimisation
Italian regulatory context: AGEA, SIAN, CAP open data, and digital PAC

FoodTech Series — All Articles

#	Article	Level	Status
1	IoT Pipeline for Precision Agriculture with Python and MQTT	Advanced	Available
2	Edge ML for Crop Monitoring: Computer Vision in the Field	Advanced	Available
3	Satellite and Weather APIs for AgriTech: Predictive Data (you are here)	Advanced	Current
4	Blockchain Traceability in Food: From Field to Supermarket	Intermediate	Coming soon
5	Computer Vision for Quality Control in the Food Industry	Advanced	Coming soon
6	FSMA and Digital Compliance: Automating Regulatory Processes	Intermediate	Coming soon
7	Vertical Farming: Environmental Control with IoT and ML	Advanced	Coming soon
8	Demand Forecasting for Food Retail with Prophet and LightGBM	Intermediate	Coming soon
9	Farm Intelligence Dashboard: Real-Time Analytics with Grafana	Intermediate	Coming soon
10	Food Supply Chain Optimisation: ML for Waste Reduction	Intermediate	Coming soon

The Copernicus Programme: Europe's Earth Observation Infrastructure

The European Union's Copernicus programme is the largest Earth observation infrastructure ever built. Managed by ESA (European Space Agency) and the European Commission with operational support from EUMETSAT, Copernicus operates a constellation of satellites called "Sentinels", each designed for specific monitoring missions. For agriculture, the reference satellite is Sentinel-2.

The Sentinel-2 constellation consists of two twin satellites (Sentinel-2A and Sentinel-2B) in sun-synchronous orbit at 786 km altitude. The two-satellite configuration guarantees a revisit time of 5 days at the equator and 2–3 days at European latitudes (Italy included). Each satellite carries the MSI (MultiSpectral Instrument) sensor, a pushbroom scanner that acquires data across a 290 km swath width and 13 spectral bands distributed across three spatial resolutions: 10 metres, 20 metres, and 60 metres.

Sentinel-2's most strategic feature is its complete availability at no cost: since 2014, all Copernicus data has been available in open access for any use, commercial or non-commercial. The new access portal since 2023 is the Copernicus Data Space Ecosystem (CDSE), which replaced the previous Copernicus Open Access Hub (SciHub) that was decommissioned in 2023. CDSE provides immediate access to the full Sentinel-2 archive, with a latency from acquisition to availability of approximately 3 hours for the most recent imagery.

      Sentinel-2 Satellite: MSI Sensor Technical Specifications
      
        BandWavelength (nm)ResolutionPrimary Application

        B1 - Coastal aerosol44360 mAir quality, atmospheric correction
B2 - Blue49010 mVegetation/soil discrimination, mapping
B3 - Green56010 mVegetation vigour, leaf colour
B4 - Red66510 mChlorophyll, NDVI (red band)
B5 - Vegetation Red Edge70520 mVegetation stress, Red Edge NDVI
B6 - Vegetation Red Edge74020 mChlorophyll content, species classification
B7 - Vegetation Red Edge78320 mBiomass, LAI
B8 - NIR84210 mNDVI (NIR band), biomass, LAI
B8a - Vegetation Red Edge86520 mCanopy structure, narrow NDVI
B9 - Water vapour94060 mAtmospheric water vapour correction
B10 - SWIR - Cirrus137560 mCirrus cloud detection
B11 - SWIR161020 mWater stress, leaf water content
B12 - SWIR219020 mAdvanced water stress, senescence

      
    

The two Sentinel-2 processing levels most relevant for agriculture are: Level-1C (L1C) — top-of-atmosphere radiance with geometric correction, and Level-2A (L2A) — bottom-of-atmosphere (BOA) surface reflectance with atmospheric correction applied via the Sen2Cor algorithm. For direct agricultural applications, Level-2A is the recommended choice since it has already been corrected for atmospheric influence, making reflectance values comparable across different dates and geographic locations.

Satellite Platforms for AgriTech: Full Comparison 2025

Sentinel-2 is not the only option available. The satellite data market for agriculture includes commercial providers like Planet Labs, established operators like Landsat (NASA/USGS), and cloud-integrated platforms like Google Earth Engine. The choice depends on a complex trade-off between spatial resolution, acquisition frequency, cloud coverage handling, and available budget.

      Satellite Platform Comparison for Precision Agriculture - 2025
      
        
            Platform
            Spatial Resolution
            Revisit Frequency
            Spectral Bands
            Cost
            Data Latency
            API
          

        
            Sentinel-2 (ESA/Copernicus)
            10 m (vis/NIR), 20 m (SWIR)
            5 days (2–3 in Europe)
            13 MSI bands
            Free (open data)
            ~3 hours from acquisition
            CDSE OData, STAC, SentinelHub, OpenEO
          

            Planet PlanetScope
            3.7 m
            Daily (near-daily)
            8 bands (PS2.SD)
            Commercial subscription; free for research
            24 hours
            Planet API v1, Subscriptions API
          

            Landsat 8/9 (NASA/USGS)
            30 m (multispectral), 15 m (pan)
            16 days
            11 OLI/TIRS bands
            Free (open data)
            24–48 hours
            USGS EarthExplorer, Google Earth Engine
          

            MODIS (NASA Terra/Aqua)
            250 m / 500 m / 1 km
            1–2 days
            36 bands
            Free (open data)
            24–48 hours
            NASA EarthData STAC, Google Earth Engine
          

            Google Earth Engine
            Dataset-dependent (10 m–1 km)
            Dataset-dependent
            Multi-dataset integrated
            Free non-commercial; paid commercial
            Immediate cloud processing
            Python ee, JavaScript API
          

            Maxar WorldView-3
            0.3 m (pan), 1.24 m (multi)
            1–4 days
            29 bands (CAVIS)
            ~$25–40/km²
            4–8 hours
            Maxar Streaming API
          

            Airbus Pleiades Neo
            0.3 m
            1–2 days
            6 multispectral bands
            ~€15–30/km²
            24–48 hours
            OneAtlas API
          

      
    

For the vast majority of agricultural applications in Italy, Sentinel-2 is the optimal choice: 10-metre resolution is sufficient for plots larger than 0.5 hectares (the threshold below which pixel statistics become unreliable), the 2–3-day revisit frequency across Italy is adequate for seasonal monitoring, and the complete absence of cost removes any economic barrier to adoption. Planet Labs becomes relevant only for applications requiring near-daily monitoring (for example, detecting the early onset of water stress in high-value crops like intensive horticulture) or sub-10m resolution. MODIS and Landsat remain useful for large-area analyses and multi-decade historical time series.

Vegetation Indices: NDVI, EVI, SAVI and LAI for Italian Crops

Vegetation Indices (VI) are metrics mathematically derived from combinations of spectral bands that quantify specific properties of vegetation. They exploit the fact that chlorophyll absorbs strongly in the red band (665 nm) and reflects strongly in the near-infrared (842 nm): a ratio between these bands encodes vegetation density and vitality with mathematical elegance.

NDVI — Normalized Difference Vegetation Index

NDVI is the most widely used vegetation index in the world, introduced by Rouse et al. in 1973. The mathematical formula is:

NDVI Formula

NDVI = (NIR - Red) / (NIR + Red)

On Sentinel-2:
NDVI = (B8 - B4) / (B8 + B4)

Range: -1.0 to +1.0

NDVI values are interpreted according to a standardised scale, but the optimal thresholds vary by crop and phenological stage. The table below shows reference thresholds for the main Italian crops, derived from scientific literature and empirical validation across typical growing regions (Po Valley, Puglia, Sicily, Tuscany):

      NDVI Interpretation for Italian Crops
      
        
            NDVI Range
            General Interpretation
            Wheat (Triticum)
            Grapevine (Vitis)
            Tomato (Solanum)
            Maize (Zea mays)
            Olive (Olea)
          

        
            < 0.1
            Bare soil, rock, snow
            Unseeded ground
            Winter, pre-bud break
            Post-harvest
            Pre-sowing
            Bare inter-row soil
          

            0.1 - 0.2
            Very sparse or dry vegetation
            Initial emergence
            Dormancy / severe stress
            Recent transplant
            Emergence (VE)
            Severe water/thermal stress
          

            0.2 - 0.4
            Sparse vegetation, moderate stress
            Early tillering
            Pre-flowering, slight stress
            Early vegetative growth
            Stages V1–V3
            Sparse canopy, stress present
          

            0.4 - 0.6
            Moderate vegetation, good health
            Full stem extension
            Pre-flowering / fruit set
            Active vegetative development
            Stages V4–V8
            Normal canopy, well irrigated
          

            0.6 - 0.8
            Dense vegetation, excellent vigour
            Heading (seasonal peak)
            Full foliation, veraison
            Maximum cover (flowering)
            Stages V10–VT (tasselling)
            Dense canopy, optimal condition
          

            > 0.8
            Very dense vegetation, forest
            Rare (field-edge woodland)
            Hedgerows and boundary trees
            Greenhouses with full cover
            Rare optimal conditions
            High-density olive groves
          

      
    

EVI — Enhanced Vegetation Index

EVI, developed by Huete et al. for the MODIS sensor, corrects the limitations of NDVI in high-density vegetation areas (NDVI saturation beyond 0.7) and in zones with exposed soils (NDVI errors caused by soil reflectance). The formula is:

EVI = G * (NIR - Red) / (NIR + C1*Red - C2*Blue + L)

On Sentinel-2:
EVI = 2.5 * (B8 - B4) / (B8 + 6*B4 - 7.5*B2 + 1)

Standard constants:
  G  = 2.5  (gain factor)
  C1 = 6.0  (aerosol resistance coefficient)
  C2 = 7.5  (aerosol resistance coefficient)
  L  = 1.0  (canopy background adjustment factor)

Range: -1.0 to +1.0 (typically 0 to 0.9 for vegetation)

SAVI — Soil Adjusted Vegetation Index

SAVI, proposed by Huete in 1988, is particularly useful in semi-arid environments such as Puglia or Sicily, where exposed soil significantly influences spectral response. The correction factor L reduces soil influence:

SAVI = (NIR - Red) * (1 + L) / (NIR + Red + L)

On Sentinel-2:
SAVI = (B8 - B4) * (1 + 0.5) / (B8 + B4 + 0.5)

L = 0.5 (standard value for medium vegetation)
L = 1.0 for very sparse vegetation
L = 0.25 for dense vegetation

Advantageous over NDVI when:
- Vegetation cover < 40%
- Bright, light-coloured soils (limestone, sandy)
- Semi-arid zones of southern Italy

LAI — Leaf Area Index

The LAI quantifies the total leaf surface area per unit of ground surface area. It is a fundamental agronomic parameter correlated with potential productivity and transpiration. Sentinel-2 allows LAI estimation through empirical or physical algorithms (SNAP BioPhysical Processor):

# Empirical LAI estimate from NDVI (Boegh et al. relationship, corrected for Sentinel-2)
# Valid for European cereal and herbaceous crops

LAI_estimate = -0.3 + 10.2 * NDVI  # valid for NDVI in [0.2, 0.8]

# For grapevine (Italy-specific relationship from literature):
LAI_vine = 0.57 * EXP(3.28 * NDVI)  # Dalla et al., 2019

# Indicative LAI thresholds for Italian crops:
# Wheat:     optimal LAI 3-5 m2/m2 at heading
# Maize:     optimal LAI 3-5 m2/m2 at flowering
# Grapevine: optimal LAI 1.5-2.5 m2/m2 during veraison
# Tomato:    optimal LAI 3-4 m2/m2 at maximum cover

Limitations of Satellite Vegetation Indices

Cloud cover: Sentinel-2 is an optical sensor — clouds render images unusable. In Italy, average cloud cover is 48% in winter and 15% in summer. The QA60 band enables automatic pixel-level cloud masking.
NDVI saturation: Beyond NDVI ~0.7–0.8, sensitivity drops sharply. For high-density crops, use EVI instead.
Mixed pixels: At 10-metre resolution, a single pixel may contain both vegetation and bare soil. For plots smaller than 0.5 ha, mean NDVI is less representative.
Phenological seasonality: Comparing NDVI across different dates requires accounting for the phenological stage, not just the absolute value.
Inter-annual variability: NDVI also varies with seasonal weather conditions (temperature, rainfall) independently of agronomic practices.

Python Implementation: CDSE Access and Sentinel-2 Download

The Copernicus Data Space Ecosystem (CDSE) became the single access portal for Sentinel data in 2023. It offers four main APIs: OData API (product search and download), STAC API (SpatioTemporal Asset Catalog), OpenEO API (standardised processing), and Sentinel Hub API (processing as a service). To get started, create a free account at dataspace.copernicus.eu and generate OAuth2 credentials.

Python Environment Setup and Dependencies

# Install required libraries
pip install sentinelhub rasterio numpy pandas geopandas shapely \
            requests python-dotenv matplotlib scipy scikit-learn \
            openmeteo-requests retry-requests xarray

# requirements.txt with validated versions for Python 3.11+
# sentinelhub==3.11.4       - CDSE and SentinelHub interface
# rasterio==1.4.0           - geospatial raster read/write
# numpy==2.1.0              - matrix computation for bands
# geopandas==1.0.1          - vector data and spatial operations
# shapely==2.0.6            - geometries for AOI (Area of Interest)
# openmeteo-requests==0.3.3 - Open-Meteo API client
# scikit-learn==1.5.2       - water stress predictive model

CDSE Authentication with OAuth2

import os
import requests
from datetime import datetime, timedelta
from dotenv import load_dotenv

load_dotenv()

# CDSE credentials configuration (from environment variables)
CDSE_USERNAME = os.getenv("CDSE_USERNAME")
CDSE_PASSWORD = os.getenv("CDSE_PASSWORD")
CDSE_CLIENT_ID = "cdse-public"
CDSE_TOKEN_URL = "https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token"

def get_cdse_access_token(username: str, password: str) -> str:
    """
    Obtain OAuth2 access token from the CDSE Identity Server.
    Token lifetime is 600 seconds (10 minutes).
    """
    response = requests.post(
        CDSE_TOKEN_URL,
        data={
            "grant_type": "password",
            "client_id": CDSE_CLIENT_ID,
            "username": username,
            "password": password,
        },
        headers={"Content-Type": "application/x-www-form-urlencoded"},
        timeout=30,
    )
    response.raise_for_status()
    token_data = response.json()
    return token_data["access_token"]


class CDSESession:
    """Authenticated session with automatic token refresh management."""

    def __init__(self, username: str, password: str):
        self._username = username
        self._password = password
        self._token: str | None = None
        self._token_expiry: datetime | None = None

    def _refresh_token(self) -> None:
        self._token = get_cdse_access_token(self._username, self._password)
        # 60-second safety margin before actual expiry (600s)
        self._token_expiry = datetime.utcnow() + timedelta(seconds=540)

    def get_headers(self) -> dict:
        if self._token is None or datetime.utcnow() >= self._token_expiry:
            self._refresh_token()
        return {"Authorization": f"Bearer {self._token}"}


# Initialise session
session = CDSESession(CDSE_USERNAME, CDSE_PASSWORD)

Searching Sentinel-2 Products via OData API

from shapely.geometry import box
import json

ODATA_BASE_URL = "https://catalogue.dataspace.copernicus.eu/odata/v1"

def search_sentinel2_products(
    bbox: tuple[float, float, float, float],  # (lon_min, lat_min, lon_max, lat_max)
    start_date: str,   # format: "2024-05-01T00:00:00.000Z"
    end_date: str,     # format: "2024-05-31T23:59:59.000Z"
    cloud_cover_max: float = 20.0,
    product_type: str = "S2MSI2A",  # L2A = Bottom of Atmosphere, preferred for agriculture
    max_results: int = 20,
) -> list[dict]:
    """
    Search for Sentinel-2 products in the CDSE catalogue.

    Args:
        bbox: Bounding box in order (lon_min, lat_min, lon_max, lat_max)
        start_date: Start date (ISO 8601 format)
        end_date: End date (ISO 8601 format)
        cloud_cover_max: Maximum cloud cover percentage (0-100)
        product_type: S2MSI2A (L2A, recommended) or S2MSI1C (L1C)
        max_results: Maximum number of results

    Returns:
        List of dictionaries with product metadata
    """
    lon_min, lat_min, lon_max, lat_max = bbox
    aoi_wkt = f"POLYGON(( {lon_min} {lat_min}, {lon_max} {lat_min}, {lon_max} {lat_max}, {lon_min} {lat_max}, {lon_min} {lat_min} ))"

    filter_query = (
        f"Collection/Name eq 'SENTINEL-2' "
        f"and Attributes/OData.CSC.DoubleAttribute/any(att:att/Name eq 'cloudCover' "
        f"    and att/OData.CSC.DoubleAttribute/Value le {cloud_cover_max}) "
        f"and Attributes/OData.CSC.StringAttribute/any(att:att/Name eq 'productType' "
        f"    and att/OData.CSC.StringAttribute/Value eq '{product_type}') "
        f"and ContentDate/Start ge {start_date} "
        f"and ContentDate/Start lt {end_date} "
        f"and OData.CSC.Intersects(area=geography'SRID=4326;{aoi_wkt}')"
    )

    params = {
        "$filter": filter_query,
        "$orderby": "ContentDate/Start desc",
        "$top": max_results,
        "$expand": "Attributes",
    }

    response = requests.get(
        f"{ODATA_BASE_URL}/Products",
        params=params,
        timeout=60,
    )
    response.raise_for_status()
    data = response.json()
    return data.get("value", [])


# Example: search for imagery over the Manduria vineyard (Puglia)
# Growing season: May-September 2024
bbox_manduria = (17.4, 40.3, 17.7, 40.5)  # Primitivo di Manduria DOC area

products = search_sentinel2_products(
    bbox=bbox_manduria,
    start_date="2024-05-01T00:00:00.000Z",
    end_date="2024-09-30T23:59:59.000Z",
    cloud_cover_max=10.0,
    product_type="S2MSI2A",
)

print(f"Found {len(products)} Sentinel-2 L2A products with cloud cover <= 10%")
for p in products[:5]:
    cloud = next(
        (a["Value"] for a in p.get("Attributes", []) if a["Name"] == "cloudCover"),
        "N/A",
    )
    print(f"  {p['Name']} | Cloud: {cloud:.1f}% | Date: {p['ContentDate']['Start'][:10]}")

Downloading and Accessing Bands with Sentinel Hub Python

For processing bands without downloading the full scene (which weighs 800 MB – 1.2 GB per L2A product), it is more efficient to use the Sentinel Hub Process API, which returns only the requested bands for the region of interest:

from sentinelhub import (
    SHConfig,
    SentinelHubRequest,
    DataCollection,
    MimeType,
    BBox,
    CRS,
    bbox_to_dimensions,
)
import numpy as np

# SentinelHub configuration via CDSE
config = SHConfig()
config.sh_client_id = os.getenv("SH_CLIENT_ID")       # from apps.sentinel-hub.com
config.sh_client_secret = os.getenv("SH_CLIENT_SECRET")
config.sh_base_url = "https://sh.dataspace.copernicus.eu"  # CDSE endpoint

# Area of Interest (AOI) definition - Manduria vineyard ~100 ha
aoi_bbox = BBox(
    bbox=[17.42, 40.34, 17.52, 40.42],
    crs=CRS.WGS84,
)
# Compute raster dimensions at 10m resolution
size = bbox_to_dimensions(aoi_bbox, resolution=10)
print(f"Raster dimensions: {size[0]} x {size[1]} pixels at 10m")

# Evalscript to download B04 (Red) and B08 (NIR) for NDVI computation
evalscript_ndvi_bands = """
//VERSION=3
function setup() {
    return {
        input: [{
            bands: ["B04", "B08", "CLM"],  // Red, NIR, Cloud Mask
            units: "REFLECTANCE"
        }],
        output: {
            bands: 3,
            sampleType: "FLOAT32"
        }
    };
}

function evaluatePixel(sample) {
    // Returns [Red, NIR, CloudMask]
    return [sample.B04, sample.B08, sample.CLM];
}
"""

# Data request
request = SentinelHubRequest(
    evalscript=evalscript_ndvi_bands,
    input_data=[
        SentinelHubRequest.input_data(
            data_collection=DataCollection.SENTINEL2_L2A.define_from(
                "s2l2a",
                service_url=config.sh_base_url,
            ),
            time_interval=("2024-07-15", "2024-07-20"),
            other_args={"dataFilter": {"maxCloudCoverage": 10}},
        )
    ],
    responses=[SentinelHubRequest.output_response("default", MimeType.TIFF)],
    bbox=aoi_bbox,
    size=size,
    config=config,
)

# Download data (returns numpy array [H, W, Bands])
images = request.get_data()
band_data = images[0]  # shape: (height, width, 3)

red_band   = band_data[:, :, 0].astype(np.float32)  # B04 - Red
nir_band   = band_data[:, :, 1].astype(np.float32)  # B08 - NIR
cloud_mask = band_data[:, :, 2]                      # CLM - 0=clear, 1=cloud

print(f"Bands downloaded - Shape: {red_band.shape}")
print(f"Red B04 - Min: {red_band.min():.4f}, Max: {red_band.max():.4f}, Mean: {red_band.mean():.4f}")
print(f"NIR B08 - Min: {nir_band.min():.4f}, Max: {nir_band.max():.4f}, Mean: {nir_band.mean():.4f}")
print(f"Cloud pixels: {cloud_mask.sum()} out of {cloud_mask.size} total ({100*cloud_mask.mean():.1f}%)")

NDVI Computation and GeoTIFF Export

import rasterio
from rasterio.transform import from_bounds
from rasterio.crs import CRS as RasterioCRS
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors

def calculate_ndvi(
    red: np.ndarray,
    nir: np.ndarray,
    cloud_mask: np.ndarray | None = None,
) -> np.ndarray:
    """
    Compute NDVI from the red and NIR bands.
    Handles division by zero with np.errstate.
    Masks cloudy pixels with np.nan.

    Returns:
        NDVI array as float32, range [-1, 1], NaN for invalid pixels
    """
    with np.errstate(divide="ignore", invalid="ignore"):
        ndvi = np.where(
            (nir + red) == 0,
            np.nan,
            (nir - red) / (nir + red),
        ).astype(np.float32)

    # Mask clouds
    if cloud_mask is not None:
        ndvi = np.where(cloud_mask == 1, np.nan, ndvi)

    # Clip physically impossible values (instrumental artefacts)
    ndvi = np.clip(ndvi, -1.0, 1.0)
    return ndvi


def save_ndvi_geotiff(
    ndvi: np.ndarray,
    bbox_wgs84: tuple[float, float, float, float],
    output_path: str,
) -> None:
    """
    Save the NDVI array as a georeferenced GeoTIFF in WGS84.

    Args:
        ndvi: 2D NDVI array (H, W)
        bbox_wgs84: (lon_min, lat_min, lon_max, lat_max)
        output_path: Output file path
    """
    height, width = ndvi.shape
    lon_min, lat_min, lon_max, lat_max = bbox_wgs84

    transform = from_bounds(lon_min, lat_min, lon_max, lat_max, width, height)

    with rasterio.open(
        output_path,
        "w",
        driver="GTiff",
        height=height,
        width=width,
        count=1,
        dtype=rasterio.float32,
        crs=RasterioCRS.from_epsg(4326),
        transform=transform,
        compress="lzw",
        nodata=np.nan,
    ) as dst:
        dst.write(ndvi, 1)
        dst.update_tags(
            NDVI_COMPUTED_AT=datetime.utcnow().isoformat(),
            SENTINEL2_PRODUCT_TYPE="S2MSI2A",
            FORMULA="(B08-B04)/(B08+B04)",
        )


# Compute and save NDVI
ndvi_map = calculate_ndvi(red_band, nir_band, cloud_mask)

# NDVI statistics for the AOI (excluding NaN)
valid_pixels = ndvi_map[~np.isnan(ndvi_map)]
print(f"\nNDVI Statistics - Manduria Vineyard (July 2024)")
print(f"  Valid pixels: {len(valid_pixels)} out of {ndvi_map.size}")
print(f"  Mean NDVI:    {valid_pixels.mean():.3f}")
print(f"  Median NDVI:  {np.median(valid_pixels):.3f}")
print(f"  NDVI p10:     {np.percentile(valid_pixels, 10):.3f} (stressed zones)")
print(f"  NDVI p90:     {np.percentile(valid_pixels, 90):.3f} (optimal zones)")

bbox_wgs84 = (17.42, 40.34, 17.52, 40.42)
save_ndvi_geotiff(ndvi_map, bbox_wgs84, "ndvi_manduria_20240717.tif")
print("NDVI GeoTIFF saved: ndvi_manduria_20240717.tif")

Weather APIs for AgriTech: Comparison and Implementation

Weather data is the second fundamental data source in precision agriculture. Combined with NDVI and ground-level IoT data, it enables building predictive models for water stress, disease risk, evapotranspiration, and yield forecasting. The weather API landscape in 2025 spans open-source free providers, commercial providers with free tiers, and regional data from Italy's ARPA networks.

      Weather APIs for AgriTech — Comparison 2025
      
        
            Provider
            Price
            Spatial Resolution
            Forecast
            Agro Variables
            Historical Archive
            API
          

        
            Open-Meteo
            Free (non-commercial); ~$15/month commercial
            1–11 km (multi-model)
            16 days
            ET0, soil temp, soil moisture, vapour pressure
            1940–present (ERA5)
            REST JSON, Python client
          

            OpenWeatherMap
            Free up to 60 calls/min; $40/month pro
            2.5 km (ICON)
            5 days (free), 16 days (pro)
            Limited, no ET0
            Paid history only
            REST JSON, Python SDK
          

            Tomorrow.io
            Free 500 calls/day; $199/month core
            1 km
            21 days
            SoilMoisture, ET0, spray conditions, pest risk
            6 years
            REST, WebSocket, gRPC
          

            ARPA Regional (Italy, 15 regions)
            Free (regional open data)
            Point stations (~1,500 stations across Italy)
            No (observed data only)
            All agro-meteorological variables
            Full archive (decades)
            WMS, REST, CSV/JSON download
          

            Meteoam / CFS/ECMWF
            ECMWF open data free; CFS NOAA free
            9–25 km
            10–45 days
            Basic meteorological variables
            1940–present via ERA5
            ECMWF API (Python), CDS API
          

            Visual Crossing
            Free 1,000 records/day; $35/month standard
            Station-interpolated
            15 days
            ET0, Growing Degree Days, frost risk
            Unlimited paid
            REST JSON, CSV, SQL-like
          

      
    

For AgriTech applications in production, the recommendation is to use Open-Meteo as the base (free, open-source, ERA5 data for historical records, 16-day forecast) and supplement it with ARPA station networks for local calibration. Open-Meteo includes critical agrometeorological variables such as ET0 (reference evapotranspiration), soil moisture at multiple depths, and water vapour pressure — variables that are not always freely available from commercial providers.

Open-Meteo Client with Caching and Retry Logic

import openmeteo_requests
import requests_cache
import pandas as pd
from retry_requests import retry
from dataclasses import dataclass
from typing import Optional

@dataclass
class WeatherForecast:
    """Daily weather data for agronomic decision-making."""
    date: pd.DatetimeIndex
    temperature_max: np.ndarray    # Celsius
    temperature_min: np.ndarray    # Celsius
    precipitation: np.ndarray      # mm
    et0: np.ndarray                # mm/day - reference evapotranspiration (FAO-56)
    wind_speed_max: np.ndarray     # km/h
    solar_radiation: np.ndarray    # MJ/m2/day
    soil_moisture_0_7cm: np.ndarray   # m3/m3 (0-7 cm depth)
    soil_moisture_7_28cm: np.ndarray  # m3/m3 (7-28 cm depth)
    soil_temp_0_7cm: np.ndarray    # Celsius


def get_weather_forecast(
    latitude: float,
    longitude: float,
    start_date: Optional[str] = None,   # "YYYY-MM-DD" for historical archive
    end_date: Optional[str] = None,
    forecast_days: int = 16,
) -> WeatherForecast:
    """
    Retrieve weather data from Open-Meteo with local cache (1 hour) and auto-retry.
    Combines forecast (16 days) with ERA5 historical archive if start_date is provided.

    Args:
        latitude: WGS84 latitude
        longitude: WGS84 longitude
        start_date: If provided, requests historical data (not forecast)
        end_date: End of historical period
        forecast_days: Forecast horizon in days (1-16)

    Returns:
        WeatherForecast with all agrometeorological parameters
    """
    # Local HTTP cache (1-hour validity) + retry on transient errors
    cache_session = requests_cache.CachedSession(".cache/open_meteo", expire_after=3600)
    retry_session = retry(cache_session, retries=5, backoff_factor=0.2)
    client = openmeteo_requests.Client(session=retry_session)

    base_url = (
        "https://archive-api.open-meteo.com/v1/archive"
        if start_date
        else "https://api.open-meteo.com/v1/forecast"
    )

    params = {
        "latitude": latitude,
        "longitude": longitude,
        "daily": [
            "temperature_2m_max",
            "temperature_2m_min",
            "precipitation_sum",
            "et0_fao_evapotranspiration",
            "wind_speed_10m_max",
            "shortwave_radiation_sum",
            "soil_moisture_0_to_7cm",
            "soil_moisture_7_to_28cm",
            "soil_temperature_0_to_7cm",
        ],
        "timezone": "Europe/Rome",  # Italian time zone
        "wind_speed_unit": "kmh",
    }

    if start_date:
        params["start_date"] = start_date
        params["end_date"] = end_date
    else:
        params["forecast_days"] = forecast_days

    # Open-Meteo uses ERA5 for historical archive (1940-present), excellent for agriculture
    if start_date:
        # ERA5-Land archive: high resolution 9 km for Italy
        params["models"] = "era5_land"

    responses = client.weather_api(base_url, params=params)
    r = responses[0]  # first (and only) location

    daily = r.Daily()

    return WeatherForecast(
        date=pd.date_range(
            start=pd.to_datetime(daily.Time(), unit="s", utc=True),
            end=pd.to_datetime(daily.TimeEnd(), unit="s", utc=True),
            freq=pd.Timedelta(seconds=daily.Interval()),
            inclusive="left",
        ),
        temperature_max      = daily.Variables(0).ValuesAsNumpy(),
        temperature_min      = daily.Variables(1).ValuesAsNumpy(),
        precipitation        = daily.Variables(2).ValuesAsNumpy(),
        et0                  = daily.Variables(3).ValuesAsNumpy(),
        wind_speed_max       = daily.Variables(4).ValuesAsNumpy(),
        solar_radiation      = daily.Variables(5).ValuesAsNumpy(),
        soil_moisture_0_7cm  = daily.Variables(6).ValuesAsNumpy(),
        soil_moisture_7_28cm = daily.Variables(7).ValuesAsNumpy(),
        soil_temp_0_7cm      = daily.Variables(8).ValuesAsNumpy(),
    )


# Example: weather for the Manduria vineyard (2024 growing season)
lat_manduria, lon_manduria = 40.38, 17.47

weather_historical = get_weather_forecast(
    latitude=lat_manduria,
    longitude=lon_manduria,
    start_date="2024-04-01",
    end_date="2024-09-30",
)
weather_forecast = get_weather_forecast(
    latitude=lat_manduria,
    longitude=lon_manduria,
    forecast_days=16,
)

df_weather = pd.DataFrame({
    "date":      weather_historical.date,
    "t_max_C":   weather_historical.temperature_max,
    "t_min_C":   weather_historical.temperature_min,
    "rain_mm":   weather_historical.precipitation,
    "et0_mm":    weather_historical.et0,
    "rad_MJ":    weather_historical.solar_radiation,
    "sw_0_7cm":  weather_historical.soil_moisture_0_7cm,
    "sw_7_28cm": weather_historical.soil_moisture_7_28cm,
})

print(df_weather.describe())
print(f"\nTotal ET0 growing season Apr-Sep 2024: {df_weather['et0_mm'].sum():.1f} mm")
print(f"Total rainfall Apr-Sep 2024: {df_weather['rain_mm'].sum():.1f} mm")
water_deficit = df_weather["et0_mm"].sum() - df_weather["rain_mm"].sum()
print(f"Seasonal water deficit (ET0 - rainfall): {water_deficit:.1f} mm")

Complete Geospatial Pipeline: GeoPandas, rasterio and PostGIS

Professional management of satellite data requires a complete geospatial infrastructure that integrates raster data (NDVI satellite imagery, weather maps) with vector data (plot boundaries, management zones, sampling points). The pipeline described here adopts the de-facto standard Python geospatial stack: rasterio for rasters, GeoPandas for vectors, PostGIS as the spatial database, and GDAL as the underlying format translation engine.

Geospatial Pipeline Architecture

AgriTech Geospatial Pipeline Technology Stack

+-------------------------------------------------------------------------+
|                     INPUT DATA                                          |
|  [Sentinel-2 GeoTIFF]  [Plot Shapefiles]      [Weather/IoT CSV]        |
|         |                      |                       |                |
|    rasterio                geopandas             pandas                 |
+---------|----------------------|---------------------- |----------------+
          |                      |                       |
+---------v----------------------v-----------------------v----------------+
|                   PROCESSING LAYER (Python)                            |
|  [NDVI Calculation]  [Zonal Statistics]  [Spatial Join]               |
|  numpy/rasterio      rasterstats         geopandas                    |
+---------|----------------------|---------------------- |----------------+
          |                      |                       |
          +----------------------v-----------------------+
                                 |
+--------------------------------v----------------------------------------+
|               STORAGE LAYER (PostGIS + S3)                              |
|  [PostGIS - geometries + attributes]  [S3/MinIO - GeoTIFF rasters]     |
|  - Plot table with mean NDVI          - Original raster files           |
|  - Historical series per plot          - Satellite scene archive        |
|  - GIST spatial index                  - Cloud Optimized GeoTIFF        |
+--------------------------------|----------------------------------------+
                                 |
+--------------------------------v----------------------------------------+
|               SERVING LAYER                                             |
|  [REST API - FastAPI]  [Dashboard - Grafana]  [ML Models]              |
|  NDVI per plot          Real-time NDVI maps     Stress prediction       |
+-------------------------------------------------------------------------+

Zonal Statistics: Mean NDVI per Plot with rasterstats

import geopandas as gpd
from rasterstats import zonal_stats
import rasterio
from sqlalchemy import create_engine, text
from geoalchemy2 import Geometry
import pandas as pd

# Load vineyard plot shapefile
gdf_plots = gpd.read_file("data/plots_manduria_vineyard.shp")
gdf_plots = gdf_plots.to_crs("EPSG:4326")  # Reproject to WGS84

print(f"Plots loaded: {len(gdf_plots)}")
print(gdf_plots[["plot_id", "variety", "hectares", "planting_year"]].head(10))

def compute_zonal_ndvi(
    ndvi_raster_path: str,
    plots_gdf: gpd.GeoDataFrame,
    acquisition_date: str,
) -> gpd.GeoDataFrame:
    """
    Compute zonal NDVI statistics for each plot.

    Per plot, computes:
    - Mean NDVI (overall vigour indicator)
    - Median NDVI (robust to outliers)
    - NDVI std (intra-plot variability)
    - 10th percentile (stressed zones within the plot)
    - 90th percentile (optimal zones)
    - Percentage of valid pixels (excludes cloud NaN)

    Returns:
        GeoDataFrame with added NDVI columns
    """
    stats = zonal_stats(
        vectors=plots_gdf.geometry,
        raster=ndvi_raster_path,
        stats=["mean", "median", "std", "percentile_10", "percentile_90", "count", "nodata"],
        nodata=np.nan,
        geojson_out=False,
    )

    df_stats = pd.DataFrame(stats)
    df_stats.columns = [
        "ndvi_mean", "ndvi_median", "ndvi_std",
        "ndvi_p10", "ndvi_p90", "pixel_count", "pixel_nodata",
    ]
    df_stats["acquisition_date"] = pd.to_datetime(acquisition_date)
    df_stats["pixel_valid_pct"] = (
        df_stats["pixel_count"] /
        (df_stats["pixel_count"] + df_stats["pixel_nodata"].fillna(0)) * 100
    )

    # Stress classification based on mean NDVI (thresholds for grapevine in summer)
    df_stats["stress_category"] = pd.cut(
        df_stats["ndvi_mean"],
        bins=[-1.0, 0.25, 0.40, 0.55, 0.70, 1.0],
        labels=["Severe Stress", "Moderate Stress", "Normal", "Good", "Optimal"],
    )

    return gpd.GeoDataFrame(
        pd.concat([plots_gdf.reset_index(drop=True), df_stats], axis=1),
        geometry="geometry",
        crs=plots_gdf.crs,
    )


# Compute zonal NDVI for all plots
gdf_ndvi = compute_zonal_ndvi(
    ndvi_raster_path="ndvi_manduria_20240717.tif",
    plots_gdf=gdf_plots,
    acquisition_date="2024-07-17",
)

# Stressed plots - irrigation priority
stressed_plots = gdf_ndvi[
    gdf_ndvi["stress_category"].isin(["Severe Stress", "Moderate Stress"])
]
print(f"\nStressed plots ({len(stressed_plots)}/{len(gdf_ndvi)}):")
print(stressed_plots[["plot_id", "variety", "ndvi_mean", "stress_category", "hectares"]])

# Persist to PostGIS for spatial queries
engine = create_engine("postgresql://agritech:pass@localhost:5432/vineyard_db")

gdf_ndvi.to_postgis(
    name="ndvi_historical",
    con=engine,
    if_exists="append",
    index=False,
    dtype={"geometry": Geometry("MULTIPOLYGON", srid=4326)},
)
print("\nNDVI data saved to PostGIS (table: ndvi_historical)")

PostGIS Queries for Advanced Spatial Analysis

-- PostGIS schema for the AgriTech system
CREATE TABLE IF NOT EXISTS plots (
    plot_id         SERIAL PRIMARY KEY,
    cadastral_code  VARCHAR(20) UNIQUE NOT NULL,
    variety         VARCHAR(50),
    hectares        NUMERIC(8, 4),
    planting_year   INTEGER,
    training_system VARCHAR(30),
    geom            GEOMETRY(MULTIPOLYGON, 4326)
);

CREATE INDEX idx_plots_geom ON plots USING GIST(geom);

CREATE TABLE IF NOT EXISTS ndvi_historical (
    id              SERIAL PRIMARY KEY,
    plot_id         INTEGER REFERENCES plots(plot_id),
    satellite_date  DATE NOT NULL,
    ndvi_mean       NUMERIC(6, 4),
    ndvi_median     NUMERIC(6, 4),
    ndvi_std        NUMERIC(6, 4),
    ndvi_p10        NUMERIC(6, 4),
    ndvi_p90        NUMERIC(6, 4),
    pixel_valid_pct NUMERIC(5, 2),
    stress_category VARCHAR(20),
    satellite       VARCHAR(20) DEFAULT 'Sentinel-2',
    created_at      TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_ndvi_plot_date ON ndvi_historical(plot_id, satellite_date);

-- Query: NDVI trend over last 4 acquisitions per plot
SELECT
    p.cadastral_code,
    p.variety,
    n.satellite_date,
    n.ndvi_mean,
    LAG(n.ndvi_mean, 1) OVER (PARTITION BY n.plot_id ORDER BY n.satellite_date) AS ndvi_prev,
    n.ndvi_mean - LAG(n.ndvi_mean, 1) OVER (PARTITION BY n.plot_id ORDER BY n.satellite_date) AS ndvi_delta
FROM plots p
JOIN ndvi_historical n ON p.plot_id = n.plot_id
WHERE n.satellite_date >= NOW() - INTERVAL '60 days'
ORDER BY p.plot_id, n.satellite_date;

-- Query: plots with NDVI drop > 0.1 in the last month (stress alert)
SELECT
    p.cadastral_code,
    p.variety,
    p.hectares,
    latest.ndvi_mean AS current_ndvi,
    prev.ndvi_mean   AS ndvi_30d_ago,
    (latest.ndvi_mean - prev.ndvi_mean) AS change
FROM plots p
JOIN ndvi_historical latest ON p.plot_id = latest.plot_id
    AND latest.satellite_date = (SELECT MAX(satellite_date) FROM ndvi_historical WHERE plot_id = p.plot_id)
JOIN ndvi_historical prev ON p.plot_id = prev.plot_id
    AND prev.satellite_date = (SELECT MAX(satellite_date) FROM ndvi_historical
        WHERE plot_id = p.plot_id AND satellite_date < NOW() - INTERVAL '25 days')
WHERE (latest.ndvi_mean - prev.ndvi_mean) < -0.10
ORDER BY change ASC;

Water Stress Predictive Model with ML

Combining satellite NDVI with weather and IoT data enables building water-stress predictive models with 3–7 days of lead time, enabling proactive irrigation planning. The model described here integrates three data sources: the Sentinel-2 NDVI time series, the water balance computed from ET0 and precipitation, and soil moisture from IoT sensors (or from Open-Meteo ERA5-Land when physical sensors are not available).

Feature Engineering for the Model

import pandas as pd
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report
import joblib
from typing import Tuple

def build_stress_features(
    df_ndvi: pd.DataFrame,      # columns: date, ndvi_mean, ndvi_std, ndvi_p10
    df_weather: pd.DataFrame,   # columns: date, t_max_C, t_min_C, rain_mm, et0_mm, sw_0_7cm
    lookback_days: int = 14,
) -> pd.DataFrame:
    """
    Build feature set for the water stress predictive model.
    Combines satellite NDVI with water balance and weather variables.

    Generated features:
    - Current NDVI and change over 5/10 days
    - Cumulative water deficit (ET0 - rainfall) over 7/14/21 days
    - Growing Degree Days (GDD) from season start
    - Average soil moisture and trend
    - Thermal variability (t_max - t_min)
    """
    df = df_weather.merge(df_ndvi, on="date", how="left").sort_values("date")
    df["ndvi_mean"] = df["ndvi_mean"].interpolate(method="linear", limit=5)

    # Cumulative water balance (mm) - positive = surplus, negative = deficit
    df["water_balance"] = df["rain_mm"] - df["et0_mm"]
    df["deficit_7d"]  = df["water_balance"].rolling(7).sum()
    df["deficit_14d"] = df["water_balance"].rolling(14).sum()
    df["deficit_21d"] = df["water_balance"].rolling(21).sum()

    # NDVI trend (point and windowed change)
    df["ndvi_delta_5d"]  = df["ndvi_mean"] - df["ndvi_mean"].shift(5)
    df["ndvi_delta_10d"] = df["ndvi_mean"] - df["ndvi_mean"].shift(10)

    # Growing Degree Days (base 10 degrees - standard for grapevine)
    df["t_mean"] = (df["t_max_C"] + df["t_min_C"]) / 2
    df["gdd_daily"]      = np.maximum(df["t_mean"] - 10, 0)
    df["gdd_cumulative"] = df["gdd_daily"].cumsum()

    # Thermal variability (thermal stress indicator)
    df["diurnal_range"] = df["t_max_C"] - df["t_min_C"]

    # Average soil moisture and trend
    df["sm_avg_7d"] = df["sw_0_7cm"].rolling(7).mean()
    df["sm_trend"]  = df["sw_0_7cm"] - df["sw_0_7cm"].shift(3)

    feature_cols = [
        "ndvi_mean", "ndvi_std", "ndvi_p10",
        "ndvi_delta_5d", "ndvi_delta_10d",
        "deficit_7d", "deficit_14d", "deficit_21d",
        "et0_mm", "rain_mm",
        "t_max_C", "t_min_C", "diurnal_range",
        "gdd_cumulative", "gdd_daily",
        "sm_avg_7d", "sm_trend",
    ]

    return df[["date"] + feature_cols].dropna()


def train_stress_model(
    features_df: pd.DataFrame,
    labels: pd.Series,  # 0=no stress, 1=moderate stress, 2=severe stress
) -> Tuple[Pipeline, dict]:
    """
    Train GradientBoosting model for water stress classification.
    Uses sklearn pipeline with integrated StandardScaler.

    Returns:
        (trained_pipeline, validation_metrics)
    """
    feature_cols = [c for c in features_df.columns if c != "date"]
    X = features_df[feature_cols].values
    y = labels.values

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )

    pipeline = Pipeline([
        ("scaler", StandardScaler()),
        ("clf", GradientBoostingClassifier(
            n_estimators=200,
            learning_rate=0.05,
            max_depth=4,
            subsample=0.8,
            random_state=42,
        )),
    ])

    pipeline.fit(X_train, y_train)
    y_pred = pipeline.predict(X_test)

    cv_scores = cross_val_score(pipeline, X, y, cv=5, scoring="f1_weighted")

    metrics = {
        "f1_cv_mean": cv_scores.mean(),
        "f1_cv_std":  cv_scores.std(),
        "classification_report": classification_report(
            y_test, y_pred,
            target_names=["No Stress", "Moderate Stress", "Severe Stress"],
        ),
    }

    return pipeline, metrics


# Save model for deployment
def save_model(pipeline: Pipeline, path: str) -> None:
    joblib.dump(pipeline, path)
    print(f"Model saved: {path}")


def predict_stress_7days(
    pipeline: Pipeline,
    features_forecast: pd.DataFrame,  # features computed on 7-day forecast data
) -> pd.DataFrame:
    """
    Predict water stress for the next 7 days.
    Uses features computed on weather forecast data (Open-Meteo, 16 days).

    Returns:
        DataFrame with date, class probabilities, and predicted class
    """
    feature_cols = [c for c in features_forecast.columns if c != "date"]
    X_forecast = features_forecast[feature_cols].values

    proba = pipeline.predict_proba(X_forecast)
    pred_class = pipeline.predict(X_forecast)

    labels = ["no_stress", "moderate_stress", "severe_stress"]
    result_df = pd.DataFrame(proba, columns=[f"prob_{l}" for l in labels])
    result_df["predicted_class"] = pred_class
    result_df["date"] = features_forecast["date"].values
    result_df["predicted_label"] = [labels[c] for c in pred_class]

    return result_df[["date", "predicted_label", "prob_no_stress",
                       "prob_moderate_stress", "prob_severe_stress"]]

Case Study: Seasonal NDVI Monitoring in a Primitivo Vineyard in Manduria

To make the described concepts concrete, we present a case study inspired by a real application in the DOC Primitivo di Manduria territory (Taranto, Puglia). The area presents conditions typical of Mediterranean viticulture: hot dry summers (summer rainfall < 50 mm), calcareous clay soils, and training systems ranging from the traditional bush vine (alberello pugliese) to vertical shoot positioning (controspalliera).

      Case Study Parameters — Primitivo Vineyard, Manduria
      
        ParameterValue

        LocationManduria (TA), Puglia — lat 40.38, lon 17.47
Total area35 hectares divided into 12 plots
Main varietiesPrimitivo (80%), Negroamaro (20%)
Training systemsBush vine — alberello (8 plots), VSP — controspalliera (4 plots)
Planting year2003–2015 (mixed young/mature vines)
IrrigationDrip irrigation (not all plots)
Monitoring satelliteSentinel-2A/B, L2A, 10m resolution
Acquisitions used28 scenes in 2024 season (cloud < 20%)
Integrated IoT sensors6 weather stations, 18 soil moisture sensors at 30/60 cm
Analysis periodApril – October 2024

      
    

Seasonal NDVI Results and Irrigation Decisions

The seasonal NDVI analysis revealed significantly differentiated water stress patterns across plots, with direct implications for irrigation management and product quality:

      NDVI Results per Plot — 2024 Growing Season
      
            Plot
            Variety
            NDVI Apr
            NDVI Jun
            NDVI Jul (peak)
            NDVI Aug
            NDVI Sep
            Irrigation Alert
          
            Block A1Primitivo / bush vine
            0.280.420.580.410.31
            Moderate stress August
          
            Block A2Primitivo / bush vine
            0.220.360.470.290.24
            Severe stress Aug–Sep
          
            Block B1Primitivo / VSP
            0.310.530.660.570.44
            No alert
          
            Block B2Primitivo / VSP
            0.290.490.630.520.40
            No alert
          
            Block C1Negroamaro / bush vine
            0.250.440.550.380.28
            Moderate stress August
          
            Block C2Negroamaro / VSP
            0.300.510.610.500.38
            No alert

The pattern that emerged is clear and agronomically significant: bush vine plots without supplemental irrigation show more pronounced water stress in August and September compared to VSP plots with drip irrigation. Block A2 in particular showed an NDVI decline from 0.47 to 0.29 between July and August 2024, correlated in time with a 23-day dry spell and maximum temperatures consistently exceeding 36 degrees Celsius.

Automated Alerting Script

import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from typing import List

def check_ndvi_alerts(
    gdf_ndvi: gpd.GeoDataFrame,
    stress_threshold: float = 0.35,
    min_valid_pixels_pct: float = 70.0,
) -> List[dict]:
    """
    Check plots with NDVI below the stress threshold.
    Generates an alert list for agronomist notification.

    Args:
        gdf_ndvi: GeoDataFrame with ndvi_mean and pixel_valid_pct columns
        stress_threshold: NDVI threshold below which to trigger alert (0.35 default for grapevine)
        min_valid_pixels_pct: Minimum valid pixel percentage for data reliability

    Returns:
        List of dicts with per-plot alert details
    """
    alerts = []
    for _, row in gdf_ndvi.iterrows():
        if row["pixel_valid_pct"] < min_valid_pixels_pct:
            continue  # Too many cloudy pixels, data unreliable

        if row["ndvi_mean"] < stress_threshold:
            severity = "SEVERE" if row["ndvi_mean"] < 0.25 else "MODERATE"
            alerts.append({
                "plot_id":        row["plot_id"],
                "variety":        row.get("variety", "N/A"),
                "hectares":       row.get("hectares", 0),
                "current_ndvi":   round(row["ndvi_mean"], 3),
                "ndvi_threshold": stress_threshold,
                "severity":       severity,
                "date":           row.get("acquisition_date", "N/A"),
                "pixel_valid":    round(row["pixel_valid_pct"], 1),
            })

    return sorted(alerts, key=lambda x: x["current_ndvi"])


def send_alert_email(
    alerts: List[dict],
    recipient: str,
    sender: str,
    smtp_host: str = "smtp.gmail.com",
    smtp_port: int = 587,
) -> None:
    """Send NDVI alert email to the agronomist."""
    if not alerts:
        return

    body_lines = [
        f"Satellite NDVI Monitoring Alert — {alerts[0]['date']}\n",
        f"Plots under water stress: {len(alerts)}\n",
        "=" * 60,
        "",
    ]
    for a in alerts:
        body_lines.append(
            f"  [{a['severity']}] Plot {a['plot_id']} ({a['variety']}, {a['hectares']} ha)\n"
            f"    NDVI: {a['current_ndvi']} (threshold: {a['ndvi_threshold']}) | Valid pixels: {a['pixel_valid']}%\n"
        )

    msg = MIMEMultipart()
    msg["Subject"] = f"[NDVI ALERT] {len(alerts)} plots under stress — {alerts[0]['date']}"
    msg["From"]    = sender
    msg["To"]      = recipient
    msg.attach(MIMEText("\n".join(body_lines), "plain"))

    smtp_password = os.getenv("SMTP_PASSWORD")
    with smtplib.SMTP(smtp_host, smtp_port) as server:
        server.starttls()
        server.login(sender, smtp_password)
        server.send_message(msg)
        print(f"Alert sent to {recipient} for {len(alerts)} stressed plots")

ROI of the Satellite Monitoring System

Quantifying the Return on Investment of satellite monitoring is essential for justifying the technological investment to viticulture businesses. For the Manduria vineyard, the cost-benefit analysis for the 2024 season shows:

      ROI Analysis — Satellite Monitoring System, Manduria Vineyard (35 ha, 2024 Season)
      
        ItemValueNotes

        COSTS

            Sentinel-2 data access (CDSE)€0
            Free open data
          
            Weather API Open-Meteo€0
            Non-commercial free tier
          
            Python pipeline development€2,500
            30 dev hours × €83/h (one-time, amortised over 3 years)
          
            Cloud hosting (VM + storage)€600/year
            4 vCPU VM + 50 GB S3 storage
          
            Agronomic integration consultancy€1,000
            One-time, crop-specific threshold calibration
          
Total Year 1 Cost€4,100Subsequent years: ~€1,400/year
ESTIMATED BENEFITS (35 ha, 100,000 bottles/year)

            Irrigation reduction (timing optimisation)€2,800
            15% water saving, 8,500 m³ × €0.33/m³
          
            Reduced crop protection inputs€1,750
            Targeted intervention in at-risk zones only (−20% treatments)
          
            Grape quality improvement€8,500
            +0.5 mean Brix points in 12 ha, +€0.20/kg quality premium
          
            Water stress loss recovery€3,200
            8% yield recovery in 2 plots with predicted severe stress
          
Total Year 1 Benefits€16,250
Year 1 ROI296%Payback in approximately 3 months

Google Earth Engine: Cloud Processing for Large-Scale Analysis

For analysis at regional or national scale — for example, variety mapping across an entire DOC appellation, assessing spring frost damage at provincial scale, or monitoring CAP compliance for paying agencies — Google Earth Engine (GEE) offers cloud-native computing capacity that would otherwise require dedicated HPC infrastructure.

import ee

# Google Earth Engine authentication (requires GEE account)
ee.Authenticate()
ee.Initialize(project="your-gee-project-id")

def calculate_ndvi_gee(
    aoi_geojson: dict,
    start_date: str,
    end_date: str,
    cloud_threshold: int = 20,
) -> ee.ImageCollection:
    """
    Compute NDVI at large scale using Google Earth Engine.
    Suited for analysis across DOC appellations or entire regions.

    Args:
        aoi_geojson: GeoJSON of the area of interest
        start_date: "YYYY-MM-DD"
        end_date:   "YYYY-MM-DD"
        cloud_threshold: Maximum cloud cover percentage (0-100)

    Returns:
        ImageCollection with NDVI computed for each scene
    """
    aoi = ee.Geometry(aoi_geojson)

    def mask_s2_clouds(image: ee.Image) -> ee.Image:
        """Mask clouds using the Sentinel-2 QA60 band."""
        qa = image.select("QA60")
        cloud_bit_mask = 1 << 10   # bit 10: opaque clouds
        cirrus_bit_mask = 1 << 11  # bit 11: cirrus clouds
        mask = qa.bitwiseAnd(cloud_bit_mask).eq(0).And(
               qa.bitwiseAnd(cirrus_bit_mask).eq(0))
        return image.updateMask(mask).divide(10000)  # scale to [0,1]

    def add_ndvi(image: ee.Image) -> ee.Image:
        """Add NDVI band to the Sentinel-2 image."""
        ndvi = image.normalizedDifference(["B8", "B4"]).rename("NDVI")
        return image.addBands(ndvi)

    collection = (
        ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED")
        .filterBounds(aoi)
        .filterDate(start_date, end_date)
        .filter(ee.Filter.lt("CLOUDY_PIXEL_PERCENTAGE", cloud_threshold))
        .map(mask_s2_clouds)
        .map(add_ndvi)
    )

    return collection


# Compute seasonal NDVI median mosaic (cloud-free composite)
def get_seasonal_ndvi_mosaic(
    collection: ee.ImageCollection,
    aoi: ee.Geometry,
) -> ee.Image:
    """Create seasonal NDVI mosaic (median is robust to residual outliers)."""
    ndvi_mosaic = collection.select("NDVI").median().clip(aoi)
    return ndvi_mosaic


# Example: Primitivo di Manduria DOC appellation (~21,000 ha)
aoi_doc_manduria = {
    "type": "Polygon",
    "coordinates": [[
        [17.2, 40.2], [17.8, 40.2], [17.8, 40.6],
        [17.2, 40.6], [17.2, 40.2],
    ]],
}

collection_summer = calculate_ndvi_gee(
    aoi_geojson=aoi_doc_manduria,
    start_date="2024-07-01",
    end_date="2024-08-31",
    cloud_threshold=15,
)
print(f"Available scenes: {collection_summer.size().getInfo()}")

# Export NDVI mosaic to Google Drive (for subsequent local Python analysis)
ndvi_mosaic = get_seasonal_ndvi_mosaic(collection_summer, ee.Geometry(aoi_doc_manduria))
task = ee.batch.Export.image.toDrive(
    image=ndvi_mosaic,
    description="NDVI_DOC_Manduria_Summer2024",
    folder="AgriTech_Export",
    fileNamePrefix="ndvi_manduria_summer2024",
    region=ee.Geometry(aoi_doc_manduria),
    scale=10,
    crs="EPSG:4326",
    maxPixels=1e10,
)
task.start()
print(f"Export started — Task ID: {task.id}")

Regulatory Framework and Open Data for AgriTech in Italy

Italy's agricultural open data ecosystem is structured around three main actors: AGEA (Agency for Agricultural Disbursements), the SIAN (National Agricultural Information System), and connected regional portals. In February 2025, AGEA completed the migration of SIAN to the National Strategic Pole (Polo Strategico Nazionale), updating the cloud infrastructure and expanding geospatial data processing capabilities.

      Main Italian Agricultural Open Data Sources
      
        SourceURLAvailable DataFormatUpdate Frequency

        
            SIAN Open Data
            dati.sian.it
            High-resolution orthophotos, AI-based Land Use Map, Farm Graphic Plans
            GeoTIFF, Shapefile, WFS
            Annual
          

            AGEA EO Services
            sian.it
            Multispectral orthophotos, X-band radar imagery, DEM, DSM
            GeoTIFF, ECW
            Periodic (campaign)
          

            ISTAT — Agricultural Statistics
            istat.it/agricoltura
            UAA, production, farm structure, agricultural census
            CSV, JSON, SDMX
            Annual
          

            National Geoportal MATTM
            geoportale.gov.it
            Carta Natura, Corine Land Cover, IDT, DBT
            WMS, WFS, Shapefile
            Variable
          

            Regional ARPA Networks (15 regions)
            arpa.[region].it
            Station climate/weather data, regional forecast models
            CSV, JSON, NetCDF
            Near real-time
          

            Copernicus Land Service
            land.copernicus.eu
            CORINE Land Cover, Grassland, Water and Wetness, Urban Atlas
            GeoTIFF, Shapefile
            Triennial/annual
          

      
    

PNRR Transizione 5.0 Tax Credits for AgriTech

Investments in satellite monitoring systems and precision agriculture can qualify for benefits under Piano Transizione 5.0 (Law 207/2024 — 2025 Budget Law). Agricultural businesses, admitted from 2025, can access tax credits for investment in 4.0 tangible and intangible capital goods, including farm management software with artificial intelligence and geospatial data analytics components.

Transizione 5.0 Tax Credit Rates for Agricultural Businesses (2025)

Investment Band	Energy Saving 3–6%	Saving 6–10%	Saving >10%
Up to €2.5M	35%	40%	45%
€2.5M to €10M	15%	20%	25%
€10M to €50M	5%	10%	15%

Precision irrigation systems driven by NDVI and weather forecasts can qualify for the "energy saving >6%" band if they document a reduction in water consumption (pumping energy) versus the baseline. A certified technical report is required.

Production Architecture: Best Practices and Anti-Patterns

Having described the components, it is important to synthesise best practices for a robust production implementation and identify the common anti-patterns that lead to fragile or inaccurate systems.

Best Practices — AgriTech Satellite Monitoring System

Progressive cloud filtering: Do not rely solely on the product-level cloud cover percentage (Level-1 metadata). Always apply pixel-level masking with QA60 (Sentinel-2) or BQA (Landsat). A product at 5% cloud cover may have a specific plot completely obscured by a localised cloud.
Cloud Optimized GeoTIFF (COG): Archive NDVI rasters in COG format for efficient range-request access from S3/MinIO. Avoid classic GeoTIFFs not optimised for cloud storage.
Consistent projection: Standardise EVERYTHING to EPSG:4326 (WGS84) for final data, or to UTM Zone 32N (EPSG:32632) for Italy to preserve correct metric measurements. Never mix CRS without explicit reprojection.
Ground truth IoT validation: Calibrate NDVI thresholds against field measurements (soil moisture sensors, portable LAI meter, field-collected agronomic data). NDVI alone without local calibration can yield 15–30% false-positive/negative alerts.
Historical archive management: Maintain at least 3–5 years of NDVI time series for each plot. The NDVI anomaly relative to the historical mean (same month, same crop) is far more informative than the absolute value.
Rate limiting and caching: CDSE and Open-Meteo have rate limits. Always implement local caching (file-based or Redis) to avoid redundant downloads. A single Sentinel-2 download takes 30–120 seconds — never request the same data twice.
Logging and audit trail: Every NDVI computation must be traceable to the source satellite scene, acquisition date, cloud cover percentage, and algorithm version. Essential for CAP/AGEA audits and for debugging anomalies.

Common Anti-Patterns — AgriTech Satellite Systems

Skipping atmospheric correction: Using Level-1C (top-of-atmosphere) instead of Level-2A (bottom-of-atmosphere) introduces systematic NDVI errors on the order of 10–20% under conditions of high atmospheric humidity (Po Valley fog, Adriatic proximity). ALWAYS use L2A for quantitative applications.
NDVI as the sole indicator: Relying exclusively on NDVI while ignoring EVI (for dense canopies), NDWI (early water stress), and Red Edge NDVI (early stress before it is visible in standard NDVI) is insufficient. A professional system uses at least 3–4 indices in combination.
Pixel size vs. plot size mismatch: Applying NDVI to plots smaller than 0.3 ha at 10-metre resolution produces statistically unreliable means (fewer than 30 pixels). For small plots, consider Planet Labs (3.7 m) or drone-mounted sensors.
Missing temporal compositing: Taking the single available acquisition for a date — even at 19% cloud cover — leads to NDVI artefacts in partially cloudy zones. Always use medoid compositing over a 10–15-day temporal window.
Pipeline without data versioning: Recomputing NDVI with different Python/rasterio/sentinelhub versions produces slightly different values for the same source data. Version computation pipelines with DVC or equivalent.

Complete Pipeline: Orchestration with Prefect

Integrating all the described components, the complete satellite monitoring pipeline can be orchestrated with Prefect (see the Pipeline Orchestration series) for automatic execution every 5 days — aligned with the Sentinel-2 revisit frequency.

from prefect import flow, task
from prefect.schedules import CronSchedule
from datetime import datetime, timedelta
import logging

logger = logging.getLogger(__name__)

@task(retries=3, retry_delay_seconds=60, name="search-sentinel2-products")
def search_products_task(bbox: tuple, lookback_days: int = 7) -> list[dict]:
    """Prefect task: search for recent Sentinel-2 products."""
    end_date = datetime.utcnow()
    start_date = end_date - timedelta(days=lookback_days)
    products = search_sentinel2_products(
        bbox=bbox,
        start_date=start_date.strftime("%Y-%m-%dT00:00:00.000Z"),
        end_date=end_date.strftime("%Y-%m-%dT23:59:59.000Z"),
        cloud_cover_max=20.0,
    )
    logger.info(f"Found {len(products)} Sentinel-2 products")
    return products


@task(retries=2, retry_delay_seconds=120, name="download-compute-ndvi")
def compute_ndvi_task(
    product: dict,
    aoi_bbox: tuple,
    output_dir: str = "/data/ndvi",
) -> str:
    """Prefect task: download bands and compute NDVI for one product."""
    # Implementation based on functions defined in previous sections
    acquisition_date = product["ContentDate"]["Start"][:10]
    output_path = f"{output_dir}/ndvi_{acquisition_date}.tif"

    # Download + compute (code omitted for brevity, uses functions above)
    logger.info(f"NDVI computed for {acquisition_date}: {output_path}")
    return output_path


@task(name="zonal-stats-postgis")
def zonal_stats_task(ndvi_path: str, plots_shapefile: str) -> dict:
    """Prefect task: compute zonal statistics and persist to PostGIS."""
    gdf_plots = gpd.read_file(plots_shapefile).to_crs("EPSG:4326")
    acquisition_date = ndvi_path.split("ndvi_")[1].replace(".tif", "")

    gdf_ndvi = compute_zonal_ndvi(ndvi_path, gdf_plots, acquisition_date)

    engine = create_engine(os.getenv("POSTGIS_URL"))
    gdf_ndvi.to_postgis("ndvi_historical", engine, if_exists="append", index=False)

    alerts = check_ndvi_alerts(gdf_ndvi)
    return {"plots_processed": len(gdf_ndvi), "alerts": len(alerts), "alert_details": alerts}


@task(name="send-alerts")
def alerts_task(stats: dict) -> None:
    """Prefect task: send alert email if any alerts are present."""
    if stats["alerts"] > 0:
        send_alert_email(
            alerts=stats["alert_details"],
            recipient=os.getenv("AGRONOMIST_EMAIL"),
            sender=os.getenv("ALERT_EMAIL"),
        )
        logger.info(f"Alerts sent for {stats['alerts']} plots")


@flow(
    name="satellite-monitoring-pipeline",
    schedule=CronSchedule(cron="0 8 */5 * *"),  # every 5 days at 08:00
)
def satellite_monitoring_flow(
    bbox: tuple = (17.35, 40.28, 17.60, 40.50),
    plots_shapefile: str = "/data/plots_vineyard.shp",
) -> None:
    """
    Complete satellite monitoring pipeline for the vineyard.
    Runs every 5 days, aligned with the Sentinel-2 revisit frequency.
    """
    products = search_products_task(bbox=bbox)

    if not products:
        logger.warning("No products found in the last period")
        return

    # Process the most recent product with minimum cloud cover
    best_product = sorted(
        products,
        key=lambda p: next(
            (a["Value"] for a in p.get("Attributes", []) if a["Name"] == "cloudCover"), 100
        ),
    )[0]

    ndvi_path = compute_ndvi_task(best_product, bbox)
    stats = zonal_stats_task(ndvi_path, plots_shapefile)
    alerts_task(stats)

    logger.info(
        f"Pipeline complete: {stats['plots_processed']} plots, {stats['alerts']} alerts"
    )


if __name__ == "__main__":
    satellite_monitoring_flow()

Conclusions and Next Steps

Sentinel-2 satellite data, freely accessible through CDSE, is arguably the most underutilised data source in Italian AgriTech. With Python, rasterio, sentinelhub, and a CDSE account, you can build in just a few days an NDVI monitoring system that outperforms many commercial paid solutions in quality and granularity.

The key to success is not technological sophistication but local agronomic calibration: NDVI thresholds, alert time windows, and feature weights in the predictive model must be tuned to the specific crop, local climate, and the farm's agronomic practices. A system calibrated for grapevine in Puglia will not perform equally well for wheat in the Po Valley without human calibration intervention.

Key takeaways for anyone building a similar system:

Always use Sentinel-2 Level-2A (atmospheric correction applied) for quantitative comparisons across different dates.
Combine NDVI with at least one complementary index (EVI for dense vineyards, NDWI for early water stress, Red Edge NDVI for chlorophyll monitoring).
Integrate Open-Meteo ERA5-Land for the water balance: ET0 + rainfall + model-derived soil moisture is an excellent proxy when you do not have IoT sensors in the field.
Maintain a historical series of at least 3 years per plot: the anomaly relative to the historical mean is far more actionable than an absolute NDVI value.
Scale with Google Earth Engine only when you need to process regional or national scales: for individual farms, the local Python pipeline on CDSE is more controllable and avoids vendor lock-in.

Continue the FoodTech Series

You have built the satellite pipeline. Now explore the other articles in the series to complete the AgriTech architecture:

Article 1: IoT Pipeline for Precision Agriculture with Python and MQTT — How to integrate ground-level sensors with the satellite pipeline.
Article 2: Edge ML for Crop Monitoring: Computer Vision in the Field — Computer vision models for early detection of crop diseases.
Article 4: Blockchain Traceability in Food: From Field to Supermarket — How NDVI data feeds quality certificates on-chain.

To dive deeper into MLOps and the deployment of the predictive models described in this article, see the MLOps for Business with MLflow series and the Enterprise LLM: RAG and Fine-Tuning series.

Band	Wavelength (nm)	Resolution	Primary Application
B1 - Coastal aerosol	443	60 m	Air quality, atmospheric correction
B2 - Blue	490	10 m	Vegetation/soil discrimination, mapping
B3 - Green	560	10 m	Vegetation vigour, leaf colour
B4 - Red	665	10 m	Chlorophyll, NDVI (red band)
B5 - Vegetation Red Edge	705	20 m	Vegetation stress, Red Edge NDVI
B6 - Vegetation Red Edge	740	20 m	Chlorophyll content, species classification
B7 - Vegetation Red Edge	783	20 m	Biomass, LAI
B8 - NIR	842	10 m	NDVI (NIR band), biomass, LAI
B8a - Vegetation Red Edge	865	20 m	Canopy structure, narrow NDVI
B9 - Water vapour	940	60 m	Atmospheric water vapour correction
B10 - SWIR - Cirrus	1375	60 m	Cirrus cloud detection
B11 - SWIR	1610	20 m	Water stress, leaf water content
B12 - SWIR	2190	20 m	Advanced water stress, senescence

Platform	Spatial Resolution	Revisit Frequency	Spectral Bands	Cost	Data Latency	API
Sentinel-2 (ESA/Copernicus)	10 m (vis/NIR), 20 m (SWIR)	5 days (2–3 in Europe)	13 MSI bands	Free (open data)	~3 hours from acquisition	CDSE OData, STAC, SentinelHub, OpenEO
Planet PlanetScope	3.7 m	Daily (near-daily)	8 bands (PS2.SD)	Commercial subscription; free for research	24 hours	Planet API v1, Subscriptions API
Landsat 8/9 (NASA/USGS)	30 m (multispectral), 15 m (pan)	16 days	11 OLI/TIRS bands	Free (open data)	24–48 hours	USGS EarthExplorer, Google Earth Engine
MODIS (NASA Terra/Aqua)	250 m / 500 m / 1 km	1–2 days	36 bands	Free (open data)	24–48 hours	NASA EarthData STAC, Google Earth Engine
Google Earth Engine	Dataset-dependent (10 m–1 km)	Dataset-dependent	Multi-dataset integrated	Free non-commercial; paid commercial	Immediate cloud processing	Python ee, JavaScript API
Maxar WorldView-3	0.3 m (pan), 1.24 m (multi)	1–4 days	29 bands (CAVIS)	~$25–40/km²	4–8 hours	Maxar Streaming API
Airbus Pleiades Neo	0.3 m	1–2 days	6 multispectral bands	~€15–30/km²	24–48 hours	OneAtlas API

NDVI Range	General Interpretation	Wheat (Triticum)	Grapevine (Vitis)	Tomato (Solanum)	Maize (Zea mays)	Olive (Olea)
< 0.1	Bare soil, rock, snow	Unseeded ground	Winter, pre-bud break	Post-harvest	Pre-sowing	Bare inter-row soil
0.1 - 0.2	Very sparse or dry vegetation	Initial emergence	Dormancy / severe stress	Recent transplant	Emergence (VE)	Severe water/thermal stress
0.2 - 0.4	Sparse vegetation, moderate stress	Early tillering	Pre-flowering, slight stress	Early vegetative growth	Stages V1–V3	Sparse canopy, stress present
0.4 - 0.6	Moderate vegetation, good health	Full stem extension	Pre-flowering / fruit set	Active vegetative development	Stages V4–V8	Normal canopy, well irrigated
0.6 - 0.8	Dense vegetation, excellent vigour	Heading (seasonal peak)	Full foliation, veraison	Maximum cover (flowering)	Stages V10–VT (tasselling)	Dense canopy, optimal condition
> 0.8	Very dense vegetation, forest	Rare (field-edge woodland)	Hedgerows and boundary trees	Greenhouses with full cover	Rare optimal conditions	High-density olive groves

Provider	Price	Spatial Resolution	Forecast	Agro Variables	Historical Archive	API
Open-Meteo	Free (non-commercial); ~$15/month commercial	1–11 km (multi-model)	16 days	ET0, soil temp, soil moisture, vapour pressure	1940–present (ERA5)	REST JSON, Python client
OpenWeatherMap	Free up to 60 calls/min; $40/month pro	2.5 km (ICON)	5 days (free), 16 days (pro)	Limited, no ET0	Paid history only	REST JSON, Python SDK
Tomorrow.io	Free 500 calls/day; $199/month core	1 km	21 days	SoilMoisture, ET0, spray conditions, pest risk	6 years	REST, WebSocket, gRPC
ARPA Regional (Italy, 15 regions)	Free (regional open data)	Point stations (~1,500 stations across Italy)	No (observed data only)	All agro-meteorological variables	Full archive (decades)	WMS, REST, CSV/JSON download
Meteoam / CFS/ECMWF	ECMWF open data free; CFS NOAA free	9–25 km	10–45 days	Basic meteorological variables	1940–present via ERA5	ECMWF API (Python), CDS API
Visual Crossing	Free 1,000 records/day; $35/month standard	Station-interpolated	15 days	ET0, Growing Degree Days, frost risk	Unlimited paid	REST JSON, CSV, SQL-like

Parameter	Value
Location	Manduria (TA), Puglia — lat 40.38, lon 17.47
Total area	35 hectares divided into 12 plots
Main varieties	Primitivo (80%), Negroamaro (20%)
Training systems	Bush vine — alberello (8 plots), VSP — controspalliera (4 plots)
Planting year	2003–2015 (mixed young/mature vines)
Irrigation	Drip irrigation (not all plots)
Monitoring satellite	Sentinel-2A/B, L2A, 10m resolution
Acquisitions used	28 scenes in 2024 season (cloud < 20%)
Integrated IoT sensors	6 weather stations, 18 soil moisture sensors at 30/60 cm
Analysis period	April – October 2024

Plot	Variety	NDVI Apr	NDVI Jun	NDVI Jul (peak)	NDVI Aug	NDVI Sep	Irrigation Alert
Block A1	Primitivo / bush vine	0.28	0.42	0.58	0.41	0.31	Moderate stress August
Block A2	Primitivo / bush vine	0.22	0.36	0.47	0.29	0.24	Severe stress Aug–Sep
Block B1	Primitivo / VSP	0.31	0.53	0.66	0.57	0.44	No alert
Block B2	Primitivo / VSP	0.29	0.49	0.63	0.52	0.40	No alert
Block C1	Negroamaro / bush vine	0.25	0.44	0.55	0.38	0.28	Moderate stress August
Block C2	Negroamaro / VSP	0.30	0.51	0.61	0.50	0.38	No alert

Item	Value	Notes
COSTS
Sentinel-2 data access (CDSE)	€0	Free open data
Weather API Open-Meteo	€0	Non-commercial free tier
Python pipeline development	€2,500	30 dev hours × €83/h (one-time, amortised over 3 years)
Cloud hosting (VM + storage)	€600/year	4 vCPU VM + 50 GB S3 storage
Agronomic integration consultancy	€1,000	One-time, crop-specific threshold calibration
Total Year 1 Cost	€4,100	Subsequent years: ~€1,400/year
ESTIMATED BENEFITS (35 ha, 100,000 bottles/year)
Irrigation reduction (timing optimisation)	€2,800	15% water saving, 8,500 m³ × €0.33/m³
Reduced crop protection inputs	€1,750	Targeted intervention in at-risk zones only (−20% treatments)
Grape quality improvement	€8,500	+0.5 mean Brix points in 12 ha, +€0.20/kg quality premium
Water stress loss recovery	€3,200	8% yield recovery in 2 plots with predicted severe stress
Total Year 1 Benefits	€16,250
Year 1 ROI	296%	Payback in approximately 3 months

Source	URL	Available Data	Format	Update Frequency
SIAN Open Data	dati.sian.it	High-resolution orthophotos, AI-based Land Use Map, Farm Graphic Plans	GeoTIFF, Shapefile, WFS	Annual
AGEA EO Services	sian.it	Multispectral orthophotos, X-band radar imagery, DEM, DSM	GeoTIFF, ECW	Periodic (campaign)
ISTAT — Agricultural Statistics	istat.it/agricoltura	UAA, production, farm structure, agricultural census	CSV, JSON, SDMX	Annual
National Geoportal MATTM	geoportale.gov.it	Carta Natura, Corine Land Cover, IDT, DBT	WMS, WFS, Shapefile	Variable
Regional ARPA Networks (15 regions)	arpa.[region].it	Station climate/weather data, regional forecast models	CSV, JSON, NetCDF	Near real-time
Copernicus Land Service	land.copernicus.eu	CORINE Land Cover, Grassland, Water and Wetness, Urban Atlas	GeoTIFF, Shapefile	Triennial/annual