Solar Radio Data on Cloud¶
OVRO-LWA solar data products are published on AWS so users can browse, download, and analyze observations without the staging workflow required by the on-site portal.
Original data portal¶
The legacy entry point for preview and query is the OVRO LWA Data Portal.
That portal is useful for discovery, but two practical limits motivated a cloud mirror:
- Shared uplink bandwidth — the whole observatory shares about 1 Gbps upload capacity, so large transfers compete with other services.
- Multi-step workflow — users must stage → download → process on their own machine, which is slow and inconvenient for routine science or classroom use.
AWS data products¶
Solar radio data products are mirrored to a public S3 bucket:
- Bucket explorer: ovro-lwa-solar on AWS S3
On AWS you can:
- Download files directly with high throughput from S3 edge locations.
- Stream or load data programmatically in Python (no manual staging on the observatory server).
- Integrate the same URLs into notebooks, pipelines, and Colab workflows.
Interactive example (Google Colab)¶
A full walkthrough notebook is available here:
Open the notebook to see end-to-end examples of listing objects, querying by time/polarization/format, downloading, and plotting products from the bucket.
Install the helper package (same API as in the notebook):
pip install "git+https://github.com/ovro-eovsa/lwa-solar-util.git"
import lwasolarutl as lsu
Code snippets¶
The examples below mirror the Colab notebook. Bucket: ovro-lwa-solar (unsigned public read). Image products live under prefixes like image_hdf/YYYY/MM/DD/.
(1) List files in a given directory with boto3¶
import boto3
from botocore import UNSIGNED
from botocore.config import Config
import lwasolarutl as lsu
bucket = "ovro-lwa-solar"
date = "20260405" # YYYYMMDD or YYYY-MM-DD
prefix = lsu.aws.date_to_s3_prefix(date) # e.g. image_hdf/2026/04/05/
s3 = boto3.client("s3", config=Config(signature_version=UNSIGNED))
paginator = s3.get_paginator("list_objects_v2")
for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
for obj in page.get("Contents", []):
print(obj["Key"], obj["Size"])
You can also reuse the package client: s3 = lsu.aws.get_s3_client().
(2) Query files by time range, polarization, and format (lsu)¶
get_flist_from_s3 filters level-1 image products on a given day. Filenames look like:
ovro-lwa-352.lev1_mfs_10s.2026-04-05T200001Z.image_I.hdf
import lwasolarutl as lsu
keys = lsu.aws.get_flist_from_s3(
date="20260405",
fmt="mfs", # e.g. mfs, fch — matches lev1_{fmt}_10s in the filename
pol="I", # Stokes I, V, ...
t0="19:00:00", # optional time-of-day window (UTC in filename)
t1="21:00:00",
)
for key in keys[:5]:
print(key)
With metadata (timestamp parsed from the filename):
entries = lsu.aws.get_flist_from_s3(
date="20260405",
fmt="mfs",
pol="I",
t0="19:00:00",
t1="21:00:00",
return_metadata=True,
)
for e in entries[:3]:
print(e["time"], e["key"])
(3) Download¶
Download one object by S3 key:
import lwasolarutl as lsu
key = keys[0] # from the query above
local_path = lsu.aws.download_s3_key(
key,
"cache/" + key.split("/")[-1],
overwrite=False,
)
print("saved to", local_path)
Download the first readable HDF from a query result (skips corrupt files):
import lwasolarutl as lsu
entries = lsu.aws.get_flist_from_s3(
date="20260405",
fmt="mfs",
pol="I",
t0="19:00:00",
t1="21:00:00",
return_metadata=True,
)
entry, hdf_path = lsu.aws.download_readable_hdf(entries, cache_dir="cache")
print(entry["name"], "->", hdf_path)
Recover FITS from HDF and plot (after download):
fits_path = lsu.recover_fits_from_h5(hdf_path, "cache/recovered.fits")
fig, axes = lsu.visualization.slow_pipeline_default_plot(fits_path, add_logo=False)
Suggested workflow¶
- Browse the S3 explorer to see the directory layout for a given day.
- Use
lsu.aws.get_flist_from_s3(...)to query bydate,fmt,pol, and optionalt0/t1, or run the full Colab notebook. - Download with
lsu.aws.download_s3_keyordownload_readable_hdf, then recover/plot withlsu.recover_fits_from_h5andlsu.visualization.slow_pipeline_default_plot. - For quick checks of recent activity, the OVRO LWA Data Portal and live spectrum stream remain helpful complements.