About this episode
Foundations of Amateur Radio
Recently I saw a social media post featuring a screenshot of some random website with pretty charts and indicators describing "current HF propagation". Aside from lacking a date, it helpfully included notations like "Solar Storm Imminent" and "Band Closed".
It made me wonder, not for the first time, what the reliability of this type of notification is. Does it actually indicate what you might expect when you get on air to make noise, is it globally relevant, is the data valid or real-time? You get the idea.
How do you determine the relationship between this pretty display and reality?
Immediately the WSPR or Weak Signal Propagation Reporter database came to mind. It's a massive collection of signal reports capturing time, band, station and other parameters, one of which is the Signal To Noise ratio or SNR.
If the number of sun spots, or a geomagnetic index change affected propagation, can we see an effect on the SNR?
Although there's close on a million records per day, I'll note in advance that my current approach of taking a daily average across all reports on a specific band, completely ignores the number of reports, the types and direction of antennas, the distance between stations, transmitter power, local noise or any number of other variables.
Using the online "wspr.live" database, looking only at 2024, I linked the daily recorded WSPR SNR average per band to the Sun Spot Numbers and Geomagnetic Index and immediately ran into problems. For starters the daily Sun Spot Number or SSN, from the Royal Observatory in Belgium does not appear to be complete. I'm not yet sure why.
For example, there's only 288 days of SSN data in 2024. Does this mean that the observers were on holiday on the other 78 days, or was the SSN zero? Curiously there's 60 days where there's more than one recording and as a bonus, on New Years Eve 2024, there's three recordings, all with the same time stamp, midnight, with 181, 194 and 194 sun spots, so I took the daily average. Also, I ignored the timezone, since that's not apparent.
Similarly the Geomagnetic Index data from the Helmholtz Centre for Geosciences in Potsdam, Germany has several weird artefacts around 1970's data, but fortunately not within 2024 that I saw. The data is collected every three hours, so I averaged that, too.
After excluding days where the SSN was missing, I ran into the next issue, my database query was too big, understandable, since there are many reports in this database, 2 billion, give or take, for 2024 alone.
Normally I'd be running this type of query on my own hardware, but you might know that I lost my main research computer last year, well, I didn't lose it as such, I can see it from where I am right now, but it won't power up. Money aside, I've been working on it, but being unceremoniously moved from Intel to ARM is not something I'd recommend.
I created a script that extracted the data, one day at a time, with