What is an anomaly?
Every day, the Global Historical Climatology Network collects temperatures from 90,000 weather stations. Dating back as far as the late 1700's, the records provide an incredible source of insight into our changing climate.
Using this data, we can determine what the weather is normally like for most places on Earth. We can tell you that the average low temperature in New York City on January 11th is 29°F and that the average high temperature in Los Angeles on July 24th is 80°F.
Once we know what temperatures to expect on any given day with a certain degree of confidence, we can sift out the uneventful days, leaving only anomalous weather events.
These criteria enabled us to track the last 50 years of temperature anomalies and categorize them into four types.
COLD anomalies occur on days when the daily high or low temperature falls below its expected range.
WARM anomalies occur when the high or low temperature falls above its expected range.
STRONG anomalies occur on those rare days when both the daily high and low temperatures fall above (STRONG WARM) or below (STRONG COLD) their expected range.
A clear trend
Since 1964, the proportion of WARM and STRONG WARM anomalies has risen from about 42% of the total to almost 67% of the total – an average increase of 0.5% per year. This trend, fitted with a generalized linear model, accounts for 40% of the year-to-year variation in warm versus cold anomalies, and is highly significant with a p-value approaching 0.0. Though we remain cautious about making predictions based on this model, it suggests that this yearly proportion of warm anomalies will regularly fall above 70% in the 2030's.
Yet despite the clarity of this trend, it is difficult to detect on a day-to-day basis. This past winter's polar vortex, for example, prompted some pundits to claim that global warming is a "hoax". But while daily and even annual weather trends may be subject to severe fluctuations, climate change shifts starkly into focus when viewed through a historical lens.
How we did it
Starting with the raw files on NOAA's FTP server, we downloaded all historical daily measurements from each of the 90,000 weather stations worldwide, standardized their schemas, and geocoded the stations' locations. This 90 GB file constituted nearly 900 million rows. To ensure data quality, we filtered this file to stations in the U.S. which had consistently collected data throughout the 50 year period of 1964 to 2013. This left us with 2,716 stations and 25 GB of data.
Armed with this refined dataset, we computed the historical range of low and high temperatures for each station for each month of the year.
We then compared each station's daily temperatures to its corresponding monthly distribution. If one or both of these measurements fell in the bottom or top 2% on a given day, we labeled it an "anomaly" according to the typology above. For instance, over the 50 year period, we identified 64 temperature anomalies in Roanoke, Alabama in the month of April, which has an average low of 46°F and high of 76°F.
Nationally, this process yielded about 3 million records, or just over 1,000 anomalous days per station.
By the numbers
most warm anomalies
warm anomalies 2011
most anomalous day
Enigma.io is a search engine and API for public data. We find certain datasets to be especially powerful because the underlying phenomena they capture have such a fundamental impact on our world. Climate affects our agricultural production, the price of gasoline, the livelihood of small businesses or temporary farm workers, and ultimately the sustainability of our species on this planet.
On March 31st, the Intergovernmental Panel on Climate Change released a report reaffirming the scientific community's consensus that the "worst is yet to come." The report projects the effects of climate change on society in the next decades, and questions for instance, the sustainability of our food supply. The White House also recently declared that "every citizen will be affected by climate change," and issued call to action for the open exploration of climate data.
At Enigma, our priority is to enable the exploration of climate change by bringing the data to the table. We have come a long way since the earliest efforts in gathering climate data. However accessing the data can be a painful process, especially when datasets are siloed in scattered locations or varied formats. Weather signals alone can be powerful indicators, but it is by placing them in relation to our economy and society that the impact of climate change becomes meaningful.
We are committed to connecting this entire world of public data to the developers, researchers, data scientists, and climatologists who are striving to make sense of our changing climate and its effect on our lives. We will support civic projects in any capacity, so get in touch if you're looking to make an impact - we'd love to help out. Join us in this effort and go grab some data.