Berkeley Earth New Data Release

Berkeley Earth has just released a new version of the Berkeley Earth dataset, which is more comprehensive than the version released in October 2011, and fixes some bugs in the initial release. You can access the new dataset here:

Berkeley earth is an excellent example of the new movements in open data that allow for
citizen science initiatives as well as artists use of scientific data

The Berkeley Earth data set is now publicly available here.

The Berkeley Earth analysis programs are now publicly available here.
Summary charts, using all of the available data, are here.
A video showing the Berkeley Earth land temperature anomaly is here.

The Berkeley Earth Surface Temperature Study has created a preliminary
merged data set by combining 1.6 billion temperature
reports from 16 preexisting data archives. Whenever possible, we have
used raw data rather than previously homogenized or
edited data. After eliminating duplicate records, the current archive
contains 39,390 unique stations. This is more than five times
the 7,280 stations found in the Global Historical Climatology Network
Monthly data set (GHCN-M) that has served as the focus
of many climate studies. The GHCN-M is limited by strict requirements
for record length, completeness, and the need for nearly
complete reference intervals used to define baselines. We have
developed new algorithms that reduce the need to impose these
requirements (see methodology), and as such we have intentionally
created a more expansive data set.

We performed a series of tests to identify dubious data and merge
identical data coming from multiple archives. In general, our
process was to flag dubious data rather than simply eliminating it.
Flagged values were generally excluded from further analysis,
but their content is preserved for future consideration.