Historical data recorded decades ago on flimsy paper, microfilm, or microfiche can easily be lost over time, calling for urgent intervention. An earlier rescue effort, the Climate Data Modernization program by NOAA, made good progress but was unfortunately canceled in 2011. International Environmental Data Rescue Organization (IEDRO) was established to pick up the slack, mobilizing technical and human resources towards rescue projects worldwide. Since then, IEDRO saved millions of data records with the help of volunteers with a passion for and commitment to environmental data rescue.
Still, tons of records that still need to be rescued require the engagement of the wider public. Here, we summarize the main steps of the data rescue process so that everyone can understand best practices and practical challenges.

Data in books and microfilms
The data rescue process generally takes seven steps:
- Pinpointing where and which records are to be rescued:
The urgency and cost of environmental data rescue mean that not all records can be rescued. An estimated 500,000 records of crucial historical weather data are being permanently lost daily, while it costs $10 to digitize one page of records and $2 to key in 100 observations. Data rescue efforts have to prioritize the most decisive records for climate modeling, weather forecasting, or other meteorological applications.
Critical to proceeding is interest by the owners of the historical records, but an early step is verifying that the data of interest are not already available in digital form. Primarily, IEDRO will check with the National Center for Environmental Information (NCEI) regarding its needs. In return, the data rescued will ultimately be stored in the NCEI database.
Sometimes IEDRO learns of the existence of records through personal connections, sometimes leading to significant results. For example, IEDRO once learned from a retired police officer that the Jesuit priests in Punta Arenas kept 504,000 local surface observations from 1870 to 1966 in a private museum, Museo Maggiorino Borgatello. The data enables the reconstruction of the atmosphere’s circulation in this very remote part of the world.

Victims of Flooding in Malawi
- Planning the project:
IEDRO will communicate virtually and in person with the data owners, usually the local meteorological service, for multiple purposes – inquiring for their consent to save and freely share the data, assessing the local conditions, and devising a rescue plan to fit their conditions.
After a thorough discussion online, two volunteers will typically be on-site for a week. They need to confirm what kinds and frequencies of measurements were recorded, at which locations, and over which years these occurred. Also important is on what medium the records are stored, their condition, and what resources are available to assist in the rescue, such as agency staff and their understanding of data rescue procedures. It is a necessary step before further planning, as the information from data owners could be inaccurate. For example, the meteorological service in Uzbekistan reported that they have about 315,000 papers of data, but there were actually 17 million when IEDRO’s team counted on-site.
After the investigation, IEDRO will draft a detailed rescue plan, including people, training, equipment, management, quality check procedures, rescue sequence, and timeline. With a plan and cost estimate in hand, IEDRO can work with the data owner to find funds for project operations.
In-person visits are crucial to ensuring operational success for data rescue operations, but also provide an opportunity to further solidify the commitment of local leadership. Getting data owners’ agreement could be the most challenging part of the whole rescue process. Data owners have many concerns over digitization or sharing their data, such as national security and potential revenue sources, even though much of the cost of the operation often comes from external sources and leaves the data owner with equipment worth hundreds or thousands of dollars.
- Setting up the site:
Once data owners agree on the rescue plan, IEDRO will assist in setting up the equipment and training the personnel for the actual data rescue work. The IEDRO team will purchase and install the computers, scanners/cameras, and memory cards and train the local volunteers or paid contributors on the best practices and workflows.

On-site Training: Mozambique Data Rescue Instruction by Dr. Wasila Thiaw NOAA
- Inventorying the records
The first step of the actual rescue work is inventorying the data records. IEDRO will work with technical staff in the local office to understand how the records were stored in the past and reorganize them by station, date, time, and types (e.g., temperature, participation, or hydrological data). For future data validation, the process should be tracked in an Excel table detailing the metadata and the number of pages/microfilms/microclips inventoried.
Meanwhile, the rescue team determines which part of the records should be prioritized, taking into account the local meteorological service’s needs and the deterioration of the records.

Nicaragua Inventory
- Imaging the data
When the inventory is finished, data rescuers can move to imaging the paper, microfilm, or microfiche records to protect them from further deterioration. Photographic imaging is usually better than scanning, as cameras neither damage book bindings nor require special equipment, maintenance, or training. In preparation, volunteers will set up tables, cameras with memory cards on camera stands/tripods, two to four lights, and computers. Then each record will be placed on the table, flattened by a sheet of glass, and imaged. Images will be downloaded to computers with copies stored on flash drives for later uploading to an appropriate data archive with the digitized data..
Practical experience provides some tips for ensuring the quality of images. The first is to keep track of how many and which pages have been imaged and where they are stored after validation. An Excel file is useful for this purpose. The second is periodic validation before backing up images, paying attention to check if (1) images are readable, (2) the number of pages imaged matches the count in the tracking file, and (3) the image file names match the content in corresponding images.
For example, whenever rescuers image a page of records, they update the tracking file with the image and its file name. The IEDRO team has been using a camera-embedded program to automatically rename image files to a standard naming convention, such as [station identification number_data type_date_page number]. Every 300 pages, volunteers check the image files. If all pages are imaged clearly, with none missing, and all are correctly named, they download and write validated files into the appropriate storage media and update the tracking file with the directory in which the files are saved. The imaging tracking file should also match the inventory tracking file.
If you are interested in more details about the imaging process, please check out pages 13-15 of the Guidelines on Best Practices for Climate Data Rescue.

Imaging equipment
- Digitizing the data
The approaches to digitizing data from images depend on whether the data is recorded as line graphs or textual tables. For line graphs, typically strip charts, a software program is available to estimate the coordinates of data points from the lines. For textual tables, typically handwritten, rescuers have to manually read and enter the data into a database with the help of a web-based application (Weather Wizards web application) or a spreadsheet-based approach developed by Old Weather.
Many might not be familiar with strip charts in the meteorological context. Please refer to this blog [a hyperlink will be inserted here to another blog about strip charts] for a detailed introduction to strip charts and the special programs used to digitize them.
In practice, there is an additional must-have step during digitization to ensure the accuracy of data entries–having two or more volunteers independently key in data from the same images. Any differences in their entries are separately examined and resolved. Data values can also be examined for plausibility, always referring to the image from which it was obtained for validation.

An example of line graphs and textual tables: Chile Bathythermograph
- Making the data accessible to everyone
The last step is to log the images and data into an open archival database, with the approval of the data owners of course. IEDRO typically uses NOAA’s National Centers for Environmental Information (NCEI) in Asheville, North Carolina for this archival step although others are available
A Practical Challenge
Data rescuing might sound technically complex, but the practical challenge lies in project management.
As you read through the data rescue process above, you might have noticed many measures designed to ensure the completeness and accuracy of data images and digitalization. These measures are checkpoints that are part of the monitoring process. Similar to a food safety management system, instead of testing the final product samples to see if they meet relevant standards, ensuring that each procedure is done as required works better for quality control.
For example, in the digitization step some projects enter data only once, or those entering data copy from each other to cut off the time and money that should have been committed. Without comparing independent data entries, the digitized data could easily go wrong and have a significant impact on climate modeling and weather forecasting.
And during the imaging process, if the volunteers do not comply with best practices, many images could be unreadable, missing, or mislabeled, causing problems for downstream work. Management thus steps in, overseeing each essential step to avoid the rescuing efforts ending up with poor-quality data.
Another prevalent problem—not using funding on the data rescue project—also calls for effective project management. The World Bank and International Monetary Fund have long been patrons of data rescue efforts and have encountered this problem multiple times. Now, IEDRO works with them and other new funders to implement a solution, setting benchmarks for getting the project funding. Project managers play a crucial role in helping the project meet the benchmarks. They set up the project plan, make a timeline for deliveries, track the project process, and monitor the human and financial resources.
The Future
Looking ahead, the revolutionary artificial intelligence of today could have the potential to streamline the data rescue process. In the past, a project in Pakistan took 12-14 years to image and digitize all the data. In the future, however, AI could speed up the process by automating the data digitization step. For now, although the technology has improved, data rescuing remains costly and time-consuming as many parts involve intensive manual work. IEDRO calls for volunteers interested in the rescue work to participate, contributing to a better understanding of the climate and environment.
By Xinyu Zheng
Special thanks to Rick Crouthamel, Monica Drazba, and John Pye for sharing their experience in data rescue projects.
Photos are credited to Randy McCracken