Response to Dag Lindgren's blog entry: "Deforestation in the North ?!"
By Matt Hansen and Peter Potapov of the University of Maryland
*This entry is a reply to Dag Lindgren's blog from 22.09.2014 "Deforestation in the North?!"
The purpose of this reply is to restate and clarify methodological issues and definitions used to produce the Hansen et al. (2013) global forest data suite. The Hansen et al. products consist of three data layers: gross forest cover gain, gross forest cover loss, and tree canopy cover. All three layers were derived independently, but using the same input dataset (Landsat time-series), and algorithm (bagged CART). Training data to relate to the Landsat inputs were “derived from image interpretation methods, including mapping of crown/no crown categories using very high spatial resolution data such as Quickbird imagery, existing percent tree cover layers derived from Landsat data, and global MODIS percent tree cover, rescaled using the higher spatial resolution percent tree cover data sets. Image interpretation on-screen was used to delineate change and no change training data for forest cover loss and gain.” Definitions of the land-cover type or change process, quality of training data, and limitations of Landsat multi-temporal metrics strongly influence output product accuracy.
Percent tree cover is an outgrowth of the standard MODIS Vegetation Continuous field product (Hansen et al. 2013). Each pixel depicts the percent tree cover for trees 5m or taller in height at the Landsat pixel scale. This is a product that estimates the biophysical presence of tree cover for the year 2000 and was used to bring context to mapped forest cover loss and gain between 2000 and 2012. A bagged regression tree approach was used to estimate percent tree cover and the product was not validated as part of this study.
Gross forest cover loss (GFCL) depicted stand-replacement disturbances and was defined as an independent class without regard to tree canopy cover or forest land use. GFCL may represent the loss of open or closed forest stands and woodlands, or natural forest, plantations, orchards, etc. Product validation based on a probability-based sample showed that omission and commission error rates were balanced and the user’s and producer’s accuracies were above 80% for each biome and for the globe. A comparison of Hansen et al. results with Swedish national estimates of GFCL provided in the blog post confirmed a high loss detection accuracy. The quality of the GFCL product within areas of low tree cover is lower (as suggested by validation results) than in dense forests. It is important to highlight that the accuracy measured using Landsat data sample does not represent limitations of Landsat data itself (sub-pixel change that is not visible at Landsat spatial resolution).
The gross forest cover gain (GFCG) product mapped the inverse of GFCL, or the transition from a no tree cover state to a tree cover state. As Hansen at al. stated in the supporting materials, “longer-lived regrowing stands of tree cover that did not begin as non-forest within the study period were not mapped as forest gain”. Our GFCG definition and mapping method have two important consequences. First, large areas of growing stock that were covered with young trees in 2000 were not part of the GFCG characterization. We were mapping unambiguous non-treed areas as the reference state for GFCG. This omits early stage recovery as a starting point. Second, areas of slow regrowth that did not meet the 5m canopy cover threshold from a non-treed state were not mapped as GFCG. Additionally, the class as defined was validated and shown to underestimate GFCG. Based on areas of no tree cover in 2000 that were mapped as gain, it was illustrated that the boreal biome exhibited shorter tree heights compared to GFCG within lower latitude biomes. Median height of GFCG in the boreal biome estimated from mid-decadal LiDAR data was 5m (Figure S8 from the Science paper), meaning that detection was near the height threshold for the GFCG class. Taking all of these factors into account, GFCG is very conservatively estimated if viewed from a growing stock perspective. Improved detection of GFCG may require a longer time-series, especially for the boreal biome. We have recently developed and prototyped a method for forest loss/gain mapping and net forest cover change assessment using 30 years of Landsat data. The paper describing our results is currently under review.
The gross forest cover loss (GFCL) and gross forest cover gain (GFCG) products were created using change classification, not via the post-classification comparison of individual tree cover or forest cover maps. Post-classification comparison typically results in higher uncertainties due to the combined errors of the individual classification results which overwhelm actual detected change; it is also useless for mapping gross loss in regions with fast tree re-establishment (like within tropical timber plantations). Another important aspect of the loss product (in its current form) is that it only allows a single (usually first) change event per pixel. If a second clearing followed the first change event, it was ignored. We understand, however, that areas where forest regrowth resulted repeated clearing during 12 years are rare, and not common for boreal regions.
In summary, the characterization of GFCL and GFCG rely on fundamentally different frames of reference. GFCL is a mainly abrupt dynamic where tree cover is removed over a short period. The resulting mapping capability relies on the observation of the abrupt change from treed to non-treed state. GFCG is a more gradual process, where the establishment of tree cover from a non-treed state takes years and is a function of local climate and other factors governing tree growth. As such, its detection relies on a longer period of observations. For the twelve year period of study, one cannot expect a capability where loss and gain are mapped in a directly comparable manner, even for a region where intensive forest management over the long term results in net zero forest cover change.
Given the differences in GFCL and GFCG definitions and the unbalanced omission/commission rates of GFCG, these data are not suitable for net forest cover change estimation. In our paper, we do not describe or otherwise present net forest change results. It must also be reiterated that forest change detected using our biophysically-based definition may be significantly different from official forest area estimates where forest is often defined as a land use category. We have already received a number of comments criticizing the Hansen et al. dataset as not immediately suitable for certain applications (such as estimating loss within high conservation value forests). As a reply, Hansen et al. suggested that “The creative use of these data through their integration with forest type, land use, carbon stock, protected area, and other data sets is appropriate and recommended”. Several recent publications (Tyukavina et al., 2013; Margono et al., 2014; Zhuravleva et al., 2013) have illustrated this approach through the integration of the GFCL product with forest type maps in assessing the context of forest loss. More such applications are encouraged. The Hansen et al. products were provided for unrestricted public use especially to allow for such studies. However, the misuse of the data due to miscomprehension of the themes as presented is probably unavoidable.
We would like to highlight several comments in the blog post that misinterpret our method and product. Section #8 stated that the algorithm is a “black box” and the methodology not transparent. We strongly disagree with this comment. First, the format of the Science paper does not allow for full provision of methodological details; we therefore supplied references to our earlier Landsat papers that focus specifically on the methodology of GFCL mapping (Hansen et al., 2008; Potapov et al; 2011 and 2012). Second, the supervised classification algorithm used is well-established for global tree cover mapping and monitoring, going back to the AVHRR and MODIS sensors, and more recently with Landsat. The method in its basic approach has not changed; it relates multi-temporal spectral reflectance data with biophysical processes defined by training data. Section #9 is also misleading. Boreal forest areas are characterized by a shorter vegetation season; however, cloud cover within the boreal domain is usually lower than that of the humid tropics. Landsat observation frequency is 2-3 times higher due to near-polar orbit convergence. The areas with the lowest cloud-free observation frequency are in equatorial rainforests, with boreal regions usually representing above average cloud-free observation frequency.
Comments
Post new comment