IS-Count: Large-scale Object Counting from Satellite Images with Covariate-based Importance Sampling

Stanford University

Visualization of distributions in Africa. The proposal distribution learned from the population density raster (Subfig. (c)) represents the ground truth building distribution (Subfig. (d)) quite well. In the most ideal case, we want the proposal distribution to be proportional to the ground truth.


Object detection in high-resolution satellite imagery is emerging as a scalable alternative to on-the-ground survey data collection in many environmental and socioeconomic monitoring applications. However, performing object detection over large geographies can still be prohibitively expensive due to the high cost of purchasing imagery and compute. Inspired by traditional survey data collection strategies, we propose an approach to estimate object count statistics over large geographies through sampling. Given a cost budget, our method selects a small number of representative areas by sampling from a learnable proposal distribution. Using importance sampling, we are able to accurately estimate object counts after processing only a small fraction of the images compared to an exhaustive approach. We show empirically that the proposed framework achieves strong performance on estimating the number of buildings in the United States and Africa, cars in Kenya, brick kilns in Bangladesh, and swimming pools in the U.S., while requiring as few as 0.01% of satellite images compared to an exhaustive approach.

paper thumbnail


arXiv 2112.09126, 2021.


Chenlin Meng*, Enci Liu*, Willie Neiswanger, Jiaming Song, Marshall Burke, David B. Lobell, Stefano Ermon. "IS-Count: Large-scale Object Counting from Satellite Images with Covariate-based Importance Sampling", to appear in Proc. 36th AAAI Conference on Artificial Intelligence (AAAI 2022).
Bibtex (coming soon)


An illustration of the IS-Count framework in comparison to the exhaustive approach to object counting.

An exhaustive approach (Subfig. (a)) downloads all image tiles covering the target region, maps the objects in each image using a trained model, and takes the summation of counts in all the images to produce a total count. However, purchasing satellite imagery for a large target region can be expensive.
In contrast, IS-Count saves a huge amount of cost by constructing a proposal distribution that is representative of the real object distribution. The major steps in IS-Count is described as follows:
  • First, we construct the base distribution from the covariate rasters (i.e. population, NL). Specifically, we take the pixel value normalized over all raster pixels within the target region as the probability to sample it.
  • Next, we learn the proposal distribution using either identity mapping from the base distribution or fine-tune the latter with isotonic regression. Then we select a small number of informative areas for object counting by sampling from the proposal distribution.
  • Finally, the small number of samples are labeled by human annotators and the total object count is estimated using importance sampling.

Estimating Building Counts

We can use IS-Count to estimate the total object count in different regions with as few as 0.001% of the data while achieving estimation error as low as 1.0%. Specifically, the estimation error is computed as the absolute difference between the estimated and the ground truth counts (given by the Microsoft Building Footprints in the US and Google Open Buildings in Africa) divided by the ground truth. We show the estimation error map of building counts in the US states and African countries below.

Click on the options above to see results for different methods.
Drag the map and zoom in to see results for the US states and African countries.

Error rate map of building count estimation using different methods (averaged over 20 runs).

Compared to exhaustive approaches, IS-Count requires much fewer cost (e.g. 0.01%) on purchasing high-resolution satellite images and fewer hours on labeling while still achieving high accuracies on estimating object counts.
Building count experiment results show that incorporating prior knowledge from covariate data via importance sampling boosts the estimation performance compared to the uniform sampling baseline.
We also found that IS-Count with tuned proposal distribution converges faster to the ground truth object count as the sample size increases. That is, the population (isotonic) proposal distribution converges fastest to the groundtruth building count, compared to the population (identity) method.

Related Work

There are prior works that use high-resolution satellite imagery for estimating poverty level [Ayush et al., 2020], [Ayush et al., 2021]. Also, recent works have achieved great success in object mapping by collecting and labelling all satellite image tiles in a target region exhaustively [Crowther et al., 2015], [Yu et al., 2018], [Yi et al., 2021b]. Nevertheless, these exhaustive approaches require huge amount of high-resolution satellite imagery over a large region and long hours of human annotation, which are often unaffortable to researchers and inflexible to update statistics. In contrast, we are interested in developing a more cost- and time-efficient human-in-the-loop object counting pipeline by utilizing importance sampling.