ECB economists (Günter W. Beck, Kai Carstensen, Jan-Oliver Menz, Richard Schnorrenberger and Elisabeth Wieland) in this paper construct a new inflation index based on high frequency data:
We study how millions of granular and weekly household scanner data combined with machine learning can help to improve the real-time nowcast of German inflation.
Our nowcasting exercise targets three hierarchy levels of inflation: individual products, product groups, and headline inflation. At the individual product level, we construct a large set of weekly scanner-based price indices that closely match their official counterparts, such as butter and coffee beans.
Within a mixed-frequency setup, these indices significantly improve inflation nowcasts already after the first seven days of a month. For nowcasting product groups such as processed and unprocessed food, we apply shrinkage estimators to exploit the large set of scanner-based price indices, resulting in substantial predictive gains over autoregressive time series models.
Finally, by adding high-frequency information on energy and travel services, we construct competitive nowcasting models for headline inflation that are on par with, or even outperform, survey-based inflation expectations.
Data details are interesting:
Our dataset comes from the household panel of the market research company GfK and contains daily purchases of fast-moving consumer goods, i.e. products that are bought regularly and consumed quickly, for the period from 2003 to 2022. The purchases covered are mainly food and non-durable goods such as shampoo or toothpaste, which are scanned by panel participants at home and therefore referred to as household scanner data.
On average, the GfK household panel for Germany comprises around 30,000 households, 200,000 products (measured at the barcode level) and 30 million observations per year. In addition, the dataset contains detailed product descriptions and has its own product classification system. These descriptions allow the data to be mapped to the most disaggregate level used in the German consumer price statistics, such as “butter”, “coffee beans” and “toothpaste”.
In total, we can map the household scanner data to more than 180 product groups of the German Harmonised Index of Consumer Prices (HICP), covering around 12% of the German consumer basket and typical outlet types such as supermarkets and discounters. From these, we derive price indices using common index methods often applied by statistical offices in the context of scanner data. We show that our scanner data-based price indices track their official counterparts very well.