The geocoding process can convert addresses into coordinate pairs. Yet this conversion often introduces uncertainties into the geocoded data, including data omissions and positional errors. These uncertainties can greatly impact spatial analysis, necessitating the estimation of the minimum percentage of accurately geocoded data. While previous studies have primarily focused on estimating the impact of data omissions on spatial crime patterns, the effects of positional errors and their comparative influence remain underexplored. We aimed to estimate minimum acceptable hit rate (MAHR) in the two uncertainty scenarios of data omissions and positional errors and compared their differential effects, accounting for point intensity and clustering levels. We conducted a simulation study to estimate the MAHR across various scenarios, and compared the results with real-world crime data. The results showed that MAHR exhibited different variations with positional errors compared to data omissions, particularly when accounting for point intensity and clustering levels. In most cases, 85% accurately geocoded data was sufficient to present the spatial pattern of real-world crime data. However, for densely clustered points or high-intensity data, achieving an accurate geocoding rate of 90% or even higher was necessary. The findings help inform geocoding data selection in the presence of omissions or positional errors.
发表评论