Handling Multicollinearity on Social Spatial Data Using Geographically Weighted Random Forest

Binti Kurniati, Yuliani Setia Dewi, Alfian Futuhul Hadi

© 2023 Yuliani Setia Dewi, published by UIKTEN. This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International. (CC BY-NC 4.0).

Citation Information: SAR Journal. Volume 6, Issue 3, Pages 149-153, ISSN 2619-9955,, September 2023.

Received: 24 July 2023.
Revised:   25 August 2023.
Accepted: 01 September 2023.
Published: 26 September 2023.


Crime includes all kinds of harmful acts that violate the laws in force in Indonesia as well as social and religious norms. The crime total is the number of incidents reported to the police, obtained from public reports and events where the perpetrators were caught red-handed by the police. We can use the Poisson model to analyze the data, but the existence of spatial heterogeneity in the data makes the model less accurate. This research investigates the methods when there is spatial heterogeneity in the data by using Geographically weighted regression (GWR), Geographically Weighted Poisson Regression (GWPR) and Geographically Weighted Random Forest (GW-RF). We compare the GWR, GWPR, and GW-RF models for criminal cases in East Java in handling multicollinearity in the data. The results of this study indicate that the GW-RF model is better for modeling criminal cases with the smallest RMSE and MAPE values and an R-Square value close to 1. Based on the three most important variables in each location, they form six groups of regencies/cities in East Java, Indonesia. The variables vary between groups and the poverty severity index is not included in the three most important variables in all locations.

Keywords – Crime, GWR, GWPR, GW-RF, multicollinearity, spatial heterogeneity


                                                                      Full text PDF