R packages Developed by Our Research Group
- Anomaly detection and repairing for COVID-19 data: cdcar v1.0
- Over the past few months, the outbreak of COVID-19 has been expanding over the world. A reliable and accurate dataset of the cases is vital for scientists to conduct related research and for policymakers to make better decisions. We collect the U.S. COVID-19 daily reported data from four open sources: the New York Times, the COVID-19 Data Repository by Johns Hopkins University, the COVID Tracking Project at the Atlantic, and the USAFacts, then compare the similarities and differences among them.
- To obtain reliable data for further analysis, Wang, et al. (2020) examined the cyclical pattern and the following anomalies, which frequently occur in the reported cases: (1) the order dependencies violation, (2) the point or period anomalies, and (3) the issue of reporting delay.
- To address these detected issues, we develop this cdcar R package to provide some anomaly detection and repairing methods if corrections are necessary. In addition, we integrate the COVID-19 reported cases with the county-level auxiliary information of the local features from official sources, such as health infrastructure, demographic, socioeconomic, and environment information, which are also essential for understanding the spread of the virus.
- For public usage, a Github repository is established to provide daily updated and cleaned data.
Reference:
Wang, G., Gu, Z., Li, X., Yu, S. Kim, M., Wang, Y., Gao, L. and Wang, L. (2020). Comparing and integrating US COVID-19 data from
multiple sources with anomaly detection and repairing. [arXiv: 2006.01333]
- Spatiotemporal epidemic model (STEM): STEM v1.0
- Wang, et al. (2020) established a new spatiotemporal epidemic modeling (STEM) framework for space-time infected/death count data to study the dynamic pattern in the spread of COVID-19. The proposed methodology can be used to dissect the spatial structure and dynamics of spread, as well as to assess how this outbreak may unfold through time and space.
Reference:
Wang, L., Wang, G., Gao, L., Li, X., Yu, S. Kim, M., Wang, Y. and Gu, Z. (2020). Spatiotemporal dynamics, nowcasting and forecasting of COVID-19 in the United States. [arXiv: 2004.14103]
- Triangulation in 2D domains: Triangulation v1.0
Read PDF
This R package performs the triangulation for any arbitrary polygonal domain.
Reference:
Lai, M. J. and Wang, L. (2013). Bivariate penalized splines for regression. Statistica Sinica, 23, 1399-1417.
This R package provides the bivariate spline basis functions and implements the bivariate penalized spline smoothing over triangulation in Lai and Wang (2013).
Reference:
Lai, M. J. and Wang, L. (2013). Bivariate penalized splines for regression. Statistica Sinica, 23, 1399-1417.
Reference:
Yu, S., Wang, G., Wang, L., Liu, C. and Yang, L. (2020). Estimation and inference for generalized geoadditive models. Journal of the American Statistical Association, Theory and Methods, 115, 761-774.
Wang, L., Wang, G., Lai, M. J. and Gao, L. (2020). Efficient estimation of partially linear models for data on complicated domains by bivariate penalized splines over triangulation. Statistica Sinica, 30, 347-369.
Lai, M. J. and Wang, L. (2013). Bivariate penalized splines for regression. Statistica Sinica, 23, 1399-1417.
- Generalized spatially varying coefficient models: GSVCM v1.0 [Under Development]
Reference:
Mu, J., Wang, G. and Wang, L. (2018). Estimation and inference in spatially varying coefficient models. Environmetrics, 29:e2485.
Kim, M. and Wang, L. (2020). Generalized spatially varying coefficient models. Journal of Computational and Graphical Statistics. In press.