NSF Data Infrastructure Building Blocks: Scalable Capabilities for Spatial Data Synthesis

Massive spatial data collected from numerous sources are increasingly used to instrument our natural, human and social systems at unprecedented scales while providing us with tremendous opportunities to gain dynamic insight into complex phenomena. Though such big data streams play crucial roles in many scientific domains and promise to enable a wide range of decision-making practices with significant societal impacts, exploiting them successfully poses significant challenges. On one hand, spatial and location attributes serve as a common key to many types of data such as census and population, land use and cover, floodplain, and vegetation distribution. Oftentimes perceived as significant benefits, spatial data synthesis can be used to link disparate pieces of data that pertain to common spatial references and units. On the other hand, however, there are diverse spatial references and units for data collection and management and they are based on different representation models and assumptions.

To break through these challenges, this project aims to establish a suite of scalable capabilities for spatial data synthesis enabled by innovative cloud computing and cyberGIS and driven by multiple scientific communities. Such capabilities will also be designed to support integration with cyberGIS analytics and workflow for solving scientific problems. The project establishes core capabilities through a spiral approach by initially developing the capabilities for solving specific scientific problems and later moving on to engage broader communities for validating and improving the core capabilities. The scientific problems will revolve around two interrelated themes: 1) measuring urban sustainability based on a number of social, environmental, and physical factors and processes; and 2) examining population dynamics by synthesizing multiple states of the art population data sources with social media data.

NSF Grant Numbers: 1443080

People: Kiumars Soltani (soltani2@illinois.edu), Pierre Riteau (priteau@uchicago.edu), Hao Hu (haohu@illinois.edu) , Anand Padmanabhan (apadmana@illinois.edu), Kate Keahey (keahey@anl.gov), Shaowen Wang (shaowen@illinois.edu)


  • Hu, H., Lin, T., Wang, S., and Rodriguez, L.F. “A cyberGIS approach to uncertainty and sensitivity analysis in biomass supply chain optimization,” Applied Energy, v.203, 2017, p. 26–40. doi:https://doi.org/10.1016/j.apenergy.2017.03.107
  • Wang, S. “CyberGIS and spatial data science,” GeoJournal, v.81, 2016, p. 965–968.
  • Armstrong, M.P., Wang, S. and Zhang, Z. (2018) “The Internet of Things and Fast Data Streams: Prospects for Geospatial Data Science in Emerging Information Ecosystems”. In: M. Freundschuh and D. Sinton (Eds.), Frontiers of Geospatial Data Science. Conference Proceedings, AutoCarto/UCGIS 2018, the 22nd International Research Symposium on Computer-based Cartography and GIScience (pp. 11-17)
  • Armstrong, M. P., Wang, S., and Zhang, Z. (2019) “The Internet of Things and fast data streams: prospects for geospatial data science in emerging information ecosystems”. Cartography and Geographic Information Science, 46(1), 39-56.
  • Feng, L., Kate, K., Pierre, R., and Jon, W. “Dynamically Negotiating Capacity Between On-demand and Batch Clusters.” In: Proceedings of the Supercomputing’17 Conference. 2017.
  • Gao, Y., Li, T., Wang, S., Jeong, M., and Soltani, K. (2018) “A Multidimensional Spatial Scan Statistics Approach to Movement Pattern Comparison”. International Journal of Geographical Information Science (IJGIS), 32(7): 1304-1325
  • Gao, Y., Wang, S., Padmanabhan, A., Yin, J., and Cao, G. (2018) “Mapping Spatiotemporal Patterns of Events Using Social Media: A Case Study of Influenza Trends”. International Journal of Geographical Information Science (IJGIS), 32(3), 425-449
  • Hu, H., Lin, T., Wang, S., and Rodriguez, L. 2017. “A CyberGIS Approach to Uncertainty and Sensitivity Analysis in Biomass Supply Chain Optimization”. Applied Energy, 203, 26-40. DOI: https://doi.org/10.1016/j.apenergy.2017.03.107
  • Hu, H., Yin, D., Liu, Y. Y., Terstriep, J., Hong, X., Wendel, J., and Wang, S. (2018) “TopoLens: Building a CyberGIS Community Data Service for Enhancing the Usability of High-resolution National Topographic Datasets”. Concurrency and Computation: Practice and Experience, DOI: https://doi.org/10.1002/cpe.4682
  • Jeong, M.-H., Cai, Y., Sullivan, C. J., and Wang, S. (2016). “Data depth based clustering analysis”. In: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2016), October 31 – Thursday, November 3, 2016 — San Francisco Bay Area, California, USA.
  • Jeong, M., Yin, J., and Wang, S. (2018) “Outliers Detection and Comparison of Origin-Destination Flows with Data Depth”. In: Proceedings of the 10th International Conference on Geographic Information Science (GIScience 2018), August 28 – 31, 2018, Melbourne, Australia
  • Keahey K., Riteau P. and Timkovich N. (2017). “LambdaLink: an Operation Management Platform for Multi-Cloud Environment”. In: Proceedings of the 10th International Conference on Utility and Cloud Computing, pp. 39-46. ACM, 2017.
  • Lin, T., Wang, S., Rodríguez, L. F., Hu, H., and Liu, Y.Y. “CyberGIS-Enabled Decision Support Platform for Biomass Supply Chain Optimization,” Environmental Modelling & Software, 2015.
  • Qiaobin, F., Timkovich, N., Riteau, P., and Keahey K. (2018) “A Step towards Hadoop Dynamic Scaling”. In: The 20th IEEE International Conference on High-Performance Computing and Communications (HPCC-2018), Exeter, United Kingdom. June 2018
  • Soliman, A., Soltani, K., Yin, J., Padmanabhan, A., and Wang, S. (2017). “Social sensing of urban land use based on analysis of Twitter users’ mobility patterns”. PLos One, 12(7). DOI: https://doi.org/10.1371/journal.pone.0181657
  • Soliman, A., Yin, J., Soltani, K., Padmanabhan, A., and Wang, S. 2015. “Where Chicagoans tweet the most: Semantic analysis of preferential return locations of Twitter users”. Proceedings of the First ACM SIGSPATIAL International Workshop on Smart Cities and Urban Analytics (UrbanGIS’15).
  • Soltani, K., Soliman, A., Padmanabhan, A., and Wang, S. 2016. “UrbanFlow: Large-scale Framework to Integrate Social Media and Authoritative Landuse Maps”. Proceedings of the 2016 Annual Conference on Extreme Science and Engineering Discovery Environment (XSEDE’16). July 17-21. Miami, Florida.
  • Wang, S., Hu, H., Lin, T., Liu, Y., Padmanabhan, A., and Soltani, K. “CyberGIS for Data-Intensive Knowledge Discovery,” ACM SIGSPATIAL Newsletter, 2014.
  • Wang, S., Liu, Y., and Padmanabhan, A.. “Open CyberGIS Software for Geospatial Research and Education in the Big Data Era,” SoftwareX, 2015.  DOI: https://doi.org/10.1016/j.softx.2015.10.003
  • Xu, Z., Guan, K., Casler, N., Peng, B., and Wang, S. (2018) “A 3D Convolutional Neural Network Method for Land Cover Classification Using LiDAR and Multi-Temporal Landsat Imagery”. ISPRS Journal of Photogrammetry and Remote Sensing, accepted
  • Yin, D., Liu, Y., Padmanabhan, A., Terstriep, J., Rush, J., and Wang, S. (2017). “A CyberGIS-Jupyter Framework for Geospatial Analytics at Scale”. In: Proceedings of the 2017 Practice & Experience in Advanced Research Computing (PEARC’17). July 9–13. New Orleans, LA.
  • Yin, J., Soliman, A., Yin, D, and Wang, S. (2017). “Depicting Urban Boundaries from a Mobility Network of Spatial Interactions: A Case Study of Great Britain with Geo-located Twitter Data”. International Journal of Geographical Information Science DOI: https://doi.org/10.1080/13658816.2017.1282615
  • Zhang, Z., Demšar, U., Wang, S., and Virrantaus, K. (2018) “A Spatial Fuzzy Influence Diagram for Modelling Spatial Objects’ Dependencies: A Case Study on Tree-related Electric Outages”. International Journal of Geographical Information Science (IJGIS), 32(2): 349-366