Share Big Geodata at Web-Scale on AWS

Date of Presentation: 
Wednesday, December 2, 2015
2015 Fall
Research Focus: 

Abstract - Amazon Web Services has changed the economics of IT and has more than a million active customers in 190 countries, including 1,700 government agencies and 4,500 education institutions. AWS customers benefit from massive economies of scale on shared infrastructure, but they most often mention the speed and agility afforded by deploying on AWS as its most important feature. One of the services that makes this possible is Amazon Simple Storage Service (Amazon S3), our object store, and one of our first services. S3 plays a central role in many systems and is now home to trillions of objects. It regularly peaks at millions of requests per second and is used to store many kinds of data, including genomic, video and satellite imagery. One of the fundamental differences between S3 and traditional file systems is that you can securely share data at web-scale without financial risk across systems. This makes using S3 to store data especially attractive for public sector and education customers who want to either make data more open, or peer with other institutions in a secure manner.

This talk will illustrate best practice for open or shared data in the cloud. I will spend some time speaking to some of the lesser known features of S3, including the requester pays feature. Having covered the basics on S3, I will use a simple map tiling architecture that uses CloudFront (CDN), Lambda (Compute service), and Mapserver/GDAL running on an auto-scaling EC2 (VMs)  group, to show how this works in practice across organizational accounts. For those of you interested in going deeper, I will then focus on what you need to know in order to build your own national or even global map server, using the first session's real-time map tiling architecture. Because the GeoTIFFs used in this demo are stored in a requester-pays marked bucket that allows authenticated read access, anyone with an AWS account has immediate access to all of the source big geo-data, and can bulk copy the data where they desire. However, I will show that the cloud, because it supports both highly available and flexible compute, makes it unnecessary to move the data anywhere, allowing for one authoritative store of data to service a world of different systems and users.


Mark Korver is the Geospatial Lead on the Solution Architecture team at Amazon Web Services (AWS) and is based in Seattle. He has 8 years of experience building cloud architectures both as a customer and employee of AWS. With his many years running companies focused on geospatial, Mark has a dual role in business development and technical architecture. Before focusing on Geospatial, Mark was the first Solution Architect on the State Local Government and Education team (SLED), part of the Worldwide Public Sector group. During the formative years of the SLED team, Mark served customers across the US, supporting projects at universities such as MIT, University of Washington, Berkeley and Stanford and worked with public entities such as the Seattle Police Department and King County, WA. Prior to Amazon, Mark was CTO/Founder at SpatialCloud LLC, a geo-services startup running on AWS. Before that, Mark started and spent 10 years building Alchemedia Inc. a software development firm based in Tokyo. At Alchemedia he helped create a business with core strengths in commercial mortgage loan management systems and delivered geo-enabled applications to customers including Kajima Corp., AutoDesk Japan, Minato City Government (Tokyo), NTT group, Lehman Brothers, Misawa Homes, Hitachi, Japan Space Imaging, Zenrin, Shimizu Corp, and the University of Tokyo. Mark holds a Master in City Planning from MIT with a specialization in Technology Transfer as it applies to international development projects and is proficient in Japanese.

Mark Korver
Speaker affiliation: 
Amazon Web Services