How You Can Benefit from the Big Data Movement
Jerry Skaw remembers a time when all computing was done on a huge, centralized mainframe in every office. He likens that era to the transformation happening today–believe it or not–because big data is being stored on “centralized” computers at Microsoft, Amazon and Google, and companies are accessing that data on web browsers.
“This is the reason for cloud computing: We do the heavy lifting so that customers don’t have to,” said Skaw, UltraCam sales and marketing manager for Microsoft. “If you are a company building a Bing Maps-based application, you don’t have to have all this data.”
When one aerial data acquisition flight can deliver 3.3 terabytes of information, it increasingly makes sense financially for geospatial professionals to work with a large “warehouse,” where the data can be accessed, stored and backed up cheaply. Firms no longer need to keep data stored on dusty shelves, where it might be unused for years or lost altogether, and they no longer need to spend the capital on their own hardware and software for handling huge amounts of data.
“I think you’re going to see Web services become more and more prevalent,” Skaw said. “Cloud computing is basically going to become like a utility bill you pay on a monthly basis.”
The amount of data being created is hard to fathom. In a presentation at the MAPPS Winter Conference last month, Mark Baker, business development manager for Esri, said 90 percent of the data in the world today has been created in the last two years. In 2011 alone, 2.5 quintillion bytes of data were created worldwide – that’s 25 with 17 zeros after it.
The problem becomes, what can geospatial professionals do with all this data?
“Many organizations have legacy data, legacy information, stored across many different mediums in their business,” Baker said. “They may have archives of data that is now offline and online archives as well. Many of them will have access to historical information for trends analysis, and you can combine that with more current information for some predictive analysis.”
In order to achieve predictive analysis – basically unlocking the value in geospatial data – Baker said firms may need to find a partner. The datasets are likely to be nontraditional to the geospatial profession, such as demographic information and purchasing power. This information would enable the retail world or state and federal governments to answer some basic questions about geography and relationships in a much more efficient way.
“It’s a combination of these types of geospatial datasets, what we provide in tapestry data, sensor data, and aligning that with social media – what sort of comments are going on through Twitter and Facebook around a particular theme – and then trying to do some predictive analysis on potential purchasing power for retail sites that will be of value in the next six to 12 months,” Baker said.
Potential partners may include big players such as IBM, Amazon, Microsoft and Google or small, third-party app developers. “From a geospatial world, that’s something that’s quite emergent at the moment,” Baker said.
Skaw suggests thinking about big data with the three V’s: volume, velocity and variety. The volume must be handled through cloud computing in order to save firms money on infrastructure costs like hardware, software and staff time. The velocity can best be handled by a web services organization. And the variety is where geospatial professionals come in: They can turn that data into visuals – a fourth V – through mapping applications.
“When you sift through that data in a traditional format – a spreadsheet or whatever – you might eventually make sense of it,” he said. “But when you map it, it’s immediately apparent. You can see the trend. … When you use Esri or other geospatial tools on the back end, you can do predictive analysis.”
Although it may not make money right away, using big data on the cloud definitely saves firms money. And in a sense, it makes money, too.
Skaw provided the example of a real estate company using geospatial data to map houses on the market. Potential home buyers can input their criteria remotely on the web and immediately see results on a map. These types of visuals add value to a real estate agent, enabling rapid analysis from data stores within a geospatial context which leads to real-time user insight, better decision-making and improved efficiency.
“It’s really all about visualization,” Skaw said. “(Microsoft’s Bing Maps) provides them with aerial views, so you can see the surroundings, and we provide them with oblique views, so they can see building facades and entrances. They can see a specific property, but also get an idea of what the surrounding area looks like, even before they go out there.
“They’re not making money every time somebody hits (a) web-mapping application, but … if particular real estate agencies have this particular service, I’m going to work with them over another.”
This type of nontraditional data can come from multiple sensors and live feeds in a much more rapid way. The goal for geospatial professionals is to try to make sense of the real-time data, Baker said.
“To me, it’s nontraditional geospatial information, information that has a point of reference to it, but may come from a nonconventional source,” he said. “It may come from your mobile device, your cell phone, weblogs. It may come from … transaction records. It may come from real-time traffic monitoring and navigation, and climate, and weather and so forth. It’s a multitude of these newer-generation sensors.
“We need to be thinking about things that we weren’t really thinking about as sources of geospatial information in the past. It’s about being able to value-drive that information from those nontraditional types of sensors. There is an (inherent) knowledge of geography in the data, but it’s very unstructured, and it may be very large.”
That’s big data to a T: very large and growing. Mark Korver, government and education solution architect for Amazon Web Services, said in his presentation at MAPPS that the number of objects stored in Amazon’s Simple Storage Service (S3) was 2.9 billion in the fourth quarter of 2006. That increased to 905 billion in the first quarter of 2012, with most of that growth (500 billion) coming in 2011.
Cloud computing can handle that massive amount of data because it has virtually unlimited capacity and allows people from all over the world to work on the same project autonomously.
With big data on the cloud, geospatial firms can instantly have a global presence. But the future is still cloudy. No one knows how big data in the geospatial profession will impact the world – even in as little as two years from now.
“(It has) yet to fully emerge in the geospatial world, which I think is quite exciting, really,” Baker said. “I don’t think we have all the answers here. The other people on the panel talked about how they can manage big data, but the uses of that are still emerging. … It’s the uses of the data that we’ve got to think about in the future.”