Big Data vs SDI? It's not an either/or.


4 comments posted
  David, In just the few

In just the few years since you posted this article, I think we’ve seen progress in the development of standards around RESTful APIs and ways into social media content repositories. But by 2014 I still don’t see the kind of clean global metadata standard that would help integrate data more easily across disciplines. What I see emerging lately is on-premise and cloud platforms like IRI (CoSort) and Splunk, respectively, to discover, prepare or mash-up, and analyze, data in both unstructured and structured data feeds or repositories irrespective of unity. Please post an update on the SDI vs. Big Data debate.

Posted by Harinath Prabhakaran (not verified) on Wed, 2014-09-03 10:51
(Posted by Lance McKee for Dr

(Posted by Lance McKee for Dr Michael Sanderson)
As an end user I want two things when I come to assemble data for an activity or make a decision. Firstly, I want to be able to access data which is catalogued (so I can work out semantically if these data are relevant) and indexed (ideally in a form that I can link my own data to the data I discover) and secondly, I want to be able to get at these data with a search tool that then enables me to conflate these data with data that I own (they are in my data warehouse). So big data is both structured and unstructured. If its structured in an SDI (and I accept it may come with Digital Rights Management access rules) and I have a set of tools that I use I want to know these will work with the SDI. If the data are unstructured (much of the social media) I want to ideally use the same tools to add sense to and inform the decisions I make. The web as we know it was for documents. Space and time are not documents. These need to be added to the web in a form which which allows space and time to be leveraged if we are to move forward. Things are developing rapidly and at such times chaos rules. So we have RDF/Sparql and I see the emergence of (from bing, Yahoo and Google) as a parallel move to add structure to unstructured web documents, so there is a need for standards to emerge or I won't be able to traverse any data successfully, Big or otherwise.

Posted by Lance McKee on Tue, 2011-11-08 09:36
Requirements for both would

Requirements for both would nevertheless advocate for a refoundation of the DCP architecture model, as a recognition for new tools and frameworks.
Such a refoundation would simply empower the business to business web services, as we are used to implement for software and data models interoperability, while opening the door to this acknowledged new 'social web' computing era for geospatial domain practitioners.
A refoundation just because one fundamental aspect of big data analytics is the computing infrastructure.
This computing infrastructure is just nothing similar or comparable to the way SDIs or EO Ground Segments are deploying resources. It would more clearly embrace the distributed computing platform of the Web 2.0, that so many big players have implemented to address reliability, scalability and performance requirements, while coping with the 'data deluge' phenomenon !
Thanks David for sharing this very interesting summary.

Posted by Hervé Caumont (not verified) on Sat, 2011-10-29 07:08
I agree it's not an either/or

I agree it's not an either/or - life never is. Standards like OGC are important to enable data sharing between systems, much like travellers need power adapters to plug their appliances into foreign sockets. And of course you have adopted KML as a standard, which should ensure that also lay users will share spatial data with a semblance of intelligence. I do realise there are applications where structure and accuracy are very important (when I worked in the oil industry a positional error of just 20 metres could cause millions of dollars in losses). The point I was making is that anyone who, like me, grew up in the traditional geodata industry will need to let go of some of our desires to structure and standardise everything, or we will miss the boat in the new world of big data. But of course both worlds have got something to bring to the table, so I think we are all agreed on that.

Posted by Thierry_G (not verified) on Mon, 2011-10-24 08:14