Hackernoon logoHow BIG is your Big Data & Why should you Care?! by@pk

How BIG is your Big Data & Why should you Care?!

How Big is your Big Data & Why should you Care?! Podcast hosted by Michael Coupland, Steven Sinofsky, Prat Moghe, Gaurav Dhillon and Roman Stanek. Stevenson Noske leads the conversation with three founders of SnapLogic and GoodData about the opportunity and variety of ways forward for companies looking to make the most of the data that matters. The a16z podcast was created using Watson Speech to Text. The transcript has been released to accompany the audio of the podcast.
pk Hacker Noon profile picture


Making the most of the Data that matters!

As an organisation getting your Data Strategy and Data Act together is becoming much more important now than ever. Having said that, the path to achieving nirvana is not all that clear set and/or defined. However a few companies and founders have been swimming in this space for a while and are willing to share their insight with us.

Data these days don’t reside in one place. With the advent and use of internet, social media, mobile …Data is born in multiple places, Data travels through multiple pipes and Data resides in multiple stores.

So knowing the above and learning how to harness the un-tapped insights makes an organisation live, demise, grow, shrink, compete, lose, etc.

Also note ones’ Data Strategy, Data Playbook might be completely irrelevant to someone else, so keep that in mind when you attempt to leverage from others lessons learnt.

Thank you Steven Sinofsky (from a16z), Prat Moghe (from Cazena), Gaurav Dhillon (from SnapLogic), Roman Stanek (from GoodData), Speech 2 Text (from IBM Watson).

I have enclosed a link to the audio for our listeners who prefer audio.

I have summarised the highlights for those who are strapped for time.

And I have enclosed the transcript which was created using Watson Speech to Text.

Making the most of the Data that matters


  • Data finding Data
  • Data has gravity
  • Excel vs Tableau
  • Pipeline
  • Data Jockey
  • Predictive Analytics
  • BDaaS (Big Data as a Service)
  • Connect Big Data faster across Premises, Apps, Things
  • Analytics Distribution Platform


Welcome to the a16z podcast I’m Michael Coupland. Every organization these days is clear about the need to get its data act together. But that doesn’t mean the path toward data bliss is clear. Data has gravity it resides in different places at different organizations. On premise in the cloud and flowing from external sources. And the rate of change within organizations is always different. So an approach towards handling data that works for one company. Maybe the exact wrong thing for yours. Stevenson Noske lead the conversation with three founders. Pratt moe gave him because Xena core of Dylan from snap logic. And romance tonic from good data. About the opportunity and variety of ways forward. For companies looking to make the most of the data that matters. Stevenson Noske kicks things off. This is the only super fascinating because the. Behind the scenes all all of you share a very similar set of other problems and challenges and opportunities when it comes to dealing with data often what differentiates you from your competitor is how do you get the data and what do you do with that data and what decisions you make based on that data. And it’s a world that’s just being completely inverted from a what we used. Anka data used. Province of a very small. People who would generate reports print them out move them up the chain and distill them down and it has been is a talk about in power point slides. And now we have the opportunity if you build out the right infrastructure to access that data analyze it look at it make choices all from a mobile device all using. The cloud. And so that is a is the centerpiece of of the section and what’s really interesting is that we in represented in you are CIO’s NCO mos and so we have a a sort of the a supplier consumer relationship that we want to we want to navigate the desire for ubiquitous access the needs of security the challenge of on drown cloud hybrid cloud private cloud and then just the desire for faster more and and better and so to explore this topic I’m super excited to bring up a three a great executives and founders of of portfolio companies that will interests and cells as they they join us here high by closing founder and CEO of Kazee. I’m going to Dylan and founder and CEO sap logic have been in the. Data business for twenty plus years of formerly co founded. Informatica and as chief executive built up built that up into a decent size public company snap logic is version two point of my journey pleased to be here. And then a month sonic founded a CEO of data. So I I kind of I I trying to decide where to start with a with this and I I think I just want to start with a a big question that I do want each of us to look at all of it which is just demystifying the big part of data and and help people to understand that how as building out a new company in the new sort of cloud sass. You know mobile world what what is big about big data. If I could take them yeah yeah sure of. Yeah so it’s interesting of you know on on one hand I tend to cringe every time I hear the word. Big data but on the other hand you know if you just look at the world you know I was before because Xena I around the product line at new teaser which was a. A big deal of it also clients and if I remember our customers back then. Anybody who had three or four petabytes of data was considered a huge customer. Now when I talk to people they they talk about you know mobile data social data existing data and and so petabytes is no longer. You don’t like being up there on the other hand you talk to many customers where it’s not about volume it’s about you know having existing data but just being able to get it together analyze it faster so it lot of it is about agility of data I I sort of defined big data as it’s a mindset it’s about being really fast about using data to make decisions so it’s not just about petabytes of debates about. And how how fast any leverage data to to create business outcomes and and so that mindset is what is different now. So just a on me help with a look as eccentric as in a particular where is casino Xena on the stacker of a business problem what is busy. Yeah but not not too much of a pitch not Olympic athletes but really I’ll be yeah so what because that’s always a a founder challenge you know you after the pitch and you know right away you get. If you solve those hunger problem no now you know what because he knows essentially is trying to move big enterprises. To leverage the cloud far their big data processing and so what we’re seeing as large enterprises CIO’s CMOS. There at this crossroads where the stock is transforming and clouds coming along so there’s an opportunity for a new platform but they all wrestling with figuring out how to use the cloud attorney guineas and that’s what hit us sure. So so taking a slightly different perspective you’re coming at it from from above talk talk what let’s face yeah like what’s big about your big data. Well you know it’s big enough mn but actually you know the sort of. Breakthrough for me on all this is if you think about the. Data warehousing industry which is about a ten billion dollar industry. Local successes here in Europe. BusinessObjects phenomenal success in the nineties various other things that you know click etcetera. But but basically if you think of the nineties. What we essentially had was the industry around data warehousing and analytics that was fundamentally about the barcode scanner. Right here’s a technology that was invented to help you get out of a supermarket fast. Standing in line you can check out. That began an industry of analytics. Nielsen and I are I would count how much beer was it more in a local brands people drink more Stella or something else you know and no we’re not going beyond that sort of comparing this verses that push the geography in big data to me the fundamental breakthrough is providing information from multiple places and producing. Insights where the data finds the date. So for example a consumer packaged goods company. Traditionally if they were looking at the sale of lipsticks. Would be looking at in a classical business intelligence way price volume geography sector. But when you bring in a social media strain and so you see the discussion around that particular product. You find that out of stock. Is like a big deal so what the big data provided is an insight that out of stock because colors of lipstick on the go being out of stock was a huge issue. People. And more importantly that it was out of stock because people were ending the life of that product based on volume. Whereas they should be and now they are looking at the lifetime value of the product. Particularly lipstick shades that apply to minorities or someone who may be lower volume purchaser will once I find the right shade to going to buy for life. So so to me data finding the data. Is that the magic. A big. That is the promise that is being fulfilled in a productive way that we never could do in the nineties is not about love at the contact is not available. But with with good data there that part of it is is actually putting that in front of your your your typical member of the marketing team the sales team the field how does how does that fit into the the the big guy to big data I absolutely believe that. You know it all investment in how do and this and how do that you know most companies are still data bank. You know how doable did other house or whatever is a place where data goes to die and all the goal is to actually change it and and you see good data as the last mile of the analytics. You know that’s the last mile that connects. Connects he’ll use it as your business partners your business that works internal and external audiences with data in how to bend with data in that data that houses and so on and on it’s kind of non trivial because we all kind of know how data looks and so on but other customers are people literally in the field and and people who managed stores and people who manage you know ascended shops and so on and we need to deliver data to Dan in a way they can actually understand there’s a big kind of impedance mismatch between the way the data is indeed other house and how do which is actually a huge advantage of I do that it can be stored in so many ways but it doesn’t help somebody who manages set bishop to arm to actually understand that and soaps ogle is to be that kind of life last mile of analytics and do any actually do it is that you know we actually lead already customers big bangs big telcos big you know insurance companies and so on white label good data and sell it under their names so you know we have about Hoffman and use it as. And very few of them actually know they use good data because they see somebody else’s logo but it’s okay as long as they get access to the data which is kind of the biggest problem today. So that much but I find. Fascinating about about trying to navigate the space is is that in most corporations finding the answer any question is is often incredibly difficult and yet I want to know like anything is our movie ticket available how many cars are available to drive me somewhere can I get a plane ticket a hotel like as a consumer I have like this a men’s access to the data. And so I think what what is it like how do we break down that barrier because I think representing the CMO is of the the audience like they they they know that all the bar codes are being scanned today know that you’re using a great reporting they know it’s there but there there’s some impedance mismatch what what it was. Your governor dnmt it’s good I think I ask a question they’re fighting over answering it so I think you you know you can give your prospective but. I think that yeah I sort of think a country in point of view to what I I think it’s not about. Ok it’s not about technology first off and say it’s not fundamentally about saying I wanna ask any question I want. Because the moment you take that approach it then becomes like you were saying it is a how to store you know can ask any question as opposed to sort of saying ok what are you really going to get done what’s the business outcome. And so what we’ve seen then you look at it many big data projects the ones that feel are ones where people have taken this approach of saying I want to collect all the data and then I want to figure out what questions I can ask. I wanna look for hidden patterns. As opposed to people who sort of look at it and say I got a marketing problem I don’t know how do you know how to track my existing customer so that I can up so I want to convert an existing customer much better I wanna give a better experience and and so whenever they’ve approached it out with a business problem and then to say what did I do I need to bring together to answer that question your lock better in terms of formulating a narrower scope of those projects and asking those questions anytime it becomes like. Collect the data and figure out what it is then you you start having those issues. So it’s sort of the top down approach was at the bottom of. I was like a different image. You know my realest Iggy not not go but we can say something that we have a five pound bet on who sees. You disagree first he once he said that. So so you know the dementors point we have is and. What I’ve come to believe in again looking in the nineties looking at this naked in the century is that it is it is not that we April or you know what the problems facing us are. We don’t because this too much going on this too much change you know all the way from world economy recession entrance of competitors is just the the intensity of change is too great and I think some of the hangover that debate industries had is this whole data warehousing batch over. And and the truth is we’re living in a world of streams of information and consumers particularly you know people who come in came into the work force of this century. Millennials have an expectation of wanting stuff now. Right so the vantage point that we see is combining streams and in a sense be or use an over used word. And. Smashing things up and providing new insights is hugely important and that is dynamic and it is done in an interactive way. Broadly speaking you have some kind of conflict are we going to sell more widgets are we trying to kill the competition or whatever but but you don’t know exactly how. Do you engage well I think I ask you know I’m only asking that because I think part of it is is. Is how how do we go get from a model of like every Friday and sister show up and find the products that aren’t selling well or fine you know where do I need to stock something to wear this exploration how did you enable that enable that that’s what I was actually going I actually believe that the biggest problem it will date another big data small data and it ate out. Is the rate of change of business. You know you you don’t want to be doing the same search of a Friday. And and in a cut and set up our IT is supposed to govern and Cuba data and business as opposed to dude exploration and it doesn’t work you know for the the. ID do it really kind of be you know in charge for a day the tools and the time don famous and so on at that on their own so they are way too long for business to actually kind of depend on it and so so business then goes to excel and and other products it you might be familiar with and either return to that one yeah and and so so that’s the biggest problem and I’ve been in so many meetings with the CMO and and CIO that is like as you know cognitive although. You know they have note the same kind of it Cher interest and so on and so I actually believe that you know your example that Isaac goons unit I have so much information and so on that’s actually not a good example because the information I get is some sort of a curated by you know buy into it and curated by Google and so on and business people don’t have that kind of experience in any company and that’s why they go to the blow in excel and and click taken so long because they essentially gone up resigned on IT getting damn kind of that information and it’s it is in how to move and it is in in date other house and so on but there is this kind of you know again it’s the last mile and I’m not saying that we are kind of able to solve it you know genetically a good database via atleast solving it for certain types of problems you know getting data to business network’s getting data over the vanity so it will never be one solution for everything because the biggest come into business. The parade of change is way too high. But I I feel like there’s. There there there’s just. Just now got to talk about this exploration idea right that the and then coming back to Romans examples have you have you working without. Speaking about how data changes business business models of there is a fast growing restaurant in the U. S. this is kind of like the next chipotle. And and the guys there that they grew up in and chipotle which again is a pretty hot Jane but they basically decided to build ground up completely differently right and the way these guys are thinking about data and and people driving in to to you know these guys for a service slash solids. But they look at you and said Mrs Steven Steven likes eggplant autocrats vegetarian are you know and and also their whole idea is that if I could profile people coming in at noon. As they come in I’ll I’ll basically figure out how to build the right product for you. Right sage it’s fresh but it’s it’s customized for you but you still want to do that scale. So there’s a whole new breed of maybe heard this this morning like the full stack because all of the full stack out like vertical eyes experiences everything that matters adding that’s where it’s going I think it’s going is all that data gets surfaced it’s in a product it’s in in some form there if it can be surfaced of the right people you get magic so let me ask you though then is is this is what you describe like so that’s not like a person doing a report so help me to understand is there is it there’s been some elements of of a whole new style analysis based on machine learning based on incorporating other data sources where how does this that because I think that this leap is super super critical to understanding the that the where why the new tools have to be cloud based why they’re there names they do what they do so so it’s. It’s not about what’s not selling on Friday but why. It’s more about what could we do based on the data that we have. That would help us be more successful without doing a traditional business intelligence and doesn’t matter with me do it on premise in excel in the cloud the traditional rear view mirror view of business intelligence has some element of return but it’s also at some point been well done there are ways to improve that we’re getting to a point of diminishing marginal returns. Where the returns are is the wealthy people in technology are doing predictive analytics and trying to figure out what they gave us telling them. And in that they’re using machine learning algorithms and certain kinds of open source algorithms many of which come from Berkeley sample out and they do you know categorization next corner aggressor we don’t yet yeah you know it’s a California Berkeley Welt has a a a whole variety of some of the leading technologies that like spark as Canada you do can download them and use them right so but you need the people behind it so what we have now is a new population of user who is using the data from the hood to presenting and that person to date a slight. This is now widespread outside of financial services. In a Goldman Sachs Morgan Stanley always had quant jocks. Not everybody has one job. And how can we obliterate the barrier of enabling that person with the information in near real time do better job of predicting their company’s future. I think it has shifted the battle has shifted to that. So I figured okay well now I can’t even though I don’t believe it’s about data scientists I absolutely believe in what you said this is kind of it you know machine learning and so on unfortunately some you know most companies are not being enough to have big enough sample for machine that you know big banks that would would Mays who will Google what makes Amazon’s on is that the date Asamblea so big they can actually learn from it typical company will look at that if you know thousand invoices and there is nothing to learn from. So I actually believe that that’s why I know that it’s done in a cloud as actually a lot of value because we see data echoes tens of thousands of companies and you can actually do machine learning from you know massive data sets that individually don’t actually mean anything Roman desktop analytic that’s reports no no it’s not it’s not addictive and I I we’re going to take a back it’s a let me. But let me let me let me and let me turn around and and ask you ask a little bit differently because I think. See there are two people mentioned excel so that was that was my that was my acute sense of talk about excel yet so so my what what I think is so interesting is that if I were to to query the room are a fun and we would find out that most people find excel the most valuable analytical tool that that they’re using and and there were maybe two reasons let’s just touch on on a you know one of them is just that it’s the one that they can use that does what they want but another one is that the the the gap between the CMO and and the IT organization is often there’s data missing. And that there is some source like it could be geographic data it could be like wow this report doesn’t even list all the stores we have all the outlets where it doesn’t have our weblogs or there’s just another part of the data that isn’t yet in the Duke is in some way and so so much of the job is is just bringing together and then applying that knowledge is I do think that that’s what differentiates you know like the difference between between Minneapolis and Bentonville in the U. S. is is not necessarily the product they sell but I started this one is going to be sure I always believe that got two types of people in the world. People who can use excel pay what they both sent people who cannot use excel built bagels you know and we all us belong in one of the categories and limit of people who can’t we work really hard to make pivot tables you know the LCD you know I know I know it’s about actually ironically snow they are in the automotive people later in the afternoon we did hear from the actual person who made them easy to use a lighting eighty nine pounds a and and and so you know example actually assumes that people actually know how do you spell except there were tables and that’s why some of the most frequently used kind of data analytics tools are extremely basic because they actually look and feel like sheet of paper like two dimensional sheet of paper and you know so that’s that’s the prominent analytics that on one hand we have you know very complex systems with you know spotted and hundred and so on and yet you know most of the people actually using those schools you know don’t like the abstractions they like sheet of paper Elson columns and up and so it’s very difficult to bridge that you know that’s that’s why you know you actually see a lot of kind of I know it is that some of that successful it’s either and bedding. So it actually all it is kind of you know very very you know very technical and industry specific and so on all people are using you know I believe that all of us compete with excellent bodies you know somebody who said it would excel and and the sense it over email. We didn Microsoft a good partner and investor in a calm you down time I’m of a different point of view on this which is I just a show of hands in this rule and don’t feel afraid how many of you use Tablo. In your shop are just two hands I find it hard to believe. Well the politics out dnmt yeah so so my view is that tools are really hard to change because tools usually embody a business process night they embody arm site my view is that like large companies particularly the ones where there analytically driven. I think the way this is for them to really leverage of fast changing data fast changing the fastening was. Are you going to try to figure out how it gets that like you were asking this question right so the legacy data flow is probably going to stay on premise for awhile. Now the question becomes how do you leveraged Disney’s new technologies how do you leverage the cloud. And so it’s going to be an augmentation strategy where there’s this concept coming up which is called the pipeline right and the pipeline idea is that data is like a river it’s it flows. And so maybe some part of the data made of its extent allowed internally flowing to the cloud you know certain kinds of processing will happen. And then over time what will happen is you take that data and then maybe landed in Sutton place by data scientists can analyze it some data will continue to go to excel some data will continue to go to a tabloid to be I analysts say it’s not going to be away there you know everything just disrupts by overnight people who do things were going to do it’s hard but won’t let yeah so let me let me ask not going to get any tool but the point is every technologies good for doing something it doesn’t subsume. Sparked doesn’t subsume data warehousing. How do doesn’t subsume it out streaming so they just like different technologies for different jobs said speaking of of subsuming which is a great a great way to ask this because I I I do want to recognize that the CIO was in the room were dealing with a a very you know real challenging real opportunity which is I’m guessing for most all of the people in this room their system of record is an underground structure sequel oracle based system. And for all the CMO goes that’s the starting point for most of the data that they they need to get to how how do what message how do we help the the customers in the room bridge that reality that they deal with. Or said another way like where’s the opportunity how do they start a new project what do they do or so many ethnic. So so you know we get off this all the time. Sometimes I would been in there those people use my former products and so on so so look here’s what we recommend. First of all. I believe the CIO and CMO have kissed and made up in a big way. In a big way well here there there they are together sometimes in the same table nobody’s nobody’s hit somebody on the head with anything yet but but beyond that what you have is this concept of a pipeline least called information factory in the nineties it’s a pipeline now because has real time streaming attributes but people should be thinking about being able to use the new price performance of a dupe Spock to obliterate their traditional data warehousing appliance not unplug it. But the rising tide of the data lake we think will drown out the data warehouse in the fullness of time. So that that is it like that’s an interesting. I exponential change that’s probably a huge opportunity that acts that might be worth taking thinking about her second which is even if you have a key to bite. In your your structure in stores today if you turn on the right sources within the company you’ll very quickly have lots more than that alternative to potentially work with that might end up being even more valuable indeed behind and and you know I I actually won and surreal original question like how do we actually. You know how do we deal with the fact that most data is on various anyways you know cloud based company I need to go my business. So so all of the focus is actually on data that goes echoes the firewall anyway you know so either really kind of help companies to monetize the data and and deliver analytics to business networks our biggest customer is a one of the large credit card issuers and they have. Medications they have you know issuing bank is they have acquired a so they have most of their audience for data actually sits outside notifiable so instead of emailing data in in CSE files all excels and so on we actually held them to a new kind of build it kind of analytics distribution platform. And I believe that that’s you know we don’t have time to wait for companies to move the date other primary data that a cloud that may not even happen but. More and more the kind of data sets are being used in this kind of echoes the firewall insult in in mobile mobile are scenarios and so on and that’s where you see kind of ninety percent of all the focus I it is is it really the case that I mean I I think it’s one of the things that so ish thing is is that in general there’s just more data outside of your organization then there is insight on that uplift key part of the tools is just how you connect those I think there’s a few things going on is definitely a shift towards the cloud. It all depends on the vertical somebody goes are. I’d say more skeptical because there’s regulations that basically says harden data cannot physically leave but in most companies even its financial services we’ve noticed that. They’re very eager to explore the cloud either for external data or even look at all your internal data and not all data as equals some data is VII some data is not now and and there are newer technologies that allow you to encrypt data in motion and harassed. Of the clouds matured this very sophisticated security controls and you’re seeing. You know it’s not as religious as I as it used to be before and that’s one of the reasons why you’re seeing CMOS and CIOS come together because there are platforms that way the CIO’s and basically now stand of projects to move data provision things in a W. us Ondrej shift on as you are and and run these projects so that they feel like they’re part of. A private cloud right while while they’re really running on public cloud. And so there’s a that each day quickly it’s changing yes and I don’t know it’s always one of the challenges in these panels do you want to we want to address the brought the of needs and and realize that that people always gonna be a different place in making this change at the same time you know we are in Europe and what makes this kind of. You know slowdown is this kind of local regulations and so on instead of putting one data center for you know. My old all audience you know user base we need to build multiple data centers and that kind of fragmentation and balkanization of data will continue and that’s going to be more and more difficult for the cloud than it is to manage that and managed understand all the regulations and so on so I think you know look the the fact is data has gravity and you have to respect. If your date as on premise you should probably put you to do for other kinds of analytics on premise. If your date is in the cloud to using a website hosted an Amazon or it Deutsche Telekom or Swiss garment Switzerland then obviously makes more sense to have analytics located there. But I think in all cases what is. Blindingly clear from looking at the various people we talked to is that you have to have a predictive analytics capability however you do it on premise in the cloud your dead without it you know one do certain things make sense to migrate to that and certain things not. You know if your credit card issuer conflict resolution how does a customer to return a product is no money in trying to put that into a modern system. If somehow got it going you worked out the disagreements leave it where it is but trying to understand how social media stream interacts with the product how people are complaining to Twitter rather than to your call center is of great importance. Right source of the we think that that is probably what people should do is get a predictive analytics damaging place and start bring based on data gravity the right sort of technologies to solve that issue. Yeah robbery I think you’ve talked about technologies and tools I the other thing we’ve noticed it’s all of our people. So that the shift it’s a transformation and transformation always begins with leaders. Yeah and you notice that it’s CIO’s CMO’s forward thinking guys. Sometimes it see deals that helped push that sometimes it CDOs chief data officers of debate digital officers. This site is one great guy in the audience Osama fired from markets I don’t know where you’re sitting was all but you know he’s he’s had those experiences before and so and he’s in a financial services company he’s thinking about how to bring those experiences. So I think it’s all about getting those people and then bridging visitor they got. Awesome well thanks everybody for a lively discussion and appreciate insights on on data that you’re naturally in.

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMIfamily. We are now accepting submissions and happy to discuss advertising &sponsorship opportunities.
To learn more, read our about page, like/message us on Facebook, or simply, tweet/DM @HackerNoon.
If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!


Join Hacker Noon

Create your free account to unlock your custom reading experience.