Thursday, October 4, 2012

Are we ready for the next On-Ramp? Big Data, Analytics, and Human-Centric Computing


This session consisted of a panel with speakers from Intel, Cisco, Facebook, and Georgia Tech.  The session started with an overview, then each presenter gave a 'TED-like' short presentation.

Overview
To start off with, what is meant by Big Data?  Some definitions are:

  • more than 100 TB
  • anything that doesn't fit on your servers
  • received at ultra-high speed streaming (i.e. twitter, fb, etc)
  • growing at > 60% per year
  • deployed on scale-out infrastructure
  • two or more data formats and/or sources

These all contribute to a characterization of Big Data:

  • huge variety
  • volume is huge
  • value is heterogeneous
  • velocity is quite high

Two additional features are viscosity and variability (inherent uncertainty in the data, look at errors in analysis)

Now that we know what Big Data is, what do we mean when we talk about analytics?  Some examples include: social network analysis, it infrastructure optimization, weather forecasting, life sciences research, fraud detection.  All of these impact humans, human society, and our relationships, especially with how we use/engage with data.  For example, 750 million photos uploaded to Facebook in just two days.

Janet Ramey
The first 'TED' speaker was Janet Ramey, who works at Cisco.  She was part of the early building of 'the Web' - moving data and organizing it.

Janet talked about Technical Assistance Center Engineers - not call center agents as you might expect.  These engineers have BSc or MSc degrees in CS, speak over 17 languages, and deal with problems related to everything from your website being down, your company being down, or even your entire country being down.  They value timely, accurate issue resolution from highly qualified experts with a rich history of innovation.

How is this related to Big Data?  Cisco uses Big Data to improve their engineering; they forecast ad schedule TAC engineer resources as well as continuously measure and adjust the schedule and simulate new models using data from customer profiles, case data, and the profiles of their engineers.

Janet closed by telling us that she thinks that to be technical leaders we just need a lot of curiosity, then training and building knowledge.

Moira Burke
Moira is a data scientist at Facebook.  She spoke about her experience running usability studies at Facebook that ask 'what's the relationship between Facebook use and well-being?"

There are three classes of FB use: directed communication, passive consumption, and broadcasting.  It turns out that the first of these is associated with increases in social support.  However, there was no improvement in well-being outcomes with passive consumption or broadcasting.

How big are the increases?  Really big!  With major life events, the only non-negligable increase in feelings of well-being is when there is a death  in the family.  This is the only life event that is matched by directed communication on Facebook.

There are design implications of this research for Facebook, too.  They are now encouraging direct communication through their UI by moving comments on feed items to be inline rather than making you navigate to a different page.  There are friend recommendations to help you engage more with people close to you too.

Eva K. Lee
Eva has a background in mathematics.  Although she works in theory and computation, she is very application-driven.  She works with homeland security on vaccine development, disaster response, epidemics and pandemics, and response to biological/nuclear/chemical/terrorist attacks.  She develops predictive models, optimization models, pattern recognition, and machine learning.  She stresses building models that are realistic and deal with real scenarios, helping users in the real world.

I found the Facebook talk the most interesting because the talk was short and focused on a single study and showed concrete methods and results rather than an array of problems to be solved.

1 comment:

benslin kard said...

While many enterprises are addressing the storage and access challenges around the volume, variety and velocity of big data. Big Data trainings