Big Impact with Big Data Part Two

This is the second post in the series describing my keynote at the Big Data Smart Knowledge Forum held at Tianjin University of Technology October 13th. You can find the first here.

In the second half of my keynote I want to share some applications for big data that I see. As I noted earlier I am not a data scientist yet. I have worked myself about halfway through this course sequence and find myself improving my R skills. I have the statistical skills to handle factor analysis, logistic regression. This make R easier for me. I know nothing (yet) of tensorflow, Hadoop, and Python.

The machine learning… That’s your job. That’s what I mean by interdisciplinary.


My first foray into Big Data traces back to my PhD work at the New Literacies Research Lab at the University of Connecticut. In conjunction with Clemson University we conducted a formative design research experiment exploring how students locate, evaluate, synthesize, and communicate information when reading on the web.

This also provided me with my first chance to code, beyond HTML mark up, since I quit coding BASIC in 6th grade. We built a simulated closed internet and a fake social network to assess students online research and comprehension abilities. I spent countless hours doing instructional design, iterating on items, and coding XML files.

What my colleagues found should startle you. Even after controlling for prior knowledge and reading ability their were differences in online reading ability based on race and economics. Once again we were recreating the inequities of the past.

If you look at the result you would see these differences across all the subset of skills. Look at evaluation in particular. Nobody challenges what they read and write online? Anyone want to take a guess? Think about it. How do we teach students? With text books and at the end of each chapter there are questions. Do you know how you answer the first question? By finding the first bold word in the chapter.

There are no bold words in life. We set students up for failure and create reward systems for the quick answer. We need what Ian O’Bryne calls healthy skeptics. We can not simply accept what our elders say as true. Authority is not a constant it ebbs and flows like most variables.

The ORCA study used a large sample but we could apply these tools to Big Data. What if we built web plug ins that scaffolded understanding? The log files could shine great light on methods best to teach folks how we read, write, and participate on the web.

Mozilla Thimble

I am a contributor to Mozilla, the good people behind then open source browser Firefox. We developed a tool to teach coding called Thimble. It is a web based editor that shows you your changes in real time.
My role was to test out early versions with my students and provide feedback on design. I also helped to write curriculum used by thousands across the globe.

I wanted to show you some of the insights I have drawn from the web analytics but Google is currently blocked in China due to the seating of the next Congress. Trust me, many people across the globe use Thimble to learn code.

Yet we can do so much more. We need help with those who have the expertise to design the learning analytics. We have so much data.

In the United States we have a saying, “finding a needle in the haystack.” This means looking for a sewing needle in a pile or grass or hay. As Bryan Matthers illustrates adding more needles to the haystack does not make finding the needle any easier.

I wonder can we build predictive code coaches using Web analytics-how many times the tutorial is clicked or Regression analysis by understanding predictive pathways students take. Smart tutors can help us teach students how to code.

Social Network Analysis

I also find social network analysis as a another interesting Big Data field. I have taught and taken many MOOCs. In fact I love them. Due to MOOCS I can say unlike Steve Jobs, Mark Zuckerberg, and Bill Gates I haven’t dropped out of just one ivy league college. I have quit them all.

We must remember the M for massively in MOOC never meant size. When David Cormier coined the term MOOC he meant for the word massively to modify open. It had nothing to do with ten or ten thousand participants. Instead MOOCs were meant to make learning open and visible.

We turn to Social Network Analysis to understand these learning connections. Search is now social. Learning has always been social but its now amplified. When I need the answer to a question I don’t go to Baidu or Google. I go to my networks and ask the experts who reside within my nodes.

Open Badges

We can also apply Big Data to open badges. Open badges represent a new way to credential learning. Instead of handing out paper diplomas we can now connect the criteria for learning with the evidence students create on their journey. Meta data lives behind each badge and we can use this to uncover new pathways for learning.

Hundreds of thousands of badges have been issued since open badges began in 2011.

Bots, Badges, and AI.

I am bullish on chatbots. I believe they can complete many of the low hanging assessment tasks that take up so much of my time. For example, in my current classes I have created a pathway called “Academic Blogger.” One of the badges wants students to use headers and images in their posts. A machine should identify this work and free up my time to provide more holistic writing advice to students.

I believe the stream is the future of online education. We can mix in smart tutor bots that can provide help when students need it. Take Thimble again as an example. Why can’t students ask a chat bot about how to change some bit of their CSS?

Brining it Home

Overall I think advances in Big Data and Learning analytics can help us make our cities be our campus and not just the classroom. For too long we have had the shackles of the Carnegie hour thrust upon us as if time in the class equates learning.

Open Badges, Big Data and bots can create new partnerships between the private and public sector. Dual enrollment classes could pop up between universities and secondary schools. Students could get credit for activities they conduct as volunteers or at the public library.

Big Data has the potential for us to recognize what great thinkers have always known. Learning happens anytime and anywhere.

3 responses on “Big Impact with Big Data Part Two”


  • Cedric Saelens
  • Greg McVerry

Leave a Reply

Your email address will not be published. Required fields are marked *