Big Data

Big Data
In this assignment, you will experiment with Big Data and what media theorist Lev Manovich calls ‘cultural analytics’. The practical side of this one is slightly more demanding than the others, which will be factored in during grading (12% for practical component, 8% for reflection).

As we will discuss in lecture, Big Data is so hot right now. Many scholars and popular journalists are excited about the possibilities presented by the enormous amount of data that users generate online and through mobile devices and apps. Proponents tell us these new data sets will tell us new things about culture, society, history, politics, technology–basically, everything. At the very least, Big Data shows us things we already knew in a different way, and helps us to connect things that we might not otherwise.

The practical component of this project is to, first, find some Big Data. There is an endless supply out there, so get creative! You can use anything from your personal blog’s web traffic, to celebrity X’s retweets and mentions on Twitter, your favourite hockey player’s ice-time or corsi rating, or even larger sets of public data such as found at the following links:
• https://www.quandl.com/collections/society
• https://www.reddit.com/r/datasets/
• https://www.kdnuggets.com/2011/02/free-public-datasets.html
• https://topsy.com/
The most important thing is that you find a data set that piques your interest and will tell an interesting story, or which poses an intriguing problem.

Next, experiment with visualizing your data using one of the many free tools available online. Here are some suggestions (there are many others):
• https://gephi.github.io/
• https://plot.ly/
• https://www.cytoscape.org/
• https://www.tableau.com/ (sign up for free trial)
• https://cartodb.com/ (sign up for free account)
Be sure to consult the School of Data tutorials on Big Data and visualization (see the required reading list). These will help you wrangle and work with your data. We will also set aside time in seminar to workshop some of these programs. Be sure to take screenshots or ensure your data visualization is in some way communicable to your TA.

For your critical reflection, think about the story told by your Big Data visualization. Were you surprised at patterns that emerged? How did visualization change your perception of the data? What are the strengths of visualization? What are its limitations? What kinds of conclusions can you draw? How might these be misinterpreted? Who might be interested in your data set? Governments and NGOs? corporations? Small businesses? How might they put it to use, e.g. what kind of problems would it help to solve? What kind of problems might it create or make worse?

Be sure to make reference to Andrejevic’s article – how does your data set either complicate or support his concern about the way we encounter and work with the glut of information at our fingertips? If Wasik were to write the “Agent Zero” chapter today, how might he reflect on the relationship between Big Data and the digital footprint generated by his invented online persona?

The goal here is not necessarily to say whether Big Data is ‘good’ or ‘bad’, but rather to explore its benefits and limitations through practical application.

1) Big Data (DUE: NOV.13)

In this assignment, you will experiment with Big Data and what media theorist Lev Manovich calls ‘cultural analytics’. The practical side of this one is slightly more demanding than the others, which will be factored in during grading (12% for practical component, 8% for reflection).

As we will discuss in lecture, Big Data is so hot right now. Many scholars and popular journalists are excited about the possibilities presented by the enormous amount of data that users generate online and through mobile devices and apps. Proponents tell us these new data sets will tell us new things about culture, society, history, politics, technology–basically, everything. At the very least, Big Data shows us things we already knew in a different way, and helps us to connect things that we might not otherwise.

The practical component of this project is to, first, find some Big Data. There is an endless supply out there, so get creative! You can use anything from your personal blog’s web traffic, to celebrity X’s retweets and mentions on Twitter, your favourite hockey player’s ice-time or corsi rating, or even larger sets of public data such as found at the following links:
• https://www.quandl.com/collections/society
• http://www.reddit.com/r/datasets/
• http://www.kdnuggets.com/2011/02/free-public-datasets.html
• http://topsy.com/
The most important thing is that you find a data set that piques your interest and will tell an interesting story, or which poses an intriguing problem.

Next, experiment with visualizing your data using one of the many free tools available online. Here are some suggestions (there are many others):
• https://gephi.github.io/
• https://plot.ly/
• http://www.cytoscape.org/
• http://www.tableau.com/ (sign up for free trial)
• https://cartodb.com/ (sign up for free account)
Be sure to consult the School of Data tutorials on Big Data and visualization (see the required reading list). These will help you wrangle and work with your data. We will also set aside time in seminar to workshop some of these programs. Be sure to take screenshots or ensure your data visualization is in some way communicable to your TA.

For your critical reflection, think about the story told by your Big Data visualization. Were you surprised at patterns that emerged? How did visualization change your perception of the data? What are the strengths of visualization? What are its limitations? What kinds of conclusions can you draw? How might these be misinterpreted? Who might be interested in your data set? Governments and NGOs? corporations? Small businesses? How might they put it to use, e.g. what kind of problems would it help to solve? What kind of problems might it create or make worse?

Be sure to make reference to Andrejevic’s article – how does your data set either complicate or support his concern about the way we encounter and work with the glut of information at our fingertips? If Wasik were to write the “Agent Zero” chapter today, how might he reflect on the relationship between Big Data and the digital footprint generated by his invented online persona?

The goal here is not necessarily to say whether Big Data is ‘good’ or ‘bad’, but rather to explore its benefits and limitations through practical application.

I get this from my professor

But remember, too, that the rationale behind these assignments is that you will research and learn about the relevant areas on your own, or by talking with each other, rather than simply following our exact instructions. It’s a practice-based excecise of learning by doing.

Finally, here is a list of some further readings and resources that will help you to understand data collection and visualization.


• “Big Data or Too Much Information?” from Smithsonian Magazine
• http://www.smithsonianmag.com/innovation/big-data-or-too-much-information-82491666

• Web Scraping
• http://blog.hartleybrody.com/web-scraping?http://codecr.am/blog/post/7

• “Long Data”, not “Big Data”
• http://www.wired.com/opinion/2013/01/forget-big-data-think-long-data

• World data supply is 2.8 zettabytes – 0.5% available for analysis and visualization
• http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume

• Beware the Errors of Big Data
• http://www.wired.com/opinion/2013/02/big-data-means-big-errors-people

• How President Obama’s Campaign Used Big Data To Rally Individual Voters
• http://www.technologyreview.com/featuredstory/509026/how-obamas-team-used-big-data-to-rally-voters

POSSIBLE DATA SETS


• Sample data on Google’s BigQuery service
• https://developers.google.com/bigquery/docs/overview
• https://bigquery.cloud.google.com

• Wikileaks’ “Afghan War Diary”
• http://www.wikileaks.org/wiki/Afghan_War_Diary,_2004-2010

• Craiglist – see this tutorial:
• http://codecr.am/blog/post/7
• …or format your search like this:
• http://YOURCITY.craigslist.org/search/sss?query=SearchString

• Stock market
• http://www.gummy-stuff.org/Yahoo-data.htm
• http://stackoverflow.com/questions/754593/source-of-historical-stock-data

• Social networks; Twitter dump and Google+ API
• https://developers.google.com/+/api

• Sports statistics
• http://www.basketball-reference.com/blog/?cat=40

• Scid Chess Database (historical player ratings, photos)
• http://scid.sourceforge.net/download.html

Is this question part of your Assignment?

We can help

Our aim is to help you get A+ grades on your Coursework.

We handle assignments in a multiplicity of subject areas including Admission Essays, General Essays, Case Studies, Coursework, Dissertations, Editing, Research Papers, and Research proposals

Header Button Label: Get Started NowGet Started Header Button Label: View writing samplesView writing samples