Big data is a term juicy and nebulous enough to take on many different definitions and orientations. A reference to scale (of data), speed (of processing and analyzing) and complexity (of the algorithms employed) at minimum, big data is heralded by Forbes as “the hottest sector in IT at the moment” and just as regularly explored as a tangle of political and ethical concerns. But most importantly perhaps for the conversations we have been having over the last two semester, I think big data means something quite different for the humanities and the social sciences. Typically for the humanities, the stirrings of data is seen as an exciting opportunity—as James Grossman notes the techniques and technologies of big data not only offer new opportunities of analysis and collaboration for historians, but also the possibility of non-academic employment opportunities (as well as academic one, one hopes). For the social sciences, and sociology in particular, thinking “data” is familiar territory—indeed as Burrows and Savage note academic sociologists were pioneers in data collection and analysis, particularly in the development of the social survey. One might say that social data collection is deeply methodologically and epistemologically foundational to sociology. Thus some social scientists, most prominently perhaps Bruno Latour, have argued that large scale data mining and visualization can perhaps allow for a reconfiguration of sociological epistemology, away from the levels of structure and individual. Lev Manovich writes of something similar when he critically analyzes the notion that new computational tools might mean that “we no longer have to choose between data size and data depth.”
While some are optimistic about what these new technologies might mean for the social sciences, others see more of a threat than opportunity (Burrows and Savage call it the “Coming Crisis of Empirical Sociology”), in large part because both the collection and analytics of this data has largely occurred outside the academy. Nigel Thrift has written about this as the emergence of “knowing capitalism,” when “capitalism began to intervene in, and make a business out of, thinking the everyday” (Knowing Capitalism, p.1). Indeed, it would seem that in for both humanists and social scientists working with truly big data requires, as Manovich notes, a reliance upon either the state (the military and domestic security apparatus being most data hungry parts of the state) or more likely privately owned, profit minded collections, many of which are only in part publicly accessible. What are the political ramifications of working with data that has been collected and (often) organized for purposes that rarely include critical inquiry (mostly, of course, the purpose is to sell stuff)?
As you probably have guessed at this point, my engagement with the phenomena known as “big data” has thus far been mostly been theoretical in nature, so in the interest of not letting that completely dominate my post here, I’m going stop now and pivot to some more practical concerns.
At their best, interactive web-based visual visualizations open up the possibility of some level of data “transparency” and even manipulation (A notable example being this CUNY designed slider map showing changes in NYC’s racial demographics). As “Tooling up for the Digital Humanities” notes, this allows the possibility of users themselves finding novel patterns and correlations. Unfortunately this level of transparency and interactivity still seems to be rare—more frequently data visualization is an increasingly popular way to tell a particular story, usually a relatively simple one. Tooling Up writes that “many viewers are not necessarily used to reading visualizations critically,” but perhaps there is something about engagement with data through visualization that thwarts criticality, or at least pushes towards simpler rather than complex answers to social questions. I’ll occasionally give class assignments where students will have to find a data visualization around a certain topic that they find particularly compelling, and quite often they bring me well-designed infographics that clearly tell a relatively simple story about a complex topic. Can we beautifully and clearly convey nuance and complexity in data visualizations?
More interesting to me than using visualizations to tell a story are the possibilities of data visualization as a research method for seeing patterns and tracing connections. Simply visualizing data by time or geography can reveal surprising patterns and connections. Mapping is an obvious area where visualizing can help us quickly see interesting patterns and connections in data, but it seem clear like things like googles ngram viewer and even word clouds could be useful tools in early stages of a research project. Has anyone used visualization tools in actually developing lines of critical inquiry?