I want to begin with a confession.
I don't understand how data is collected, how algorithms work, how I participate in a range of data-gathering extravaganzas, trading (as I do) information about where I am so that I can find out about a great bicycling route or a local brew pub, though I have enough of an inkling to want to be cautious.
However, I have watched many episodes of the show Person of Interest (Nolan, Plageman, Abrams, & Burk, 2011), a show about the role of artificial intelligence used to gather up an extensive amount of data on individuals and determine who is a risk and who is at risk in life and death scenarios. While the show mostly is intriguing for its ways of exploring contemporary data capabilities, it reminds me to proceed with caution when considering data collection. While I'm claiming only ignorance about data, informed by a television show, I still would like to raise a few questions and suggest some future collective moves on the part of organizations like the Council of Writing Program Administrators (CWPA) and the Conference on College Composition and Communication Committee on Computers in Composition and Communication (7Cs) regarding the ethics of data collection for writing program administrator (WPA) types like myself who feel always increasing pressure to provide assessment data to others.
I should say that a colleague is taking over as the writing director at my school, which affords me the luxury of returning to research on surveillance and mobility. But I mull my future research as I'm sending him the files for the program, and I wonder whether or not to reveal the level of paranoia that I have about data collection to him.
In addition to having tamped down the desire to somehow spread the contagion of data paranoia, I have been tempted to, but have not yet succumbed to the desire to order him his own copies of my favorite assessment texts in field (Adler-Kassner & O'Neill, 2010; Broad, 2003; Huot, 2002; O'Neill, Morgan & Huot, 2009), nor have I sent him all the PDF files of my favorite assessment texts designed to help a WPA decide on a plan for assessment. However, I have suggested that he read through the case studies in assessment available on the CWPA website, encouraged him to read through the other resources there, to read through the last year of subject headings for the CWPA listserv (WPA-L), and to dig into a few threads from the past couple of years that are intriguing, I think, for a new WPA.
Given my own inclinations to push around the questions of surveillance and political issues regarding privacy in data collection, I tried to find more tempered and helpful information for him that might prove useful as he encounters a range of products designed to facilitate assessment. However, guidelines are scant in rhetoric and composition, and it may be helpful to see the 7Cs group collaborate with the CWPA, National Council of Teachers of English (NCTE), and perhaps the Modern Language Association (MLA) to establish some guidelines for new WPAs and for the community at large. It behooves us to pay attention to the extremely large and always growing data pool of student essays and other data being gathered up not only by Google, with its amazing reach in providing universities with email and document capabilities, but also by campus-wide programs like Turnitin or MyCompLab (through Pearson), Re:Writing (Bedford/St. Martins), and programs that started as or cater to local institutions like Raider Writer (Texas Tech), Marca (originally affiliated with University of Georgia), MyReviewers (University of South Florida), or Eli Review (originally affiliated with Michigan State).
Rhetoric and composition has worked effectively to create shared documents that help us to argue for the kinds of assessment that might be most productive for our students and our programs, but we haven't yet created position statements on data collection, and with the freshman writing sequences required at so many institutions, powerful data accumulates with each passing semester. At the very minimum we should have a set of statements that indicates to a new WPA what is at stake with each kind of data collection option, how that data should be collected, what kinds of access we as a community need to have to that data, and what kinds of regulation or oversite we would like our leading organizations to have with regards to that data. Our best people should be contributing their voices to the conversations, contributing perspectives on the cleaning of data, the questions that are asked of data, and the conclusions that are drawn from that data (cf. boyd & Crawford  for a significant discussion regarding shifting notions of research in the land of big data analysis).
While people like me aren't trained or able to participate in these kinds of data analysis that require an awareness of what it means for Pearson to hold a significant market share of programs like MyCompLab, we need people who are savvy on these big picture issues. Me? I'm mostly concerned with figuring out how to meet local and state assessment expectations in ways that will benefit our local students, improve our instruction, and develop our programs, but I need a way to make decisions that address the larger-picture considerations. If only 20% of the nation's writing programs adopt MyCompLab, or if only 20% of the nation's universities rely on Blackboard or Canvas, the kinds of pools of data that accumulate yearly from freshmen writers are enormous.
When I think about my colleague making decisions about what programs to promote and what kinds of resources to select, I want to say to him, people may send you seemingly kind and supportive emails, suggesting that you participate in their pilot projects. Try our program; see if it works for you, they will say: We'll give you back-end data, and you'll have those tedious assessment reports ready in a jiffy. Or similar messages may come from an administrator trying to provide support who might offer to fund this or that e-writing or eportfolio system, with the hopes that there might be more possibility for longitudinal assessment as a student's writing record accumulates over many semesters.
When this happens, I want to say to him, remember that the machines are everywhere. Giant pools of data from students' essays are accumulating (along with collections of their emails, tweets, posts, pins). I want to say to him that it might make sense to watch something like Person of Interest and consider what artificial intelligence machines may look like because of the data we contribute. I'd rather be able to say to him, look there's a web page on the CWPA website, linked also with CCCC, and 7Cs, and MLA, where people can review these programs, give you a bottom-line analysis of the strengths and weaknesses when making recommendations for a local writing program assessment resource.
At the moment, Google, Pearson, MacMillan (Bedford/St. Martins), Turnitin, and other courseware companies are under no obligation to explain to you, as the WPA, what the machine will know, what it will be able to accomplish when it starts to pool much larger data sets, though all trends suggest that the programmers are increasingly savvy at coding in shared ways that facilitate data transfer between these online giants. In other words, our Instructure (Canvas) courseware program intersects with Pearson's textbook delivery and collection of students' responses to a range of modules in ways that I fail to imagine at an adequate level.
The large companies market their resources effectively. But a less obvious temptation may arise for the new WPA when talking to a well-meaning and incredibly industrious colleague at another school who comes up with an online venue that he or she wants to develop into something larger. There's a query in your mail box. Would you like to participate, perhaps a few sections to start, in our great program that offers these bells and whistles? This is not to say that all companies, especially the disciplinary-grown ones, are up to nefarious data collection. It is to say that we should be ready to question what, how, and why the data is being collected, even from locally produced platforms.
Here's What We Can Do
We as a community need to be doing the groundwork on data collection to decide on the kinds of position statements that will aid companies in shaping policies and procedures to address our concerns, as we have the potential to not offer corporations large pools of student (and instructor) data.
We need a way of determining the "best in show," or the most writer-friendly data collectors so that the new WPA can make wise decisions without extensive savvy on data collection. We need a way for these companies to register and to be reviewed by a group of people in-field, to effectively assess and then recommend the company based on a strengths and weaknesses analysis. Some of this review process should be open to WPAs who are always in the midst of deciding on the best resources. This suggestion would probably best be met by a crowdsourcing approach, one that effectively envisions sustainability. Finally, we need a group of people with data savvy who can participate in data analyses within these companies. It is also well past time for us to collaborate with Ph.D.-granting assessment programs in universities, sharing our expertise, and figuring out viable ways to help train new accreditation experts to address writing concerns.
As a field, with at least a passing understanding about data collection, privacy concerns, and surveillance options, we need to somehow develop a more powerful voice in the analysis of the always-growing mass of data or live with the consequences of not finding a way to negotiate with these large entities regarding our programs' contributions to their giant pools of data.
Adler-Kassner, Linda, & O'Neill, Peggy. (2010). Reframing writing assessment to improve teaching and learning. Logan, UT: Utah State University Press.
boyd, danah, & Crawford, Kate. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication, & Society, 15(5), 662–679. Retrieved from http://www.danah.org/papers/2012/BigData-ICS-Draft.pdf
Broad, Bob. (2003). What we really value: Beyond rubrics in teaching and assessing writing. Logan, UT: Utah State University Press.
Council of Writing Program Administrators. (n.d.). NCTE-WPA white paper on writing assessment in colleges and universities. Retrieved from http://wpacouncil.org/assessment-gallery
Council of Writing Program Administrators (CWPA) listserv. Retrieved from http://wpacouncil.org/wpa-1
Grabill, Jeff, Hart-Davidson, Bill, & McLeod, Mike. (n.d.). Writer review—named Eli. Retrieved from http://elireview.com
Huot, Brian. (2002). (Re)articulating writing assessment for teaching and learning. Logan, UT: Utah State University Press.
Instructure. (n.d.). Retrieved August 11, 2015, from http://www.instructure.com
MyWritingLab. (n.d.). Retrieved August 11, 2015, from http://www.pearsonmylabandmastering.com/northamerica/mywritinglab
Nolan, Jonathan, Plageman, Greg, Abrams, J.J., & Burk, Bryan. (2011). Person of interest [Television series]. New York, NY: CBS.
O'Neill, Peggy, Moore, Cindy, & Huot, Brian. (2009). A guide to college writing assessment. Logan, UT: Utah State University Press.
Raider Writer at Texas Tech. (n.d.). Retrieved August 11, 2015, from https://www.depts.ttu.edu/english/fyc/
Re:Writing. (n.d.). Retrieved August 11, 2015, from http://bcs.bedfordstmartins.com/rewrting2e/#t_526483____
University of South Florida. (n.d.). My Reviewers. Retrieved August 11, 2015, from http://myreviewers.com