Kairos 19.3: Wolff, Baby, We Were Born to Tweet

Study Background

All tweets discussed in this webtext come from a larger corpus of 2.5 million tweets containing the word Springsteen. I archived the tweets using yourTwapperKeeper between 20 February 2012 and 20 October 2013—dates that corresponded with the beginning and end of Bruce Springsteen’s 2012–2013 Wrecking Ball tour with the E Street Band. Designed by John O’Brien, yourTwapperKeeper searches Twitter’s Streaming API and Search API by keyword or hashtag and archives tweets on a server hosted by the researcher (Kelly et al., 2010). It has quickly become a preferred method (Bruns & Liang, 2012) for archiving tweets because of its ease of use, ability to export to Excel and other formats, and robust archiving capacity. YourTwapperKeeper collects tweet content, username, reply username, and other information, though Highfield, Harrington, and Bruns (2013) have pointed out that it does not collect retweets created using Twitter’s retweet button (p. 322). Manual retweets (those within quotation marks or containing a RT followed by the original content) are collected in the archive.

Screen shot of Bill Wolff's yourTwapperKeep homepage showing the Springsteen archive which contains 2,508,013 tweets. Archiving began on Monday, February 20, 2013 at 1:03 am, GMT.

There are two other limitations to the archive. First, it does not include all tweets tweeted about the concert at the Izod Center; tweets about the concert that did not contain the word Springsteen were not archived. I also later learned that the show did have its own hashtag, #wbnj2 (Wrecking Ball New Jersey show 2) and a search on Topsy.com found 80 total tweets—a relatively small corpus. These are not included in the analysis unless an author used the #wbnj2 hashtag and the word Springsteen. Second, yourTwapperKeeper will very rarely stop collecting tweets. It is up to the researcher to check in on the archive daily to ensure the archive is running. The archive continued running throughout the time the Izod Center concert tweets were collected. Despite these kinds of limitations, Bruns and Liang (2012) observed: "No dataset captured by using the Twitter API is guaranteed to be entirely comprehensive. . . however, such research nonetheless remains valid and important" (p. 1).

As a compliment to yourTwapperKeeper, I also used Martin Hawksey’s Twitter Archive Google Spreadsheet (TAGS) (Gaffney & Puschmann, 2013, p. 62; Hawksey, 2013). As its name suggests, TAGS collects tweets from Twitter's streaming API in a Google spreadsheet. Because of Google's character limits, TAGS is an excellent choice for archives of fewer than 15,000 tweets (such as course hashtags). To bring my tweets into TAGS, I used Google's import feature. I then ran Hawkey's integrated TAGSExplorer to create corpus network visualizations. Hawksey designed TAGSExplorer to take data from a TAGS spreadsheet and integrate with the d3.js graphing library to create dynamic interactive visualization environments (Hawksey, 2011). The visualization maps interactions in a particular corpus by representing each author as a node in a circular-shaped cloud of authors. Nodes connected by dark lines indicate @replies; those connected by dotted lines indicate @mentions. The larger the node, the more replies and mentions (Hawksey, 2011).

Tweets from the April 4, 2012 concert at the Izod Center were chosen following Daniel Cavicchi's (1998) ethnographic approach where he began his study of fans at a concert he attended with his wife (pp. 22–37). I attended the April 4 show with my wife and thought the lived experience would help me navigate the meaning of the tweets just as Cavicchi's lived experience helped him understand various Springsteen fan discourses. To make the size of the corpus manageable for study processes, I filtered tweets to fit within a set timeframe: between 5:00pm the night of the concert and 9:00am the following morning—that is, in between traditional work hours.

Grounded Theory Methodology

I analyzed the tweets using a grounded theory approach informed by those advocated by Anselm Strauss and Juliet Corbin (1990), Kathy Charmaz (2006), and John W. Creswell (2006). Charmaz (2006) argued that "grounded theory methods consist of systematic, yet flexible guidelines for collecting and analyzing qualitative data to construct theories 'grounded' in the data themselves" (p. 2). Strauss and Corbin (1990) described grounded theory methodology as one where "data collection, analysis, and theory stand in reciprocal relationship with each other. One does not begin with a theory, then prove it. Rather, one begins with an area of study and what is relevant to that area is allowed to emerge" (p. 23). Because theories emerge from the data rather than in the head of the researcher, one of the many benefits of using a grounded theory approach is that any bias a researcher may have is significantly diminished.

Grounded theory was chosen for this study following Creswell’s (2006) description that it "is a good design to use when a theory is not available to explain a process. The literature may have models available, but they were developed and tested on samples and populations other than those of interest to the research" (p. 66). Studies have been conducted on various Twitter corpuses but none on the Springsteen fan community. Further, Creswell (2006) described grounded theory as a "qualitative research design in which the inquirer generates a general explanation (a theory) of a process, action, or interaction shaped by the views of a large number of participants" (p. 63). With over 14,000 tweets in the initial corpus, I had a large number of participants who composed and tweeted their views. My goal for the study was to learn about the views, actions, and interactions of those participants. Creswell (2006) argued that in grounded theory data is collected from "multiple individuals who have responded to an action or participated in a process about a central phenomenon" (p. 120). That is, the methodology does not require triangulation to ensure study validity. The tweets in my corpus have been collected (archived) from multiple individuals who are participating in a process (composing tweets) about a central phenomenon (Bruce Springsteen as fan object). Whereas many grounded theory studies have interviews as the units of analysis, I modified the methods so each tweet served as a single unit of analysis. After categories were generated via open and axial coding processes, each tweet was assigned a primary code and, when applicable, one or multiple secondary codes. Memos and lists of code definitions were maintained throughout the process. I have ensured the reliability of assigned codes by conducting inter-rater reliability tests using ReCal, which was developed by Deen G. Freelon (2010).

I prepared, coded, and analyzed tweets by following these steps:

Downloaded tweets composed on April 4 and 5, 2012, from yourTwapperKeeper as Excel file.
Opened in Excel, adjusted Unix time, which is set to Greenwich Mean Time, to account for the concert’s time zone, and converted the time and date to Western time and date conventions.
Created categories of tweets using what Strauss and Corbin (1990) called open coding and Charmaz (2011) called initial coding based on observed phenomena. Strauss and Corbin (1990) described open coding as "the part of the analysis that pertains specifically to the naming and categorizing of phenomena through close examination of the data" (p. 62). Charmaz (2006) observed that through this initial coding, researchers "remain open to exploring whatever theoretical possibilities we can discern from the data" (p. 47). Open coding is what makes the rest of the research possible. Three researchers (myself and two research assistants) labeled tweets in groups of 200 based on the content of the tweets, noting such items as their subject, occurrence of @replies, hashtags, links to images, and so on. Researchers met, discussed their findings, added categories to a master list, and created and/or modified definitions. This months-long process was completed to learn more about the community being studied and determine a specific study focus. Memos were kept throughout the process.
Once a focus was located, employed a modified version of axial coding (Strauss and Corbin, 1990) to generate categories directly relating to the focus (Creswell, 2007, p. 64). My process was informed by Charmaz’s (2011) description of using gerunds for category names to showcase actions (p. 136). These categories help understand phenomena observed in the tweets.
Coded the tweets along categories generated in axial coding into one primary code and, if necessary, one or more secondary codes.
Used selective coding (Strauss & Corbin, 1990) to make connections between the categories defined during axial coding to help generate theories about what is suggest by the analysis.
Conducted an inter-rater reliability (IRR) test on a small sample to determine effectiveness of code definitions.
Revised definitions based on the IRR test results and then completed another IRR test on a larger sample to determine effectiveness of revised code definitions. (If necessary, repeated this process until revised definitions results in exemplary IRR scores.)
Imported tweets into TAGS to create visualizations.
Wrote article based on findings, sending the initial draft to members of the Springsteen fan community on Twitter to ensure representations of fans are accurate. Incorporated fan feedback in later drafts following a practice advocated by Henry Jenkins (2013, p. 7).

Track 3: Fans/Readers/Writers

Track 5: Results

Baby, We Were Born to Tweet
Springsteen Fans, the Writing Practices of In Situ Tweeting,
and the Research Possibilities for Twitter

William I. Wolff • @billwolff

Study Background

Grounded Theory Methodology

Baby, We Were Born to TweetSpringsteen Fans, the Writing Practices of In Situ Tweeting,and the Research Possibilities for Twitter

William I. Wolff • @billwolff

Study Background

Grounded Theory Methodology

Baby, We Were Born to Tweet
Springsteen Fans, the Writing Practices of In Situ Tweeting,
and the Research Possibilities for Twitter