Hello all, the following is the documentation regarding my final project for this semester’s Collective Methods course.
Throughout the semester, we had learned about predominately two topics which I would categorize as “The Narrative” and “The Data”, where the former referred to how a collective narrative can be told, while the latter refers to attaining information from the internet. Keeping these two topics in mind, I went forth to design a final project which could incorporate the two topics that we had studied.
Originally, I had set out to create a program which would convey the amount of online support for some current U.S. President hopefuls and and present it visually. In other words, display the twitter support of the top six American presidential candidates. To do this, I had imagined a visual display which showed the candidates’ heads, which would hopefully shift in size to display proportionate media support from twitter. For example, should one candidate have a lot of support, than their head would be large, and vice versa. In this aspect, the shifting sizes were meant to construct a narrative about these candidates’ political presence and success online, and tell the story about their success and failure in relation to one another. The idea here would be that the visual narrative would be based on the participation of tweeters from around the globe who submitted posts via Twitter displaying their support for each candidate. The narrative was facilitated by me, as I made the visual display, while the actual content of the narrative was supported by the masses who participate in the digital sphere.
The second part of the project involved attaining that data from the internet, so as to properly gauge each candidate’s online following. To do this, I decided to use the tweepy Python library in order to scrape Twitter for supportive tweets regarding each candidate.
I began this project by taking our scraping excercises and applying them to scrape twitter. In conjunction with vanilla tweepy, I used the tweepy LiveStreamer library so that I could get real time results for each query. In this process I learned that simply scraping information, such as a hashtag for a candidate’s name, was not enough, as often such hashtags are used critically or ironically, and therefore by no means a good indication of one’s twitter support. However, specific hashtags regarding a candidate were conversely very supportive. For example, scraping for a hashtag with a candidate’s campaign slogan would often provide useful tweets from supporters of that candidate. Therefore, I had to limit the parameters of my searches to hashtags which overwhelmingly produced supportive tweets for each candidate.
The issue with this was often these hashtags would be so numerous that they would never work with my tweepy code, as tweepy can only handle so many requests within a fifteen minute window. For example, using tweepy to find #Hippo would often yield about 400 tweets, which was no issue. However, searching for a hashtag such as Donald Trump’s slogan #MakeAmericaGreatAgain would always result in a plethora of errors, particularly tweepy’s 429 errror, as this query would literally return thousands of responses. Moreover, tweepy would parse quickly but count out the results at about the rate one result per second, which made waiting for each search a lengthy process of about 5-10 minutes each time, and only when scraping quantifiable hashtags. This prevented me from scraping the data for each candidate, as I had originally intended. On one occasion, this code even caused my computer to crash, which was, needless to say, frustrating.
Another aspect of the scraping worth mentioning was the struggle with coding. Python is, I believe, generally easy to use. Moreover, documentation of tweepy is extensive and easily found within the internet. However, this documentation is tyically written and read by people who understand tweepy on a level beyond my own, as the jargon used and presecribed remedies inherent in the documentation are not written in lay terms. This was problematic for my research, and certainly retarded my progress. However, I was able to learn how to use the LiveStreamer library to an extent, which was paramount for this project.
To remedy these obstacles, I had to compromise, and used tweepy instead to scrape information regarding animals, such as the aforementioned #Hippo, and attribute the results to a candidate. This way I would still be using tweepy and the LiveStreamer library, but also have some attainable numbers to fuel the visual aid.
Beyond the data scraping, my final relied on creating the visual narrative from whence to display each candidate’s political support. My process involved using the Processing program to create a sketch whereby I created a screen that housed the dynamic heads of each candidate, and had each change size based on the tweepy output. This was perhaps the simplest part of the process, as the steps were simple and easy for me to accomplish. I began by writing a sketch, and placing a photoshoped PNG of Donald trumps head into that sketch. This initial sketch allowed me to change the size of his head based on my placement of the mouse within the processing window. Then, I created another sketch where I simply had his head bouncing around the window and oscillating bewteen sizes ranging from 100 pixels to 1000 pixels. The next step required me to then photoshop the other candidates’ heads into PNGs, so that they matched Donald Trumps and had no backgrounds. The final step, perhaps the simplest, was to design the final visual display and integrate all the visual and technical assets.
Here are the asset pictures used for my final:
I think the goal of each final project is to challenge oneself with the tools and lessons learned over the semester, which was what I attempted for this project. Along the way, much self education occurs as well, and helps solidify one’s understanding of the concepts and efficacy with the tools. For me, the scraping aspect was certainly the most challenging, as it required me to use and learn the various ways in which the tool tweepy could be used. In addition, as previously mentioned, the research I did was greatly impeded by the technical jargon used by tweepy documenters. I was able to use the skype office hours a few times, but the time difference in conjunction with my schedule made this option a tricky task, as availability was not always attainable. However, this issue was unavoidable due to the great time difference, and therefore couldn’t be helped.
Moreover, I am disappointed that I couldn’t achieve my ultimate goal of scraping the actual hashtags regarding each candidate. It is my hope that I can work on this project a little more to see if I might be able to do it in the future, as I really appreciate the value of scraping and would like to better understand this concept.
Overall, I am glad that I learned what I could in this class, and was able to put some of it to use for my final. I appreciate the study of the dynamics of narrative, and will keep those lessons in mind with all future projects. Moreover, I really did enjoy learning about scraping, as I believe it is a tool which should be used more by people, as it creates greater digital self efficacy when searching the web. Therefore, I will conclude by stating that the greatest outcome of this project was my practice and usage of tweepy to scrape Twitter.