Michael J. Price Lab for Digital Humanities

Star Wars Fanfiction

Star Wars Fanfiction

Peter Decherney

Professor of English and Cinema Studies

Jehoshua Eliashberg

Sebastian S. Kresge Professor of Marketing; Professor of Operations, Information and Decisions

Co-Investigators: 

Scott Enderle, Digital Humanities Specialist

James Fiumara, Research Project Manager, Linguistics Data Consortium

Funding Period: 
May, 2018

Professors Peter Decherney (SAS) and Jehoshua Eliashberg (Wharton) have been working with Scott Enderle (Penn Libraries) and James Fiumara (Linguistics Data Consortium) on a project studying movie fanfiction.

Communities of devoted movie fans write stories for each other using the characters, plots, and sometimes dialogue from movies. One popular website that hosts fanfiction, “An Archive of Our Own” or AO3, hosts more than 3.6 million works, and it is growing all the time.

The team started by looking at the 20,000+ works of fan fiction devoted to Star Wars: The Force Awakens (2015). They built topic modeling software that compares the movie’s script to its fan fiction, looking for the parts of script that fans engage with most. Sometimes fans quote lines directly and other times they rework scenes or speculate about character’s lives outside of the film narrative.

Through a browser interface, users can graph the most re-used sections of the script and overlay that onto additional forms of analysis, such as sentiment, character, or narrative analyses. Do users engage more with angry scenes, for example? Or do they prefer to write about darker characters?

The team is in the process of adding more scripts and fan works to the database, so that users can run analyses on any film with a significant fan community. Because we designed our scripts and visualizations to be reused on many different kinds of data, the workflow for this process is largely automated, and can be reused by researchers investigating many different kinds of questions in the future.

A future iteration will add a predictive element, forecasting the sections of a new script that fans are most likely to engage with though re-writing and re-imagining.