“Gendered” is a new series of posts which look at gender stereotypes with data. The goal is to expose the stereotypes and equip people with tools that will help recognize them in everyday life. Because they are everywhere. Really.
A few months ago a friend posted a picture that was traversing the internets a couple of years earlier: side-by-side covers of two teen magazines – Girls’ life and Boys’ life. The difference was so striking that it caused a modest uproar.
It caught my attention and I visited the websites of these magazines. The situation looked even worse when you compared the covers of dozens of issues over time. To quantify this, because that’s what I do, I analyzed the words that occurred most commonly on these covers and ended up creating my first (ever) infographic. Enjoy and degender.
For those of you who want to look under the hood and see how this was done, here are some details:
Sources – girlslife.com, boyslife.org , magazine-agent.com-sub.info, childstats.gov
Reader percentages were estimated by using the number of readers indicated on the journal’s website (e.g. Girls’ life numbers) and the number of children of the corresponding age in the US.
Analysis of text on the magazine covers – I first created data files with all the words/sentences from the magazine covers. I then used the R tm (text mining) package to stem the text and remove common words like “a” and “the”. I also removed four irrelevant but common words that appeared almost in every issue and would dominate the cloud and make the other words harder to see: “quiz”, “story”, “scout”, “true” (Boys’ life is a boy scouts journal, so “scout” is a word that appears in every issue). Below is the code I used in case you’d like to do some text mining yourself. Finally, I used the wordcloud package to created the colourful word clouds.
library(tm) library(SnowballC) library(wordcloud) library(RColorBrewer) data <- read.csv('girls_life.csv') docs <- Corpus(VectorSource(data[,2])) #convert the text to lower case docs <- tm_map(docs, content_transformer(tolower)) #remove numbers docs <- tm_map(docs, removeNumbers) #remove common English stopwords docs <- tm_map(docs, removeWords, stopwords('english')) #remove punctuation docs <- tm_map(docs, removePunctuation) #remove extra white spaces docs <- tm_map(docs, stripWhitespace) #stem the words docs <- tm_map(docs, stemDocument) #remove additional stopwords docs <- tm_map(docs, removeWords, c('quiz')) #convert to a data frame dtm <- TermDocumentMatrix(docs) m <- as.matrix(dtm) v <- sort(rowSums(m),decreasing=TRUE) d <- data.frame(word = names(v),freq=v) head(d, 10) #generate the word cloud par(bg='#FFDD9D') wordcloud(d$word, d$freq, col=brewer.pal(n = length(d$word), name = "PuBuGn"), random.order=FALSE, rot.per=0.3 )
3 thoughts on “Gendered: Girls’ life vs Boys’ life magazines”
It would be interesting to do the reverse of this. Find gender neutral magazines, like National Geographic for kids, Donald Duck (is that still a thing or am I too old), etc, then examine the themes/topics and readership. Could we draw a correlation between topics and readership?
LikeLiked by 1 person
That’s a great idea. One thing to take into account is how these magazine are marketed and what is the target audience. This will affect readership composition.