What kinds of Parenting books sold on Amazon.sg?

I was curious to know the kinds of parenting books sold on Amazon.sg. Searching “parenting” in Amazon.sg gets a list of books such as this

How search results look like on Amazon.sg



Web-scrapping search results off Amazon.sg website

Using a search word “parenting” in Amazon.sg, I hit 75 pages of search results of about 1400 books. I then scrapped information on the book title. My codes for scrapping can be accessed here.



Let’s take a look at the variables in the dataframe.

glimpse(data_all)
## Rows: 1,198
## Columns: 3
## $ id       <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
## $ title    <chr> "Family Fun Night Conversation Starters Placemats: 375 Questi…
## $ pub_type <chr> "Novelty Book", "Paperback", "Hardcover", "Paperback", "Hardc…



From first few rows of the data, we can see the titles and type of publication. The book “How to Raise Kids Who Aren’t Assholes” seems interesting.

library(knitr)
data_all %>%  head(5) %>% kable(digits = 2)
id title pub_type
1 Family Fun Night Conversation Starters Placemats: 375 Questions That Celebrate Family and Create Lasting Memories Novelty Book
2 Cribsheet: A Data-Driven Guide to Better, More Relaxed Parenting, from Birth to Preschool: 2 Paperback
3 Parenting: 14 Gospel Principles That Can Radically Change Your Family Hardcover
4 The Danish Way of Parenting: What the Happiest People in the World Know About Raising Confident, Capable Kids Paperback
5 How to Raise Kids Who Aren’t Assholes: Science-Based Strategies for Better Parenting–from Tots to Teens Hardcover

Preprocessing the data

Let’s turn the dataframe into a corpus. Then using Quanteda’s package to tokenize the titles and also perform some pre-processing steps. I removed any numbers, punctuations, filtered out stopwords (which are not useful for analysis, e.g., is, and, we). And then convert the tokens object to dfm.

corp <- corpus(data_all,
               docid_field = "id",
               text_field = "title")

tok <- 
    corp %>% 
    tokens(remove_numbers = T,
           remove_punct = T,
           remove_separators = T,
           remove_symbols = T,
           remove_url = T,
           include_docvars = T) %>% 
    tokens_tolower() %>% 
    tokens_select(pattern = c(stopwords("en")),
                  selection = "remove") %>% 
    tokens_ngrams(n = 1:2) 

dfm <-
    tok %>% dfm() 

Plotting frequency of word occurrences in book titles

word_freq <-
    data_tidy %>% 
    group_by(term) %>% 
    summarize(word_n = sum(count)) %>% 
    ungroup()

word_freq %>% 
    filter(term != "parent" & term != "parents" & term != "parenting") %>% 
    slice_max(word_n, n = 30) %>% 
    ggplot(aes(x = word_n, y = reorder(term,word_n))) +
    geom_textsegment(aes(yend=term, xend=-0, label = term),  
                 alpha = .7, 
                 size = 5, 
                 linewidth = 1.5,
                 linecolor = "black",
                 textcolor = nus_blue, fontface = 7, family = font
              ) +
        geom_point(size = 3, color = tar_blue,  alpha = 1) +    
    labs(title = "Parenting books on Amazon.sg",
         subtitle = "Most used words in book titles",
         x = "# times word appeared",
         y = "Word",
         caption = "GERARDCHUNG.COM"
    ) + 
    theme(#legend.position = "none",
        plot.title = element_text(size=22, face="bold"),
        plot.subtitle = element_text(size=18),
        axis.title.x = element_text(size = 12),
        axis.title.y = element_text(size = 12),
        axis.text.y = element_blank(),
       # axis.text.y = element_text(size=12),
        axis.text.x = element_text(size=15),
       axis.ticks.y = element_blank()
    ) + 
    scale_x_continuous(limits = c(0, 190), breaks = seq(0, 200, by = 20), expand = c(0, 0)) 



We have books that are guides to parenting. Books for children ranging from infants, toddlers, children, and teens. Books probably for new parents. Book for pregnant parents. Books to improve sleep for the newborns. Some books are revised books and some spawned new editions. Activity books.

More books for mothers?

Would there be more book titles mentioning “mothers” than “fathers”?

str_view_all(dadmum1$title, regex(pattern = "dad|mum|mother|father|dads|mums|mothers|fathers", ignore_case = T), match =T)



There are twice the number of books offered for mothers than for fathers. Still, most books were written for parents in general and do not have references to either mothers or fathers.



Would word usage in titles be different across books for mums and for dads?



What Topics Were These Parenting Books On Amazon.sg About?

To identify topics, I ran structural topic modeling on the titles. STM is based on the Latent Dirichlet Allocation algorithm and its basic premise is to model documents as distributions of topics (topic prevalence) and topics as a distribution of words. Basically, it is used to quantitatively look at what latent topics generated these words used in the titles. STM provides a quick method to “qualitatively” analyze a large set of textual data. If you want to read more on analyzing open-ended survey responses using STM, read my paper here

Identify the possible number of topics using exclusivity and semantic coherence

The first challenge is to identify the possible number of latent topics (k) that had generated the distribution of words and topics. Using a plot of semantic coherence and exclusivity, the best k number seems to be k=26 topics (it has the best semantic coherence and exclusivity)

Twenty-six Topics and their highly associated words

With a model of 26 topics best fitting the data, let’s take a look at the highly associated words for each of the 26 topics. We see books on guiding parents to love their kids/teens (Topic 1), on motherhood and pregnancy (Topic 3 and 24), helping parents to develop rich relationships with kids (Topic 25), babies and language development/sleeping (Topic 9 and 6), journaling the process of motherhood/pregnancy (Topic 11), and books on what to expect in the 1st year of parenting (Topic 5). There are also activity books for kids (Topic 14). Dad-Jokes books for parents (Topic 18). Books related to children with Autism (Topic 10). Books on home-schooling (Topic 13)



Topic Expected topic proportion Top 6 terms
Topic 2 0.109 famili, babi, mom, practic, blue, calendar
Topic 1 0.082 life, love, teen, kid, boy, guid
Topic 25 0.073 children, rais, guid, kid, rich, relationship
Topic 3 0.068 guid, pregnanc, edit, birth, first, revis
Topic 7 0.063 child, kid, rais, learn, children, empow
Topic 20 0.060 help, kid, child, children, guid, mind
Topic 19 0.058 talk, child, children, edit, littl, way
Topic 9 0.052 babi, languag, help, child, guid, children
Topic 24 0.041 mother, guid, mom, child, day, motherhood
Topic 16 0.032 kid, children, child, build, power, thrive
Topic 11 0.030 journal, pregnanc, babi, kid, keepsak, memori
Topic 23 0.028 babi, know, year, need, everi, first
Topic 18 0.027 book, girl, dad, joke, kid, bodi
Topic 4 0.026 read, novel, six, help, french, child
Topic 13 0.026 child, home, guid, read, age, teach
Topic 5 0.026 year, expect, first_year, famili, first, effect
Topic 12 0.025 child, rais, children, shape, kid, complet
Topic 22 0.025 littl, rais, book, guid, littl_book, children
Topic 14 0.025 activ, workbook, child, fun, kid, exercis
Topic 17 0.024 kid, guid, way, overcom, go, kid_guid
Topic 15 0.023 rais, success, kid, child, children, world
Topic 21 0.019 children, adult, rais, happi, child, kind
Topic 10 0.018 autism, mindset, think, help, kid, skill
Topic 6 0.017 sleep, babi, babi_sleep, night, solut, help
Topic 8 0.017 magic, old, babi, wean, year_old, year
Topic 26 0.005 children, guid, feel, way, emot, kid

Conclusion

It was interesting to see what topics of parenting books were offered on Amazon.sg. This ranges from practical books such as journaling books to joke books, to guides books, as well as books on skills and parenting knowledge. There are still more books for mums than for dads. What will be interesting to look at is how these topics of books change in line with their publication date. Would the number of books for dads change with time? Would certain topics (e.g., books on brain development) change with time?

About me

Go to my personal website gerardchung.com to check out my other ongoing work