A Collection of Data Science Take-Home Challenges

Marketing Email Campaign


Optimizing marketing campaigns is one of the most common data science tasks.

Among the many possible marketing tools, one of the most efficient is using emails. Beside its efficiency, emails are great cause they are free and can be easily personalized.

Email optimization involves personalizing the text and/or the subject, who should receive it, when should be sent, etc. Machine Learning excels at this.

Challenge Description

The marketing team of an e-commerce site has launched an email campaign. This site has email addresses from all the users who created an account in the past.

They have chosen a random sample of users and emailed them. The email let the user know about a new feature implemented on the site. From the marketing team perspective, success is if the user clicks on the link inside of the email. This link takes the user to the company site.

You are in charge of figuring out how the email campaign performed and were asked the following questions:


We have 3 tables downloadable by clicking here.

The 3 tables are:

    email_table - info about each email that was sent


    email_opened_table -  the id of the emails that were opened at least once. 


    link_clicked_table -  the id of the emails whose link inside was clicked at least once. This user was then brought to the site.



    Let's check one email that was sent

head(email_table, 1)

Column Name Value Description
email_id 85120 The Id of the email
email_text short_email That was a short email
email_version personalized It was personalized with the user name in the text
hour 2 It was sent at 2AM user local time
weekday Sunday It was sent on a Sunday
user_country US The user is based in the US
user_past_purchases 5 The user in the past has bought 5 items from the site

    Let's check if that email was opened

subset(email_opened_table, email_id == 85120)

<0 rows> (or 0-length row.names) # Nop. The user never opened it.

    We would obviously expect that the user never clicked on the link, since you need to open the email in the first place to be able to click on the link inside. Let's check:

subset(link_clicked_table, email_id == 85120)

<0 rows> (or 0-length row.names) # The user obviously never clicked on the link.

Click here for mock phone interviews to practice similar questions, DS takehome challenges, and more