The Impact of Message Framing in Physical Activity (Bachelor's Thesis)
Overview
In this project, I aimed to understand the impact of message framing on physical activity levels. To do this, I sent 2 different types of emails: positive, negative, (and also included a control group with no emails), and measured the difference in activity levels over a span of a month. In the end, no significant results were found, although I believe this reflects on the nature of the intervention rather than the overall effectivity of message framing.
Methods
- Describe the research design (e.g., survey, experiment, field study).
- Tools used (e.g., Python for data analysis, R, Qualtrics for surveys).
- Data collection and sample details (ensuring privacy/anonymization).
Mathematical model and Thesis
In essence the idea behind the model is as follows. We used a moderation model, where the type of message (positive, negative, or control) was the independent variable, self-efficacy and attitudes towards physical activity were moderators, and the difference in physical activity over a period of a month was the dependent variable. In a flow chart, we can visualize this as
graph LR
D{Self-Efficay} -->|β| B;
A --> X((( )));
D --> X;
X --> |δ| B
A[Message Framing] --> |α| B[Change in Physical Activity];
A --> Y((( )));
E --> Y;
Y --> |ε| B;
E{Attitudes} --> |γ| B;
In particular, naming message framing as \(X\), the the moderators \(M_1\) and \(M_2\) for self-efficacy and attitudes respectively, and \(Y\) for the change in physical activity, the model takes the form
for which I used a simple regression to find the optimum coefficients \(c, \alpha, \beta, \gamma,\delta, \varepsilon\). (Some extra work was needed as \(X\) is really a categorical variable, and so it was encoded into dummy variables)
Here you can find my complete thesis.
Supplementary Code
I needed coding for two main purposes: for the data analysis of the .csv files, and for the automatization of the emails. In particular, the code that I needed to run every day took care of: * Update my csv file with users, writing down how many days deep in the interview each user was and identifying what are the resepctive procedures that needed to be implemented. * If a user finished the intervention (participated for over 30 days) remove them from the email list and place them in a different file to be used for the final survey * Else, check to which group the specific user belonged to and assign a message. A message was sent to a user every 3 days since they started the intervention, and it could be either gain-framing, loss-framing, or no message (if in control group) * Send the emails
import smtplib
import pandas as pd
import random
from email.mime.text import MIMEText
from datetime import datetime
from dataclasses import dataclass
from typing import List
## First define the macros SEND_EMAILS, INIT_SURVEY_PATH, FINAL_SURVEY_PATH,EMAILS_AND_GROUPS_PATH, OVER_30_PATH, SENDER, PASSWORD
@dataclass
class EmailObj:
email: str
subject: str
body: str
def extract_subjects_and_bodies()-> tuple:
"""Extracts the subjects and bodies from the .txt
files and returns them as lists of strings.
Returns:
tuple: (subjects_text, bodies_text)
With both being lists of 3 lists of 31 strings
"""
subjects_text = [[], [], []]
bodies_text = [[], [], []]
for group_num in range(3):
with open(f'subjects_group_{group_num+1}.txt', 'r') as file:
for line in file:
# Append each line (recipient) to the recipients list, removing leading/trailing whitespace
subjects_text[group_num].append(line.strip())
with open(f'body_group_{group_num+1}.txt', 'r') as file:
for line in file:
bodies_text[group_num].append(line.strip())
return subjects_text, bodies_text
def match_emails_to_person() -> List[EmailObj]:
"""Match the emails to the subjects and bodies corresponding to them
depending on the group number and the number of days since the survey was filled.
Returns:
List[EmailObj]: list of EmailObj objects, for each entry in EMAILS_AND_GROUPS_PATH file
"""
subjects_text, bodies_text = extract_subjects_and_bodies()
emails_groups_df = pd.read_csv(EMAILS_AND_GROUPS_PATH)
emailObjects = []
for _, row in emails_groups_df.iterrows():
email = row['Email Address']
group_number = row['Group Number']
date = row['Timestamp']
date = datetime.strptime(date, '%m/%d/%Y %H:%M:%S')
num_days = (datetime.now() - date).days
subject = subjects_text[group_number-1][num_days]
body = bodies_text[group_number-1][num_days]
if email == "snmendozav@gmail.com":
print(f"Email: {email}, Group: {group_number}, Days: {num_days}", subject)
# print(f"Email: {email}, Group: {group_number}, Days: {num_days}, Subject: {subject}, Body: {body}")
if subject == "" or body == "":
continue
emailObjects.append(EmailObj(email, subject, body))
return emailObjects
def send_email(emailObjects: EmailObj, sender: str, password: str):
"""Send an email to the recipients in the emailObjects list
Args:
emailObjects (EmailObj): EmailObj object to be sent in the email
sender (str): Email address of the sender
password (str): Password of the sender's email
Returns:
None: No return value
"""
for emailObject in emailObjects:
recipient = emailObject.email
subject = emailObject.subject
body = emailObject.body
msg = MIMEText(body, 'html')
msg['Subject'] = subject
msg['From'] = sender
msg['To'] = ', '.join(recipient)
with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp_server:
smtp_server.login(sender, password)
smtp_server.sendmail(sender, recipient, msg.as_string())
# print("Message sent!")
return None
def update_emails_and_groups() -> None:
"""Updates the emails_and_groups.csv file by adding the
emails that are in the FINAL_SURVEY_PATH file
Returns:
None: No return value
"""
init_survey_df = pd.read_csv(INIT_SURVEY_PATH)
emails_groups_df = pd.read_csv(EMAILS_AND_GROUPS_PATH)
final_survey_df = pd.read_csv(FINAL_SURVEY_PATH)
over_30_df = pd.read_csv(OVER_30_PATH)
# Iterate over the "Email Address" column in survey.csv
for _, row in init_survey_df.iterrows():
email = row['Email Address']
init_time = row['Timestamp']
# Check if the email exists in emails_and_groups.csv
if email not in emails_groups_df['Email Address'].values\
and email not in final_survey_df['Email Address'].values\
and email not in over_30_df['Email Address'].values:
# Generate a random group number between 1 and 3
group_number = int(random.randint(1, 3))
date = row['Timestamp']
date = datetime.strptime(date, '%m/%d/%Y %H:%M:%S')
num_days = int((datetime.now() - date).days)
# Add a new row to emails_and_groups_df
new_row = {'Email Address': email,
'Timestamp': init_time,
'Group Number': group_number,
'Number of Days': num_days}
emails_groups_df = pd.concat([emails_groups_df, pd.DataFrame([new_row])], ignore_index=True)
emails_groups_df.to_csv(EMAILS_AND_GROUPS_PATH, index=False)
return None
def find_over_30_days():
"""Finds the emails that have been in the EMAILS_AND_GROUPS_PATH
file for over 30 days. Then removes them and reallocates them into the
OVER_30_PATH file.
Returns:
None: No return value
"""
emails_groups_df = pd.read_csv(EMAILS_AND_GROUPS_PATH)
over_30_df = pd.read_csv(OVER_30_PATH)
for idx, row in emails_groups_df.iterrows():
num_days = row['Number of Days']
if num_days > 32 and row['Email Address'] not in over_30_df['Email Address'].values:
newrow = {'Email Address': row['Email Address'],
"Timestamp": row["Timestamp"], 'Group Number': row['Group Number']}
over_30_df = pd.concat([over_30_df, pd.DataFrame([newrow])], ignore_index=True)
emails_groups_df.drop(idx, inplace=True)
over_30_df.to_csv(OVER_30_PATH, index=False)
emails_groups_df.to_csv(EMAILS_AND_GROUPS_PATH, index=False)
return None
def find_filled_final_survey():
"""Finds the emails that have filled the final survey and removes them
from the OVER_30_PATH file. Then sends them an email with the final survey link
to those that have not filled out the final survey and are in the OVER_30_PATH file.
Returns:
None: No return value
"""
over_30_df = pd.read_csv(OVER_30_PATH)
final_survey_df = pd.read_csv(FINAL_SURVEY_PATH)
for idx, row in over_30_df.iterrows():
email = row['Email Address']
if email in final_survey_df['Email Address'].values:
over_30_df.drop(idx, inplace=True)
else:
subject = "Final Step!"
body = "Congratulations, you are almost there! To finalize this study (and participate in the raffle for the prize), please complete the following short survey: https://forms.gle/gSwfwrJHFtMmsMZC8\n\n Thank you for your participation!"
if SEND_EMAILS:
send_email([EmailObj(email, subject, body)], SENDER, PASSWORD)
else:
print("Sending email to: ", email)
# print("Subject: ", subject)
# print("Body: ", body)
final_survey_df.to_csv(FINAL_SURVEY_PATH, index=False)
over_30_df.to_csv(OVER_30_PATH, index=False)
return None
def daily_process():
"""The process that needs to be run every day
"""
# Update all the .csv files
update_emails_and_groups()
find_over_30_days()
find_filled_final_survey()
# Match the emails to the subjects and bodies corresponding to them
emailObjects = match_emails_to_person()
# Send the emails
if SEND_EMAILS:
send_email(emailObjects, SENDER, PASSWORD)
else:
for emailObj in emailObjects:
print("Sending email to: ", emailObj.email)
This automatization was needed, as the sample size was large enough to make it unfeasable to send each email individually.
Results
Summarize findings and their implications. Explain how results informed decisions or further research.
Impact
My results highlight that using emails as an intervention system was not an effective technique. Indeed, the participants which received negative and positive framing exhibited no statistically significant difference in activity levels from the control group. This should be considered in further interventions, and perhaps other methodologies (like mobile applications) should be adopted. In particular, it is a step forward in understanding what type of behavioural interventions should be used to make an impact in the activity levels of students.
Next Steps & Reflections
What could be improved, follow-up studies, lessons learned.