oTree Forum >

CSV files in static folder regularly wiped out on Heroku

#1 by Juliana

Hi everyone,

My oTree app is writing CSV files stored in the static folder. The data in those csv files need to be used by the app later on. In my pilot study, I realised that csv files are wiped out by Heroku every 24 hours. Does anyone came across this issue and if so, what is the best way to have permanent storage of the data written by the app in the csv files in the static folder?

Thank you so much!

Juliana

#2 by Chris_oTree

Don’t use CSV, store in a model, either player, participant, or ideally ExtraModel.

#3 by Juliana

Thank Chris. I use the CSV file because my experiment involves multiple sessions / experiments (each with different apps), ran across several weeks. The CSV files store the data about the participants' decisions that need to be used in the subsequent sessions / experiments. The data is not linked to a participant code (as defined in oTree), but it's link to another identifier that allows me to track who completes all sessions. 

If I need to use the CSV files, does anyone know about any solution so that the data is not wiped out from those files?

#4 by Chris_oTree

You can store any custom data in a Postgres database directly on Heroku. provision a second Postgres database and connect to it using psycopg2:

https://stackoverflow.com/questions/15634092/connect-to-an-uri-in-postgres

You can get the connection parameters from the database URL that Heroku creates automatically when the add on is provisioned. Its name usually contains a color like GREEN, BLUE, etc.

#5 by Juliana

Thank you for your response Chris, that is helpful!

In terms of implementation, how does the solution you suggest compare with using simple file upload (https://www.simplefileupload.com/heroku)? Since I have limited programming knowledge, I would invest time to implement the solution that is the easiest from a programming perspective. However, I am not yet confident I can use simple file upload for what I need. Specifically, my oTree app is writing csv files with data on individual decisions that need to be used in subsequent sessions (run several weeks apart).

For instance, in week 2 of the study, only people who completed week 1 can participate and the parameters of the experiment in week 2 depend on decisions they have made in week 1. In week 1, the oTree app writes a csv file, with a participant identifier (to track who has participated) and the decisions that are relevant for the session in week 2. In week 2, the oTree app reads the csv file which determine some parameters of the session in week 2 (e.g. time spent on real effort task). The issue I currently have with Heroku is that the csv files are wiped out every 24h, so in Week 2, the data is no longer there and participants cannot complete the session. 

I have to use the csv files method, and I would like to get advice on what is the easiest solution (from a programming perspective) to keep the data written by the oTree app in the csv files.

Thank you so much!

Juliana

#6 by Chris_oTree

Simple file upload seems designed for the case where you need participants to upload files from their browser. But that doesn't seem to be what you're doing.

Are you sure you need to use CSV? if you're storing data for read/write from Python code it seems more appropriate to use SQL tables (postgres).

#7 by Juliana

Thank you Chris, that is very helpful. Is there any example of code in oTree that writes and reads data from a second postgres database (the solution you suggest)?

 Cheers,
 Juliana

#8 by BonnEconLab (edited )

I have used Juliana’s approach myself for the very same purpose: storing participants’ decisions from the first part of an experiment in a CSV file so that the decisions can be referred to in subsequent parts several days or even weeks later on the same server. We did not save the CSV file in the static folder, though. We saved it in the top folder of the oTree project, that is, the same folder in which the settings.py file is located.

Our study was run on a web server in-house, and it worked flawlessly.

I don’t have any experience with Heroku. Maybe Heroku won’t wipe the CSV file every 24 hours if it is not stored in the static folder but in a different location? Have you tried that, Juliana?

#9 by Juliana

Hi!

Thank you so much for sharing this information with me. The issue is specific to the way Heroku is built. I can see why you don't have the issue if you are not using Heroku. I will try what you suggest though, who knows if that solves the problem... 

Does anyone know about good alternatives to Heroku to deploy an oTree experiment online? Unfortunately at my university, we do not have a web server in-house that is accessible by non-university students.

Thank you!

Cheers,
Juliana

#10 by Chris_oTree

The files will get deleted no matter what folder you put them in. 

The Postgres solution is just a few lines of code. See the psycopg2 tutorial.

#11 by Chris_oTree

You will first need to create a table with the columns you want. That can be done from the Postgres command line. Then in your Python code you use INSERT to write a row and SELECT to read a row.

#12 by BonnEconLab

Chris_oTree wrote:

> The files will get deleted no matter what folder you put them in.

Hmmmm. Why does Heroku do that? Out of security concerns, in fear of malicious files?

What happens if the file already exists in the data that you upload to Heroku, and you let oTree change the file’s contents during the experiment? Is the file then reset to the version that you uploaded?

#13 by Juliana

@BoonEconLab:

I have tested and the CSV file seems to be reset to the uploaded version.

@ Chris:

If I use a second database to currently store the data being stored on CSV files, what happens if I upload a new version of the software during the course of the experiment (since the experiment runs for 1 months, there is scope for something to go wrong with the server or I might realise I need to change something). If I restart the database (because I uploaded the software again), will the second database (with the info currently on CSV files) also be wiped out, just like the main database?

Thank you.
Juliana

#14 by Chris_oTree (edited )

The second database will not be affected by updates to oTree or resetdb, etc. You manage it totally separately.

#15 by Juliana

Thanks!

Below is an example of code I currently have to read data from the CSV file, and another example to write data to the CSV file. Would you be able to give an example of how I would modify to read / write data to a second database?

Thank you so much!
Juliana


Reading data:

def check_id(id):
    f = open('id_database.csv', encoding='utf-8-sig')
    rows = [row for row in csv.DictReader(f)]
    id_in_data = False
    for row in rows:
        if row["participant_id"] == id:
            id_in_data = True
    return id_in_data
    
    
    
Writing data:

def update_pre_survey_database(values, id):
    filename = 'pre_survey_database.csv'
    headers = ['id', 'session', 'time_slider_1']
    with open(filename, 'a', newline='') as f_object:
        values['session'] = 2
        values['id'] = id
        dictwriter_object = DictWriter(f_object, fieldnames=headers)
        dictwriter_object.writerow(values)
        f_object.close()

Write a reply

Set forum username