oTree Forum >

Data too large, how to export csv?

#1 by Hauke

Hi Chris & Community,

I just ran an experiment where I made some recordings that I stored as base64 strings. As a consequence, the resulting table requires quite some memory (250 MB maybe). 

As a result, I cannot open the session's data tab. Moreover, I general CSV Data Export doesn't work either. Both yield application errors after some time of trying.

This brings me to the question of how to export the data in the desired format. I suspect that I'll have to use heroku but I don't know how (+ I don't have any real SQL experience).

Can someone help out?

Best
Hauke

#2 by Chris_oTree

That's really big. Storing media in the DB can quickly overwhelm the system...I think you will need to connect to the postgres DB directly. Install postgres on your computer, and then connect to the database (you can get the connection parameters in DATABASE_URL config var). Or use Heroku PG backups to download the DB as a blob and load it in postgres locally.

If you want some limited subset of the data without the base64 stuff, you should use custom_export.

#3 by Hauke

Thanks!

Do you have some postgres snippets that you use on your end to format the all_apps_wide.csv you provide? Or where can I find them?

My hope is that I use postgres and then export a data file that looks the same as usual.

#4 by Chris_oTree

You would need to use raw SQL queries on the tables individually. It won't be straightforward to replicate oTree's CSV export which involves complex code.

#5 by Chris_oTree

Actually another thing you can try is to install the same version of oTree locally, then set your DATABASE_URL environment variable to match DATABASE_URL on Heroku. Then run 'otree prodserver'. Then you should have otree running locally with the database from your Heroku server.

Then try doing data export. Since the server is running the advantages are: 

- you won't be subjected to Heroku's request cutoffs such as 30 seconds
- you are not restricted by Heroku's RAM limit but rather your PC's full resources
- If you generate any files they will be created locally on your PC, not on the Heroku server which is harder to download from.

But this is experimental and you still need to know what you are doing to get the data...and definitely back up the DB first.

#6 by BonnEconLab

If you can access the Heroku server via the terminal, you could try to dump the contents of your database via

pg_dump --dbname=postgresql://postgres:<PASSWORD>@127.0.0.1:5432/django_db > FILENAME.sql

(the port 5432 may be different). This would produce a file (which will be large in your case) that contains all SQL commands to recreate the entire database. That is, it would also contain all the inputs of your participants in plain text so that you could extract the information that you need.

The file would also allow you to recreate the database locally such that you could subsequently perform the CSV export locally.

Write a reply

Set forum username