oTree Forum >

Weird server problem

#1 by Christoph

Hi everybody,
apologies in advance, this will be a bit a vague post, as I am currently fishing for solutions/ideas. I ran a study on Friday on our own server (otree version 5.9.3) with over 2000 participants. I tested if the server can a larger amount of traffic beforehand as far as I could and it seemed to be working well. Still, the server ran into problems with otree not responding such that participants were not able to open the study at one point. Here part of the errors I encountered (the rest of the logs is mostly the same message repeated):

Traceback (most recent call last):
  File "/home/lab/venv_otree/lib/python3.7/site-packages/uvicorn/protocols/http/h11_impl.py", line 172, in handle_events
    event = self.conn.next_event()
  File "/home/lab/venv_otree/lib/python3.7/site-packages/h11/_connection.py", line 443, in next_event
    exc._reraise_as_remote_protocol_error()
  File "/home/lab/venv_otree/lib/python3.7/site-packages/h11/_util.py", line 76, in _reraise_as_remote_protocol_error
    raise self
  File "/home/lab/venv_otree/lib/python3.7/site-packages/h11/_connection.py", line 425, in next_event
    event = self._extract_next_receive_event()
  File "/home/lab/venv_otree/lib/python3.7/site-packages/h11/_connection.py", line 367, in _extract_next_receive_event
    event = self._reader(self._receive_buffer)
  File "/home/lab/venv_otree/lib/python3.7/site-packages/h11/_readers.py", line 73, in maybe_read_from_IDLE_client
    request_line_re, lines[0], "illegal request line: {!r}", lines[0]
  File "/home/lab/venv_otree/lib/python3.7/site-packages/h11/_util.py", line 88, in validate
    raise LocalProtocolError(msg)
h11._util.RemoteProtocolError: illegal request line: bytearray(b'\x16\x03\x01\x02\x00\x01\x00\x01\xfc\x03\x03\xa2\xa5\xbd\xefF\x9b\xb7\xfe\xcdR\xa9\x1ds\x1c\x9c\xa6.\x1b\xbb\xb4\x1888\xf7$\xb4\xc7$>\xa2s\x97 #=l\xdf\x93\xbf\x81\xdd4\xe4Ktf$\x97l\xebc\x8d\xac\x87\xab\xe5\x7f\x8a@\xa8\xc7%\x96\x8c1\x00 zz\x13\x01\x13\x02\x13\x03\xc0+\xc0/\xc0,\xc00\xcc\xa9\xcc\xa8\xc0\x13\xc0\x14\x00\x9c\x00\x9d\x00/\x005\x01\x00\x01\x93\xaa\xaa\x00\x00\x00\x00\x00"\x00 \x00\x00\x1dotree01.awi.uni-heidelberg.de\x00\x17\x00\x00\xff\x01\x00\x01\x00\x00')
Invalid HTTP request received.

At this point I simply resolved to restart the server and run the remainder of it in smaller badges. However, yesterday night I ran into the same problem. This time, however, the problem occurred after only around 10 participants (3 could still finish the study). There were no errors recorded. The server still seemed to work fine, otree was just not responding anymore (e.g. trying to open pages in the admin section was not possible, it simply kept loading). It worked again after stopping the server and rerunning prodserver.

After yesterday I am a bit lost what is the problem here. As I restarted the server and set up a completely new study it seems strange to me that a similar problem occurs only after a handful of people. I used the server recently to run two studies with a couple of hundred people each, without any problems.

Has anyone encountered this behavior before by any chance? Any ideas what might have went wrong here?

Kind regards
Christoph

#2 by Domnica

Hi Christoph,

This might not be helpful, but what databases are you using (changing to Postgresql resolved a lot of issues for me)? Also, are you on a university server (our IT often blocks some requests on our servers, might be worth checking)?

Best,

Domnica

#3 by FlavourDave (edited )

Hey,

I'm facing the exact same problem even with using Postgres and otree prodserver on an AWS EC2 machine. Googeling the error brings results referring to a bug in uvicorn/gunicorn: https://github.com/encode/uvicorn/issues/441 and https://github.com/encode/uvicorn/issues/344

This is quite problematic as the entire webserver is not reachable once this happens until prodserver is restarted by hand! This did already interfere one of our experiments and required constant monitoring and checking for this error and restarting it ASAP as it occurs… kind of a showstopper for using otree (5.9.7) in this state.

Best,
Dave

#4 by gr0ssmann

I am intrigued by this. I/we run a very large setup of many oTree servers on one Debian machine and we have never encountered this. Could it be that configuring nginx to reverse proxy like that:

    add_header Last-Modified $date_gmt;
    add_header Cache-Control 'no-store, no-cache';
    if_modified_since off;
    expires off;
    etag off;

… resolves the issue? Can you post your nginx configs?

#5 by FlavourDave (edited )

Hi,

we currently don't use a reverse proxy, just plain otree prodserver via http. (used this setup before without any problems)

Best,
Dave

#6 by gr0ssmann

I will spare you the web security anecdote and only remark that this may be one reason why this issue can appear. Nginx is likely to filter out invalid requests, hence in my/our case they "never make it" to oTree. Just a hypothesis of course, but might be well worth investigating.

#7 by FlavourDave (edited )

Sure, agreed. But since we usually complete the collection within a few hours or days and delete the instance directly afterwards, this is less of an issue for us at the moment. However, it is new that we get this error with this setup and have to restart completely to be reachable from the Internet again.

#8 by gr0ssmann (edited )

I wouldn't rule out that using nginx with a reasonable config can fix the error, though. If you don't want to set it up with TLS etc., I can offer that we arrange something using one of my domains. Email me at otree [at] [my_username_with_0_replaced_by_o] [dot] io if you're interested. I'd be interested to figure out whether this is resolved by using nginx as a "filter".

#9 by gr0ssmann

Oops, I misspecified my email address. See the edited message above for the correct version.

#10 by FlavourDave

seriously very many thanks for your offer, but you don't have to make the effort. Next time, I will go the extra mile and setup nginx with https. Your argument sounds perfectly plausible that pre-filtering those requests can avoid this error. In previous surveys, this extra step was not mandatory, but in the current situation it is apparently unavoidable. I will report if the error appears even with a more technically reasonable setup (or not).

However, I think it's still important that the otree team has this issue on their radar, perhaps the error lies deeper.

Write a reply

Set forum username