Solved

Why are my tests getting a '500 Internal Server Error'?

  • 8 February 2021
  • 1 reply
  • 66 views

Userlevel 1
Badge +2

The symptoms:

  • Failing Test Commands
  • Elements which can’t be located
  • Screenshots that look like this:
500 Internal Server Errors

 

icon

Best answer by dylanatsauce 8 February 2021, 08:38

View original

1 reply

Userlevel 1
Badge +2

What that error means

There’s good news, and there’s bad news.  The good news is that the test is telling us exactly what component of your test architecture went wrong.  That error is one of the HTTP Status Codes.  They’re a pre-defined set of codes which indicate the outcome of a HTTP request (such as the one a browser makes to a server).  They’re grouped into 5 sets, and each response has a specific number.

 

This error here is 500 Internal Server Error.  The docs over at Mozilla tell us that a 500 error means:

The server has encountered a situation it doesn't know how to handle. 

So, one of the servers that was processing this request couldn’t do so properly, and had to raise an error instead. 

 

This is where the bad news comes in.  This error is generated by your system under test, or one of the intervening systems; it’s not coming from a Sauce Labs machine.  Since this error is almost always generated by your system under test or some other server in your infrastructure, it can be effectively impossible for Sauce Labs to diagnose or correct what the problem is (See my note at the end).

 

What to do about it

Discovering the root cause of Internal Server Errors depends an awful lot on the Internals of your Server.  Here’s a rough guide of steps to take.

 

If the error happens before your test fully loads your application

This could indicate that the connection to your system under test is causing issues.  Perhaps there’s a proxy server that’s malfunctioning, you’ve configured Sauce Connect to use the wrong proxy, or a load balancer is occasionally broken.

You should start by trying to determine which machine is causing problems:

  1. The error page itself might list which server raised the error  
  2. Check the browser’s URL bar to see if it’s the same URL your test was navigating too
  3. If you’re using Sauce Connect, gathering some verbose logs and sending them to us might allow us to let you know
  4. Check with your security team to see if there might be a configuration requirement you’ve missed for traffic

Once you’ve figured out the server to blame, you’re gonna have to check the server logs.  This is where any advice becomes almost useless; How and what your system does for logging is impossible for me to know I’m afraid!

 

If the error happens when your application is navigating between systems

If your app only experiences issues when going between different systems, you should try following the steps for “If the error happens before your test fully loads your application”, above.  This includes if you need to log into a separate site before being re-directed to the app under test.

 

If the error happens when doing specific actions in your tests or only when using a specific browser

Something about those actions or browser is making your system unhappy!  Sauce Labs’ browsers and devices are configured to interact with your applications just like they would if being used by a genuine user (See my note at the end).  An error of this kind means either that you’ve found a genuine bug in your SUT or your specific test data is invalid.

You’re still going to have to check your server logs to find out what’s wrong, but at least you should have a good idea what to look for.

 

A note on root cause

I’ve worked at Sauce Labs for over eight years and handled thousands of tickets.  In that time, I have heard of one single case where a 500 Internal Server Error was caused by something Sauce Labs did… And even then, the fault ultimately lay with the System under test.  (The problem was that one of our network traffic servers correctly included some metadata with a request.  The system under test wasn’t expecting this metadata and, instead of ignoring it (like it should), it decided to freak out).  Almost everything is possible so I won’t rule it out… But it’s vanishingly unlikely for this one specific error to be generated by our systems.

Reply