4xx responses for bad query strings?

NESTING, DAVID M (SBCSI) Mon, 02 Aug 2004 15:13:33 -0700

I'm trying to determine a proper way to indicate a requested piece of information was 
not found, where the information is keyed not just on the URI components identifying 
the HTTP resource, but upon the query string as well.


If I have a content display resource at, say, http://example.com/path/display, and 
that resource generates a different page depending upon query string*, I might have 
URLs like:

http://example.com/path/display?page1
http://example.com/path/display?page2

If, however, one of those "logical" pages doesn't exist, with what HTTP response code 
should this resource respond?

My first thought was to use a 404 response, but it can be argued pretty convincingly 
that this response code indicates the HTTP resource itself (the "display" resource) is 
missing, when it's not.  It's just the *query* to that resource failed to return any 
content.

My second thought was to use a 403, since that could be interpreted as just a generic 
refusal to handle the request.  We could put "not found" within the response body.  
This is a lot better, because there's no risk of user agent implementations (or users) 
thinking that the resource itself has gone MIA.

I don't think a 500-series error would be appropriate, because that implies a problem 
with the server, not with the request.

It could also be argued that the resource itself is operating correctly and the 
request was fine and is getting a valid response (of "no such content"), therefore it 
should use a 200 response code.  The problem with this, though, is that search engines 
will end up indexing it as though it were legitimate, which is not desirable**.

I'm a little curious to know if there is a recommended practice here.  Many HTTP 
response codes tie themselves with the presence, absence or abilities of the HTTP 
*resource* itself, without discussing resources that may change behavior based upon a 
query string.

Thanks for your help.

David Nesting

[*] - IMO, a proper solution would be to utilize some sort of URI translation to make 
the URI look more like /path/display/page1 or /path/page1, but I don't have that 
luxury here.
[**] - The use of "robots exclusion" information is not acceptable, because it either 
requires knowing in advance what query strings are valid and what are not (for use in 
the robots.txt file), or assumes the responses are HTML.

4xx responses for bad query strings?

Reply via email to