Border case: Basic authentication with a comma in the auth realm fails

Hello libwww,

I've encountered a situation where browsers "just work" but
LWP::UserAgent fails.

This is a border case which apparently went unnoticed for decades, and
for my own problem I've found a ridiculously easy workaround.  So I'm
not sure whether it is a good idea or worth the effort to change
things.  I volunteer to create a test case but am hesitating with a
fix because this might break things in the real world.


SETUP:

   * A web server with basic authentication for an auth realm (chosen
     somewhat, but not completely arbitrary):

       $realm = "data, protected";

   * A LWP::UserAgent which gets passed the correct credentials with:
   
       $ua->credentials($netloc,$realm,$uid,$password);


SYMPTOM:

  No matter what, you'll get a 401 status code for every request.
  
  The web server sends a WWW-Authenticate header:

    WWW-Authenticate: Basic realm="data, protected"

  On the client side, it turns out that the requests don't contain the
  corresponding Authorization header.


ROOT CAUSE:

  When processing the WWW-Authenticate header, LWP::UserAgent
  unconditionally changes all commas in this header's value to
  semicolons:
  
    $challenge =~ tr/,/;/;  # "," is used to separate auth-params!!

  But this modifies the realm if there's a comma in it.  Therefore,
  there's no longer a match when the credentials are looked up later
  in the process, so no credentials are sent with the request.

  The code change happened almost exactly 20 years ago, in commit
  c4cefa219297e42ce73a10b6ad1fe4d9a19a9373 (1997-12-01).  Since then,
  RFC 2068 (dated Jan 1997) was obsoleted by RFC 2617 (Jun 1999),
  which in turn has been obsoleted by RFC 7235 (Jun 2014).  But does
  that say that it is reasonably safe to rely on HTTP::Header::Util to
  split words as intended?

  Is it safe to do that replacement only _after_ the string
  "auth-params" or are there other values where that replacment is
  required? 


WORKAROUNDS:

  * Either get rid of the comma in the web server's realm definition.
    Can be tricky it isn't _your_ server.

  * Or do the same unconditional modification to the realm before
    passing it to the LWP::UserAgent's credentials method.  (BTW:
    Fixing it in LWP::UserAgent's get_basic_credentials method is
    possible but ugly, and not fully sufficient because you are
    advised to override this method when subclassing).

  * Or, if you're actually using a derived class like WWW::Mechanize
    and talk to just one application, use the credentials method of
    this class which allows to omit $netloc and $realm.

-- 
Cheers,
haj
0
Harald
1/12/2018 11:54:21 PM
perl.libwww 3328 articles. 0 followers. Follow

2 Replies
35 Views

Similar Articles

[PageSpeed] 48

Hi Harald,


> On Jan 12, 2018, at 6:54 PM, Harald J=C3=B6rg <Harald.Joerg@arcor.de> =
wrote:
>=20
> Hello libwww,
>=20
> I've encountered a situation where browsers "just work" but
> LWP::UserAgent fails.
>=20
> This is a border case which apparently went unnoticed for decades, and
> for my own problem I've found a ridiculously easy workaround.  So I'm
> not sure whether it is a good idea or worth the effort to change
> things.  I volunteer to create a test case but am hesitating with a
> fix because this might break things in the real world.


Thanks very much for this detailed explanation of what you've been =
seeing.  I don't really know this part of the code well enough to be =
able to comment on this right now, but there was a recent pull request =
which deals with authentication.  Does =
https://github.com/libwww-perl/libwww-perl/pull/255/files fix anything =
for you?

If it does or it doesn't, it might be worth commenting on the existing =
pull request.

Best,

Olaf
0
olaf
1/23/2018 2:36:36 AM
Hello Olaf,

you write:

>> On Jan 12, 2018, at 6:54 PM, Harald J=C3=B6rg <Harald.Joerg@arcor.de> wr=
ote:
>>=20
>> Hello libwww,
>>=20
>> I've encountered a situation where browsers "just work" but
>> LWP::UserAgent fails.
>> [...]
>
> Thanks very much for this detailed explanation of what you've been
> seeing.  I don't really know this part of the code well enough to be
> able to comment on this right now, but there was a recent pull request
> which deals with authentication.  Does
> https://github.com/libwww-perl/libwww-perl/pull/255/files fix anything
> for you?
>
> If it does or it doesn't, it might be worth commenting on the existing
> pull request.

Thanks for the pointer.  Unfortunately that pull request tries to fix
another issue which isn't closely related to my own.  The pull request
does, however, introduce yet another of these unconditional translations
of commas to semicolons, which is somewhat foolhardy but doesn't do
extra damage.

I think I can prepare a fix to make authentication RFC compliant, but
since I haven't working in the guts of LWP since 10+ years this would
also be foolhardy :)

Some more details on the handling of auth headers:

If I have a header like this:

   WWW-Authenticate: Basic realm=3D"Hello, world"

....then LWP::UA converts this value to 'Basic realm=3D"Hello; world"'.
This can't be right. Quoted strings should be retained as they are.

The conversion is done with the intent to fit the specs of
HTTP::Headers::Util::split_header_words, which works quite fine for
headers which aren't WWW-Authenticate.  But WWW-Authenticate is
"special", to say it politely.  The example in
https://tools.ietf.org/html/rfc7235#section-4.1 reads:

    WWW-Authenticate: Newauth realm=3D"apps", type=3D1,
                      title=3D"Login to \"apps\"", Basic realm=3D"simple"

So, the comma is not only used to separate auth-params within one
authentication scheme, it also separates two different authentication
schemes.  The RFC says, encouragingly,

   User agents are advised to take special care in parsing the field
   value, as it might contain more than one challenge, and each
   challenge can contain a comma-separated list of authentication
   parameters.  Furthermore, the header field itself can occur multiple
   times.

Today, LWP::UA wouldn't be able to process the RFC example correctly.
The params of the header are parsed into a hash, so that the second
realm clobbers the first.  With the pull request it would be able to
process the following equivalent headers quite fine:

   WWW-Authenticate: Newauth realm=3D"apps", type=3D1, title=3D"Login to \"=
apps\""
   WWW-Authenticate: Basic realm=3D"simple"

The options are: Either we take special care in parsing the field value,
or we just live with the fact that a comma in the realm might cause
issues, like we did in the last 20 years.  Ignorance is bliss :)
--=20
Cheers,
haj
0
Harald
1/23/2018 4:21:21 PM
Reply: