Friday, April 27, 2012

URL encoding in Java

Encoding URLs in Java is quite trivial. However, too often I see people using the URLEncoder class for this. This is WRONG.

The URLEncoder class is used for encoding form data into xxx-form-urlencoded -format. Even though very poorly named, it does say this in the Javadocs ;)

The proper class for URL encoding is the URI class.

Lets assume we have an URL such as http://www.somedomain.com/shop/Blue Widgets

If you encode this with the URLEncoder class, you get:

http%3A%2F%2Fwww.somedomain.com%2FBlue+Widgets

Unreadable and incorrectly encoded. Reserved characters such as : and / should not be encoded. Also, the URLEncoder encodes empty space as "+" even though it should be encoded as "%20".

With the URI class, you get:

http://www.somedomain.com/Blue%20Widgets

Which is correct.

To construct as simple URL like the example above with the URI class:

URI myURI = new URI("http", "www.somedomain.com", "/Blue Widgets");

And to get the URL in an encoded format:

String url = myUri.toASCIIString();


1 comment:

  1. This helped me, thanks. Javascript's uridecode() method will decode this url fine if anyone else is calling a nodejs server.

    ReplyDelete