Discussion:
Spaces in mailto links
(too old to reply)
Russell May
2004-02-27 19:05:47 UTC
Permalink
Here is an example of a format I have used for personalized mailto links since
year 2000, using my name and a fictitious address: <a href="mailto:Russell&nbsp;May&lt;***@nowhere.com&gt;"></a>
It uses character entity names for non-break space, less-than, greater-than
symbols.

Mouseover of the mailto link should not show any strange characters in the
browser status window. Clicking on the link should bring up a mail program with
the email address (including the person's name but with no strange characters)
in the "To:" window.

In year 2000 this format worked completely for nearly all browsers and mail
programs I tried then. (AOL 5.0 and Compuserve 4.0 were the exceptions.)

In year 2004 it still works properly for Netscape 7.0 and Internet Explorer 6.0
browsers used with Netscape Mail, Outlook, Outlook Express, Eudora Lite 3.05,
and Free Agent 1.93 mail programs. I have been told that it works properly with
a combination of the latest Mozilla browser and Mozilla Mail but I have not
tried that combination.

It does NOT work properly with Mozilla 1.6 or 1.7a or Netscape 7.1 browsers and
any of the mail programs that I have tried. Clicking on the link brings up the
mail program with an A with ^ over it (A circumflex, &Acirc;) before the space
in the "To:" window. The message is sent with the extra character in the header.
I looks bad.

The February 2000 version of Tidy HTML validator shows no errors, warnings, or
changes for this format in an HTML file. The February 2004 version of Tidy warns
of a malformed URI link; and changes the non-break space to %C2%A0, the
less-than symbol to %3C, and greater-than symbol to %3E. I presume that is what
Netscape 7.1 and Mozilla are doing. %C2%A0 causes the problem for all of the
mail programs I have tried recently. A space, %20, %80, &#x20; or &#x80; works
in all of the combinations I tried recently. They caused minor problems with
some browsers or mail programs I tried in 2000.

I cannot find anything which says that a non-break space should be converted to
%C2%A0 in a URI. The HTML 4.01 spec seems to say that anything in a URI outside
the range of %20 to %7F is non-compliant.

Can anyone point me toward something that says whether or why the conversion
from non-break space to %C2%A0 happens?
Russell May
2004-02-27 19:15:03 UTC
Permalink
On Fri, 27 Feb 2004 19:05:47 GMT, ***@ditmcoNotThis.com (Russell May) wrote:

Ooops,
&#xA0; misoperates like &nbsp;
%A0 works okay
Andreas Prilop
2004-02-27 19:24:03 UTC
Permalink
Post by Russell May
Here is an example of a format I have used for personalized mailto links since
It uses character entity names for non-break space, less-than, greater-than
symbols.
Russell May<***@nowhere.com>
with char xA0 between "Russell" and "May" is an illegal address.
Correct are
Russell May <***@nowhere.com>
***@nowhere.com (Russell May)
with space between "Russel" and "May".

The second form can be written in HTML as
<a href="mailto:***@nowhere.com%20(Russel%20May)">
--
Top-posting.
What's the most irritating thing on Usenet?
Russell May
2004-02-27 20:26:02 UTC
Permalink
On Fri, 27 Feb 2004 20:24:03 +0100, Andreas Prilop
Post by Andreas Prilop
Post by Russell May
Here is an example of a format I have used for personalized mailto links since
It uses character entity names for non-break space, less-than, greater-than
symbols.
with char xA0 between "Russell" and "May" is an illegal address.
It does seem illegal because it is outside of the %20 to %7F range,
but %A0 works with Netscape 7.1 and Outlook.
Post by Andreas Prilop
Correct are
with space between "Russel" and "May".
I presume you are referring to what appears in the mail program "To:" window.
Post by Andreas Prilop
The second form can be written in HTML as
It worked for me today using Netscape 7.1 and Outlook,
but so does my original format if I use %20 or %A0 instead of &nbsp;

What I am most interested in is the last sentence of my original post:
Can anyone point me toward something that says whether or why the conversion
from non-break space to %C2%A0 happens?
Andreas Prilop
2004-02-27 20:42:01 UTC
Permalink
Post by Russell May
It worked for me today using Netscape 7.1 and Outlook,
but so does my original format if I use %20 or %A0 instead of &nbsp;
"%A0" represents character xA0 and this character would be illegal in
an e-mail address.
Post by Russell May
Can anyone point me toward something that says whether or why the conversion
from non-break space to %C2%A0 happens?
The no-break space is xC2A0 in UTF-8.
<http://www.w3.org/TR/html4/appendix/notes.html#non-ascii-chars>
Again: You cannot have xA0 in an e-mail address.

BTW: A more appropriate group is <news:comp.mail.mime> .
--
Top-posting.
What's the most irritating thing on Usenet?
Russell May
2004-02-27 21:15:29 UTC
Permalink
On Fri, 27 Feb 2004 21:42:01 +0100, Andreas Prilop
Post by Andreas Prilop
Post by Russell May
It worked for me today using Netscape 7.1 and Outlook,
but so does my original format if I use %20 or %A0 instead of &nbsp;
"%A0" represents character xA0 and this character would be illegal in
an e-mail address.
That is what I thought too. But any of the mail programs
that I tried recently used it okay. Sometimes things work
in practice even if they should not work in theory.

Is there anything theoretically wrong with this format?
<a href="mailto:Russell%20May%***@nowhere.com%3E">Russ May</a>

I could change my mailto links to the format that you suggest,
but changing to this one would be simpler. Just search-and-replace.
Post by Andreas Prilop
Post by Russell May
Can anyone point me toward something that says whether or why the conversion
from non-break space to %C2%A0 happens?
The no-break space is xC2A0 in UTF-8.
<http://www.w3.org/TR/html4/appendix/notes.html#non-ascii-chars>
Again: You cannot have xA0 in an e-mail address.
That almost got me to the right spot. I had looked in
http://www.ietf.org/rfc/rfc2279.txt previously but I had
not recognized the following as the source of the conversion:
UCS-4 range (hex.) UTF-8 octet sequence (binary)
0000 0000-0000 007F 0xxxxxxx
0000 0080-0000 07FF 110xxxxx 10xxxxxx
:
:
partly because I not realized for a while that the
character entity number for &nbsp; is actually &#xA0;
I was especially confused by the fact that %C2 and %A0
individually appear to be invalid.
Post by Andreas Prilop
BTW: A more appropriate group is <news:comp.mail.mime> .
Maybe true, but I came across this one first.

Thanks for your help.
Andreas Prilop
2004-02-27 21:23:35 UTC
Permalink
Post by Russell May
Is there anything theoretically wrong with this format?
Another "%20" is missing:
<a href="mailto:Russell%20May%20%***@nowhere.com%3E">
--
Top-posting.
What's the most irritating thing on Usenet?
Russell May
2004-02-27 21:35:16 UTC
Permalink
On Fri, 27 Feb 2004 22:23:35 +0100, Andreas Prilop
Post by Russell May
Is there anything theoretically wrong with this format?
I can do that :)
(although I am not sure why the extra character is needed)

I omitted the space (%20) before the less-than symbol because
some browsers or mail programs in 2000 inserted a space there,
producing two contiguous spaces. I don't know whether this happens
with current browsers and mail programs, but it is a minor problem
at worst.

Thanks again.
Andreas Prilop
2004-02-27 21:39:02 UTC
Permalink
Post by Russell May
I omitted the space (%20) before the less-than symbol because
some browsers or mail programs in 2000 inserted a space there,
producing two contiguous spaces.
No problem with that. You could write
Russell May <***@nowhere.com>
with even more spaces.
--
Top-posting.
What's the most irritating thing on Usenet?
SpaceGirl
2004-02-27 23:29:22 UTC
Permalink
Post by Andreas Prilop
Post by Russell May
I omitted the space (%20) before the less-than symbol because
some browsers or mail programs in 2000 inserted a space there,
producing two contiguous spaces.
No problem with that. You could write
with even more spaces.
--
Top-posting.
What's the most irritating thing on Usenet?
My worry with all of this is... would it work even if you managed to get the
damn thing to send? I'm fairly sure most mail programs would think it was an
invalid address. Even if mail READING software doesn't dump the mail as an
invalid address, what about mail routers? I'm fairly sure they would choke
too. Exchange (for example) doesn't seem to like spaces in addresses... it
just wont accept them. Not sure about other mail server software.
Toby A Inkster
2004-02-28 10:54:50 UTC
Permalink
Post by SpaceGirl
Post by Andreas Prilop
No problem with that. You could write
with even more spaces.
My worry with all of this is... would it work even if you managed to get the
damn thing to send? I'm fairly sure most mail programs would think it was an
invalid address. Even if mail READING software doesn't dump the mail as an
invalid address, what about mail routers? I'm fairly sure they would choke
too. Exchange (for example) doesn't seem to like spaces in addresses... it
just wont accept them. Not sure about other mail server software.
The servers don't come into it. Mail clients don't[1] include the Real
Name part of to 'To:' field in the SMTP envelope, so mail servers don't
even see it.

---
[1] Well, there's always a chance that some stupid mail client might, but
it would be so severely borked that no-one would use it.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me - http://www.goddamn.co.uk/tobyink/?page=132
SpaceGirl
2004-02-27 19:34:12 UTC
Permalink
Post by Russell May
Here is an example of a format I have used for personalized mailto links since
It uses character entity names for non-break space, less-than,
greater-than
Post by Russell May
symbols.
Mouseover of the mailto link should not show any strange characters in the
browser status window. Clicking on the link should bring up a mail program with
the email address (including the person's name but with no strange characters)
in the "To:" window.
In year 2000 this format worked completely for nearly all browsers and mail
programs I tried then. (AOL 5.0 and Compuserve 4.0 were the exceptions.)
In year 2004 it still works properly for Netscape 7.0 and Internet Explorer 6.0
browsers used with Netscape Mail, Outlook, Outlook Express, Eudora Lite 3.05,
and Free Agent 1.93 mail programs. I have been told that it works properly with
a combination of the latest Mozilla browser and Mozilla Mail but I have not
tried that combination.
It does NOT work properly with Mozilla 1.6 or 1.7a or Netscape 7.1 browsers and
any of the mail programs that I have tried. Clicking on the link brings up the
mail program with an A with ^ over it (A circumflex, &Acirc;) before the space
in the "To:" window. The message is sent with the extra character in the header.
I looks bad.
The February 2000 version of Tidy HTML validator shows no errors, warnings, or
changes for this format in an HTML file. The February 2004 version of Tidy warns
of a malformed URI link; and changes the non-break space to %C2%A0, the
less-than symbol to %3C, and greater-than symbol to %3E. I presume that is what
Netscape 7.1 and Mozilla are doing. %C2%A0 causes the problem for all of the
mail programs I have tried recently. A space, %20, %80, &#x20; or &#x80; works
in all of the combinations I tried recently. They caused minor problems with
some browsers or mail programs I tried in 2000.
I cannot find anything which says that a non-break space should be converted to
%C2%A0 in a URI. The HTML 4.01 spec seems to say that anything in a URI outside
the range of %20 to %7F is non-compliant.
Can anyone point me toward something that says whether or why the conversion
from non-break space to %C2%A0 happens?
You can't have spaces in email addresses. "Miranda ***@somewhere.net" is
no a valid address, but "***@somewhere.net" is
Russell May
2004-02-27 20:43:22 UTC
Permalink
Post by Russell May
Post by Russell May
Here is an example of a format I have used for personalized mailto links
since
Post by Russell May
It uses character entity names for non-break space, less-than,
greater-than
Post by Russell May
symbols.
<snip>
Post by Russell May
Post by Russell May
I cannot find anything which says that a non-break space should be
converted to
Post by Russell May
%C2%A0 in a URI. The HTML 4.01 spec seems to say that anything in a URI
outside
Post by Russell May
the range of %20 to %7F is non-compliant.
Can anyone point me toward something that says whether or why the
conversion
Post by Russell May
from non-break space to %C2%A0 happens?
I agree in principle, even though spaces can actually be used.
I have tried it with Netscape 7.0 and 7.1, Mozilla 1.6 and 1.7a;
and in years 2000-2003 with Netscape 3.01, 4.0x and 4.7x.

But notice: There are no spaces in my mailto address format.
A character entity name &nbsp; is used instead.
A character entity number has the same effect.
Andreas Prilop
2004-02-27 20:59:12 UTC
Permalink
Post by Russell May
But notice: There are no spaces in my mailto address format.
A character entity name &nbsp; is used instead.
This is a misunderstanding!
If you write *in your HTML source*
<a href="mailto:***@nowhere.com&nbsp;(Russel&nbsp;May)">
then the mailto URL would be
mailto:***@nowhere.com (Russel May)
^ ^
with character xA0 at the indicated positions (assuming
charset=ISO-8859-1).

See? Your HTML source has no spaces, but your mailto address has.
--
Top-posting.
What's the most irritating thing on Usenet?
Continue reading on narkive:
Loading...