|
Manfred Stienstra
manfred.stienstra at dwerg.net
Wed Nov 26 08:30:52 EST 2003
On Wed, 2003-11-26 at 03:56, Gregory (Grisha) Trubetskoy wrote:
> Interesting question. I don't know the answer. Is content-type really
> supposed to accept unicode? I thoguht all HTTP headers are ASCII only (but
> I may be wrong). If anyone knows and has RFC references, etc - please
> pitch in.
Rfc 1945 (http 1.0) states:
HTTP-header = field-name ":" [ field-value ] CRLF
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, tspecials, and quoted-string>
TEXT = <any OCTET except CTLs,
but including LWS>
OCTET = <any 8-bit sequence of data>
CHAR = <any US-ASCII character (octets 0 - 127)>
Content-Type = "Content-Type" ":" media-type
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
parameter = attribute "=" value
attribute = token
value = token | quoted-string
token = 1*<any CHAR except CTLs or tspecials>
tspecials = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "\" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
This means only us-ascii in the media-type, but any character in the field-content
in general. For more information:
http://www.faqs.org/rfcs/rfc1945.html
http://www.faqs.org/rfcs/rfc2616.html
Manfred
|