xref: /curl/docs/libcurl/curl_url_set.md (revision c0a9db84)
1---
2c: Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al.
3SPDX-License-Identifier: curl
4Title: curl_url_set
5Section: 3
6Source: libcurl
7See-also:
8  - CURLOPT_CURLU (3)
9  - curl_url (3)
10  - curl_url_cleanup (3)
11  - curl_url_dup (3)
12  - curl_url_get (3)
13  - curl_url_strerror (3)
14Protocol:
15  - All
16Added-in: 7.78.0
17---
18
19# NAME
20
21curl_url_set - set a URL part
22
23# SYNOPSIS
24
25~~~c
26#include <curl/curl.h>
27
28CURLUcode curl_url_set(CURLU *url,
29                       CURLUPart part,
30                       const char *content,
31                       unsigned int flags);
32~~~
33
34# DESCRIPTION
35
36The *url* handle to work on, passed in as the first argument, must be a
37handle previously created by curl_url(3) or curl_url_dup(3).
38
39This function sets or updates individual URL components, or parts, held by the
40URL object the handle identifies.
41
42The *part* argument should identify the particular URL part (see list below)
43to set or change, with *content* pointing to a null-terminated string with the
44new contents for that URL part. The contents should be in the form and
45encoding they would use in a URL: URL encoded.
46
47When setting a part in the URL object that was previously already set, it
48replaces the data that was previously stored for that part with the new
49*content*.
50
51The caller does not have to keep *content* around after a successful call
52as this function copies the content.
53
54Setting a part to a NULL pointer removes that part's contents from the *CURLU*
55handle.
56
57This function has an 8 MB maximum length limit for all provided input strings.
58In the real world, excessively long fields in URLs cause problems even if this
59function accepts them.
60
61When setting or updating contents of individual URL parts, curl_url_set(3)
62might accept data that would not be otherwise possible to set in the string
63when it gets populated as a result of a full URL parse. Beware. If done so,
64extracting a full URL later on from such components might render an invalid
65URL.
66
67The *flags* argument is a bitmask with independent features.
68
69# PARTS
70
71## CURLUPART_URL
72
73Allows the full URL of the handle to be replaced. If the handle already is
74populated with a URL, the new URL can be relative to the previous.
75
76When successfully setting a new URL, relative or absolute, the handle contents
77is replaced with the components of the newly set URL.
78
79Pass a pointer to a null-terminated string to the *url* parameter. The string
80must point to a correctly formatted "RFC 3986+" URL or be a NULL pointer. The
81URL parser only understands and parses the subset of URLS that are
82"hierarchical" and therefore contain a :// separator - not the ones that are
83normally specified with only a colon separator.
84
85By default this API only parses URLs using schemes for protocols that are
86supported built-in. To make libcurl parse URLs generically even for schemes it
87does not know about, the **CURLU_NON_SUPPORT_SCHEME** flags bit must be set.
88Otherwise, this function returns *CURLUE_UNSUPPORTED_SCHEME* for URL schemes
89it does not recognize.
90
91Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed in
92the URL.
93
94When a full URL is set (parsed), the hostname component is stored URL decoded.
95
96## CURLUPART_SCHEME
97
98Scheme cannot be URL decoded on set. libcurl only accepts setting schemes up
99to 40 bytes long.
100
101## CURLUPART_USER
102
103If only the user part is set and not the password, the URL is represented with
104a blank password.
105
106## CURLUPART_PASSWORD
107
108If only the password part is set and not the user, the URL is represented with
109a blank user.
110
111## CURLUPART_OPTIONS
112
113The options field is an optional field that might follow the password in the
114userinfo part. It is only recognized/used when parsing URLs for the following
115schemes: pop3, smtp and imap. This function however allows users to
116independently set this field.
117
118## CURLUPART_HOST
119
120The hostname. If it is International Domain Name (IDN) the string must then be
121encoded as your locale says or UTF-8 (when WinIDN is used). If it is a
122bracketed IPv6 numeric address it may contain a zone id (or you can use
123*CURLUPART_ZONEID*).
124
125Note that if you set an IPv6 address, it gets ruined and causes an error if
126you also set the CURLU_URLENCODE flag.
127
128Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed to set.
129
130## CURLUPART_ZONEID
131
132If the hostname is a numeric IPv6 address, this field can also be set.
133
134## CURLUPART_PORT
135
136The port number cannot be URL encoded on set. The given port number is
137provided as a string and the decimal number in it must be between 0 and
13865535. Anything else returns an error.
139
140## CURLUPART_PATH
141
142If a path is set in the URL without a leading slash, a slash is prepended
143automatically.
144
145## CURLUPART_QUERY
146
147The query part gets spaces converted to pluses when asked to URL encode on set
148with the *CURLU_URLENCODE* bit.
149
150If used together with the *CURLU_APPENDQUERY* bit, the provided part is
151appended on the end of the existing query.
152
153The question mark in the URL is not part of the actual query contents.
154
155## CURLUPART_FRAGMENT
156
157The hash sign in the URL is not part of the actual fragment contents.
158
159# FLAGS
160
161The flags argument is zero, one or more bits set in a bitmask.
162
163## CURLU_APPENDQUERY
164
165Can be used when setting the *CURLUPART_QUERY* component. The provided new
166part is then appended at the end of the existing query - and if the previous
167part did not end with an ampersand (&), an ampersand gets inserted before the
168new appended part.
169
170When *CURLU_APPENDQUERY* is used together with *CURLU_URLENCODE*, the
171first '=' symbol is not URL encoded.
172
173## CURLU_NON_SUPPORT_SCHEME
174
175If set, allows curl_url_set(3) to set a non-supported scheme. It then of
176course cannot know if the provided scheme is a valid one or not.
177
178## CURLU_URLENCODE
179
180When set, curl_url_set(3) URL encodes the part on entry, except for
181**scheme**, **port** and **URL**.
182
183When setting the path component with URL encoding enabled, the slash character
184is skipped.
185
186The query part gets space-to-plus converted before the URL conversion is
187applied.
188
189This URL encoding is charset unaware and converts the input in a byte-by-byte
190manner.
191
192## CURLU_DEFAULT_SCHEME
193
194If set, allows the URL to be set without a scheme and then sets that to the
195default scheme: HTTPS. Overrides the *CURLU_GUESS_SCHEME* option if both are
196set.
197
198## CURLU_GUESS_SCHEME
199
200If set, allows the URL to be set without a scheme and it instead "guesses"
201which scheme that was intended based on the hostname. If the outermost
202subdomain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP then that scheme is
203used, otherwise it picks HTTP. Conflicts with the *CURLU_DEFAULT_SCHEME*
204option which takes precedence if both are set.
205
206If guessing is not allowed and there is no default scheme set, trying to parse
207a URL without a scheme returns error.
208
209If the scheme ends up set as a result of guessing, i.e. it is not actually
210present in the parsed URL, it can later be figured out by using the
211**CURLU_NO_GUESS_SCHEME** flag when subsequently getting the URL or the scheme
212with curl_url_get(3).
213
214## CURLU_NO_AUTHORITY
215
216If set, skips authority checks. The RFC allows individual schemes to omit the
217host part (normally the only mandatory part of the authority), but libcurl
218cannot know whether this is permitted for custom schemes. Specifying the flag
219permits empty authority sections, similar to how file scheme is handled.
220
221## CURLU_PATH_AS_IS
222
223When set for **CURLUPART_URL**, this skips the normalization of the
224path. That is the procedure where libcurl otherwise removes sequences of
225dot-slash and dot-dot etc. The same option used for transfers is called
226CURLOPT_PATH_AS_IS(3).
227
228## CURLU_ALLOW_SPACE
229
230If set, the URL parser allows space (ASCII 32) where possible. The URL syntax
231does normally not allow spaces anywhere, but they should be encoded as %20
232or '+'. When spaces are allowed, they are still not allowed in the scheme.
233When space is used and allowed in a URL, it is stored as-is unless
234*CURLU_URLENCODE* is also set, which then makes libcurl URL encode the
235space before stored. This affects how the URL is constructed when
236curl_url_get(3) is subsequently used to extract the full URL or
237individual parts. (Added in 7.78.0)
238
239## CURLU_DISALLOW_USER
240
241If set, the URL parser does not accept embedded credentials for the
242**CURLUPART_URL**, and instead returns **CURLUE_USER_NOT_ALLOWED** for
243such URLs.
244
245# %PROTOCOLS%
246
247# EXAMPLE
248
249~~~c
250int main(void)
251{
252  CURLUcode rc;
253  CURLU *url = curl_url();
254  rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
255  if(!rc) {
256    /* change it to an FTP URL */
257    rc = curl_url_set(url, CURLUPART_SCHEME, "ftp", 0);
258  }
259  curl_url_cleanup(url);
260}
261~~~
262
263# %AVAILABILITY%
264
265# RETURN VALUE
266
267Returns a *CURLUcode* error value, which is CURLUE_OK (0) if everything
268went fine. See the libcurl-errors(3) man page for the full list with
269descriptions.
270
271The input string passed to curl_url_set(3) must be shorter than eight
272million bytes. Otherwise this function returns **CURLUE_MALFORMED_INPUT**.
273
274If this function returns an error, no URL part is set.
275