Fossil

Check-in [e514eeea]
Login

Check-in [e514eeea]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Back out check-in [5bb921dd0893a548] which was wrong - the REQUEST_URI CGI parameter should include the query string. Improve the CGI variable documentation in comments. Improve robustness to malformed CGI variables.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: e514eeea8f8598a899a6c0e1f76fd9bb9c736f96fc1165e0dedf1fdb37f5a3e6
User & Date: drh 2022-02-13 19:16:15
Context
2022-02-15
21:46
Return a 404-Not Found error on any attempt to access a "draft" skin that is not defined. ... (check-in: de320cc3 user: drh tags: trunk)
2022-02-13
19:16
Back out check-in [5bb921dd0893a548] which was wrong - the REQUEST_URI CGI parameter should include the query string. Improve the CGI variable documentation in comments. Improve robustness to malformed CGI variables. ... (check-in: e514eeea user: drh tags: trunk)
19:14
Improved robustness in CGI variable parsing. ... (Closed-Leaf check-in: b8973500 user: drh tags: cgi-compliance)
2022-02-12
20:53
Update the defense-against-robots documentation to align with current behavior. ... (check-in: c9082b29 user: drh tags: trunk)
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to src/cgi.c.

1172
1173
1174
1175
1176
1177
1178




1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204


1205
1206
1207
1208
1209
1210
1211
** the QUERY_STRING environment variable (if it exists), from standard
** input if there is POST data, and from HTTP_COOKIE.
**
** REQUEST_URI, PATH_INFO, and SCRIPT_NAME are related as follows:
**
**      REQUEST_URI == SCRIPT_NAME + PATH_INFO
**




** Where "+" means concatenate.  Fossil requires SCRIPT_NAME.  If
** REQUEST_URI is provided but PATH_INFO is not, then PATH_INFO is
** computed from REQUEST_URI and SCRIPT_NAME.  If PATH_INFO is provided
** but REQUEST_URI is not, then compute REQUEST_URI from PATH_INFO and
** SCRIPT_NAME.  If neither REQUEST_URI nor PATH_INFO are provided, then
** assume that PATH_INFO is an empty string and set REQUEST_URI equal
** to PATH_INFO.
**
** Sometimes PATH_INFO is missing and SCRIPT_NAME is not a prefix of
** REQUEST_URI.  (See https://fossil-scm.org/forum/forumpost/049e8650ed)
** In that case, truncate SCRIPT_NAME so that it is a proper prefix
** of REQUEST_URI.
**
** SCGI typically omits PATH_INFO.  CGI sometimes omits REQUEST_URI and
** PATH_INFO when it is empty.
**
** CGI Parameter quick reference:
**
**                                      REQUEST_URI
**                               _____________|____________
**                              /                          \
**    https://www.fossil-scm.org/forum/info/12736b30c072551a?t=c
**            \________________/\____/\____________________/ \_/
**                    |            |             |            |
**               HTTP_HOST         |        PATH_INFO     QUERY_STRING
**                            SCRIPT_NAME


*/
void cgi_init(void){
  char *z;
  const char *zType;
  char *zSemi;
  int len;
  const char *zRequestUri = cgi_parameter("REQUEST_URI",0);







>
>
>
>


















|
|
|
|
|
|
|
|
>
>







1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
** the QUERY_STRING environment variable (if it exists), from standard
** input if there is POST data, and from HTTP_COOKIE.
**
** REQUEST_URI, PATH_INFO, and SCRIPT_NAME are related as follows:
**
**      REQUEST_URI == SCRIPT_NAME + PATH_INFO
**
** Or if QUERY_STRING is not empty:
**
**      REQUEST_URI == SCRIPT_NAME + PATH_INFO + '?' + QUERY_STRING
**
** Where "+" means concatenate.  Fossil requires SCRIPT_NAME.  If
** REQUEST_URI is provided but PATH_INFO is not, then PATH_INFO is
** computed from REQUEST_URI and SCRIPT_NAME.  If PATH_INFO is provided
** but REQUEST_URI is not, then compute REQUEST_URI from PATH_INFO and
** SCRIPT_NAME.  If neither REQUEST_URI nor PATH_INFO are provided, then
** assume that PATH_INFO is an empty string and set REQUEST_URI equal
** to PATH_INFO.
**
** Sometimes PATH_INFO is missing and SCRIPT_NAME is not a prefix of
** REQUEST_URI.  (See https://fossil-scm.org/forum/forumpost/049e8650ed)
** In that case, truncate SCRIPT_NAME so that it is a proper prefix
** of REQUEST_URI.
**
** SCGI typically omits PATH_INFO.  CGI sometimes omits REQUEST_URI and
** PATH_INFO when it is empty.
**
** CGI Parameter quick reference:
**
**                                   REQUEST_URI
**                           _____________|________________
**                          /                              \
**    https://fossil-scm.org/forum/info/12736b30c072551a?t=c
**    \___/   \____________/\____/\____________________/ \_/
**      |           |          |             |            |
**      |       HTTP_HOST      |        PATH_INFO     QUERY_STRING
**      |                      |
**    REQUEST_SCHEMA         SCRIPT_NAME
**               
*/
void cgi_init(void){
  char *z;
  const char *zType;
  char *zSemi;
  int len;
  const char *zRequestUri = cgi_parameter("REQUEST_URI",0);
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255

1256
1257
1258
1259



1260

1261
1262
1263
1264
1265
1266
1267
#endif
  g.isHTTP = 1;
  cgi_destination(CGI_BODY);

  /* We must have SCRIPT_NAME. If the web server did not supply it, try
  ** to compute it from REQUEST_URI and PATH_INFO. */
  if( zScriptName==0 ){
    size_t nRU, nPI;
    if( zRequestUri==0 || zPathInfo==0 ){
      malformed_request("missing SCRIPT_NAME");  /* Does not return */
    }
    nRU = strlen(zRequestUri);
    nPI = strlen(zPathInfo);
    if( nRU<nPI ){
      malformed_request("PATH_INFO is longer than REQUEST_URI");
    }
    zScriptName = fossil_strndup(zRequestUri,(int)(nRU-nPI));
    cgi_set_parameter("SCRIPT_NAME", zScriptName);
  }

#ifdef _WIN32
  /* The Microsoft IIS web server does not define REQUEST_URI, instead it uses
  ** PATH_INFO for virtually the same purpose.  Define REQUEST_URI the same as
  ** PATH_INFO and redefine PATH_INFO with SCRIPT_NAME removed from the 
  ** beginning. */
  if( zServerSoftware && strstr(zServerSoftware, "Microsoft-IIS") ){
    int i, j;
    cgi_set_parameter("REQUEST_URI", zPathInfo);
    for(i=0; zPathInfo[i]==zScriptName[i] && zPathInfo[i]; i++){}
    for(j=i; zPathInfo[j] && zPathInfo[j]!='?'; j++){}
    zPathInfo = fossil_strndup(zPathInfo+i, j-i);
    cgi_replace_parameter("PATH_INFO", zPathInfo);
  }
#endif
  if( zRequestUri==0 ){
    const char *z = zPathInfo;

    if( zPathInfo==0 ){
      malformed_request("missing PATH_INFO and/or REQUEST_URI");
    }
    if( z[0]=='/' ) z++;



    zRequestUri = mprintf("%s/%s", zScriptName, z);

    cgi_set_parameter("REQUEST_URI", zRequestUri);
  }
  if( zPathInfo==0 ){
    int i, j;
    for(i=0; zRequestUri[i]==zScriptName[i] && zRequestUri[i]; i++){}
    for(j=i; zRequestUri[j] && zRequestUri[j]!='?'; j++){}
    zPathInfo = fossil_strndup(zRequestUri+i, j-i);







<



|
<
|
|

|



















>




>
>
>
|
>







1226
1227
1228
1229
1230
1231
1232

1233
1234
1235
1236

1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
#endif
  g.isHTTP = 1;
  cgi_destination(CGI_BODY);

  /* We must have SCRIPT_NAME. If the web server did not supply it, try
  ** to compute it from REQUEST_URI and PATH_INFO. */
  if( zScriptName==0 ){

    if( zRequestUri==0 || zPathInfo==0 ){
      malformed_request("missing SCRIPT_NAME");  /* Does not return */
    }
    z = strstr(zRequestUri,zPathInfo);

    if( z==0 ){
      malformed_request("PATH_INFO not found in REQUEST_URI");
    }
    zScriptName = fossil_strndup(zRequestUri,(int)(z-zRequestUri));
    cgi_set_parameter("SCRIPT_NAME", zScriptName);
  }

#ifdef _WIN32
  /* The Microsoft IIS web server does not define REQUEST_URI, instead it uses
  ** PATH_INFO for virtually the same purpose.  Define REQUEST_URI the same as
  ** PATH_INFO and redefine PATH_INFO with SCRIPT_NAME removed from the 
  ** beginning. */
  if( zServerSoftware && strstr(zServerSoftware, "Microsoft-IIS") ){
    int i, j;
    cgi_set_parameter("REQUEST_URI", zPathInfo);
    for(i=0; zPathInfo[i]==zScriptName[i] && zPathInfo[i]; i++){}
    for(j=i; zPathInfo[j] && zPathInfo[j]!='?'; j++){}
    zPathInfo = fossil_strndup(zPathInfo+i, j-i);
    cgi_replace_parameter("PATH_INFO", zPathInfo);
  }
#endif
  if( zRequestUri==0 ){
    const char *z = zPathInfo;
    const char *zQS = cgi_parameter("QUERY_STRING",0);
    if( zPathInfo==0 ){
      malformed_request("missing PATH_INFO and/or REQUEST_URI");
    }
    if( z[0]=='/' ) z++;
    if( zQS && zQS[0] ){
      zRequestUri = mprintf("%s/%s?%s", zScriptName, z, zQS);
    }else{
      zRequestUri = mprintf("%s/%s", zScriptName, z);
    }
    cgi_set_parameter("REQUEST_URI", zRequestUri);
  }
  if( zPathInfo==0 ){
    int i, j;
    for(i=0; zRequestUri[i]==zScriptName[i] && zRequestUri[i]; i++){}
    for(j=i; zRequestUri[j] && zRequestUri[j]!='?'; j++){}
    zPathInfo = fossil_strndup(zRequestUri+i, j-i);
1864
1865
1866
1867
1868
1869
1870

1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
  }
  cgi_setenv("GATEWAY_INTERFACE","CGI/1.0");
  cgi_setenv("REQUEST_METHOD",zToken);
  zToken = extract_token(z, &z);
  if( zToken==0 ){
    malformed_request("malformed URL in HTTP header");
  }

  cgi_setenv("SCRIPT_NAME", "");
  for(i=0; zToken[i] && zToken[i]!='?'; i++){}
  if( zToken[i] ) zToken[i++] = 0;
  cgi_setenv("REQUEST_URI", zToken);
  cgi_setenv("PATH_INFO", zToken);
  cgi_setenv("QUERY_STRING", &zToken[i]);
  if( zIpAddr==0 ){
    zIpAddr = cgi_remote_ip(fileno(g.httpIn));
  }
  if( zIpAddr ){
    cgi_setenv("REMOTE_ADDR", zIpAddr);







>



<







1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883

1884
1885
1886
1887
1888
1889
1890
  }
  cgi_setenv("GATEWAY_INTERFACE","CGI/1.0");
  cgi_setenv("REQUEST_METHOD",zToken);
  zToken = extract_token(z, &z);
  if( zToken==0 ){
    malformed_request("malformed URL in HTTP header");
  }
  cgi_setenv("REQUEST_URI", zToken);
  cgi_setenv("SCRIPT_NAME", "");
  for(i=0; zToken[i] && zToken[i]!='?'; i++){}
  if( zToken[i] ) zToken[i++] = 0;

  cgi_setenv("PATH_INFO", zToken);
  cgi_setenv("QUERY_STRING", &zToken[i]);
  if( zIpAddr==0 ){
    zIpAddr = cgi_remote_ip(fileno(g.httpIn));
  }
  if( zIpAddr ){
    cgi_setenv("REMOTE_ADDR", zIpAddr);