The first ID_Encoding test caught me by surprise, since it does not appear to match the RFC:
// /foo/bar?baz=http://foo.bar stays unencoded.
{
const absl::string_view robotstxt =
"User-agent: FooBot\n"
"Disallow: /\n"
"Allow: /foo/bar?qux=taz&baz=http://foo.bar?tar&par\n";
EXPECT_TRUE(IsUserAgentAllowed(
robotstxt, "FooBot",
"http://foo.bar/foo/bar?qux=taz&baz=http://foo.bar?tar&par"));
}
However, section 2.2.2 of the REP RFC seems to indicate that /foo/bar?baz=http://foo.bar should be encoded as /foo/bar?baz=http%3A%2F%2Ffoo.bar.
I can't decide if I'm mis-reading the RFC or if the test intentionally deviates from the RFC in this case.
Thanks!
The first
ID_Encodingtest caught me by surprise, since it does not appear to match the RFC:However, section 2.2.2 of the REP RFC seems to indicate that
/foo/bar?baz=http://foo.barshould be encoded as/foo/bar?baz=http%3A%2F%2Ffoo.bar.I can't decide if I'm mis-reading the RFC or if the test intentionally deviates from the RFC in this case.
Thanks!