Conversation
| rsp = session.get(url, allow_redirects=follow_redirects) | ||
|
|
||
| if rsp.status_code // 100 == 3: | ||
| continue |
There was a problem hiding this comment.
I think we should still yield a response, even if it is a redirect. For instance, a user of the library might want to know about all requests that are redirected. We should also extract any URLs from the Location header of a 3xx response.
Please could you:
- remove these two lines, so that we continue to yield all responses;
- add a check in this section so that if the response is a 3xx, we try to extract a URL from the Location header.
Does that make sense?
Once you've done that, you should also add yourself to AUTHORS.txt
There was a problem hiding this comment.
add a check in this section so that if the response is a 3xx, we try to extract a URL from the Location header.
@inglesp By doing this, wouldn't we essentially be following redirects? I am confused.
If we are proceeding this manner, I think it would be a better approach to scrap the body of the 3xx HTTP response also, in addition to adding the url from location header.
There was a problem hiding this comment.
Thinking about this a bit more,follow_redirects is not the right name for the new argument, since it suggests that 3xx responses shouldn't be followed at all.
Can you think of a better name for it?
There was a problem hiding this comment.
@inglesp How about pause_at_redirects or scrap_redirects?
Any thoughts on this?
I think it would be a better approach to scrap the body of the 3xx HTTP response also, in addition to adding the url from location header.
There was a problem hiding this comment.
Hi @rkrp. I'm so sorry it's taken me so long to reply to you.
What about auto_follow_redirects?
And yes, we should also scrape the body of the 3xx response.
Added feature whether to follow redirects or not.