-
-
Notifications
You must be signed in to change notification settings - Fork 35.5k
src: refactor EndsInANumber in node_url.cc and adds IsIPv4NumberValid #46227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
ff9c478
f076935
b1f4c83
95d3081
e9282ad
a683d34
d165e39
573dd4a
b52bbd4
3074383
de4f065
9058eb5
b65769f
9534455
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -407,29 +407,62 @@ int64_t ParseIPv4Number(const char* start, const char* end) { | |
| return strtoll(start, nullptr, R); | ||
| } | ||
|
|
||
| // https://url.spec.whatwg.org/#ipv4-number-parser | ||
| bool IsIPv4NumberValid(std::string_view input) { | ||
| if (input.empty()) { | ||
| return false; | ||
| } | ||
|
|
||
| if (input.size() >= 2 && input[0] == '0') { | ||
|
anonrig marked this conversation as resolved.
|
||
| if (input[1] == 'X' || input[1] == 'x') { | ||
| if (input.size() == 2) { | ||
| return true; | ||
| } | ||
|
|
||
| return input.find_first_not_of("0123456789abcdefABCDEF", 2) == | ||
|
anonrig marked this conversation as resolved.
Outdated
|
||
| std::string_view::npos; | ||
| } else { | ||
|
anonrig marked this conversation as resolved.
Outdated
|
||
| return input.find_first_not_of("01234567", 1) == std::string_view::npos; | ||
| } | ||
| } | ||
|
|
||
| return std::all_of(input.begin(), input.end(), ::isdigit); | ||
|
anonrig marked this conversation as resolved.
Outdated
|
||
| } | ||
|
|
||
| // https://url.spec.whatwg.org/#ends-in-a-number-checker | ||
| bool EndsInANumber(const std::string& input) { | ||
| std::vector<std::string> parts = SplitString(input, '.', false); | ||
| bool EndsInANumber(const std::string_view input) { | ||
|
anonrig marked this conversation as resolved.
Outdated
|
||
| if (input.empty()) { | ||
| return false; | ||
| } | ||
|
|
||
| char delimiter = '.'; | ||
| auto pointer_start = input.begin(); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done!
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Windows build is still failing : /
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would like to know if I can go with &(*it); again
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @addaleax any recommendations?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it'd look like this:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What i understand is that iterators don't need to be pointers although they could be implemented as one. And some implementations of the STL (MSVC++, Mingw) implement interators with special classes (with no pre-difined casts to pointer), so that we cannot use them directly in the string_view constructor that we are using, which needs a charT*. then we could use * which is overloaded to do what logically mean: convert the iterator type to its pointed-to object, then pass the address of this &(*it). https://en.cppreference.com/w/cpp/named_req/RandomAccessIterator |
||
| auto pointer_end = input.end(); | ||
|
|
||
| if (parts.empty()) return false; | ||
| uint8_t parts_size = std::count(pointer_start, pointer_end, delimiter); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why uint8_t?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because it's an IPv4, so there shouldn't be more parts than, lets say, 4. In short.. it would always be a small number (I think uint8_t still quite big for this)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. uint8_t is correct. referencing spec:
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that’s a good idea, yeah. Reading this code it’s really not obvious that anything in the code guarantees that the return value fits into an |
||
| ++parts_size; | ||
|
|
||
| if (parts.back().empty()) { | ||
| if (parts.size() == 1) return false; | ||
| parts.pop_back(); | ||
| if (input.back() == delimiter) { | ||
| --pointer_end; | ||
| --parts_size; | ||
| } | ||
|
|
||
| const std::string& last = parts.back(); | ||
| if (parts_size > 1) { | ||
| pointer_start += | ||
| input.rfind(delimiter, std::distance(pointer_start, pointer_end) - 1) + | ||
| 1; | ||
| } | ||
|
|
||
| // If last is non-empty and contains only ASCII digits, then return true | ||
| if (!last.empty() && std::all_of(last.begin(), last.end(), ::isdigit)) { | ||
| return true; | ||
| if (std::distance(pointer_start, pointer_end) == 0) { | ||
| return false; | ||
| } | ||
|
|
||
| const char* last_str = last.c_str(); | ||
| int64_t num = ParseIPv4Number(last_str, last_str + last.size()); | ||
| if (num >= 0) return true; | ||
| if (std::all_of(pointer_start, pointer_end, ::isdigit)) { | ||
| return true; | ||
| } | ||
|
|
||
| return false; | ||
| return IsIPv4NumberValid( | ||
| std::string_view(&*pointer_start, pointer_end - pointer_start)); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is too hacky (even if it's guaranteed safe).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. removed!
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll try a
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @nodejs/cpp-reviewers any recommendations?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah: I would definitely recommeng to go back to the original "hacky" implementation. It works and looks correct, even if it is not visually pretty. (I'd also not call this
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think it only works on the other compilers because their implementations for the iterator are: |
||
| } | ||
|
|
||
| void URLHost::ParseIPv4Host(const char* input, size_t length) { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.