summaryrefslogtreecommitdiff
path: root/ext/socket
AgeCommit message (Collapse)Author
2024-03-03Drop support for old ERBNobuyoshi Nakada
2024-02-26Revise 9ec342e07df6aa5e2c2e9003517753a2f1b508fdNobuyoshi Nakada
2024-02-26[Bug #20296] Fix the default assertion messageNobuyoshi Nakada
2024-02-26Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp (#9374)Misaki Shioi
* Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp This is an implementation of Happy Eyeballs version 2 (RFC 8305) in Socket.tcp. [Background] Currently, `Socket.tcp` synchronously resolves names and makes connection attempts with `Addrinfo::foreach.` This implementation has the following two problems. 1. In name resolution, the program stops until the DNS server responds to all DNS queries. 2. In a connection attempt, while an IP address is trying to connect to the destination host and is taking time, the program stops, and other resolved IP addresses cannot try to connect. [Proposal] "Happy Eyeballs" ([RFC 8305](https://datatracker.ietf.org/doc/html/rfc8305)) is an algorithm to solve this kind of problem. It avoids delays to the user whenever possible and also uses IPv6 preferentially. I implemented it into `Socket.tcp` by using `Addrinfo.getaddrinfo` in each thread spawned per address family to resolve the hostname asynchronously, and using `Socket::connect_nonblock` to try to connect with multiple addrinfo in parallel. [Outcome] This change eliminates a fatal defect in the following cases. Case 1. One of the A or AAAA DNS queries does not return --- require 'socket' class Addrinfo class << self # Current Socket.tcp depends on foreach def foreach(nodename, service, family=nil, socktype=nil, protocol=nil, flags=nil, timeout: nil, &block) getaddrinfo(nodename, service, Socket::AF_INET6, socktype, protocol, flags, timeout: timeout) .concat(getaddrinfo(nodename, service, Socket::AF_INET, socktype, protocol, flags, timeout: timeout)) .each(&block) end def getaddrinfo(_, _, family, *_) case family when Socket::AF_INET6 then sleep when Socket::AF_INET then [Addrinfo.tcp("127.0.0.1", 4567)] end end end end Socket.tcp("localhost", 4567) --- Because the current `Socket.tcp` cannot resolve IPv6 names, the program stops in this case. It cannot start to connect with IPv4 address. Though `Socket.tcp` with HEv2 can promptly start a connection attempt with IPv4 address in this case. Case 2. Server does not promptly return ack for syn of either IPv4 / IPv6 address family --- require 'socket' fork do socket = Socket.new(Socket::AF_INET6, :STREAM) socket.setsockopt(:SOCKET, :REUSEADDR, true) socket.bind(Socket.pack_sockaddr_in(4567, '::1')) sleep socket.listen(1) connection, _ = socket.accept connection.close socket.close end fork do socket = Socket.new(Socket::AF_INET, :STREAM) socket.setsockopt(:SOCKET, :REUSEADDR, true) socket.bind(Socket.pack_sockaddr_in(4567, '127.0.0.1')) socket.listen(1) connection, _ = socket.accept connection.close socket.close end Socket.tcp("localhost", 4567) --- The current `Socket.tcp` tries to connect serially, so when its first name resolves an IPv6 address and initiates a connection to an IPv6 server, this server does not return an ACK, and the program stops. Though `Socket.tcp` with HEv2 starts to connect sequentially and in parallel so a connection can be established promptly at the socket that attempted to connect to the IPv4 server. In exchange, the performance of `Socket.tcp` with HEv2 will be degraded. --- 100.times { Socket.tcp("www.ruby-lang.org", 80) } --- This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to `IO::select` in the implementation. * Avoid NameError of Socket::EAI_ADDRFAMILY in MinGW * Support Windows with SO_CONNECT_TIME * Improve performance I have additionally implemented the following patterns: - If the host is single-stack, name resolution is performed in the main thread. This reduces the cost of creating threads. - If an IP address is specified, name resolution is performed in the main thread. This also reduces the cost of creating threads. - If only one IP address is resolved, connect is executed in blocking mode. This reduces the cost of calling IO::select. Also, I have added a fast_fallback option for users who wish not to use HE. Here are the results of each performance test. ```ruby require 'socket' require 'benchmark' HOSTNAME = "www.ruby-lang.org" PORT = 80 ai = Addrinfo.tcp(HOSTNAME, PORT) Benchmark.bmbm do |x| x.report("Domain name") do 30.times { Socket.tcp(HOSTNAME, PORT).close } end x.report("IP Address") do 30.times { Socket.tcp(ai.ip_address, PORT).close } end x.report("fast_fallback: false") do 30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close } end end ``` ``` user system total real Domain name 0.015567 0.032511 0.048078 ( 0.325284) IP Address 0.004458 0.014219 0.018677 ( 0.284361) fast_fallback: false 0.005869 0.021511 0.027380 ( 0.321891) ```` And this is the measurement result when executed in a single stack environment. ``` user system total real Domain name 0.007062 0.019276 0.026338 ( 1.905775) IP Address 0.004527 0.012176 0.016703 ( 3.051192) fast_fallback: false 0.005546 0.019426 0.024972 ( 1.775798) ``` The following is the result of the run on Ruby 3.3.0. (on Dual stack environment) ``` user system total real Ruby 3.3.0 0.007271 0.027410 0.034681 ( 0.472510) ``` (on Single stack environment) ``` user system total real Ruby 3.3.0 0.005353 0.018898 0.024251 ( 1.774535) ``` * Do not cache `Socket.ip_address_list` As mentioned in the comment at https://github.com/ruby/ruby/pull/9374#discussion_r1482269186, caching Socket.ip_address_list does not follow changes in network configuration. But if we stop caching, it becomes necessary to check every time `Socket.tcp` is called whether it's a single stack or not, which could further degrade performance in the case of a dual stack. From this, I've changed the approach so that when a domain name is passed, it doesn't check whether it's a single stack or not and resolves names in parallel each time. The performance measurement results are as follows. require 'socket' require 'benchmark' HOSTNAME = "www.ruby-lang.org" PORT = 80 ai = Addrinfo.tcp(HOSTNAME, PORT) Benchmark.bmbm do |x| x.report("Domain name") do 30.times { Socket.tcp(HOSTNAME, PORT).close } end x.report("IP Address") do 30.times { Socket.tcp(ai.ip_address, PORT).close } end x.report("fast_fallback: false") do 30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close } end end user system total real Domain name 0.004085 0.011873 0.015958 ( 0.330097) IP Address 0.000993 0.004400 0.005393 ( 0.257286) fast_fallback: false 0.001348 0.008266 0.009614 ( 0.298626) * Wait forever if fallback addresses are unresolved, unless resolv_timeout Changed from waiting only 3 seconds for name resolution when there is no fallback address available, to waiting as long as there is no resolv_timeout. This is in accordance with the current `Socket.tcp` specification. * Use exact pattern to match IPv6 address format for specify address family
2024-02-23Stop using rb_str_locktmp_ensure publiclyPeter Zhu
rb_str_locktmp_ensure is a private API.
2024-02-23Add option for mtu discovery flagMarek Küthe
Signed-off-by: Marek Küthe <m.k@mk16.de>
2024-02-23Fixes [Bug #20258]Marek Küthe
Signed-off-by: Marek Küthe <m.k@mk16.de>
2024-02-01Revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns"KJ Tsanaktsidis
This reverts commit 673ed41c81cf5a6951bcb2c3dec82d7bd6ea7440.
2024-01-24Initialize errno variables and fix maybe-uninitialized warningsNobuyoshi Nakada
2024-01-22Make sure the correct error is raised for EAI_SYSTEM resolver failKJ Tsanaktsidis
In case of EAI_SYSTEM, getaddrinfo is supposed to set more detail in errno; however, because we call getaddrinfo on a thread now, and errno is threadlocal, that information is being lost. Instead, we just raise whatever errno happens to be on the calling thread (which can be something very confusing, like `ECHILD`). Fix it by explicitly propagating errno back to the calling thread through the getaddrinfo_arg structure. [Bug #20198]
2024-01-19Mark asan fake stacks during machine stack markingKJ Tsanaktsidis
ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]
2024-01-12Revert "Mark asan fake stacks during machine stack marking"KJ Tsanaktsidis
This reverts commit d10bc3a2b8300cffc383e10c3730871e851be24c.
2024-01-12Mark asan fake stacks during machine stack markingKJ Tsanaktsidis
ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]
2024-01-11Remove setaffinity of pthread for getaddrinfoYusuke Endoh
It looks like `sched_getcpu(3)` returns a strange number on some (virtual?) environments. I decided to remove the setaffinity mechanism because the performance does not appear to degrade on a quick benchmark even if removed. [Bug #20172]
2024-01-05Free pthread_attr after setting up the threadAdam Hess
[bug #20149]
2023-12-18Update `BasicSocket#recv` documentation on return valueJean Boussier
Ref: https://github.com/ruby/ruby/pull/6407 [Bug #19012] `0` is now interpreted as closed connection an not an empty packet, as these are very rare and pretty much useless.
2023-12-18[DOC] Add Socket::ResolutionError documentationNobuyoshi Nakada
2023-12-18[DOC] Correct the location of Addrinfo documentNobuyoshi Nakada
The document must be placed immediately before the class definition. No other statements can be placed in between.
2023-12-18[DOC] Stop unintentional references to builtin or standard namesNobuyoshi Nakada
2023-12-17Revert "[DOC] Make undocumented socket constans nodoc"Nobuyoshi Nakada
This reverts commit cbda94edd80b0f664eda927f9ce9405b2074633a, because `:nodoc:` does not work for constants. In the case of `rb_define_const`, RDoc parses the preceeding comment as in `"/* definition: comment */"` form.
2023-12-17[DOC] Make undocumented socket constans nodocNobuyoshi Nakada
2023-12-17[DOC] Utilize COMMENTS.default_proc to add fallback documentsNobuyoshi Nakada
2023-12-12Partially revert "Set AI_ADDRCONFIG when making getaddrinfo(3) calls"KJ Tsanaktsidis
This _partially_ reverts commit d2ba8ea54a4089959afdeecdd963e3c4ff391748, but for UDP sockets only. With TCP sockets (and other things which use `rsock_init_inetsock`), the order of operations is to call `getaddrinfo(3)` with AF_UNSPEC, look at the returned addresses, pick one, and then call `socket(2)` with the family for that address (i.e. AF_INET or AF_INET6). With UDP sockets, however, this is reversed; `UDPSocket.new` takes an address family as an argument, and then calls `socket(2)` with that family. A subsequent call to UDPSocket#connect will then call `getaddrinfo(3)` with that family. The problem here is that... * If you are in a networking situation that _only_ has loopback addrs, * And you want to look up a name like "localhost" (or NULL) * And you pass AF_INET or AF_INET6 as the ai_family argument to getaddrinfo(3), * And you pass AI_ADDRCONFIG to the hints argument as well, then glibc on Linux will not return an address. This is because AI_ADDRCONFIG is supposed to return addresses for families we actually have an address for and could conceivably connect to, but also is documented to explicitly ignore localhost in that situation. It honestly doesn't make a ton of sense to pass AI_ADDRCONFIG if you're explicitly passing the address family anyway, because you're not looking for "an address for this name we can connect to"; you're looking for "an IPv(4|6) address for this name". And the original glibc bug that d2ba8ea5 was supposed to work around was related to parallel issuance of A and AAAA queries, which of course won't happen if an address family is explicitly specified. So, we fix this by not passing AI_ADDRCONFIG for calls to `rsock_addrinfo` that we also pass an explicit family to (i.e. for UDPsocket). [Bug #20048]
2023-12-10Change the semantics of rb_postponed_job_registerKJ Tsanaktsidis
Our current implementation of rb_postponed_job_register suffers from some safety issues that can lead to interpreter crashes (see bug #1991). Essentially, the issue is that jobs can be called with the wrong arguments. We made two attempts to fix this whilst keeping the promised semantics, but: * The first one involved masking/unmasking when flushing jobs, which was believed to be too expensive * The second one involved a lock-free, multi-producer, single-consumer ringbuffer, which was too complex The critical insight behind this third solution is that essentially the only user of these APIs are a) internal, or b) profiling gems. For a), none of the usages actually require variable data; they will work just fine with the preregistration interface. For b), generally profiling gems only call a single callback with a single piece of data (which is actually usually just zero) for the life of the program. The ringbuffer is complex because it needs to support multi-word inserts of job & data (which can't be atomic); but nobody actually even needs that functionality, really. So, this comit: * Introduces a pre-registration API for jobs, with a GVL-requiring rb_postponed_job_prereigster, which returns a handle which can be used with an async-signal-safe rb_postponed_job_trigger. * Deprecates rb_postponed_job_register (and re-implements it on top of the preregister function for compatability) * Moves all the internal usages of postponed job register pre-registration
2023-12-07Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns (#7295)KJ Tsanaktsidis
When making an outgoing TCP or UDP connection, set AI_ADDRCONFIG in the hints we send to getaddrinfo(3) (if supported). This will prompt the resolver to _NOT_ issue A or AAAA queries if the system does not actually have an IPv4 or IPv6 address (respectively). This makes outgoing connections marginally more efficient on non-dual-stack systems, since we don't have to try connecting to an address which can't possibly work. More importantly, however, this works around a race condition present in some older versions of glibc on aarch64 where it could accidently send the two outgoing DNS queries with the same DNS txnid, and get confused when receiving the responses. This manifests as outgoing connections sometimes taking 5 seconds (the DNS timeout before retry) to be made. Fixes #19144
2023-11-30Adjust indent [ci skip]Nobuyoshi Nakada
2023-11-30Rename rsock_raise_socket_error to rsock_raise_resolution_errorMisaki Shioi
Again, rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail
2023-11-30Replace SocketError with Socket::ResolutionError in rsock_raise_socket_errorMisaki Shioi
rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail
2023-11-30Add Socket::ResolutionError & Socket::ResolutionError#error_codeMisaki Shioi
Socket::ResolutionError#error_code returns Socket::EAI_XXX
2023-11-28Retry pthread_create a few timesYusuke Endoh
According to https://bugs.openjdk.org/browse/JDK-8268605, pthread_create may fail spuriously. This change implements a simple retry as a modest measure, which is also used by JDK.
2023-11-07Prevent cpu_set_t overflow even if there are more than 63 coresYusuke Endoh
Do not use `pthread_attr_setaffinity_np` if `sched_getcpu()` exceeds `CPU_SETSIZE`. (Using `CPU_ALLOC()` would be more appropriate.)
2023-11-07Fix a memory leakYusuke Endoh
pointed by @nobu
2023-11-07Use pthread_attr_setaffinity_np instead of pthread_setaffinity_npYusuke Endoh
2023-11-07Detach a pthread after pthread_setaffinity_npYusuke Endoh
After a pthread for getaddrinfo is detached, we cannot predict when the thread will exit. It would lead to a segfault by setting pthread_setaffinity to the terminated pthread. I guess this problem would be more likely to occur in high-load environments. This change detaches the pthread after pthread_setaffinity is called. [Feature #19965]
2023-11-07Revert "Do not use pthread_setaffinity_np on s390x"Yusuke Endoh
This reverts commit de82439215dd2770ef9a3a2cf5798bdadb788533.
2023-10-25Do not use pthread_setaffinity_np on s390xYusuke Endoh
Looks like it randomly causes a segfault https://rubyci.s3.amazonaws.com/rhel_zlinux/ruby-master/log/20231025T093302Z.fail.html.gz ``` [11186/26148] TestNetHTTP_v1_2#test_set_form/home/chkbuild/build/20231025T093302Z/ruby/tool/lib/webrick/httprequest.rb:197: [BUG] Segmentation fault at 0x000003ff1ffff000 ruby 3.3.0dev (2023-10-25T07:50:00Z master 526292d9fe) [s390x-linux] ```
2023-10-24rb_getaddrinfo should return EAI_AGAIN instead of EAGAINYusuke Endoh
2023-10-24Indent critical regions with blocksYusuke Endoh
Cosmetic change per ko1's preference
2023-10-24Do not use pthread on mingwYusuke Endoh
2023-10-24Make rb_getnameinfo interruptibleYusuke Endoh
Same as previous commit for rb_getnameinfo.
2023-10-24Make rb_getaddrinfo interruptibleYusuke Endoh
When pthread_create is available, rb_getaddrinfo creates a pthread and executes getaddrinfo(3) in it. The caller thread waits for the pthread to complete, but detaches it if interrupted. This allows name resolution to be interuppted by Timeout.timeout, etc. even if it takes a long time (for example, when the DNS server does not respond). [Feature #19965]
2023-10-24Expand macro branches to make them plainYusuke Endoh
2023-10-24Refactor GETADDRINFO_IMPL instead of GETADDRINFO_EMUYusuke Endoh
This is a preparation for introducing cancellable getaddrinfo/getnameinfo.
2023-10-24refactor a call to getaddrinfoYusuke Endoh
2023-10-17Use rb_getnameinfo instead of directly using getnameinfoYusuke Endoh
2023-08-30BasicSocket#recv* return `nil` rather than an empty packetJean Boussier
[Bug #19012] man recvmsg(2) states: > Return Value > These calls return the number of bytes received, or -1 if an error occurred. > The return value will be 0 when the peer has performed an orderly shutdown. Not too sure how one is supposed to make the difference between a packet of size 0 and a closed connection. Notes: Merged: https://github.com/ruby/ruby/pull/6407
2023-06-30Don't check for null pointer in calls to freePeter Zhu
According to the C99 specification section 7.20.3.2 paragraph 2: > If ptr is a null pointer, no action occurs. So we do not need to check that the pointer is a null pointer. Notes: Merged: https://github.com/ruby/ruby/pull/8004
2023-06-12[Feature #19719] Universal Parseryui-knk
Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu. Notes: Merged: https://github.com/ruby/ruby/pull/7927
2023-04-06Update VPATH for socket, & dependenciesMatt Valentine-House
The socket extensions rubysocket.h pulls in the "private" include/gc.h, which now depends on vm_core.h. vm_core.h pulls in id.h when tool/update-deps generates the dependencies for the makefiles, it generates the line for id.h to be based on VPATH, which is configured in the extconf.rb for each of the extensions. By default VPATH does not include the actual source directory of the current Ruby so the dependency fails to resolve and linking fails. We need to append the topdir and top_srcdir to VPATH to have the dependancy picked up correctly (and I believe we need both of these to cope with in-tree and out-of-tree builds). I copied this from the approach taken in https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3 Notes: Merged: https://github.com/ruby/ruby/pull/7393
2023-02-28Update the depend filesMatt Valentine-House
Notes: Merged: https://github.com/ruby/ruby/pull/7310