module RDF::Util::File
Wrapper for retrieving RDF
resources from HTTP(S) and file: scheme locations.
By default, HTTP(S) resources are retrieved using Net::HTTP. However, If the [Rest Client](rubygems.org/gems/rest-client) gem is included, it will be used for retrieving resources, allowing for sophisticated HTTP caching using [REST Client Components](rubygems.org/gems/rest-client-components) allowing the use of ‘Rack::Cache` to avoid network access.
To use other HTTP clients, consumers can subclass {RDF::Util::File::HttpAdapter} and set the {RDF::Util::File.}.
Also supports the file: scheme for access to local files.
@since 0.2.4
Public Class Methods
Source
# File lib/rdf/util/file.rb, line 244 def http_adapter(use_net_http = false) if use_net_http NetHttpAdapter else @http_adapter ||= begin # Otherwise, fallback to Net::HTTP if defined?(RestClient) RestClientAdapter else NetHttpAdapter end end end end
Get current HTTP adapter. If no adapter has been explicitly set, use RestClientAdapter
(if RestClient is loaded), or the NetHttpAdapter
@param [Boolean] use_net_http use the NetHttpAdapter
, even if other
adapters have been configured
@return [HttpAdapter] @since 1.2
Source
# File lib/rdf/util/file.rb, line 232 def http_adapter= http_adapter @http_adapter = http_adapter end
Set the HTTP adapter @see .http_adapter @param [HttpAdapter] http_adapter
@return [HttpAdapter] @since 1.2
Source
# File lib/rdf/util/file.rb, line 299 def self.open_file(filename_or_url, proxy: nil, headers: {}, verify_none: false, **options, &block) remote_document = nil if filename_or_url.to_s.match?(/^https?/) base_uri = filename_or_url.to_s remote_document = self.http_adapter(!!options[:use_net_http]). open_url(base_uri, proxy: proxy, headers: headers, verify_none: verify_none, **options) else # Fake content type based on found format format = RDF::Format.for(filename_or_url.to_s) content_type = format ? format.content_type.first : 'text/plain' # Open as a file, passing any options begin url_no_frag_or_query = RDF::URI(filename_or_url).dup url_no_frag_or_query.query = nil url_no_frag_or_query.fragment = nil options[:encoding] ||= Encoding::UTF_8 # Just use path if there's a file scheme. This leaves out a potential host, which isn't supported anyway. if url_no_frag_or_query.scheme == 'file' url_no_frag_or_query = url_no_frag_or_query.path if url_no_frag_or_query.match?(/^\/[A-Za-z]:/) && Gem.win_platform? # Turns "/D:foo" into "D:foo" url_no_frag_or_query = url_no_frag_or_query[1..-1] end end Kernel.open(url_no_frag_or_query, "r", **options) do |file| document_options = { base_uri: filename_or_url.to_s, charset: file.external_encoding.to_s, code: 200, content_type: content_type, last_modified:file.mtime, headers: {content_type: content_type, last_modified: file.mtime.xmlschema} } remote_document = RemoteDocument.new(file.read, document_options) end rescue Errno::ENOENT => e raise IOError, e.message end end if block_given? yield remote_document else remote_document end end
Open the file, returning or yielding {RemoteDocument}.
Input received as non-unicode, is transformed to UTF-8. With Ruby >= 2.2, all UTF is normalized to [Unicode Normalization Form C (NFC)](unicode.org/reports/tr15/#Norm_Forms).
HTTP resources may be retrieved via proxy using the ‘proxy` option. If `RestClient` is loaded, they will use the proxy globally by setting something like the following:
`RestClient.proxy = "http://proxy.example.com/"`.
When retrieving documents over HTTP(S), use the mechanism described in [Providing and Discovering URI
Documentation](www.w3.org/2001/tag/awwsw/issue57/latest/) to pass the appropriate ‘base_uri` to the block or as the return.
Applications needing HTTP caching may consider [Rest Client](rubygems.org/gems/rest-client) and [REST Client Components](rubygems.org/gems/rest-client-components) allowing the use of ‘Rack::Cache` as a local file cache.
@example using a local HTTP cache
require 'restclient/components' require 'rack/cache' RestClient.enable Rack::Cache RDF::Util::File.open_file("http://example.org/some/resource") # => Cached resource if current, otherwise returned resource
@param [String] filename_or_url to open @param [String] proxy
HTTP Proxy to use for requests.
@param [Array, String] headers ({})
HTTP Request headers Defaults `Accept` header based on available reader content types to allow for content negotiation based on available readers. Defaults `User-Agent` header, unless one is specified.
@param [Boolean] verify_none (false)
Don't verify SSL certificates
@param [Hash{Symbol => Object}] options
options are ignored in this implementation. Applications are encouraged to override this implementation to provide more control over HTTP headers and redirect following. If opening as a file, options are passed to `Kernel.open`.
@return [RemoteDocument, Object] A {RemoteDocument}. If a block is given, the result of evaluating the block is returned. @yield [ RemoteDocument] A {RemoteDocument} for local files @yieldreturn [Object] returned from open_file
@raise [IOError] if not found