class UrlPrivacy
Usage:
Constants
- TRACKING_PARAMS
Remove these params from URLs. Taken from Neat URL and CleanURLs plus some others manually found.
@see {github.com/Smile4ever/Neat-URL} @see {gitlab.com/anti-tracking/ClearURLs/rules/-/blob/master/data.json} @see {github.com/Smile4ever/Neat-URL/issues/235}
Public Class Methods
clean(url)
click to toggle source
Clean the given URL. If the URL can't be parsed, returns the URL unmodified.
Caches in case there're duplicates.
@param [String] @return [String]
# File lib/url_privacy.rb 81 def clean(url) 82 @cleaned_urls ||= {} 83 @cleaned_urls[url] ||= begin 84 uri = URI(url) 85 86 if uri.query 87 hostname = uri.hostname.sub(/\Awww\./, '') 88 params = URI.decode_www_form(uri.query).to_h 89 90 # Remove params by name first 91 params.reject! do |param, _| 92 TRACKING_PARAMS.include? param 93 end 94 95 # Remove params with globs 96 params.reject! do |param, _| 97 simple_tracking_params.any? do |pattern_param| 98 File.fnmatch(pattern_param, param) 99 end 100 end 101 102 # Remove params matching by hostname and then param 103 params.reject! do |param, _| 104 complex_tracking_params.any? do |pattern_hostname, pattern_params| 105 next false unless File.fnmatch(pattern_hostname, hostname) 106 107 pattern_params.any? do |pattern_param| 108 File.fnmatch(pattern_param, param) 109 end 110 end 111 end 112 113 uri.query = URI.encode_www_form(params) 114 end 115 116 uri.to_s 117 end 118 rescue URI::Error 119 @cleaned_urls[url] ||= url 120 end
Private Class Methods
complex_tracking_params()
click to toggle source
This is all so we can just copy and paste from Neat URL source code, it produces a hash of hostname => [ params ] that can be glob-matched.
@return [Hash]
# File lib/url_privacy.rb 129 def complex_tracking_params 130 @complex_tracking_params ||= TRACKING_PARAMS.map do |param| 131 next unless param.include? '@' 132 133 Hash[*param.split('@', 2).reverse] 134 end.compact.reduce({}) do |hash, pairs| 135 pairs.each do |key, value| 136 (hash[key] ||= []) << value 137 end 138 139 hash 140 end 141 end
simple_tracking_params()
click to toggle source
# File lib/url_privacy.rb 143 def simple_tracking_params 144 @simple_tracking_params ||= TRACKING_PARAMS.select do |param| 145 !param.include?('@') 146 end 147 end