class Moab::FileGroupDifference
Performs analysis and reports the differences between two matching {FileGroup} objects. The descending elements of the report hold a detailed breakdown of file-level differences, organized by change type. This stanza is a child element of {FileInventoryDifference}, the documentation of which contains a full example.
In order to determine the detailed nature of the differences that are present between the two manifests, this algorithm first compares the sets of file signatures present in the groups being compared, then uses the result of that operation for subsequent analysis of filename correspondences.
For the first step, a Ruby Hash is extracted from each of the of the two groups, with an array of {FileSignature} object used as hash keys, and the corresponding {FileInstance} arrays as the hash values. The set of keys from the basis hash can be compared against the keys from the other hash using {Array} operators:
-
matching = basis_array & other_array
-
basis_only = basis_array - other_array
-
other_only = other_array - basis_array
For the second step of the comparison, the matching and non-matching sets of hash entries are further categorized as follows:
-
identical = signature and file path is the same in both basis and other file group
-
renamed = signature is unchanged, but the path has moved
-
copyadded = duplicate copy of file was added
-
copydeleted = duplicate copy of file was deleted
-
modified = path is same in both groups, but the signature has changed
-
added = signature and path are only in the other inventor
-
deleted = signature and path are only in the basis inventory
Data Model¶ ↑
-
{FileInventoryDifference} = compares two {FileInventory} instances based on file signatures and pathnames
-
{FileGroupDifference} [1..*] = performs analysis and reports differences between two matching {FileGroup} objects
-
{FileGroupDifferenceSubset} [1..5] = collects a set of file-level differences of a give change type
-
{FileInstanceDifference} [1..*] = contains difference information at the file level
-
{FileSignature} [1..2] = contains the file signature(s) of two file instances being compared
-
-
-
-
@note Copyright © 2012 by The Board of Trustees of the Leland Stanford
Junior University.
All rights reserved. See {file:LICENSE.rdoc} for details.
Attributes
@return [Hash<Symbol,FileGroupDifferenceSubset>] A set of containers (one for each change type),
each of which contains a collection of file-level differences having that change type.
Public Class Methods
Source
# File lib/moab/file_group_difference.rb, line 55 def initialize(opts = {}) @subset_hash = Hash.new { |hash, key| hash[key] = FileGroupDifferenceSubset.new(change: key.to_s) } super(opts) end
(see Serializable#initialize)
Serializer::Serializable::new
Public Instance Methods
Source
# File lib/moab/file_group_difference.rb, line 115 def added subset_hash[:added].count end
Source
# File lib/moab/file_group_difference.rb, line 172 def basis_only_keys(basis_hash, other_hash) basis_hash.keys - other_hash.keys end
@api internal @param (see matching_keys
) @return [Array] Compare the keys of two hashes and return the keys unique to the first hash
Source
# File lib/moab/file_group_difference.rb, line 187 def compare_file_groups(basis_group, other_group) @group_id = basis_group.group_id compare_matching_signatures(basis_group, other_group) compare_non_matching_signatures(basis_group, other_group) self end
@api internal @param basis_group [FileGroup] The file group that is the basis of the comparison @param other_group [FileGroup] The file group that is compared against the basis group @return [FileGroupDifference] Compare two file groups and return a differences report
Source
# File lib/moab/file_group_difference.rb, line 198 def compare_matching_signatures(basis_group, other_group) matching_signatures = matching_keys(basis_group.signature_hash, other_group.signature_hash) tabulate_unchanged_files(matching_signatures, basis_group.signature_hash, other_group.signature_hash) tabulate_renamed_files(matching_signatures, basis_group.signature_hash, other_group.signature_hash) self end
@api internal @param (see compare_file_groups
) @return [FileGroupDifference] For signatures that are present in both groups,
report which file instances are identical or renamed
Source
# File lib/moab/file_group_difference.rb, line 209 def compare_non_matching_signatures(basis_group, other_group) basis_only_signatures = basis_only_keys(basis_group.signature_hash, other_group.signature_hash) other_only_signatures = other_only_keys(basis_group.signature_hash, other_group.signature_hash) basis_path_hash = basis_group.path_hash_subset(basis_only_signatures) other_path_hash = other_group.path_hash_subset(other_only_signatures) tabulate_modified_files(basis_path_hash, other_path_hash) tabulate_added_files(basis_path_hash, other_path_hash) tabulate_deleted_files(basis_path_hash, other_path_hash) self end
@api internal @param (see compare_file_groups
) @return [FileGroupDifference] For signatures that are present in only one or the other group,
report which file instances are modified, deleted, or added
Source
# File lib/moab/file_group_difference.rb, line 87 def copyadded subset_hash[:copyadded].count end
Source
# File lib/moab/file_group_difference.rb, line 94 def copydeleted subset_hash[:copydeleted].count end
Source
# File lib/moab/file_group_difference.rb, line 122 def deleted subset_hash[:deleted].count end
Source
# File lib/moab/file_group_difference.rb, line 69 def difference_count count = 0 subset_hash.each do |type, subset| count += subset.count if type != :identical end count end
Source
# File lib/moab/file_group_difference.rb, line 334 def file_deltas # The hash to be returned deltas = Hash.new { |hash, key| hash[key] = [] } # case where other_path is empty or 'same'. (create array of strings) %i[identical modified deleted copydeleted].each do |change| deltas[change].concat(subset_hash[change].files.collect(&:basis_path)) end # case where basis_path and other_path are both present. (create array of arrays) %i[copyadded renamed].each do |change| deltas[change].concat(subset_hash[change].files.collect { |file| [file.basis_path, file.other_path] }) end # case where basis_path is empty. (create array of strings) [:added].each do |change| deltas[change].concat(subset_hash[change].files.collect(&:other_path)) end deltas end
@return [Hash<Symbol,Array>] Sets of filenames grouped by change type for use in performing file or metadata operations
Source
# File lib/moab/file_group_difference.rb, line 80 def identical subset_hash[:identical].count end
Source
# File lib/moab/file_group_difference.rb, line 165 def matching_keys(basis_hash, other_hash) basis_hash.keys & other_hash.keys end
@api internal @param basis_hash [Hash] The first hash being compared @param other_hash [Hash] The second hash being compared @return [Array] Compare the keys of two hashes and return the intersection
Source
# File lib/moab/file_group_difference.rb, line 108 def modified subset_hash[:modified].count end
Source
# File lib/moab/file_group_difference.rb, line 179 def other_only_keys(basis_hash, other_hash) other_hash.keys - basis_hash.keys end
@api internal @param (see matching_keys
) @return [Array] Compare the keys of two hashes and return the keys unique to the second hash
Source
# File lib/moab/file_group_difference.rb, line 357 def rename_require_temp_files(filepairs) # Split the filepairs into two arrays oldnames = [] newnames = [] filepairs.each do |old, new| oldnames << old newnames << new end # Are any of the filenames the same in set of oldnames and set of newnames? intersection = oldnames & newnames intersection.count > 0 end
@param [Array<Array<String>>] filepairs The set of oldname, newname pairs for all files being renamed @return [Boolean] Test whether any of the new names are the same as one of the old names,
such as would be true for insertion of a new file into a page sequence, or a circular rename. In such a case, return true, indicating that use of intermediate temporary files would be required when updating a copy of an object's files at a given location.
Source
# File lib/moab/file_group_difference.rb, line 372 def rename_tempfile_triplets(filepairs) filepairs.collect { |old, new| [old, new, "#{new}-#{Time.now.strftime('%Y%m%d%H%H%S')}-tmp"] } end
@param [Array<Array<String>>] filepairs The set of oldname, newname pairs for all files being renamed @return [Array<Array<String>>] a set of file triples containing oldname, tempname, newname
Source
# File lib/moab/file_group_difference.rb, line 101 def renamed subset_hash[:renamed].count end
Source
# File lib/moab/file_group_difference.rb, line 50 def subset(change) subset_hash[change.to_sym] end
@param change [String] the change type to search for @return [FileGroupDifferenceSubset] Find a specified subset of changes
Source
# File lib/moab/file_group_difference.rb, line 131 def subsets subset_hash.values end
Source
# File lib/moab/file_group_difference.rb, line 135 def subsets=(array) return unless array array.each { |subset| subset_hash[subset.change.to_sym] = subset } end
Source
# File lib/moab/file_group_difference.rb, line 148 def summary FileGroupDifference.new( group_id: group_id, identical: identical, copyadded: copyadded, copydeleted: copydeleted, renamed: renamed, modified: modified, added: added, deleted: deleted ) end
@api internal @return [FileGroupDifference] Clone just this element for inclusion in a versionMetadata structure
Source
# File lib/moab/file_group_difference.rb, line 142 def summary_fields %w[group_id difference_count identical copyadded copydeleted renamed modified deleted added] end
@return [Array<String>] The data fields to include in summary reports
Source
# File lib/moab/file_group_difference.rb, line 304 def tabulate_added_files(basis_path_hash, other_path_hash) other_only_keys(basis_path_hash, other_path_hash).each do |path| fid = FileInstanceDifference.new(change: 'added') fid.basis_path = '' fid.other_path = path fid.signatures << other_path_hash[path] subset_hash[:added].files << fid end self end
@api internal @param basis_path_hash [Hash<String,FileSignature>]
The file paths and associated signatures for manifestations appearing only in the basis group
@param other_path_hash [Hash<String,FileSignature>]
The file paths and associated signatures for manifestations appearing only in the other group
@return [FileGroupDifference]
Container for reporting the set of file-level differences of type 'added'
Source
# File lib/moab/file_group_difference.rb, line 322 def tabulate_deleted_files(basis_path_hash, other_path_hash) basis_only_keys(basis_path_hash, other_path_hash).each do |path| fid = FileInstanceDifference.new(change: 'deleted') fid.basis_path = path fid.other_path = '' fid.signatures << basis_path_hash[path] subset_hash[:deleted].files << fid end self end
@api internal @param basis_path_hash [Hash<String,FileSignature>]
The file paths and associated signatures for manifestations appearing only in the basis group
@param other_path_hash [Hash<String,FileSignature>]
The file paths and associated signatures for manifestations appearing only in the other group
@return [FileGroupDifference]
Container for reporting the set of file-level differences of type 'deleted'
Source
# File lib/moab/file_group_difference.rb, line 285 def tabulate_modified_files(basis_path_hash, other_path_hash) matching_keys(basis_path_hash, other_path_hash).each do |path| fid = FileInstanceDifference.new(change: 'modified') fid.basis_path = path fid.other_path = 'same' fid.signatures << basis_path_hash[path] fid.signatures << other_path_hash[path] subset_hash[:modified].files << fid end self end
@api internal @param basis_path_hash [Hash<String,FileSignature>]
The file paths and associated signatures for manifestations appearing only in the basis group
@param other_path_hash [Hash<String,FileSignature>]
The file paths and associated signatures for manifestations appearing only in the other group
@return [FileGroupDifference]
Container for reporting the set of file-level differences of type 'modified'
Source
# File lib/moab/file_group_difference.rb, line 252 def tabulate_renamed_files(matching_signatures, basis_signature_hash, other_signature_hash) matching_signatures.each do |signature| basis_paths = basis_signature_hash[signature].paths other_paths = other_signature_hash[signature].paths basis_only_paths = basis_paths - other_paths other_only_paths = other_paths - basis_paths maxsize = [basis_only_paths.size, other_only_paths.size].max (0..maxsize - 1).each do |n| # rubocop:disable Lint/AmbiguousRange fid = FileInstanceDifference.new fid.basis_path = basis_only_paths[n] fid.other_path = other_only_paths[n] fid.signatures << signature if fid.basis_path.nil? fid.change = 'copyadded' fid.basis_path = basis_paths[0] elsif fid.other_path.nil? fid.change = 'copydeleted' else fid.change = 'renamed' end subset_hash[fid.change.to_sym].files << fid end end self end
@api internal @param matching_signatures [Array<FileSignature>] The file signature of the file manifestations being compared @param basis_signature_hash [Hash<FileSignature, FileManifestation>]
Signature to file path mapping from the file group that is the basis of the comparison
@param other_signature_hash [Hash<FileSignature, FileManifestation>]
Signature to file path mapping from the file group that is the being compared to the basis group
@return [FileGroupDifference]
Container for reporting the set of file-level differences of type 'renamed','copyadded', or 'copydeleted'
Source
# File lib/moab/file_group_difference.rb, line 228 def tabulate_unchanged_files(matching_signatures, basis_signature_hash, other_signature_hash) matching_signatures.each do |signature| basis_paths = basis_signature_hash[signature].paths other_paths = other_signature_hash[signature].paths matching_paths = basis_paths & other_paths matching_paths.each do |path| fid = FileInstanceDifference.new(change: 'identical') fid.basis_path = path fid.other_path = 'same' fid.signatures << signature subset_hash[:identical].files << fid end end self end
@api internal @param matching_signatures [Array<FileSignature>] The file signature of the file manifestations being compared @param basis_signature_hash [Hash<FileSignature, FileManifestation>]
Signature to file path mapping from the file group that is the basis of the comparison
@param other_signature_hash [Hash<FileSignature, FileManifestation>]
Signature to file path mapping from the file group that is the being compared to the basis group
@return [FileGroupDifference]
Container for reporting the set of file-level differences of type 'identical'