Two list comparison in Python -
i have 2 lists:
list1=[[['b', 10], 1], [['c', 15], 1], [['f', 30], 1]] list2=[[['g', 20], 2], [['d', 25], 1]]
now [['b', 10], 1]]
in list1
matches [['d', 25], 1]]
in list2
because have equal second element ([1]
, [1]
, first match).
i want ['b', 10]
, ['d', 25]
modified , [['c', 15], 1]
, [['f', 30], 1]
deleted list1
[['g', 20], 2]
added list2
.
can me here? i've tried sets , iteration of lists , comparing both doesn't work.
note: not homework, wondering how list works.
you can try this:
list1=[[['b', 10], 1], [['c', 15], 1], [['f', 30], 1]] list2=[[['g', 20], 2], [['d', 25], 1]] list2_iter = iter(list2) item1 in list1: in_list2 = false item2 in list2_iter: if item1[1] == item2[1]: print "match" if item1 == item2 else "modified", item1, item2 in_list2 = true break else: print "inserted", item2 if not in_list2: print "deleted", item1
note inner loop list2
using iterator, not start beginning of list2
each time, last element stopped in previous iteration of outer loop. else rather straightforward.
output:
inserted [['g', 20], 2] modified [['b', 10], 1] [['d', 25], 1] deleted [['c', 15], 1] deleted [['f', 30], 1]
note not find best match, 1 pass through both, list1
, list2
. more complete algorithm, take @ diff , longest common subsequence.
update: realized how problem in fact virtually same finding minimum edit distance of 2 strings, happened have code lying around. after generalization:
import operator def diff(s1, s2, match=operator.eq, neutral="*", subst=2): """s1, s2: 2 strings or lists match match: function determine whether 2 elements match neutral: 'neutral' element padding subst: substitution costs """ s1, s2 = neutral + s1, neutral + s2 # calculate edit distance / sequence match dp = [[0] * len(s2) in range(len(s1))] in range(len(s1)): k in range(len(s2)): if min(i, k) == 0: a[i][k] = max(i, k) else: diag = 0 if match(s1[i], s2[k]) else subst a[i][k] = min(a[i-1][k-1] + diag, a[i ][k-1] + 1, a[i-1][k ] + 1) # reconstruct path path, i, k = [], len(s1)-1, len(s2)-1 while or k: if a[i][k] == a[i-1][k] + 1: path, = [-1] + path, i-1 elif a[i][k] == a[i][k-1] + 1: path, k = [1] + path, k-1 else: path, i, k = [0] + path, i-1, k-1 return a[len(s1)-1][len(s2)-1], path def print_match(list1, list2, path): i1, i2 = iter(list1), iter(list2) p in path: if p == -1: print "del %20r" % next(i1) if p == 0: print "eq %20r %20r" % (next(i1), next(i2)) if p == +1: print "ins %41r" % next(i2) # strings word1, word2 = "intention", "execution" x, path = diff(word1, word2) print_match(word1, word2, path) # lists of lists list1 = [[['b', 10], 1], [['c', 15], 1], [['f', 30], 1]] list2 = [[['g', 20], 2], [['d', 25], 1]] x, path = diff(list1, list2, match=lambda x, y: x[1] == y[1], neutral=[[none, -1]]) print_match(list1, list2, path)
note resulting match can still different expect, since there many equally optimal ways match elements, should optimal nonetheless.
Comments
Post a Comment