python - Counting number of unique items from a dictionary -


my program reads in large log file. searches file ip , time(whatever in brackets).

5.63.145.71 - - [30/jun/2013:08:04:46 -0500] "head / http/1.1" 200 - "-" "checks.panopta.com" 5.63.145.71 - - [30/jun/2013:08:04:49 -0500] "head / http/1.1" 200 - "-" "checks.panopta.com" 5.63.145.71 - - [30/jun/2013:08:04:51 -0500] "head / http/1.1" 200 - "-" "checks.panopta.com"

i want read whole file, , summarize entries follows:

num 3 ip 5.63.145.1 time [30/jun/2013:08:04:46 -0500] number of entries, ip, time , date

what have far:

import re   x = open("logssss.txt")  dic={}   line in x:     m = re.search(r"\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b",line).group().split()     c = re.search(r"\[(.+)\]",line).group().split()     in range(len(m)):         try:             dic[m[i]] += 1          except:             dic[m[i]] = 1         k = dic.keys() in range(len(k)):     print dic[k[i]], k[i] 

the above code displays correctly now! thanks.

6 199.21.99.83

1 5.63.145.71

edit: how adding c output now, timestamps going differ obviously, getting 1 of values, on same line, possible?

move print statement outside of main loop

import re x = open("logssss.txt")  dic={}   line in x:     m = re.search(r"\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b",line).group().split()     c = re.search(r"\[(.+)\]",line).group().split()     in range(len(m)):         try:             dic[m[i]] += 1          except:             dic[m[i]] = 1  k,v in dic.iteritems(): #or items if python 3.x     print k, v  

as tip take advantage of pythons counter class replace try except block

from collections import counter dic = counter() line in x:     m = re.search(r"\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b",line).group().split()     c = re.search(r"\[(.+)\]",line).group().split()     in range(len(m)):         dic[m[i]] += 1  k,v in dic.iteritems(): #or items if python 3.x     print k, v    

from comment, use dictionary of lists, count each ip address extracted length of list:

dic = {} line in x:     m = re.search(r"\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b",line).group().split()     c = re.search(r"\[(.+)\]",line).group().split()     in range(len(m)):         dic.setdefault(m[i], []).append(c)  k,v in dic.iteritems(): #or items if python 3.x     print k, len(v), v  

Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -