regex - Creating a keyword based search in python -


i have giant csv file close 6k entries , file looks this:

pdb id  ndb id  structure title  citation title     abstract 1et4    1et4    structure of     solution structure research performed ,                  haemoglobin mrna of mrna aptamer    structure of mrna obtained                 aptamer. 

my end goal display output given keyword so:

keyword: mrna pdb id   ndb id   structure title   citation title   abstract   location of first hit                                                                 struc/citation/abstract 

what starting point me? also, have use called regex this?

disclaimer: part of research project, not school homework.

a pseudocode or template great me.

you parse csv file , create 2 data structures. both dictionaries.

one dictionary contain each line, keyed on pdb id. other dictionary store sets of pdb ids , keyed on keywords.

below example code because i'm ignoring headers. want parse csv properly...

from collections import defaultdict entries = {} keywords = defaultdict(set)  open('my_csv.csv') f:     line in f:         entries[line.split()[0]] = line  # keying on pdb id  open('my_csv.csv') f:     line in f:         kw in line.split()[1:]             keywords[kw].add(line.split()[0]) 

once have 2 data structures should trivial keyword in keywords dict, iterate on set, , print out each line relevant pdb id.


Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -