Sunday, February 4, 2018

The collections python module .

This module named collections implements some nice data structures which will help you to solve various real life problems.
Let's start to see the content of this python module:

C:\Users\catafest>cd C:\Python27\

Python 2.7 (r27:82525, Jul  4 2010, 07:43:08) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import collections
>>> from collections import *
>>> dir(collections)
['Callable', 'Container', 'Counter', 'Hashable', 'ItemsView', 'Iterable', 'Iterator', 'KeysView',
 'Mapping', 'MappingView', 'MutableMapping', 'MutableSequence', 'MutableSet', 'OrderedDict', 'Sequence',
 'Set', 'Sized', 'ValuesView', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__'
, '_abcoll', '_chain', '_eq', '_heapq', '_ifilter', '_imap', '_iskeyword', '_itemgetter', '_repeat', 
'_starmap', '_sys', 'defaultdict', 'deque', 'namedtuple']
Now I will tell you about some
First is Counter and is a dict subclass which helps to count hashable objects.
The elements are stored as dictionary keys and counts are stored as values which can be zero or negative.
Next is defaultdict and is a dictionary object which provides all methods provided by dictionary.
This takes first argument (default_factory) as default data type for the dictionary.
The namedtuple helps to have meaning of each position in a tuple.
This allow us to code with better readability and self-documenting code.
Let's try some examples:
>>> from collections import Counter
>>> from collections import defaultdict
>>> from collections import namedtuple
>>> import re
>>> path = 'C:/yara_reg_rundll32.txt'
>>> output = re.findall('\w+', open(path).read().lower())
>>> Counter(output).most_common(5)
[('a', 2), ('nocase', 2), ('javascript', 2), ('b', 2), ('rundll32', 2)]
>>> d = defaultdict(list)
>>> colors = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> for k, v in colors:
...     d[k].append(v)
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
>>> Vertex = namedtuple('vertex', ['x', 'y'])
>>> v = Vertex(5,y = 9)
>>> v
vertex(x=5, y=9)
>>> v.x*v.y
>>> v[0]
>>> v[0]+v[1]
>>> x,y = v
>>> v
vertex(x=5, y=9)
>>> x
>>> y
The content of the yara_reg_rundll32.txt file is:
rule poweliks_rundll32_exe_javascript
description = "detect Poweliks' autorun rundll32.exe javascript:..."
$a = "rundll32.exe" nocase
$b = "javascript" nocase
$a and $b

I used vertex variables into my example because can be used with Blender 3D.
You can see many examples at official documentation website.