Learning Python for Forensics
上QQ阅读APP看书,第一时间看更新

Understanding scripting flow logic

Flow control logic allows us to create dynamic logic by specifying different routes of program execution based upon a series of circumstances. In any script worth its salt, some manner of flow control needs to be introduced. For example, flow logic would be required to create a dynamic script that returns different results based on options selected by the user. Unaccounted for possibilities are a common cause for error, and are another reason to test your code thoroughly. Creating dynamic code is important in Forensics as we sometimes encounter fragments of data, such as within slack or unallocated space, whose incomplete nature may interfere or cause errors in our scripts. In Python, there are two basic sets of flow logic: conditionals and loops.

Flow operators are frequently accompanied with flow logic. These operators can be strung together to create more complicated logic. The table below represents a "truth table" and illustrates the value of various flow operators based on the "A" or "B" variable Boolean state.

The logical AND and OR operators are the third and fourth columns in the table. Both A and B must be True for the AND operator to return True. Only one of the variables need to be True for the OR operator to be True. The not operator simply switches the Boolean value of the variable to its opposite (for example, True becomes False and vice versa).

Mastering conditionals and loops will take our scripts to another level. At its core, flow logic relies on only two values, True or False. As noted earlier, in Python these are represented by the Boolean True and False data types.

Conditionals

When a script hits a conditional, it's much like standing at a fork in the road. Depending on some factor, say a more promising horizon, you may decide to go East over West. Computer logic is less arbitrary: if something is true the script proceeds one way, if it is false then it will go another. These junctions are critical, if the program decides to go off the path we've developed for it, we'll be in serious trouble. There are three statements used to form a conditional block: if, elif, and else.

The conditional block refers to the conditional statements, their flow logic, and code. A conditional block starts with an if statement followed by flow logic, a colon, and indented line(s) of code. If the flow logic evaluates to True, then the indented code below the if statement will be executed. If it does not evaluate to True the Python Virtual Machine (PVM) will skip those lines of code and go to the next line on the same level of indentation as the if statement. This is usually a corresponding elif (else-if) or else statement.

Indentation is very important in Python. It is used to demarcate code to be executed within a conditional statement or loop. A standard of 4 spaces for indentation is used in this book, though you may encounter code that uses a 2 space indentation or uses tab characters. While all three of these practices are allowed in Python, 4 spaces are preferred and easier to read.

In a conditional block, once one of the statements evaluates to True, the code is executed and the PVM exits the block without evaluating the other statements. Please review Appendix B, Python Technical Details for a description on the Python Virtual Machine.

# Conditional Block Pseudocode
if [logic]:
    # Line(s) of indented code to execute if logic evaluates to True.
elif [logic]:
    # Line(s) of indented code to execute if the 'if' statement is false and this logic is True.
else:
    # Line(s) of code to catch all other possibilities if the if and elif(s) statements are all False.

Until we define functions, we will stick to simple if statement examples.

>>> a = 5
>>> b = 22
>>>
>>> a > 0
True
>>> a > b
False
>>> if a > 0:
... print str(a) + ' is greater than zero!'
...
5 is greater than zero!
>>> if a > b:
... print str(a) + ' beats ' + str(b)
...
>>>

Notice how when the flow logic evaluates to True then the code indented below the if statement is executed. When it evaluates to False the code is skipped. Typically, when the if statement is False you will have a secondary statement, such as an elif or else to catch other possibilities, such as when "a" is less than or equal to "b". However, it is important to note that we can just use an if statement without any elif or else statements.

The difference between if and elif is subtle. We can only functionally notice a difference when we use multiple if statements. The elif statement allows for a second condition to be evaluated in the case that the first isn't successful. A second if statement will be evaluated regardless of the outcome of the first if statement.

The else statement does not require any flow logic and can be treated as a catch-all case for any remaining or unaccounted for case. This does not mean, however, errors will not occur when the code in the else statement is executed. Do not rely on else statements to handle errors.

Conditional statements can be made more comprehensive by using the logical and or or operators. These allow for more complex logic in a single conditional statement.

>>> a = 5
>>> b = 22
>>> 
>>> if a > 4 and a < b:
... print 'Both statements must be true to print this'
...
Both statements must be true to print this
>>> if a > 10 or a < b:
... print 'One of these statements must be true to print this'
...
One of these statements must be true to print this

The following table can be helpful to understand how common operators work.

Loops

Loops provide another method of flow control and are suited to perform iterative tasks. A loop will repeat inclusive code until the provided condition is no longer True or an exit signal is provided. There are two kinds of loops: for and while. For most iterative tasks a for loop will be the best option to use.

For

For loops are the most common and, in most cases, the preferred method to perform a task over and over again. Imagine a factory line, for each object on the conveyor belt a for loop could be used to perform some task on it, such as placing a label on the object. In this manner, multiple for loops can come together in the form of an assembly line, processing each object, until they are ready to be presented to the user.

Much like the rest of Python, the for loop is very simple syntactically and yet powerful. In some languages a for loop needs to be initialized, have a counter of sorts, and a termination case. Python's for loop is much more dynamic and handles these tasks on its own. These loops contain indented code that is executed line by line. If the object being iterated over still has elements (for example, more items to process) at the end of the indented block, the PVM will position itself at the beginning of the loop and repeat the code again.

The for loop syntax will specify the object to iterate over and what to call each of the elements within the object. Note, the object must be iterable. For example, strings and lists are iterable, but an integer is not. In the example below, we can see how a for loop treats strings and lists and helps us iterate over each element in iterable objects.

>>> for character in 'Python':
... print character
...
P
y
t
h
o
n
>>> cars = ['Volkswagon', 'Audi', 'BMW']
>>> for car in cars:
... print car
...
Volkswagon
Audi
BMW

There are additional, more advanced, ways to call a for loop. The enumerate() function can be used to start an index. This comes in handy when you need to keep track of the index of the current object. Indexes are incremented at the beginning of the loop. The first object has an index of 0, the second has an index of 1, and so on. The range() and xrange() functions can execute a loop a certain number of times and provide an index. The difference between range() and xrange() is somewhat subtle, though the xrange() function is quicker and more memory efficient than the range function.

>>> numbers = [5, 25, 35]
>>> for i, x in enumerate(numbers):
... print 'Item', i, 'from the list is:', x
... 
Item 0 from the list is: 5
Item 1 from the list is: 25
Item 2 from the list is: 35
>>> for x in xrange(0, 100): # prints 0 to 100 (not shown below in an effort to save trees)
... print x

While

While loops are not encountered as frequently in Python. A while loop executes as long as a statement is true. The simplest while loop would be a while True statement. This kind of loop would execute forever as the Boolean object True is always True and so the indented code would continually execute.

If you are not careful, you can inadvertently create an infinite loop, which will wreak havoc on your script's intended functionality. It is imperative to utilize conditionals to cover all your bases such as if, elif, and else statements. If you fail to do so, your script can enter an unaccounted situation and crash. This is not to say that while loops are not worth using. They are quite powerful and have their own place in Python.

>>> guess = 0
>>> answer = 42
>>> while True:
... if guess == answer:
... print 'You\'ve found the answer to this loop: ' + str(answer) + '.'
... break
... else:
... print guess, 'is not the answer.'
... guess += 1

The break, continue, and pass statements are used in conjunction with for and while loops to create more dynamic loops. The break escapes from the current loop, while the continue statement causes the PVM to begin executing code at the beginning of the loop, skipping any indented code after the continue. The pass statement literally does nothing and acts as a placeholder. If you're feeling brave or bored, or worse, both, remove the break statement from the previous example and note what happens.