Difference between revisions of "Locally optimal optimisation of code fragments"

Revision as of 16:36, 6 September 2013

If we assume a code fragment of 4 instructions of 32 bit each and the two input registers to be 32 bit then we have: 2^32⁴ * 2³² * 2³² possible combinations of code and data. This number is so large that there is no hope of testing all possible combinations to verify that two different code fragments give the same result for all possible input. The search space is practically infinite.

Boolean logic is different from arithmetic in that every bit is computed completely independently from the others. So it makes no difference if we use the full 32 bit or just one bit, changing the value of one bit will never affect the others. That means we now have 2^32⁴ * 2¹ * 2¹ possible combinations of code and data, a huge improvement but still impossibly large. So it is clear that using brute force alone will not work.

Since we now have reduced the input to two bits then there are just 2² = 4 possible inputs.

since the result of each input is either 0 or 1 then all possible outputs from the function fits in 4 bits and we can write down the result of the function in a separate column and we have what is called a truth table. For the Eor function (T = A EOR B) it would be:

The truth column T contains omly 4 bits so we have exactly 2⁴ = 16 possible boolean functions with two inputs. So out of all those trillions of code fragments there are only 16 unique functions and from that it follows that it must exist 16 unique code fragments that that have no shorter equivalents.

If we can find those 16 optimal code fragments and find a mapping between all possible code fragments and those optimal code fragments we can use a small look up table to optimise all code fragments.

The truth table is that mapping, it tells exactly what the code does without any influence by how it is done. SO top optimise a function we generate the truth table for it and use that as the index into a table that holds the shortest code fragment for that truth table.

Generating the truth table

As an example we use a function with 4 inputs. To generate the truth table we simply feed the function with all possibl einputs and record the output. With 4 inputs we have 2⁴ = 16 different combinations so we run the function 16 times to get the complete truth table.

Generate the truth table in parallel

If we turn the truth table by 90 degrees we can do all 16 iterations in one operation, making the process more than 16 times as fast.

 A 0000000011111111
 B 0000111100001111
 C 0011001100110011
 D 0101010101010101
 T 1111010011110000

Discovering the optimal code fragments

Since the search space is too large to even imagine we have to be efficient and search where it is most likely we will find the code fragments.

To find the shortest code we start the search with the shortest code fragments and make them longer and longer, that way the first one we find for a given truth table is the shortest or that there is no shorter ones.

v1 |v2 |v3 |v4 | ¬((¬v4||v1||v3)&&v2)

1 | 1 | 1 | 1 | 0
1 | 1 | 1 | 0 | 0
1 | 1 | 0 | 1 | 0
1 | 1 | 0 | 0 | 0
1 | 0 | 1 | 1 | 1
1 | 0 | 1 | 0 | 1
1 | 0 | 0 | 1 | 1
1 | 0 | 0 | 0 | 1
0 | 1 | 1 | 1 | 0
0 | 1 | 1 | 0 | 0
0 | 1 | 0 | 1 | 1
0 | 1 | 0 | 0 | 0
0 | 0 | 1 | 1 | 1
0 | 0 | 1 | 0 | 1
0 | 0 | 0 | 1 | 1
0 | 0 | 0 | 0 | 1

@@ Line 21: / Line 21: @@
 If we can find those 16 optimal code fragments and find a mapping between all possible code fragments and those optimal code fragments we can use a small look up table to optimise all code fragments.
-The truth table is that mapping, it tells exaclty what the code does without any influence by how it is done.
+The truth table is that mapping, it tells exactly what the code does without any influence by how it is done. SO top optimise a function we generate the truth table for it and use that as the index into a table that holds the shortest code fragment for that truth table.
-1111010011110000  B C A D not or or and not
 ==Generating the truth table==
- T = (not ((((not D) or A) or C) and B))
+As an example we use a function with 4 inputs. To generate the truth table we simply feed the function with all possibl einputs and record the output. With 4 inputs we have 2<sup>4</sup> = 16 different combinations so we run the function 16 times to get the complete truth table.
   '''A B C D  T'''
 0 0 0  1
@@ Line 48: / Line 44: @@
 ===Generate the truth table in parallel===
+If we turn the truth table by 90 degrees we can do all 16 iterations in one operation, making the process more than 16 times as fast.
    '''A''' 0000000011111111
    '''B''' 0000111100001111

Difference between revisions of "Locally optimal optimisation of code fragments"

Revision as of 16:36, 6 September 2013

Generating the truth table

Generate the truth table in parallel

Discovering the optimal code fragments

Navigation menu

Views

Personal tools

Contents

Science Zero

Search

Tools