Retrieve access information of instruction operands

1. Get access info of registers

Now available in the Github branch next, Capstone provides a new API named cs_regs_access(). This function can retrieve the list of all registers read or modified - either implicitly or explicitly - by instructions.


The C sample code below demonstrates how to use cs_regs_access on X86 input.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <stdio.h>

#include <capstone/capstone.h>

#define CODE "\x8d\x4c\x32\x08\x01\xd8"

int main(void)
{
  csh handle;
  cs_insn *insn;
  size_t count, j;
  cs_regs regs_read, regs_write;
  uint8_t read_count, write_count, i;
  
  if (cs_open(CS_ARCH_X86, CS_MODE_32, &handle) != CS_ERR_OK)
    return -1;
  
  cs_option(handle, CS_OPT_DETAIL, CS_OPT_ON);
  
  count = cs_disasm(handle, CODE, sizeof(CODE)-1, 0x1000, 0, &insn);
  if (count > 0) {
    for (j = 0; j < count; j++) {
      // Print assembly
      printf("%s\t%s\n", insn[j].mnemonic, insn[j].op_str);

      // Print all registers accessed by this instruction.
      if (cs_regs_access(handle, &insn[j],
            regs_read, &read_count,
            regs_write, &write_count) == 0) {
        if (read_count > 0) {
          printf("\n\tRegisters read:");
          for (i = 0; i < read_count; i++) {
          	printf(" %s", cs_reg_name(handle, regs_read[i]));
          }
          printf("\n");
        }

        if (write_count > 0) {
          printf("\n\tRegisters modified:");
          for (i = 0; i < write_count; i++) {
            printf(" %s", cs_reg_name(handle, regs_write[i]));
          }
          printf("\n");
        }
      }
    }

    cs_free(insn, count);
  } else
  	printf("ERROR: Failed to disassemble given code!\n");

  cs_close(&handle);

  return 0;
}


Compile and run this sample, we have the output as follows.

lea	ecx, [edx + esi + 8]

	Registers read: edx esi
	Registers modified: ecx

add	eax, ebx

	Registers read: eax ebx
	Registers modified: eflags eax


Below is the explanation for important lines of the above C sample.


For those readers more familiar with Python, the below code does the same thing as the above C sample.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from capstone import *

CODE = b"\x8d\x4c\x32\x08\x01\xd8"

md = Cs(arch, mode)
md.detail = True

for insn in md.disasm(code, 0x1000):
	print("%s\t%s" % (insn.mnemonic, insn.op_str))

	(regs_read, regs_write) = insn.regs_access()

	if len(regs_read) > 0:
		print("\n\tRegisters read:", end="")
		for r in regs_read:
			print(" %s" %(insn.reg_name(r)), end="")
		print()

	if len(regs_write) > 0:
		print("\n\tRegisters modified:", end="")
		for r in regs_write:
			print(" %s" %(insn.reg_name(r)), end="")
		print()


Below is the explanation for important lines of this Python sample.


2. Get access info of operands

For instruction operands, besides the information such as size & type, now we can retrieve the access information. This is possible thanks to the new field cs_x86_op.access in x86.h.


With the help of cs_x86_op.access, we can find out how each instruction operand is accessed, like below.

lea	ecx, [edx + esi + 8]
	Number of operands: 2
		operands[0].type: REG = ecx
		operands[0].access: WRITE

add	eax, ebx
	Number of operands: 2
		operands[0].type: REG = eax
		operands[0].access: READ | WRITE

		operands[1].type: REG = ebx
		operands[1].access: READ


Note that instruction LEA do not actually access the second operand, hence this operand is ignored.


3. Status register update

Arithmetic instructions might update status flags. In X86 case, this is the EFLAGS register. Capstone does not only tell you that EFLAGS is modified, but can also provide details on individual bits inside EFLAGS. Examples are CF, ZF, OF, SF flags and so on.

On X86, this information is available in the field cs_x86.eflags, which is bitwise OR of X86_EFLAGS_* values. Again, this requires the engine to be configured in DETAIL mode.


See the screenshot below for what this feature can provide.


4. More examples

Find the full sample on how to retrieve information on operand access in source of test_x86.c or test_x86.py.