跳转至

Function Selector and Argument Encoding

詳細可查看 官方文檔

參考自己博客 Function Selector and Argument Encoding

在 Ethereum 生態系統中,ABI (Application Binary Interface,應用二進制接口) 是從區塊鏈外部與合約進行交互以及合約與合約間進行交互的一種標準方式。數據會根據其類型按照這份手冊中說明的方法進行編碼。

Function Selector

原理

某個函數簽名的 Keccak (SHA-3) 哈希的前 4 字節,指定了要調用的函數,形如 bytes4(keccak256('balanceOf(address)')) == 0x70a08231 這種形式,0x70a08231 便是 balanceOf(address) 的 Function Selector

  • 基礎原型即是函數名稱加上由括號括起來的參數類型列表,參數類型間由一個逗號分隔開,且沒有空格
  • 對於 uint 類型,要轉成 uint256 進行計算,比如 ownerOf(uint256) 其 Function Selector = bytes4(keccak256('ownerOf(uint256)')) == 0x6352211e
  • 函數參數包含結構體,相當於把結構體拆分成單個參數,只不過這些參數用 () 擴起來,詳細可看下面的例子

例子

pragma solidity >=0.4.16 <0.9.0;
pragma experimental ABIEncoderV2;

contract Demo {
    struct Test {
        string name;
        string policies;
        uint num;
    }

    uint public x;
    function test1(bytes3) public {x = 1;}
    function test2(bytes3[2] memory) public  { x = 1; }
    function test3(uint32 x, bool y) public  { x = 1; }
    function test4(uint, uint32[] memory, bytes10, bytes memory) public { x = 1; }
    function test5(uint, Test memory test) public { x = 1; }
    function test6(uint, Test[] memory tests) public { x = 1; }
    function test7(uint[][] memory,string[] memory) public { x = 1; }
}

/* 函數選擇器
{
    "0d2032f1": "test1(bytes3)",
    "2b231dad": "test2(bytes3[2])",
    "92e92919": "test3(uint32,bool)",
    "4d189ce2": "test4(uint256,uint32[],bytes10,bytes)",
    "4ca373dc": "test5(uint256,(string,string,uint256))",
    "ccc5bdd2": "test6(uint256,(string,string,uint256)[])",
    "cc80bc65": "test7(uint256[][],string[])",
    "0c55699c": "x()"
}
*/

Function Selector and Argument Encoding

原理

  • 動態類型的數據,比如動態數組,結構體,變長字節,其編碼後存儲其 offsetlengthdata
    • 先把參數順序存儲:如果是定長數據類型,直接存儲其 data,如果是變長數據類型,先存儲其 offset
    • 順序遍歷變長數據:先存儲 offset,對於第一個變長數據,先存儲其 offset = 0x20 * number ( number 是函數參數的個數 );對於下一個變長數據,其 offset = offset_of_prev + 0x20 + 0x20 * number (第一個 0x20 是存儲前一個變長數據的長度佔用的大小,number 是前一個變長數據的元素個數)
    • 順序遍歷變長數據:存儲完 offset ,接着就是遍歷每個變長數據,分別存儲其 lengthdata
    • ( ps: 對於結構體這樣的類型,存儲的時候可把結構體內元素看成是一個新函數的參數,這樣的話,對於結構體中的第一個變長數據,其 offset = 0x20 * numnum 是結構體元素的個數 )

例子

針對上述的合約例子的 7 個函數,其函數調用最終編碼如下

  • test1("0x112233")
0x0d2032f1                                                             // function selector
0 - 0x1122330000000000000000000000000000000000000000000000000000000000 // data of first parameter
  • test2(["0x112233","0x445566"])
0x2b231dad                                                             // function selector
0 - 0x1122330000000000000000000000000000000000000000000000000000000000 // first data of first parameter
1 - 0x4455660000000000000000000000000000000000000000000000000000000000 // second data of first parameter
  • test3(0x123,1)
0x92e92919                                                             // function selector
0 - 0x0000000000000000000000000000000000000000000000000000000000000123 // data of first parameter
1 - 0x0000000000000000000000000000000000000000000000000000000000000001 // data of second parameter
  • test4(0x123,["0x11221122","0x33443344"],"0x31323334353637383930","0x3132333435")
0x4d189ce2                                                             // function selector
0 - 0x0000000000000000000000000000000000000000000000000000000000000123 // data of first parameter
1 - 0x0000000000000000000000000000000000000000000000000000000000000080 // offset of second parameter
2 - 0x3132333435363738393000000000000000000000000000000000000000000000 // data of third parameter
3 - 0x00000000000000000000000000000000000000000000000000000000000000e0 // offset of forth parameter
4 - 0x0000000000000000000000000000000000000000000000000000000000000002 // length of second parameter
5 - 0x0000000000000000000000000000000000000000000000000000000011221122 // first data of second parameter
6 - 0x0000000000000000000000000000000000000000000000000000000033443344 // second data of second parameter
7 - 0x0000000000000000000000000000000000000000000000000000000000000005 // length of forth parameter
8 - 0x3132333435000000000000000000000000000000000000000000000000000000 // data of forth parameter

/* 一些解釋說明
data of first parameter: uint 定長類型,直接存儲其 data
offset of second parameter: uint32[] 動態數組,先存儲其 offset=0x20*4 ( 4 代表函數參數的個數 ) 
data of third parameter: bytes10 定長類型,直接存儲其 data
offset of forth parameter: bytes 變長類型,先存儲其 offset=0x80+0x20*3=0xe0 (0x80 是前一個變長類型的 offset,3 是前一個變長類型存儲其長度和兩個元素佔用的插槽個數)
length of second parameter: 存儲完 data 或者 offset 後,便開始存儲變長數據的 length 和 data,這裏是第二個參數的長度
first data of second parameter: 第二個參數的第一個數據
second data of second parameter: 第二個參數的第二個數據
length of forth parameter: 上面就把第二個變長數據存儲完成,這裏就是存儲下一個變長數據的長度
data of forth parameter: 第四個參數的數據
*/
  • test5(0x123,["cxy","pika",123])
0x4ca373dc                                                             // function selector
0 - 0x0000000000000000000000000000000000000000000000000000000000000123 // data of first parameter
1 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of second parameter
2 - 0x0000000000000000000000000000000000000000000000000000000000000060 // first data offset of second parameter
3 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // second data offset of second parameter
4 - 0x000000000000000000000000000000000000000000000000000000000000007b // third data of second parameter
5 - 0x0000000000000000000000000000000000000000000000000000000000000003 // first data length of second parameter
6 - 0x6378790000000000000000000000000000000000000000000000000000000000 // first data of second parameter
7 - 0x0000000000000000000000000000000000000000000000000000000000000004 // second data length of second parameter
8 - 0x70696b6100000000000000000000000000000000000000000000000000000000 // second data of second parameter

/* 一些解釋說明
data of first parameter: uint 定長類型,直接存儲其 data
offset of second parameter: 結構體,先存儲其 offset=0x20*2 ( 2 代表函數參數的個數) 
first data offset of second parameter: 結構體內元素可當成函數參數拆分,有三個元素,因第一個元素是 string 類型,所以先存儲其 offset=0x20*3=0x60
second data offset of second parameter: 結構體第二個元素是 string 類型,先存儲其 offset=0x60+0x20+0x20=0xa0 (第一個 0x20 是存儲第一個 string 的長度所佔大小,第二個 0x20 是存儲第一個 string 的數據所佔大小)
third data of second parameter: 結構體第三個元素是 uint 定長類型,直接存儲其 data
first data length of second parameter: 存儲結構體第一個元素的 length
first data of second parameter: 存儲結構體第一個元素的 data
second data length of second parameter: 存儲結構體第二個元素的 length
second data of second parameter: 存儲結構體第二個元素的 data
*/
  • test6(0x123,[["cxy1","pika1",123], ["cxy2","pika2",456]])
由於是結構體數組,所以需要拆分,由內向外。內部是兩個結構體,分別來看其 encoding

對於 ["cxy1","pika1",123] 結構體,其 encoding 如下( 直接當成函數參數 encoding )
0 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy1"
1 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika1"
2 - 0x000000000000000000000000000000000000000000000000000000000000007b // encoding of 123
3 - 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy1"
4 - 0x6378793100000000000000000000000000000000000000000000000000000000 // encoding of "cxy1"
5 - 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika1"
6 - 0x70696b6131000000000000000000000000000000000000000000000000000000 // encoding of "pika1"

對於 ["cxy2","pika2",456] 結構體,其 encoding 如下(直接當成函數參數 encoding )
0 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy2"
1 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika2"
2 - 0x00000000000000000000000000000000000000000000000000000000000001c8 // encoding of 456
3 - 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy2"
4 - 0x6378793200000000000000000000000000000000000000000000000000000000 // encoding of "cxy2"
5 - 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika2"
6 - 0x70696b6132000000000000000000000000000000000000000000000000000000 // encoding of "pika2"

由於是結構體,所以還需要 ["cxy1","pika1",123] 的 offset 和 ["cxy2","pika2",456] 的 offset,如下
0 - a                                                                  // offset of ["cxy1","pika1",123]
1 - b                                                                  // offset of ["cxy2","pika2",456]
2 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy1"
3 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika1"
4 - 0x000000000000000000000000000000000000000000000000000000000000007b // encoding of 123
5 - 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy1"
6 - 0x6378793100000000000000000000000000000000000000000000000000000000 // encoding of "cxy1"
7 - 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika1"
8 - 0x70696b6131000000000000000000000000000000000000000000000000000000 // encoding of "pika1"
9 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy2"
10- 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika2"
11- 0x00000000000000000000000000000000000000000000000000000000000001c8 // encoding of 456
12- 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy2"
13- 0x6378793200000000000000000000000000000000000000000000000000000000 // encoding of "cxy2"
14- 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika2"
15- 0x70696b6132000000000000000000000000000000000000000000000000000000 // encoding of "pika2"
a指向 offset of "cxy1",所以 a=0x20*2=0x40
b指向 offset of "cxy2",所以 b=0x20*9=0x120

由於是結構體數組,結構體外面是數組,所以要按照動態數組encoding的方法,如下
0 - c                                                                  // offset of [["cxy1","pika1",123],["cxy2","pika2",456]]
1 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count of second parameter
2 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of ["cxy1","pika1","1"]
3 - 0x0000000000000000000000000000000000000000000000000000000000000120 // offset of ["cxy2","pika2","1"]
4 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy1"
5 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika1"
6 - 0x000000000000000000000000000000000000000000000000000000000000007b // encoding of 123
7 - 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy1"
8 - 0x6378793100000000000000000000000000000000000000000000000000000000 // encoding of "cxy1"
9 - 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika1"
10- 0x70696b6131000000000000000000000000000000000000000000000000000000 // encoding of "pika1"
11- 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy2"
12- 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika2"
13- 0x00000000000000000000000000000000000000000000000000000000000001c8 // encoding of 456
14- 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy2"
15- 0x6378793200000000000000000000000000000000000000000000000000000000 // encoding of "cxy2"
16- 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika2"
17- 0x70696b6132000000000000000000000000000000000000000000000000000000 // encoding of "pika2"
c 是函數參數的第二個參數,是動態類型,所以 offset c = 0x20*2 = 0x40

所以總的 encoding 如下
0xccc5bdd2                                                             // function selector
0 - 0x0000000000000000000000000000000000000000000000000000000000000123 // encoding of 0x123
1 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of second parameter
2 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count of second parameter
3 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of ["cxy1","pika1","1"]
4 - 0x0000000000000000000000000000000000000000000000000000000000000120 // offset of ["cxy2","pika2","1"]
5 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy1"
6 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika1"
7 - 0x000000000000000000000000000000000000000000000000000000000000007b // encoding of 123
8 - 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy1"
9 - 0x6378793100000000000000000000000000000000000000000000000000000000 // encoding of "cxy1"
10- 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika1"
11- 0x70696b6131000000000000000000000000000000000000000000000000000000 // encoding of "pika1"
12- 0x0000000000000000000000000000000000000000000000000000000000000060 // offset of "cxy2"
13- 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of "pika2"
14- 0x00000000000000000000000000000000000000000000000000000000000001c8 // encoding of 456
15- 0x0000000000000000000000000000000000000000000000000000000000000004 // length of "cxy2"
16- 0x6378793200000000000000000000000000000000000000000000000000000000 // encoding of "cxy2"
17- 0x0000000000000000000000000000000000000000000000000000000000000005 // length of "pika2"
18- 0x70696b6132000000000000000000000000000000000000000000000000000000 // encoding of "pika2"
  • test7([[1,2],[3]],["one","two","three"])
同理進行由內向外的拆分,首先是 [[1,2],[3]] 動態數組中的 [1, 2] 和 [3] 兩個動態數組
0 - a                                                                  // offset of [1,2]
1 - b                                                                  // offset of [3]
2 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [1,2]
3 - 0x0000000000000000000000000000000000000000000000000000000000000001 // encoding of 1
4 - 0x0000000000000000000000000000000000000000000000000000000000000002 // encoding of 2
5 - 0x0000000000000000000000000000000000000000000000000000000000000001 // count for [3]
6 - 0x0000000000000000000000000000000000000000000000000000000000000003 // encoding of 3
a 指向 [1,2] 的開始,所以 a=0x20*2=0x40
b 指向 [3] 的開始,所以 b=0x20*5=0xa0

然後是 [[1,2],[3]] 動態數組本身的 encoding
0 - c                                                                  // offset of [[1,2],[3]]
1 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [[1,2],[3]]
2 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of [1,2]
3 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of [3]
4 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [1,2]
5 - 0x0000000000000000000000000000000000000000000000000000000000000001 // encoding of 1
6 - 0x0000000000000000000000000000000000000000000000000000000000000002 // encoding of 2
7 - 0x0000000000000000000000000000000000000000000000000000000000000001 // count for [3]
8 - 0x0000000000000000000000000000000000000000000000000000000000000003 // encoding of 3
c 指向 [[1,2],[3]] 的開始,所以 a=0x20*2=0x40

其次是 ["one","two","three"] 動態數組中每個 string 的 encoding
0 - d                                                                  // offset for "one"
1 - e                                                                  // offset for "two"
2 - f                                                                  // offset for "three"
3 - 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "one"
4 - 0x6f6e650000000000000000000000000000000000000000000000000000000000 // encoding of "one"
5 - 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "two"
6 - 0x74776f0000000000000000000000000000000000000000000000000000000000 // encoding of "two"
7 - 0x0000000000000000000000000000000000000000000000000000000000000005 // count for "three"
8 - 0x7468726565000000000000000000000000000000000000000000000000000000 // encoding of "three"
d 指向 “one” 的開始,所以 d=0x20*3=0x60
e 指向 “two” 的開始,所以 e=0x20*5=0xa0
f 指向 “three” 的開始,所以 f=0x20*7=0xe0

然後是 ["one","two","three"] 動態數組本身的 encoding
0 - g                                                                  // offset of ["one","two","three"]
1 - 0x0000000000000000000000000000000000000000000000000000000000000003 // count for ["one","two","three"]
2 - 0x0000000000000000000000000000000000000000000000000000000000000060 // offset for "one"
3 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset for "two"
4 - 0x00000000000000000000000000000000000000000000000000000000000000e0 // offset for "three"
5 - 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "one"
6 - 0x6f6e650000000000000000000000000000000000000000000000000000000000 // encoding of "one"
7 - 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "two"
8 - 0x74776f0000000000000000000000000000000000000000000000000000000000 // encoding of "two"
9 - 0x0000000000000000000000000000000000000000000000000000000000000005 // count for "three"
10- 0x7468726565000000000000000000000000000000000000000000000000000000 // encoding of "three"
這裏 g 先不進行計算,因爲涉及到函數參數整體的一個 encoding

上面就已經把最後就是 [[1,2],[3]] 和 ["one","two","three"] 分析完畢,最後就是其作爲一個整體進行 encoding
0 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of [[1,2],[3]]
1 - g                                                                  // offset of ["one","two","three"]
2 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [[1,2],[3]]
3 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of [1,2]
4 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of [3]
5 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [1,2]
6 - 0x0000000000000000000000000000000000000000000000000000000000000001 // encoding of 1
7 - 0x0000000000000000000000000000000000000000000000000000000000000002 // encoding of 2
8 - 0x0000000000000000000000000000000000000000000000000000000000000001 // count for [3]
9 - 0x0000000000000000000000000000000000000000000000000000000000000003 // encoding of 3
10- 0x0000000000000000000000000000000000000000000000000000000000000003 // count for ["one","two","three"]
11- 0x0000000000000000000000000000000000000000000000000000000000000060 // offset for "one"
12- 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset for "two"
13- 0x00000000000000000000000000000000000000000000000000000000000000e0 // offset for "three"
14- 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "one"
15- 0x6f6e650000000000000000000000000000000000000000000000000000000000 // encoding of "one"
16- 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "two"
17- 0x74776f0000000000000000000000000000000000000000000000000000000000 // encoding of "two"
18- 0x0000000000000000000000000000000000000000000000000000000000000005 // count for "three"
19- 0x7468726565000000000000000000000000000000000000000000000000000000 // encoding of "three"
g 指向字符串數組的開始,所以 g=0x20*10=140

所以總的 selector + encoding 如下所示
0xcc80bc65                                                             // function selector
0 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of [[1,2],[3]]
1 - 0x0000000000000000000000000000000000000000000000000000000000000140 // offset of ["one","two","three"]
2 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [[1,2],[3]]
3 - 0x0000000000000000000000000000000000000000000000000000000000000040 // offset of [1,2]
4 - 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset of [3]
5 - 0x0000000000000000000000000000000000000000000000000000000000000002 // count for [1,2]
6 - 0x0000000000000000000000000000000000000000000000000000000000000001 // encoding of 1
7 - 0x0000000000000000000000000000000000000000000000000000000000000002 // encoding of 2
8 - 0x0000000000000000000000000000000000000000000000000000000000000001 // count for [3]
9 - 0x0000000000000000000000000000000000000000000000000000000000000003 // encoding of 3
10- 0x0000000000000000000000000000000000000000000000000000000000000003 // count for ["one","two","three"]
11- 0x0000000000000000000000000000000000000000000000000000000000000060 // offset for "one"
12- 0x00000000000000000000000000000000000000000000000000000000000000a0 // offset for "two"
13- 0x00000000000000000000000000000000000000000000000000000000000000e0 // offset for "three"
14- 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "one"
15- 0x6f6e650000000000000000000000000000000000000000000000000000000000 // encoding of "one"
16- 0x0000000000000000000000000000000000000000000000000000000000000003 // count for "two"
17- 0x74776f0000000000000000000000000000000000000000000000000000000000 // encoding of "two"
18- 0x0000000000000000000000000000000000000000000000000000000000000005 // count for "three"
19- 0x7468726565000000000000000000000000000000000000000000000000000000 // encoding of "three"

例題

balsn 2020

  • 題目名稱 Election

Note

注:題目附件相關內容可至 ctf-challenges/blockchain 倉庫尋找。