## 了解数据

```[Event "F/S Return Match"]
[Date "1992.11.04"]
[Round "29"]
[White "Fischer, Robert J."]
[Black "Spassky, Boris V."]
[Result "1/2-1/2"]
(moves from the game follow...)

```

### 建立一个处理管道

shell命令对于数据处理管道很好，因为你免费获得并行。为了证明这一点，在你的终端尝试一个简单的例子。

```sleep 3 | echo "Hello world."
```

```cat *.pgn
```

```cat *.pgn | grep "Result"
```

```cat *.pgn | grep "Result" | sort | uniq -c

```

```cat *.pgn | grep "Result" | awk '{ split(\$0, a, "-"); res = substr(a[1], length(a[1]), 1); if (res == 1) white++; if (res == 0) black++; if (res == 2) draw++;} END { print white+black+draw, white, black, draw }'

```

### 并行化瓶颈

```find . -type f -name '*.pgn' -print0 | xargs -0 -n1 -P4 grep -F "Result" | gawk '{ split(\$0, a, "-"); res = substr(a[1], length(a[1]), 1); if (res == 1) white++; if (res == 0) black++; if (res == 2) draw++;} END { print NR, white, black, draw }'
```

```find . -type f -name '*.pgn' -print0 | xargs -0 -n1 -P4 awk '/Result/ { split(\$0, a, "-"); res = substr(a[1], length(a[1]), 1); if (res == 1) white++; if (res == 0) black++; if (res == 2) draw++;} END { print white+black+draw, white, black, draw }'

```

```time find . -type f -name '*.pgn' -print0 | xargs -0 -n4 -P4 awk '/Result/ { split(\$0, a, "-"); res = substr(a[1], length(a[1]), 1); if (res == 1) white++; if (res == 0) black++; if (res == 2) draw++ } END { print white+black+draw, white, black, draw }' | awk '{games += \$1; white += \$2; black += \$3; draw += \$4; } END { print games, white, black, draw }'

```

```find . -type f -name '*.pgn' -print0 | xargs -0 -n4 -P4 mawk '/Result/ { split(\$0, a, "-"); res = substr(a[1], length(a[1]), 1); if (res == 1) white++; if (res == 0) black++; if (res == 2) draw++ } END { print white+black+draw, white, black, draw }' | mawk '{games += \$1; white += \$2; black += \$3; draw += \$4; } END { print games, white, black, draw }'

```