2018-10-05

ななめに移動してほしい

その他

グリッド上の移動経路を求めるためのアルゴリズムには、ダイクストラとかA*とかがある。けど、これらは45°単位で移動方向を制限する。もうちょっとななめに効率的に移動してほしい。

f:id:YaaMaa:20181004194134p:plain — 45°制限のある移動と、ななめの効率的な移動

f:id:YaaMaa:20181004194145p:plain — 45°制限のある移動と、ななめの効率的な移動

ここでやりたいことは、any-angle path planningと呼ぶらしい。

JavaScriptでの経路の話ならPathFInding.jsがあるけど、ここにはまだany-angle path planning用のアルゴリズムは入ってないし（プルリクはあるけど未マージ）、セルごとのweight設定もできなさそうなので（forkはあるらしい）、今は使わない。

Theta*

Any-angle path planningの一つ。この記事がわかりやすい。A*をベースとしていて、A*での「セルの親はneighboring cellでなければならない」というルールをなくすことによっていろんな角度に移動できるようにしている。

以下は、Theta*を実装しようとして考えたことのメモ。

Neighboring cells

セルのneighborとして巡回するセルは、4方向（North, East, West, South）にするべきか、8方向（4方向 + Northeast, Northwest, Southeast, Southwest）にするべきか。

初め、「どうせTheta*だったら子を親の親とまでくっつけようとするんだから、4方向を登録しとくだけで、最終的にななめに移動できるのでは？」と思っていた。だいたいの場合はそうなんだけど、weightが絡んでくると頭が悪い経路ができてしまう。

f:id:YaaMaa:20181005005438p:plain — グレーのはブロックされてるセル

この経路は、本来ならweightが1000もあるセルは避けて(0, 0) -> (1, 1) -> (2, 0)とななめ移動するべき。それができないのは、(1, 1) -> (2, 2)への経路を作るためには、

(1, 1)を親とする(1, 0)をpriority queueに入れる
(1, 0)がpriority queueから取り出されて、(2, 0)に行ってゴール

という手順を踏まないといけないけど、当然(1, 0)の親としては(0, 0)の方が優秀（コストが低い）なのでそっちが優先で保存されてしまい、理想的なななめの経路は残らなくなってしまうから。やっぱりちゃんと8方向分をneighboring cellとしてチェックしないといけない。

Line of sight

離れているセルに移動するときは、その移動パスが通るセル全てが通行可能である（ブロックされていない）ことを確かめないといけない。直線が通るセルをチェックする方法は、直線をラスターディスプレイに表示するときにどこの画素を塗りつぶすかっていう話に似たようなものになる。

上記のTheta*の記事で書かれているpseudocodeは、移動体はvertexからvertexへと移動することを前提としているので、セルからセルに移動させたい場合はそのままでは使えない。なので、二つのセルの中心を線で繋ぐような方法を探した。

Bresenham's Line Drawing Algorithm

グラフィック分野での古典的なアルゴリズムらしい。これなら、セルの中心から別のセルの中心へのline of sightチェックができそう。

問題は、例えばΔx > Δyのとき、x軸に沿って一列に1セルしか取り上げられないこと（xとyが逆でも同じ）。すると、直線が微妙に通ってるところにブロックされたセルがあってもすり抜けられてしまう。

f:id:YaaMaa:20181005010222p:plain

ピンクのセルがこのアルゴリズムでピックアップされるセルで、濃いグレーがブロックされてるとこ。(2, 2)にあるセルは本当は通り抜けできないんだけど、line of sightチェックでx = 2のとき(2, 1)しか取り上げられないため、(2, 2)は無視されている。

Supercover

直線が通る全てのセルが欲しいので、上記のアルゴリズムをそのままは使えない。探してたら、Bresenham-based supercover line algorithmってやつが見つかった。やりたいことは私と全く同じだけど、前のerrorを保持しなければいけなかったり複雑なので、自分で書くことにした。

Bresenhamさんのやつはセルの中央でのerrorによってどちらのセルに描画するかを分けている。私はセルの端っこでのerrorから次どこに行くかを判断したい。そこでのerrorは

< 0.5: secondary axisの座標はそのまま
= 0.5: secondary axisの座標を進めるけど、行った先で描画はしない
> 0.5: secondary axisの座標を進め、行った先で描画する

(secondary axisは、xとyのうち移動量が少ない方の軸)

という3つのケースがある。

座標を進める場合（ケース2と3）のみ、errorの基準（error = 0の場所）が一セル分進むのでerror--する。

この方法で、さっきみたいなすり抜けは発生しなくなった。

f:id:YaaMaa:20181005010943p:plain — セル(0, 0), (1, 1)はケース1、(2, 1)はケース2、(1, 0)はケース3にあたる

Diagonal movements

対角線上の移動を許可するかについては、道の片方だけブロックされてるときは許可して、両方ブロックされてたら許可しないというふうにした。これは好みの問題なので、ブロックされたセルのきわきわを通るのは嫌なときもあるかもしれないけど。

f:id:YaaMaa:20181005012430p:plain — 左は片方ブロックされているだけなので移動可、右は両方なので通れない。

f:id:YaaMaa:20181004200816p:plain — 左は片方ブロックされているだけなので移動可、右は両方なので通れない。

両側ブロックされてるときに対角線上の移動を却下するには、

Line of sightチェックで、セルの端っこでのerror = 0.5のとき、両側のセルの少なくともどちらか一方が空いている場合のみline of sightがあると判定する
セルのneighborリストのイテレーションのときに、そのneighborがななめ方向にあるとき、両側のブロックが空いている場合のみそのneighborを考慮する

という方法が考えられる。上の方法だと、親の親と子を繋げる経路（Path 2って呼ばれてるやつ）にしか適用されなくて、A*でも存在する普通のneighborへの経路（Path 1）がいけるかどうか判断するときには個別で条件分岐をつけないといけなくなるので、下の方法の方が便利。

Gの計算

Gというのは、A*・Theta*での

F (最小化するやつ) = G (cumulative cost) + H (heuristic distance to the goal)

っていう計算に使うGの話。ここでは、

G of child = G of parent
        + parentとchild間の距離
        + parentとchild間にある全てのセルのweightの合計

にすることにした。

Weightとは、そのセルを通るためにかかるコスト。たとえば、さらさらなセル（歩きやすい）とねちゃねちゃのセル（歩きにくい）があったとき、ねちゃねちゃのセルの方が大きなweightを持つ。

ここで、parentとchild間にある全てのセルっていうのをどうやって決めるか考えないといけなかった。

さっき書いたsupercoverのセルのweightを全部足すと、以下のような2×3のグリッド（セルは均一のweight 10を持っている）で、右のようなスムーズな移動よりも左のかくかくした移動の方が有利になる。

f:id:YaaMaa:20181005004445p:plain — 左: weightの合計は10 * 3, 右: weightの合計は10 * 4

f:id:YaaMaa:20181005004456p:plain — 左: weightの合計は10 * 3, 右: weightの合計は10 * 4

けど、普通に考えたら右のほうが選ばれるべき。

なので、Gにweightを加算するセルたちは、supercoverではない方のBresenham's algorithmで求められるやつだけにすることにした。

それはそれでany-angle pathの方が有利になりすぎることもあるけど、どうするのがいいのかよくわからない。アルゴリズム名が検索しづらいせいか、あんまり似たような使用例を見つけられなかった。

ためす

とりあえず上記の方法で実装したやつの動作を試してみるために、小さいツールを作った。進路妨害するの楽しい。

Optimization

Lazy Theta*は、line-of-sightのチェックを本当に必要なときにしかしないことで効率化してくれる。

2018-09-12

はてなインターン2018に参加した

その他

2018年の8月中旬から9月初旬にかけて、はてなのサマーインターンに参加させてもらった。

developer.hatenastaff.com

めちゃくちゃ良かったので、雰囲気が伝わるように記録したい。

応募

5月くらいから深刻に「インターンしないといけない」と考え始めた。まず自分が行きたい業界がわからんし、インターン生募集サイトに行っても「アットホームな環境で社員と親密な関係に！」みたいなやつは怖いし、かといって「学生にも手加減なし！厳しい環境で圧倒的成長を！」みたいなやつも怖いしで、ずっと鬱々としていた。

就活サイトの口コミを見ていると精神を病みそうだったので、個人のインターン体験記を読みたいなーと思って「インターン　感想　はてなブログ」で検索した*1ら、過去のはてなインターン生の感想ブログ記事がいっぱい出てきた。読んでみると異常に楽しそうだったので、ダメ元で応募することにした。

まず応募フォームを提出して、その後面接を受けた。私はアピールできるほどの技術力がないので、応募フォームではちょっとでも目に止まるようにプライベートで作ったものを並べて、面接では「学ぶことが好きです」とアピールしようとした。そしたら受け入れてもらえることになった。

前半過程: 講義

前半2週間は講義パートで、webアプリを作るにあたって必要な知識を網羅的に教えてくれる。今年はサーバーサイドの言語がGoにリニューアルされた上、GraphQLなども新たに導入されて、制作物の雛形なども全部それに合わせて作られていたので、このインターンにものすごく力を入れて準備してくださってるんだなぁと実感してありがたい気持ちになった。

メンターのid:miki_beneさんが、私が生み出したアホみたいなバグの原因を解明したり、課題に直接関係ないところでもちょくちょく効率化の方法や便利機能を教えてくださったおかげで、あまり残業することなく過ごすことができた。でも、私の体力がなさすぎたせいで終業後の飲み会などのイベントにあまり参加できず、ちょっと勿体なかった気がする。

後半過程: チーム配属

フロントエンド寄りの知識しかなかったことと、多くのユーザーを持つサービスに携われたら楽しそうという気持ちから、コンテンツプラットフォームコースを希望した。希望通り、Giga Viewer*2を開発するマンガチームに配属してもらえた。何にでも「優勝」や「最高便利」という接頭辞をつけるid:mizdraさんというインターン生と同じチームだった。メンターと見紛うほどの勢いで私にいろいろ教えてくれて、心から感謝しています。

マンガチームの中では主にid:hitode909さん、id:onkさんにメンタリング&コードレビューをしていただきながら開発した。本業やミーティングで忙しそうなのに、Slackで質問したらすごい速さで答えをくれるし、丁寧かつ的確なコードレビューをしてくれるし、木魚を叩けばすぐに助けに来てくれるなど、全面的にサポートしてくださった。

2週間で、id:mizdraさんの「お気に入り機能みたいなものがほしい」というアイデアから更新通知機能を作り、私の「前後のエピソードとかにシームレスに移動したい」という希望が発端でviewer画面のSPA化を行った（どちらもとなりのヤングジャンプのみで利用可能）。開発は最高に楽しかったし、作った機能はめっちゃいいと確信しているが、リリースがかなりバタバタしてしまった。最終日ぎりぎりに出せたものの、その場しのぎの実装をいくつか残してしまうし、最終日の懇親会中に東京オフィスの方々にデバッグさせてしまうしで、反省している。あと一週間くらい続けたかった。

f:id:YaaMaa:20180910221611p:plain — 作った更新通知機能

f:id:YaaMaa:20180910221909p:plain — 作った更新通知機能

チームごとの社内投票の結果、マンガチームは優勝は逃したけど2位になることができて、驚いたし本当に嬉しかった。

人

他のインターン生・社員さんともに、皆とんでもなく頭が良いけど、決して人間性を捨てているわけではなくて、むしろすごく気遣いができる親切な人しかいなかった。インターン生で一人だけ女だから大変そう、と慮って飲みに誘ってくれたり、「進路悩んでるみたいだけど、エンジニアとしてでも生きていけると思うから自信持って」と励ましてくれたり、優しさがすごすぎて感動した。

普段エンジニアや理系の人と関わる機会が少ないので、今回のインターンでそういう人たちの文化に触れたり他人と比べて自分に何が足りないのかを認識できたりしたことは、貴重な経験となった。

ただ、現社員さんたちの就活体験談は、「バイトしてたら誘われてそのまま入社」とか「サークルの先輩から続く伝統に則ってインターン→入社」みたいなのばっかりで、全然参考にならなかった。

ホテル

手配していただいたホテルは、快適で素晴らしいホテルだった。空調を設定したら秒で反応してくれるし、部屋の電気のスイッチは寝たまま手が届く位置にあるし、トイレは迅速かつ力強く流れるし、非の打ちどころがない。

ただ、シャワーの出し方が結構難しかった。湯の出口としてシャワーヘッドと蛇口があって、初め「出口を選択→湯を出す」のが正解かと思って10分くらい頑張ったけどできなくて、結局「湯をどこからでもいいから出す→出口を切り替える」という方法で成功した。この方法、「ペットボトル飲料を飲むときにボトルを逆さまにしてからキャップを開ける」くらいの違和感がある。

あと、一人部屋なのに枕が二つある。調子に乗って二つ重ねて使ったら、頭部の傾斜が急になったことで奥歯に大きな力が加わり、朝起きたら顎の側面がめっちゃ腫れていた。一日中インターンに集中できないほど痛かったので、これから泊まられる方は気をつけてほしい。

感謝

私の進度に合わせて丁寧に教えてくださったメンターの方々、いろいろ調整してインターン生がやりたいことを最大限やらせてくださったマンガチームの皆さん、そして、インターンの企画運営に関わった全ての方、本当にありがとうございました！

他のインターン生の記事

存在に気づいたら順次貼ります。

e-ntyo.hatenablog.com

blog.meteors.me

tomoyaf.hatenablog.com

turtar-fms.hatenablog.com

noahorberg.hatenablog.com

*1:普通のテンションで詳らかに書いてあるレポート系記事ははてなブログに多いっていうイメージがあったから

*2:Web上でマンガviewer機能や管理機能を提供しているサービス。となりのヤングジャンプ、少年ジャンプ+、くらげバンチ、コミックDAYSなどで使われている

2018-06-24

Bitcoin and Cryptocurrency Technologies

その他

Courseraで、Bitcoin and Cryptocurrency Technologiesっていう講義を受けた。みんなめっちゃビットコインって口にするから何なのか知りたかったものの、「ビットコインとは」とか検索しても妙にテンション高い記事しか出てこなくてイラっとしたので、淡々と教えてくれるCourseraで学んだ。

このコース受けるときの注意

私みたいな完全初心者は、week 3ぐらいまで進んでからweek 1の宿題解いた方がいい。講義に一言も「UTXO」とか出てきてないのに「UTXOクラス使ってね」って言われてもよくわからんし。Instructionと実際のコードから何となくわからなくはないけど、それならweek 3まで先にやっちゃう方が効率的。
宿題するにはJava書けないといけない。特にAssignment 2はいろいろ試して試行錯誤する系なので、思いついたことを素早く書けないとめんどくさくなる。

ただ、graderがエラーとかちゃんと見せてくれるので、自分とこで実行環境まで作らなくても大丈夫だった。
Certificateは出ない。

以下は講義ノート。1.1はノート取らなかったので、1.2から。

1.2. Hash Pointers and Data Structures

H(obj): hash of obj

n = the number of leaf nodes (not the number of all the nodes in the tree)

Going down the entire tree takes O(log n) because the height of the tree is log n.

このページがわかりやすい。

1.3. Digital Signatures

2 primitives of cryptocurrency technologies:

Hash function
Digital signature

Rules of digital signatures

Only you can sign, but anyone can verify
A signature is tied to a particular document (cannot be cut&pasted)

API for digital signatures

(sk, pk) = generateKeys(keySize)

sk: secret signing key (used to make a signature)

pk: public verification key (used by anyone to verify)

sig = sign(sk, message)

sig (a string of bits): signature

isValid = verify(pk, message, sig)

Requirements for signature

valid signatures should verify

verify(pk, message, sign(sk, message)) == true
signatures cannot be forged

Consider a game between a challenger and an attacker, where the attacker tries to forge a signature for a particular message. The probability that the attacker wins should be negligible.

Algorithms are randomized -> We need good source of randomness

Limit on message size -> use Hash(message) instead of the raw message

By signing a hash pointer, we can protect all the nodes that the pointer points at.

Bitcoin uses ECDSA standard

ECDSA = Elliptic Curve Digital Signature Algorithm, a digital signature scheme

When using this algorithm, we need to use good randomness; otherwise, we'll leak our private keys

1.4. Public Keys as Identities

public key == an identity

That is, if verify(pk, msg, sig) == true, it's like "pk says [msg]" (= the pk is the actor or the party that states the message).

To make the statement that the pk states, we need the matching sk.

How to make a new identity

create a new, random key pair (sk, pk) by executing generateKeys

pk: the name of that identity. Since the raw pk is big, it's usually better to use Hash(pk)

sk: lets you (the creator of the keys) "speak for" the identity

Decentralized identity management: No central point of control

Rather than making users register in order to use the system, make anyone create identities at anytime.

-> This is the way bitcoin works

In bitcoin jargon, "address" is the name of the identity--either pk or Hash(pk).

Privacy

Addresses are not directly connected to the real-world identity.

But observers can link multiple activities done by the same address over time and make inferences.

1.5. A Simple Cryptocurrency

Simplest possible cryptocurrency: GoofyCoin

Goofy can create new coins whenever he wants
These new coins belong to Goofy

a coin: the result of CreateCoin operation with a uniqueCoinID, signed by pk (= Goofy)

Anyone can verify that the sign is valid and that the sign is put on this statement.
A coin's owner can spend it

spend: to pass the coin to someone else

Goofy makes a new statement saying "Pay this to pk (Alice): H(the original coin object)", which is signed by pk (Goofy)
The recipient can pass on the coin again to another person

Alice: "Pay this to pk (Bob): H(the second coin object)" and sign it herself

Anyone can verify the validity of the coin by following the chain and checking all the signatures along the way.

A security problem

double-spending attack: Alice can make two parallel statement saying that she passes the coin to two different people

ScroogeCoin: designed to solve the double-spending problem

Scrooge publishes a history of all transactions (a block chain, signed by Scrooge)

f:id:YaaMaa:20180624035300p:plain

Anyone can follow the entire structure, which is endorsed by Scrooge, and can notice the attempted double-spending by viewing the transaction history of the coin.

Sometimes, for the sake of optimization, multiple transactions are put into the same block.

Scrooge coin allows two types of transactions:

CreateCoins: can create multiple coins at a time

f:id:YaaMaa:20180624035328p:plain

num: serial number

value: the worth

recipient: pk of the person who becomes the owner of the created coin

PayCoins: can consume multiple coins owned by multiple people at a time

f:id:YaaMaa:20180624035340p:plain

This transaction is valid if:

consumed coins are valid (legitimately created)
consumed coins have not been already consumed
total value out = total value in
signed by every owners of all consumed coins (The same person need to sign multiple times if he has multiple coins in the transaction (judging from the quiz at the end of the video))

Immutable coins

Coins can't be transferred, subdivided, or combined.

But you can get the same effect by consuming your coins and create new coins that have, in total, the same amount of value.

The problem: What if Scrooge starts to misbehave?

-> We need to decentralize (descroogify) this currency

Programming Assignment: Scrooge Coin

week 1で見たときは意味不明だったのに、week 3から戻ってきたらわりと単純に見える！

疑問点

"unordered"ってことは、時系列にすらなってないってこと？"a transaction can reference another in the same block"って書いてあるけど、arrayの中で自分より後のtransactionを参照したりする…？

→そういうの考慮しなくてもpassした。けど、一回目のループでacceptされんかったやつを後でもっかい試してみるとかやった方が安全だったかも。
addInput()とかできるって説明あったけど、そんなことしたらTransaction全体のhashが変わっちゃうから、これのoutputをspendしようとするやつの参照がわからなくなっちゃうのでは？てか自分でaddInput()とかする意味ある？

→Extra creditしない限り別に使いどころなかった。
maximumとmaximalって何が違うん…

→この答えが最高にわかりやすい。

2.1. Centralization vs. decentralization

The problem with ScroogeCoin: it has a central authority

Centralization & decentralization is not all-or-nothing.

e.g. E-mail has decentralized protocol, but is dominated by centralized webmail services.

Aspects of decentralization in Bitcoin:

Who maintains the ledger?
Who has authority over which transactions are valid?
Who creates new bitcoins?
Who determines how the rules of the system change?
How do bitcoins acquire exchange value?

Peer-to-peer network (almost purely decentralized): open to anyone, low barrier to entry

Mining: open to anyone, but inevitable concentration of power is often seen as undesirable

Updates to software (how and when the rules change): core developers trusted by community have great power

2.2. Distributed consensus

Key technical challenge of decentralized e-cash: distributed consensus

Key problem: inconsistent database (where an event is recorded in one database but not in other databases)

Distributed key-value store enables various applications like DNS (domain name server), public key directory (which pairs users' email or something with their public keys) etc.

The requirements of distributed consensus

The protocol terminates and all correct nodes decide on the same value

correct nodes: excludes faulty or malicious nodes
This value (that they agree upon) must have been proposed as input by some correct node

What does it mean in the context of Bitcoin?

When Alice wants to pay Bob, she broadcasts the transaction (which has Alice's signature and transaction content, and the hash to the coin she wants to pay) to all Bitcoin nodes.

Bob's computer doesn't need to receive this transaction. As long as all the nodes received that transaction, the coin is Bob's now.

How consensus could work in Bitcoin:

All nodes have a sequence of blocks of transactions they've reached consensus on We do consensus on block-by-block basis for optimization
Each node has a set of outstanding transactions it's heard about

8:00くらいのとこよくわからんかった。inputとは何なのか。outstanding transactions it's heard aboutとは。

Problems:

Nodes may crash
Nodes may be malicious
Network is imperfect
- Not all pairs of nodes are connected
- Faults in network
- Latency -> we cannot use timestamp

There are some impossibility results.

Some well-known protocols

e.g. Paxos (Never produces inconsistent result. In exchange, it can get stuck and fail to make any progress)

Bitcoin works well because it uses different assumptions than the models that produced impossibility results.

Things that Bitcoin does differently:

Introduces incentives for users to work together (Works only because it's a currency)
Embraces randomness

Consensus do not have specific ending time. Rather, the probability that the transaction has made it into the consensus blockchain rises as time goes, and the probability that you're wrong in making an assumption about the transaction goes down at the same time.

2.3. Consensus without identity: the block chain

Nodes do not have any persistent identity.

Why is it difficult to make consensus without identity?

Pragmatic reason: some protocols need node IDs
Security reason: if nodes have identities, we can assume something like "less than 50% of nodes are malicious"

Why don't Bitcoin nodes have identities?

Identity is hard in a P2P (peer-to-peer) system (We need a central authority to manage identities)

-> cannot detect Sybil attack
Bitcoin aims at making pseudonymity

Users can participate without registering their personal information

Weak assumption: There is a way to pick a random node

How to: Give nodes tokens or something, and later, choose randomly from the published tokens and call the corresponding node.

Adversary that attempts Sybil attack only gets one token, thus not being able to increase his power.

What becomes available by this assumption: implicit consensus

In each round, random node is picked. This node proposes the next block in the chain. Other nodes implicitly accept/reject this block by either extending it or ignoring it and extending chain from the last block in the block chain (= skip the proposed block).

Every block contains hash of the block it extends -> Then the nodes can signal which nodes they are extending

Simplified consensus algorithm

New transaction are broadcast to all nodes

When Alice wants to pay to Bob, she broadcast that transaction to all nodes.
Each node collects new transactions into a block

Nodes are constantly listening to the transactions that yet have not made it to the consensus block chain.
In each round a random node gets to broadcast its block

The chosen node can propose a block that contains all the outstanding transactions it has heard about.
Other nodes accept the block only if all transactions in the block are valid (not double-spending, have valid signatures)

Therefore, blocks proposed by malicious nodes can be rejected.
Nodes express their acceptance of the block by including its hash in the next block they create

じゃあnodeごとに違うblock chainを持つことになってしまわないのか？

→なる。でもそんな感じで動いている。

What can a malicious node do?

Can it simply steal coins that belong to other people?

No. It cannot forge the owner's signature, and transactions that contain invalid signature are rejected by other nodes.
Suppose it hates Bob. Then it can decide not to include transactions originated in Bob's address in the block that proposes. Can it prevent Bob's transactions?

No. Other honest node can propose blocks that contain Bob's transactions in other rounds.
Can it double-spend?

Which of the two blocks end up in the long-term consensus? From the technological point of view, these two transactions are identical.

Policy for honest nodes: Extend the longest valid branch.

Due to network latency, there is not guarantee that the "legitimate" transaction is the first transaction heard by nodes and thus it is proposed first.

If the double-spending block is accepted, then the "legitimate" block gets completely ignored by the network, becoming an orphan block.

How do the merchant (the person who receives money from the double-spending attacker) decide whether the money is legitimately payed?

f:id:YaaMaa:20180624035454p:plain

The more confirmation the block gets, the more likely the block get into the long-term consensus block chain.

Most common heuristic: 6 confirmations is enough to believe that the block ends up in the consensus chain.

2.4. Incentives and proof of work

Are the majority of nodes really honest?

-> We need to give nodes incentives for behaving honestly.

Can we reward nodes that created the blocks that end up in the long-term consensus chain? (The nodes contained in the first branch in the picture above)

-> Use Bitcoin itself!

Incentive 1: Block reward

Creators of each block gets to

include special coin-creation transaction in the block (The value is fixed: currently 25 BTC, halves every 4 years -> total supply ends at 21 million in 2140)
choose recipient address of this transaction (typically itself)

Block creator can "collect" the reward only if the block ends up on long-term consensus branch!

This is the only way BTC is created.

Creatorとはproposerなのか？ →いえす。

Incentive 2: Transaction fees

Creator of transaction can choose to make output value less than input value

Remainder is a transaction fee and goes to block creator

Purely voluntary, like a tip

誰がそんなことするの？

→priorityにもよるけど、十分なtransaction feeがないとblockになかなか入れてもらえない。それに、transactionの目安も制定されてる。

Remaining problems

How to pick a random node?
How to avoid a free-for-all (fight, dispute without no rules) due to rewards?
How to prevent Sybil attacks (Adversary creates a lot of Sybil nodes to manipulate the consensus)?

Solution: Proof of work

Approximate selecting a random node by selecting nodes in proportion to a resource that no one can monopolize (hopefully)

Two kinds of resources:

In proportion to computing power: proof-of-work (used by Bitcoin)
In proportion to ownership of currency: proof-of-stake (used by some of alternative currency)

Equivalent views of proof of work

Select nodes in proportion to computing power
Let nodes compete for right to create block
Make it moderately hard to create new identities

Hash puzzles

To create a block, a node need to find a nonce (a number) such that H(nonce || prev_hash || tx || ... || tx) is smaller than certain threshold. (|| stands for concatenation)

Given that the hash function is secure, only way to succeed is to try many nonces until you get lucky.

output space: all the possible hashed value

target space: hashed value that is smaller than the threshold

target space / output space is very small in order to make the hash puzzle difficult

Nodes have to constantly compete to solve the puzzle.

PoW (Proof of Work) properties:

The problem has to be quite difficult to compute

Only some nodes bother to compete. These nodes are called "miners".
Parameterizable cost

Nodes automatically re-calculate the target value (the size of the target space in the output space) every two weeks

Goal: average time between blocks = 10 mins (good interval for optimization)

Because the overall power of all the nodes increases, the amount of work a single node need to do also increases.

P(Alice wins next block) = fraction of global hash power she controls

For an individual miner, the mean time it takes to find a nonce: 10 mins / fraction of hash power (derived from the graph on the slide)

Key security assumption

Attacks are infeasible if majority of miners weighted by hash power follow the protocol
Trivial (?) to verify

Nonce must be published as part of block

Then other miners can simply verify that the hash is smaller than the target value

2.5. Putting it all together

It costs to invest in hardware to become a successful miner. But the income one gets from creating a block is also high.

mining reward (block reward + tx fees) - hardware and electricity cost = profit (can be negative)

They cannot be simply compared

fixed (hardware) vs. variable costs (electricity)
reward depends on global hash rate
exchange rate of BTC and real money is variable
miners can deploy different strategies

Bitcoin has three types of consensus

Value: how much BTC one has depends on consensus
State
Rules

Bitcoin is bootstrapped

Security of block chain -> value of currency -> health of mining ecosystem -> security ...

What can a "51% attacker" do?

Steal coins from existing address? -> No. They can create a longest branch, but merchants who runs honest nodes do not accept that invalid branch.
Suppress some transactions?
- From the block chain -> Yes. They can just refuse the block containing those transactions.
- From the P2P network -> No. The network still receives those transactions.
Change the block reward? -> No. The attacker needs to change the rule, but doesn't control the copies of the Bitcoin software.
Destroy confidence in Bitcoin? -> Yes. The main practical threat.

3.1. Bitcoin Transactions

Bitcoin consensus gives us:

Append-only ledger
Decentralized consensus
Miners to validate transactions

assuming a currency exists to motivate miners!

How to make that currency?

The account-based ledger system is simple, but requires us to calculate the account balance by going backward to evaluate the validity each time a transaction happens.

f:id:YaaMaa:20180624035703p:plain

Instead, Bitcoin uses a transaction-based ledger, which explicitly specifies the number of inputs and outputs.

f:id:YaaMaa:20180624035712p:plain

In this case, it's easy to prove that the transaction 4 is not valid because the amount of input and output don't match.

Here, each transaction has its own identifier.

The first transaction's input is ∅ because the coins are newly created.

"1[0]" means "output index 0 of transaction 1".

Why does Alice send money back to herself (Change address) in transaction 2?

-> Everyone has to consume all the output from the previous transaction.

We implement this with hash pointers, so we have direct access to the output which the input of the transaction refers to.

What becomes easy in this transaction-based ledger system?

Merging values: You can set two outputs directed to you as the two inputs of the transaction, and direct the total value to one person.
Joint payments: You and someone else can jointly pay to one person in a single transaction. (The transaction requires two separate signatures)

A Bitcoin transaction data structure

{
    "hash": "abcdefg...",
    "ver": 1,
    "vin_sz": 2,
    "vout_sz": 1,
    "lock_time": 0,
    "size": 404,
    "in": [
        {
            "prev_out": {
                "hash": "qwerty...",
                "n": 0
            },
            "scriptSig": "asdfgh..."
        },
        { one more input }
    ],
    "out": [
        {
            "value": 10.00,
            "scriptPubKey": "OP_DUP OP_HASH160 ..."
        }
    ]
}

The first 6 properties are metadata, followed by the series of inputs and outputs.

hash: the hash of this entire transaction
housekeeping information
- ver (version number, which is currently 1)
- vin_sz (number of inputs)
- vout_sz (number of outputs)
housekeeping: size

We'll come back to "lock_time" property later.

In the input data:

n: the index of the output that is consumed
scriptSig: signature of the owner of that output

In the output data:

scriptPubKey: the recipient address?? script?? We'll come back to this in the next lecture!

3.2. Bitcoin Scripts

The output scriptPubKey is actually a script, not just an address. The input address is also a script. To verify the transaction, the concatenated script must execute completely with no errors.

scriptSig

30440220...
0467d2c9...

scriptPubKey

OP_DUP
OP_HASH160
69e02e18...
OP_EQUALVERIFY
OP_CHECKSIG

Bitcoin's scripting language

Built specifically for Bitcoin (inspired by Forth)
Simple, compact
Has support for cryptography (like calculating hash, verifying signature)
Stack-based (Last In First Out like a stack of books)
Limits on time/memory
No looping (everything is linearly executed) -> No possibility of infinite loop

Reverse-polishとは何か、そしてその特徴がどのようにスクリプトの書き方に影響してるかについては、The Best Step-by-Step Bitcoin Script Guide Part 1のHow Does the Script Work?の項がわかりやすい。OP_EQUALとOP_EQUALVERIFYの違いも載ってる。

The sender specifies the public keys of the recipients, and recipients need to specify the signature to redeem the coins.

Redeemっていうのは、そのコインを実際使うってことでいいのか？コインを受け取るとき（recipientになるとき）にはsignature要らんよね？

What is written in the script?

The first two lines are data instructions which specifies the signature and the public key that is used to generate the signature of the recipient of the input component.

Recipient of the input componentって言うけど、inputsは分かれて別々のrecipientに行くかもしれないんだから、一対一で対応してなくない？それとも、inputは実はprevious outputだから、そのrecipient（今回のsender）ってことなのか？

→それだったらhash of <pubKey>と<pubKeyHash?>が一緒であるべきっていうのが意味わからなくなる。<pubKeyHash?>っていうのは受取人のものだから、明らかに今回のsenderのものじゃないし。

→後で判明するからちょっと待って。

In stack-based language, if you see a data, you push it onto the stack. (No need to declare variables)

<sig>: Signature of the input
<pubKey>: Recipient address
OP_DUP: Duplicate operation, which adds a clone of the top element of the stack
OP_HASH160: Compute the hash of the top element on the stack. Replaces the element with the computed result.
<pubKeyHash?>: Specified by the sender
OP_EQUALVERIFY: Compares <pubKeyHash> to the hash derived from OP_HASH160. If they aren't the same, the script stops executing.
OP_CHECKSIG: Check if <sig> is the signature by <pubKey>. The message is the entire transaction.

<pubKeyHash>: "Hash of the public key that was actually used by the recipient, when trying to claim the coins". でもclaimするのってrecipientがそれを使うとき、つまりもっと後のことじゃないの？受け取るときに必要なの？Senderがこのtransactionを作ってるから、recipientはその作成に介入できないと思ってたんだけど、何でrecipientが情報載せることができてるの？二人が共同で作ってるの？

→上で挙げた疑問も合わせて、この辺の混乱は、Discussion forumのこの質問が述べてくれている。つまり下図みたいなことだよね？やっとわかった。

f:id:YaaMaa:20180624035734p:plain

この辺の捉え方については、The Best Step-by-Step Bitcoin Script Guide Part 1のThe Game of Locking and Unlockingの項がわかりやすかった。

なんでsenderはpublic keyじゃなくてpublic keyのhashを載せてるの？

→Since the raw pk is big, it's usually better to use Hash(pk) (week1より)だからかな？

Bitcoin script instructions

256 opcodes total

Arithmetic
If / then
Logic / data handling
Crypto
- Hash function
- Signature verification
- Multi-signature verification

OP_CHECKMULTISIG

Built-in support for join signatures
Specify n public keys
Specify t (threshold. There must be at least t out of n signatures that are valid for this transaction to be valid)
Verification requires t signatures

There's a bug: Extra data value popped from the stack and ignored

Avoid this bug by putting extra dummy values on the stack

Bitcoin scripts in practice (as of 2014)

Most nodes whitelist known scripts (Refuse to execute scripts that aren't considered to be standard)
99.9% are simple signature checks
~0.01% are MULTISIG
~0.01% are Pay-to-Script-Hash
Remainder are errors, proof-of-burn

Proof-of-burn

It proves that the coins have been destroyed and cannot be spent in any ways.

OP_RETURN
<arbitrary data>

This script throws an error no matter what is written above.

When do people use it?

When they want to put arbitrary data on blockchain
When forcing people to destroy Bitcoin in order to get coins in new system

Pay to Script Hash (P2SH)

Senders can specify the hash to the script they want to use.

Then, the sender's output becomes simple, while the recipient needs to specify the entire redemption script and let the sender know the hash of the script.

Input's scriptSig by the recipient

<signature>
<<pubKey> OP_CHECKSIG>

Output's scriptPubKey by the sender

OP_HASH160
<hash of redemption script>
OP_EQUAL

The first validation: Is the <hash of redemption script> really the hash of the redemption script specified by the recipient?

The second validation: Does the signature correspond to the pubKey?

Redemption scriptはいつ実行されるの？Stackに入ったら自動的に実行されたりするのか？OP_EQUALがhashになったscriptを消しちゃうと思うから、それまでに実行されないといけない？そもそも、<<pubKey> OP_CHECKSIG>がredemption scriptだと思ってるんだけど、合ってるよね？

3.3. Applications of Bitcoin Scripts

Example 1: Escrow transactions

Alice is a buyer, and Bob is a seller. Alice doesn't want to pay until Bob sends the goods to her, and Bob doesn't want to send the goods until Alice pays him.

In this case, they will involve a trusted third-party, Judy.

f:id:YaaMaa:20180624035754p:plain

（上図は私の理解なんだけど、合ってるかどうかわからん。2つtransaction起こるよね？あと、「Multisig address」って書いたところが「escrow address」って言ってる人もいた。1つ目のtransactionの送り先は、2つ目のtransactionのinputで書かれるredemption scriptのhash値？？）

If Alice pays and Bob sends the goods: Alice and Bob sign to send the coin to Bob

If Alice pays but Bob doesn't send the goods: Alice and Judy sign to send the coin back to Alice

If Bob sends the goods but Alice refuses to sign: Bob and Judy sign to send the coin to Bob

Example 2: Green addresses

Alice is a buyer, and Bob is a seller. Bob is either offline or too busy to check the blockchain.

In this case, Alice asks a bank to pay Bob, and bank pays to Bob with its sign. If Bob believes the bank doesn't do double spending, then he can trust the payment and give the goods to Alice without waiting for 6 confirmations.

This trust comes from the real-world trust, not from the Bitcoin system.

This isn't used so much in practice because there were some companies that implemented green addresses and ended up collapsing.

Bobがofflineだったらどうやってtransactionが発行されたかどうか確認するの？

Example 3: Efficient micro-payments

Alice wants to pay Bob a small amount of Bitcoin every minute for the service Bob provides. Alice doesn't want to incur a transaction fee every minute.

How can Alice combine these small payments into one big payment at the end?

f:id:YaaMaa:20180624035808p:plain

What if Bob never signs?

Alice demands a timed refund transaction between process (1) and (2) in the illustration. "Pay 100 to Alice, LOCK until time (signed by both Alice and Bob)"

The lock time is specified as the transaction property "lock_time".

More advanced scripts

Multiplayer lotteries
Hash pre-image challenges
Coin-swapping protocols (-> It becomes difficult to trace who owns which coin)

These contracts are called "Smart contracts".

The contracts that can enforce some features which were traditionally enforced by laws or authority.

3.4. Bitcoin Blocks

Why bundle transactions together?

Single unit of work for miners
Keep the length of hash-chain of blocks shorter

Faster to verify history

Bitcoin block structure is a combination of two different hash-based structures.

Hash chain of blocks
Hash tree (Merkle tree) of transactions in each block

f:id:YaaMaa:20180624035820p:plain

The structure of a block

{
    "hash": "0000...",
    "ver": 2,
    "prev_block": "0001...",
    "time": 1391,
    "bits": 4195,
    "nonce": 4594,
    "mrkl_root": "89776...",

    "n_tx": 354,
    "size": 181520,
    "tx": [
        ...
    ],
    "mrkl_tree": [
        "6bd5eb25...",
        ...
        "89776cdb..."
    ]
}

The first 7 lines are block header, and the rest is transaction data.

1:52あたりのスライドでは"mrkl_root"はheaderに含まれてなかったけど、他のソースによると含まれてるっぽい？

Among those header properties, "hash", "bits", and "nonce" are mining puzzle related information.

Header is the only thing that's hashed during mining.

"mrkl_root"はheaderに含まれているのでhash化される。すると、treeの中の値を一つでも変えるとblock全体のhashが変わるので、改竄できなくなる。

n_txはnumber of transactionsだと思うけど、じゃあsizeって何？

→実際の例を見たら1.496KBとか書いてあったから、データとしてのサイズかな。

Merkle treeのrootへの参照だけ持っていればいいんだと思ってたけど、"tx"とか"mrkl_tree"は何なの？

Coinbase transaction

A special transaction in the Merkle tree, which creates a new coin.

"in": [
    {
        "prev_out": {
            "hash": "0000...0000",
            "n": 42429696
        },
        "coinbase": "..."
    }
],
"out": [
    {
        "value": "25.033...",
        "scriptPubKey": "OP_DUP OP_HASH 160"
    }
]

動画でのスライドの括弧の対応とかインデントとかなんかおかしかったから適当に直したけど、もしかしたら間違ってるかも。

"value" = block reward + sum of transaction fees
"prev_out"["hash"]: Null hash pointer, hash of all zeros
"prev_out"["n"]: arbitrary
"coinbase": completely arbitrary

You can see the real block and transaction data at blockchain.info.

3.5. The Bitcoin Network

Ad-hoc protocol (runs on TCP port 8333)
Ad-hoc network with random topology
All nodes are equal
New nodes can join at any time
Forget non-responding nodes after 3 hours

How to join the Bitcoin P2P network?

Send message "getaddr()" to one node that you know (the seed node).

Then the seed node let you know the address of the nodes that the seed node is connected to.
Send the same message to the nodes that you got to know in process 1.
Repeat this procedure as many times as you like and randomly select the nodes to connect to.

How to publish a transaction?

-> Use flooding algorithm.

A node tells about the transactions to its neighbors.
The neighbors store the transaction to their pending transaction pool, and then tells about the transaction to their neighbors.
When a node has already heard about the transaction and gets notified about the same transaction, it simply doesn't propagate the information any more.

Conditions under which you should propagate the proposed transaction

Valid with current block chain
Script is whitelisted (by default)

Avoid unusual scripts
Haven't seen before

Avoid infinite loops
Doesn't conflict with other transactions you've relayed

Avoid double-spends

They are just sanity checks; some nodes may ignore these conditions.

Nodes may differ on transaction pool because of the network delay.

That's fine because they've not yet published as a block.

Race condition

Transactions or blocks may conflict.

Default behavior: accept what you hear first

But if the conflicting transaction made it to the block, the node just forget about the first transaction it heard about.
Network position matters
Miners may implement other logic

Block propagation is nearly identical

Relay a new block when you hear about it if:

Block meets the hash target (the nonce meets the standard)
Block has all valid transactions

Run all scripts even if you wouldn't relay
Block builds on current longest chain

Avoid forks

They are also just sanity checks, too.

This protocol is not so efficient, but Bitcoin prioritizes the simplicity and decentralization.

How big is the network?

Impossible to measure exactly
Estimates: up to 1M IP addresses/month
Only about 5-10k "full nodes", the permanently connected and fully-validating nodes
This number may be dropping

Fully-validating node

Permanently connected
Stores entire block chain
Hear and forward every transaction
Tracks the full UTXO set

Thin/SPV clients (not fully-validating)

Store block headers only -> cannot check for the transactions' validity
Request transactions as needed to verify incoming payment
Trust fully-validating nodes

3.6. Limitations & Improvements

Hard-coded limits in Bitcoin

10 minutes average to create a block
1 MB in a block
20,000 signature operations per block
100 M satoshis per BTC
21M total BTC maximum
50, 25, 12.5... Bitcoin mining reward

The maximum amount of the coin and the reward structure cannot be changed because they affect the economic balance too much.

Throughput limits in Bitcoin

1 MB / block (10 min)
more than 250 bytes / transaction
7 transactions / sec

Inefficient compared to VISA or PayPal

Cryptographic limits in Bitcoin

Only 1 signature algorithm (ECDSA/P256)
Hard-coded hash functions

Crypto primitives might break by 2040

"Hard-forking" changes to Bitcoin

It's impossible to assume that every node will upgrade.

-> Nodes that still have old version will reject valid transactions or blocks sent from nodes with the new version of the software.

PROBLEM: Old nodes will never catch up

Soft forking

Observation: We can add new features which only limit the set of valid transactions. We can just make the validation rules stricter.

Need majority of nodes to enforce new rules

Old nodes will approve the blocks proposed by new nodes.

RISK: Old nodes might mine now-invalid blocks

Example of soft fork: Pay to script hash

Old nodes will just approve the hash without running the embedded script.

Soft fork possibilities

New signature schemes
Extra per-block metadata
- Restrict the content of coinbase parameter to, for example, the root of the UTXO tree

4.1. How to Store and Use Bitcoins

To spend a Bitcoin, you need to know:

some information from the public blockchain: don't need to worry about how to store it because you can always get it
the owner's (your) secret signing key

So, it's all about key management.

Goals

availability: you can spend your coins
security: nobody else can spend your coins
convenience: relatively easy to use

Simplest approach: store key in a file on your computer or phone

as available as your device
as secure as your device
very convenient

Wallet software

Keeps track of your keys and coins, provides nice UI.

Nice trick: use a separate address/key for each coin

benefits privacy (looks like separate owners)
wallet can do the bookkeeping, so users don't need to know

Encoding addresses

Encode as text string: base 58 notation or use QR code

4.2. Hot and Cold Storage

Hot storage (online): convenient but risky
Cold storage (offline): not as convenient but safer

Store separate keys in each of these storage.

To transform money to each other, each storage needs to know the address of the other.

Even though the cold storage is offline, hot storage can send money to it.

Problem: want to use a new address (and key) for each coin sent to cold storage. How can hot wallet learn new addresses if cold wallet is offline?

Solution: Hierarchical key generation

f:id:YaaMaa:20180624051518p:plain

Important properties:

ith address and ith key correspond to each other.
address gen info doesn't leak keys -> address gen info can be public

Some digital signature schemes cannot support hierarchical key generation, but the one Bitcoin uses does.

At the beginning, cold side does generateKeysHier, and pass the address gen info to the hot side.

Hot sideがどんなiを使ったかってどうやったらcold sideが知れるの？

How to store cold info

Info stored in device, device locked in a safe
"brain wallet": encrypt info under passphrase that user remembers
paper wallet: print info on paper, lock up the paper
in "tamperproof" device: device will sign things for you, but won't divulge keys

4.3. Splitting and Sharing Keys

Secret sharing

Split secret into N pieces, such that:

given any K pieces, one can reconstruct the secret key
given fewer than K pieces, one cannot learn anything

Example: N = 2, K = 2

P = a large prime number
S = secret in [0, P - 1]
R = random in [0, P - 1]
X1 = (S + R) mod P
X2 = (S + 2R) mod P
reconstruction: S = (2X1 - X2) mod P (because S < P)

f:id:YaaMaa:20180624051625p:plain

As k becomes 3, 4, ... you can just increase the number of random parameters R1, R2... and make the function quadratic, cubic, ... and so on.

Good:

Store share,es separately, adversary must compromise several shares to get the key.
You can lose (N - K) shares.

Bad: To sign, one needs to bring shares together and reconstruct the key -> vulnerable

Multi-sig

Lets you keep shares apart, approve transaction without reconstructing key at any point.

If there are 4 people, each of them generates a key-pair and puts the secret key in a secure and offline place, and if they stipulate that three of the four keys are needed to release their coins, then the coins will be safer.

じゃあ結局S+Rとかのやつは使わんってこと？めっちゃ賢いのに何かもったいないね。

4.4. Online Wallets and Exchanges

Online wallet

like a local wallet but "in the cloud"
runs in your browser, sends code to you and stores keys. You log in to access the wallet

Good: convenient. Nothing to install, works on multiple devices

Bad: vulnerable if the site is malicious or compromised

Bank-like services

You give the bank money (a "deposit")

The bank promises to pay you back later, on demand

The bank doesn't actually keep your money - they invest the money

Bitcoin exchanges

accept deposits of Bitcoins and fiat currency, promise to pay back on demand

lets customers:

make and receive Bitcoin payments
buy/sell Bitcoins for fiat currency

typically, match up BTC buyer with BTC seller

What happens when you buy BTC

No BTC transaction appears on the blockchain

The effect is only that Exchange is now making a different promise (the kind of currency your account holds has changed)

Good:

connects BTC economy to fiat currency and economy
easy to transfer value back and forth

Bad: the same kind of risks as banks, cyber attack

For traditional banks, government typically

imposes minimum reserve requirements
regulates behavior and investments
insures depositors against losses
acts as lender of last resort

But for Bitcoin banks and exchange services, these regulations do not apply.

Proof of Reserve

Bitcoin exchange can prove it has fractional (or 100%) reserve.

Prove how much reserve you're holding

Publish valid payment-to-self of that amount. Sign a challenge string (generated by some impartial party) with the same private key -> Proves that someone who knows the private key is participating in this proof of reserve
Prove how much demand deposit you hold

In Merkle tree, each hashpointer includes total value in its subtree.

Construct the tree and sign the root pointer.

If a customer demands to prove that he is included in the tree, just show log n items from the root to the customer's account. He can only check if the sum is correctly added.

Reserve fraction >= how much reserve you're holding / how much demand deposit you hold

4.5. Payment Services

Scenario: merchant accepts BTC

customer wants: to pay with BItcoin

merchant wants: to receive dollars, simple deployment, and low risk

この辺の話には興味がなかったのであんまりノート取ってない。

customer, merchant: do what they wanted to do

payment service: gets Bitcoins, pays dollars, absorbs risks related to security and exchange rate

4.6. Transaction Fees

transaction fee = value of inputs - value of outputs

fee goes to miner who makes the block that contains this transaction

How are transaction fees set today?

Transaction fee exists to compensate for (some of) the cost for

peers to relay the transaction
miner to record the transaction in a block

Current consensus fees:

No fee if all these conditions apply:

tx is less than 1000 bytes
all outputs are 0.01 BTC or larger
priority is large enough

priority = (sum of (inputAge * inputValue)) / txSize

Otherwise: 0.0001 BTC per 1000 bytes

Most miners enforce the consensus fee structure.

Transaction without consensus fee will take longer to be recorded.

Miners prioritize transactions based on fees and the priority formula.

4.7. Currency Exchange Markets

Supply of Bitcoins

supply = coins in circulation (+ demand deposits?)

coins in circulation: fixed number

When to include demand deposits?

-> When they can actually be sold in the market

Supplyとdemandのあたりあまりよくわからなかった。経済がわかってなさすぎる。興味が出たらちゃんと考える。

Programming Assignment: Consensus from Trust

今週の講義内容とあんまり関係なくない？笑

疑問点

Do malicious nodes make up fake transactions?

→"Your CompliantNode code can assume that all Transactions it sees are valid"らしい！
How should I use the four arguments passed to CompliantNode constructor? Individual nodes are not supposed to know the structure of the entire network...???

→According to a mentor's post in the discussion forum, these parameters can be used to construct our own malicious nodes for testing.

ここ見て、ああああなるほど！！！ってなって"handshaking" transaction実装しようとしたけど、Simulation.java見たらvalidTxIdsに入ってるidを持つTransactionしか伝達されてないので詰んでいた。でも、どっちにしてもmaliciousだってそのfolloweeから送られてきたやつをそのまま伝達する（ことがある）んだから、maliciousからhandshaking transactionも送られて来るんじゃない？

結局、何も送ってこないfolloweeをガン無視するって方法で90%まで行ったので、そこで諦めることにした。

これの逆バージョンで、malicious nodeをデザインしてどれだけネットワークを攪乱させられるかみたいなのあったら楽しそう。

5.1. The Task of Bitcoin Miners

Bitcoin miners

store and broadcast the blockchain
validate new transactions
vote (by hash power) on consensus
Join the network, listen for transactions

validate all proposed transactions
Listen for new blocks, maintain blockchain

when a new block is proposed, validate it
Assemble a new valid block
Find the nonce to make your block valid
Hope everybody accepts your new block
Get profit

How to look for nonce?

Try every single possible value for both the block's nonce and the coinbase parameter

When the coinbase parameter changes, the change in hash propagate all the way up to the entire block's hash.

Mining difficulty (the difficulty to find nonce) is reset every two weeks.

next_difficulty = previous_difficulty * (2 weeks) / (time to mine last 2016 blocks)

2016 is the ideal number of blocks mined in 2 weeks because a block should be mined every 10 minutes. (14 * 24 * 60 / 10 = 2016)

5.2. Mining Hardware

SHA-256

general purpose hash function

part of SHA-2 family
Remains unbroken cryptographically (weaknesses known)
SHA-3 (replacement) under standardization

256 bits -> split into 32 bits * 8

Then applied some tweaks and rotated.

SHA-256のdiagramはノートに取ってない。

Actually, SHA-256 is applied twice to get the hash of the block.

Nonceを見つけるためにどれくらいのマシンパワーが必要かとかいう話。マイニングに興味はないのでノート取ってない。Bitcoin miningがどれくらい熾烈な争いになってるかにびっくりした。

"Goodput" = throughput * success rate

5.3. Energy Consumption & Ecology

この人Bitcoinの話してたかと思ったらハードウェアに移って、更に熱力学みたいな話まで始めたけど…専門の幅がすごすぎる。あんまり話聞いてなかった。

5.4. Mining Pools

Mutual insurance of miners

How to prove how much work you did as a miner?

Show "near-valid blocks" (The hash is relatively small, but not enough to be valid) that you mined to the pool manager

Pros

Make mining more predictable
Allow small miners to participate
More miners use updated validation software

Cons

Lead to centralization
Discourage miners from running fully-validating nodes

5.5. Mining Incentives and Strategies

Game-theoretic analysis of mining

Which transactions to include in a block

Default: any above minimum transaction fee
Which block to mine on top of

Default: longest valid chain
How to choose between colliding blocks with the same length of chain

Default: first block heard
When to announce new blocks

Default: immediately after finding them

Can you profit from a non-default strategy?

assuming that you control 0 < α < 1 of mining power

-> For some α, yes. (Analysis is still ongoing)

Forking attacks

f:id:YaaMaa:20180624052202p:plain

Certainly possible if α > 0.5 or via bribery
Attack is detectable
Might be reversed
Might crash exchange rate because people no longer believes in Bitcoin

Checkpointing

Avoid deep forkings

Block-withholding attacks or "selfish mining"

Don't announce blocks right away. Try to get ahead!

Find, for example, two blocks ahead of the currently longest blockchain. Keep them secret.
The rest of the network finds one block.
Release your secret blocks. The block found in 2 immediately becomes an orphan block.

Even if you've found only one block, if you win the race, the block goes in to the accepted blockchain.

How to win the race?

Try to peer with as many nodes as possible
Bribery

Punitive forking

Blacklist transactions from address X = freeze X's money forever

Announce that you will refuse to mine on any chain with a transaction from X

With α < 0.5, you'll soon fall behind

Feather-forking

To blacklist transactions from X, announce that you will refuse to mine directly on any block with a transaction from X. But you'll concede after n confirming blocks

-> You can cancel the block containing transactions from X with probability α²

-> For other miners, including a transaction from X induces an α² chance of losing a block -> Might be safer to join in on the blacklist

6.1. Anonymity Basics

Bitcoin addresses are public key hashes (that act kind of like pseudo-identities) rather than real identities = pseudonymity

Anonymity = pseudonymity + unlinkability

Unlinkability: Different interactions of the same user with the system should not be linkable to each other

Why is unlinkability needed?

Many Bitcoin services require real identity
If you buy something with Bitcoin at a real store, the stuff will know your physical identity
Linked profiles can be deanonymized by a variety of side channels

Example: One might be able to connect your address to your Twitter account by the similarity between when you make Bitcoin transactions and when you post tweets

-> Pseudonymity is quite fragile, so unlinkability is needed

Defining unlinkability in Bitcoin

Hard to link different addresses of the same user
Hard to link different transactions of the same user
Hard to link sender of a payment to its recipient

Indirect payment that goes through a bunch of transactions does not easily reveal the initial sender and the ultimate recipient

Complete unlinkability (among all addresses/transactions) is hard

Quantifying anonymity

Try to maximize the size of anonymity set

Anonymity set: the crowd that one attempts to blend into

To calculate anonymity set:

Define adversary model
Reason carefully about: what the adversary knows, does not know, and cannot know

Why we need anonymous cryptocurrencies?

Blockchain-based currencies are totally, publicly, and permanently traceable

-> Without anonymity, privacy is much worse than traditional banking

Money laundering

Bottleneck: moving large flows into and out of Bitcoin ("cashing out")

Similar dilemma: Tor

Anonymous communication network

Sender and receiver of message unlinkable

Used by many kinds of people including:

Journalists -> should be protected from government pressure
People who spread malware -> should be identified

Using blind signature, traditional bank can let users withdraw and deposit their money without being able to identify the user's connection.

Anonymity & decentralization: in conflict

Interactive protocols with bank are hard to decentralize
Decentralization often achieved via public traceability to enforce security

6.2. How to de-anonymize Bitcoin

Bitcoin's best practice: always receive payment at fresh address

f:id:YaaMaa:20180624052549p:plain

Shared spending is evidence of joint control -> leads to linking addresses

Also, change address can be connected to the payer's address though people cannot see which address in the output is the change address

"Idioms of use"

Idiosyncratic features of wallet software

e.g. each address is used only once as a change address

Shared spending + idioms of use

You can draw a diagram of a user's addresses and transactions and guess the correspondence between address and service provider.

Instead of just guessing, you can also make transaction with the service and see how the address you interact with is connected to other addresses.

From services to users

High centralization in service providers

Most flows pass through one of these in a traceable way
Address: identity links in forums

Users sometimes post their addresses online to get donation

Network-layer de-anonymization

"The first node to inform you of a transaction is probably the source of it"

-> You can identify the IP address of the source node

Solution: use Tor (not specifically for Bitcoin, but robust and functioning) or Mix nets

6.3. Mixing

To protect anonymity, use an intermediary

f:id:YaaMaa:20180624052632p:plain

By just looking at the blockchain, one cannot identify who deposited specific coin

There exist dedicated mixing services

Promise not to keep records
Don't ask for your identity

Wallet services requires your identity to connect you to your account. But if you trust these services, it's good to use them for mixing because they have bigger anonymity set.

Principles for mixing services

Use a series of mixes

If at least one mixing service deletes records, your transaction becomes untraceable
Uniform transactions (make the transactions have the same value)
Client side must be automated
Fees must be all-or-nothing

Remaining problem: trusting mixes

Stay in business and build up reputation
Users can test for themselves
Cryptographic "warranties"

最後の解決法がよくわからんかった。

6.4. Decentralized Mixing

Pros:

No bootstrapping problem
Theft impossible
Possibly better anonymity
More philosophically aligned with Bitcoin

Coinjoin

Find peers who want to mix
Exchange input/output addresses
One of the member construct the transaction
Send it around and collect signatures

Before signing, each peer checks if his output is present
Broadcast the transaction

Problems:

How to find peers

-> Solution: Use an untrusted server
Peers (or at least one of them) know your input-output mapping

-> Solution: special-purpose anonymous routing mechanism
Betrayal

After learning the other's address and signature, one can transfer the coin to his own address

-> Proposed solutions: proof of work, proof of burn, server kicks out malicious participant, cryptographic "blame" protocol

3つ目の問題があるなら、theft不可能ってわけでもなくない？

High-level flows could be identified

For example, if you receive specific amount of coin every week and immediately transfer specific portion to your other account, the pattern in the amount of money and time is identifiable

Heuristic: merge avoidance

Instead of a single payment transaction,

receiver provides multiple output addresses
sender avoids combining different inputs

6.5. Zerocoin and Zerocash

Zerocoin

protocol-level mixing (Mixing capability baked into protocol)

Advantage: cryptographic guarantee of mixing -> You don't need to trust others to achieve anonymity

Disadvantage: not currently compatible with Bitcoin

Basecoin: Bitcoin-like Altcoin

Zerocoin: Extension of Basecoin

Basecoins can be converted into zerocoins and back

-> Breaks line between original and new Basecoin

Zerocoins: cryptographic proof that you owned a Basecoin and made it unspendable

Miners can verify these proofs

Gives you the right to redeem a new Basecoin and exchange it for Zerocoin

Challenges

How to construct these proofs?
How to make sure each proof can only be "spent" once?

Zero-knowledge proofs: a way to prove a statement without revealing any other information Example:

"I know an input that hashes to xxxx"
"I know an input that hashes to some hash in the following set ..."

-> Others can't learn what the input is

Zerocoin comes with a standardized denomination

-> Any zerocoin in the block chain has the same value

Minting zerocoins

Generate serial number S (eventually made public)
Generate random secret r (never becomes public, ensures unlinkability)
Compute H(S, r) = cryptographic "commitment"
Create a transaction which have base coin as input and the computed H(S, r) as the output address

-> The basecoin in the input becomes unspendable

Spending zercoins

Reveal S

-> Miners will verify S hasn't been spent before
Create zero-knowledge proof that "I know a number r such that H(S, r) is one of the zerocoins in the block chain"

-> This gives you the right to redeem basecoin
Pick arbitrary zerocoin in block chain (to ensure anonymity) and use it as input to your new transaction

Zerocoin is anonymous because people cannot learn which zerocoin you put in the block chain as long as you keep your random number r secret.

Zerocash

Zerocoin without Basecoin

Differences

Different (and more efficient) crypto for proofs
Proposal to run system without Basecoin

All transactions are zerocoins

Supports splitting and merging

Puts transaction value inside the "commitments"

Ledger merely records existence of transactions

Random, secret inputs are required generate public parameters. These secret inputs must then be securely destroyed. No one can know them (anyone who does can break the system).

System	Type	Anonymity attacks	Deployability
Bitcoin	Pseudonymous	Tx graph analysis	Default
Single mix	Mix	Tx graph analysis, bad mix	Usable today
Mix chain	Mix	Side channels, bad mixes/peers	Bitcoin-compatible
Zerocoin	Cryptographic mix	Side channels (possibly)	Altcoin
Zerocash	Untraceable	None	Altcoin, tricky setup

6.6. Tor and the Silk Road

f:id:YaaMaa:20180624052923p:plain

In Tor, a sender picks three Tor nodes at random to route her message. If at least one of them is honest node, it's safe(ish).

Key challenge: hiding routing information

Individual router shouldn't learn the destination's IP address

Solution: layered encryption

f:id:YaaMaa:20180624052935p:plain

What is encrypted in each layer is the destination of the message at that layer.

Hidden services

What if the server wants to hide its address?

Connect to "rendezvous point" through Tor
Publish {domain name -> rendezvous} point mapping

Domain name is not the default DNS address; the domain name used here is called "onion address"
Client connects to rendezvous point

Silk Road

Communication: Tor hidden service
Payment: Bitcoin
Security?
Anonymous shipping?

7. Community, Politics, and Regulation

ここからは、後で復習することはあまりなさそうなので、ノート取らない。でも勉強になるので聞くだけ聞いた。

Consensus about rules: depends on developers

Consensus about history: depends on miners

Consensus that coins are valuable: depends on investors, merchants, and customers

Satoshi Nakamotoさんがミステリアスなのは何となく知ってたけど、addressがわかってるから何かtransaction起こしたらブロックチェーンに公開されるっていうの聞いて「ロマンがある！！！！」ってなった。

7.7の2つ目のクイズの解答がめっちゃ謎。どうしてそうなったん？？ →Mentorさんが"I thought the answers to the second question on market failure were exactly opposite to the ones given."って言ってはるからやっぱり何かの間違いかな。

Programming Assignment: Blockchain

疑問点

heightとは？？そのblockがgenesisからどれくらい伸びたところにあるかってこと？

→たぶん。heightは相対的にしか使われないから、genesisを0としても1000としても何でもいい。
UTXOPoolとTransactionPoolは、forkあたり一つずつなのか、blockchainあたり一つずつなのか？

→どっちもinstructionに書いてあった笑

TransactionPool: "Maintain only one global Transaction Pool for the block chain and ..."

UTXOPool: "You have to maintain a UTXO pool corresponding to every block on top of which a new block might be created."

なぜかgenesisBlockは空っぽだろうと信じてたのでtransactionのvalidationがうまくいかなかった。genesisBlockのcoinbaseを考慮したらちゃんとできた。

2018-06-21

Three.jsでオブジェクトが表示されない時のチェックリスト

Three.jsで何か作ってるとき、画面にオブジェクトが表示されなくてしばらく時間を費やしてしまうことがちょくちょくある。わかってみたら単純でアホみたいなミスばかりなんだけど、実際画面に何も表示されないって状況に直面すると「どこからチェックしよ…」ってなってしまうので、ありえる原因を書き出して今後の参考にしようと思う。

Canvas

renderer.domElementであるcanvasはちゃんと正しい場所に正しい大きさで存在してる？

ウェブページ自体の背景をsceneの背景色と同じにしてたら気づきにくいので、ちゃんとinspectorで見る。

Canvas element - Renderer - Scene - Cameraの繋がり

renderer.domElementは意図したcanvas elementになってる？

Renderer.render()に正しいsceneとcameraが渡されてる？

Material

オブジェクトのマテリアルはちゃんとsceneの背景色と識別できる色？

opacityが0になってたりせん？visibilityはtrue？

material.sideは意図した方向になってる？

Position

デフォルトではカメラは原点で-Zの方向を向いてるので、オブジェクトはそこから見えるところに置いといた方がいい。

カメラとオブジェクトが重なってたり、オブジェクトの内側にカメラがいたりしたら（materialを背面に設定していない限り）見えないから気をつけて。

Add

ちゃんとscene.add(object)した？sceneにちゃんと入ってるか、scene.childrenで確認して。

Render

ちゃんとrenderしてる？requestAnimationFrame()で毎フレームrenderしてる？

完全に静止してるやつやったら一回だけのrenderでもいいけど、それならオブジェクトのセッティングとか全部render前に済ませよう。texture画像の読み込みみたいな非同期処理はrenderが実行される前に終わらんこともあるし。

Fog

Fogが濃すぎてオブジェクト見えないとかない？とりあえずfog無効にしよ。

Camera depth

camera.nearとcamera.farは適切？範囲が狭すぎたらオブジェクトがクリップされてしまう。

Controls

FlyControlsとかがちゃんと動いてる前提でカーソルぐるぐる動かしてても、実はカメラ全然動いてないとかありえるから、camera.positionとかcamera.rotationが変動してるか確認しよ。

おまけ: tweenしても動かんとき

tween.startしてる？
tween.updateしてる？
update毎にちゃんと値を代入してる？
needsUpdateをtrueにしなければいけないとかない？ちゃんとドキュメント見て。

今思いつくんはこれくらいかな。何かめちゃくちゃうざいチェックリストになってしまった。

2018-06-14

タイムゾーンを変換するLine botを作る

留学から帰ってから、現地にいる友達とビデオチャットする時間を設定するのが厄介になってしまった。「○○時は空いてる？」って一言聞くためにも、世界の時間一覧アプリを開けて当該時刻までスクロールして、対応する相手の時間を探して打ち込まなければならない…っていう一連の手間がめんどい。しかも私は「今ここは○○時やから、あっちは○○時かな」って計算できるほど頭がよくない。Daylight saving timeとか考えだしたら死ぬ。

だからそういうのを自動でしてくれるbotが欲しいと思った。

作ったbotはここにある↓

https://github.com/shio-yaamaa/timezone-line-bot

目標

私はAsia/Tokyo 18:00からできるよー

とか打ったらbotが

私は
Asia/Tokyo 18:00
America/Los_Angeles 2:00
からできるよー

みたいな感じに変換してしゃべってほしい。シンプルな「JST」ではなく「Asia/Tokyo」とするのは、略称だと一意に決まらないことがあるみたいだから。それに、友達のいるところだと季節によってPSTになったりPDTになったりして頭おかしくなりそうだし。

けど毎回「Asia/Tokyo」とか書くなんてめんどくさすぎるので、チャットメンバーが「Asia/Tokyoのことは"jst"と呼ぶ」とでも決めておいて、

私はjst18:00からできるよーでbotに反応してもらいたい。

できるようにすること

現在時刻

「今そっち何時？」「○○時だよ」って会話も頻発するので、

timezone nowって言ったら登録されてるタイムゾーン全ての現在時刻を表示するようにする。

全てのタイムゾーン

「Asia/Tokyo」みたいなやつ全部覚えてないので、timezone allって言ったら全てのタイムゾーンのリストを返すようにする。ただし、全部のタイムゾーンは数が多すぎるので、一覧ページへのリンクのみ返す。

タイムゾーンの登録

timezone add Asia/Tokyo as jstみたいな感じにする。いちいち「timezone」って打つのめんどくさいけど、「add」だけで反応するようにしたら要らんときまで反応してきそう。

asの後は、そのタイムゾーンのエイリアスで、適当に指定できる。

消すときは、timezone delete jstで消える。

Typoがあればsuggestする機能つけたかったけど、とりあえず今は一部を入力したら全部を返す("london"って入れたら「もしかしてEurope/London?」って聞いてくる)くらいしかできてない。

時刻のparse

Javascriptの時刻parserはいろいろ見つかったけど、Chronoが一番すごいと思ったので使わせてもらうことにした。Datejsも「tomorrow」とかをparseできるのはすごいんだけど、「tomorrow 5:00」とかになるとChronoしか処理できなくなる。

他にChronoがすごいところ

文字列が時刻以外の情報を含んでいても時刻だけを読み取り、どこから読み取ったかのインデックスを教えてくれる
年や月などのコンポーネントがそれぞれspecifyされてるのかimplyされてるだけなのか教えてくれる
期間のデータだったら始まりと終わりどちらも取ってくれる
デフォルトで多言語対応（さすがに英語の方が広く対応してるけど）

めっちゃありがたかった。

サーバーとか

Amazon Lambdaを使うことにした。データベースはDynamoDB。どっちもよくわかってないし、AWS自体が謎の塊だから、このチョイスで大丈夫なのか知らんけど。

一番躓いたところは、Lambdaでzipをアップロードして実行したときの「Cannot find module 'index'」っていうエラー。私の場合たぶん原因は二つあって、一つはzip圧縮するときにディレクトリの中身だけではなくディレクトリごと圧縮してしまったこと、もう一つはnode_modulesの中に必要なモジュールがちゃんと入ってなかった（npm iし忘れた）ことだった。

あと、LambdaからDynamoDBにアクセスできない！！と思ったら原因は全然違うところにあった。DBの読み込みとかが非同期なのに、それを無視してcontext.succeed()やっちゃってたから途中でDBの処理を中断させられてたみたい。

LambdaとDynamoDBしか使わんはずやったのに、いざやり始めたら他に必要なサービスがどんどん出てきて圧倒された。AWSってすごいな…。もうちょっと全体像を把握したい。

いつまでこのbotを使う用事があるかはわからんけど、ずっとLine botに興味あったから作れてよかった。

2018-02-25

React+Reduxで脱出ゲームを作る

React

服部平次の部屋からの脱出ゲーム作りたい。今までの原作と映画でのトリックをうまく組み合わせて何かできひんかなって考えてる。それと、ちょうどReactをやってみたいと思ってたので、Reactで脱出ゲームがちゃんと作れるのか確かめるために小さいのを作ってみた。

GitHub Pagesで公開してる。

https://shio-yaamaa.github.io/react-escape-room/

ソース: https://github.com/shio-yaamaa/react-escape-room

作った感想とか、作り替えるときにどこを変更したらいいかとかメモしておく。

ゲームの中身

中学生の時くらいにNeutralのMyaさんに激しく憧れてて（今ももちろん憧れてるけど）、何でもいいから一個脱出ゲーム作りたい！ってなってシナリオ考えたんだった気がする。見返してみると、物理法則めっちゃ無視してるしアイテムの鍵率高すぎだし酷いけど、なんか頑張って考えたのは覚えている。当時は吉里吉里のKAGで作ろうとしたけど、書き方が悪かったのか超遅くなったので、諦めてしまったんだった。部屋とかアイテムはそのときにモデリング&レンダリングしたので、その画像は今回そのまま使えた。

Componentの構成

<Provider>
    <App>
        <ScreenSwitch container> manages which screen to display and calls AssetsLoader
            <StartScreen>
            <EndScreen>
            <LoadScreen>
            <div> shows fade effect when switching between screens

            <Game container> manages functions related to the whole game, like save, hint, and cursor
                <div> aligns MainScreen and Sidebar horizontally
                    <MainScreen container> handles click events on MainViewMap, ArrowArea, and ItemDetailWindow
                        <MainView> displays background images
                        <MainViewOverlay> additional images on the MainView like dial numbers
                        <MainViewMap> clickable map
                        <ArrowArea> (x4) fire events when they (not their children!) are clicked
                        <ItemDetailWindow> zoomed view of items
                            <ItemDetailView> displays item images
                            <ItemDetailViewMap> clickable map
                        <Sidebar>
                            <ItemFrameContainer container>
                                <ItemFrame> (x10)
                            <div> aligns SaveButton and HintButton horizontally
                                <SaveButton>
                                <HintButton>
                <Hint>
                <div> // shows fade effect when save button is clicked

どれをcontainerにするかの判断についてはあんまり自信がない。

ゲームごとに書き換えるところ

なるべく後で使えるように作ったつもりなので、他の脱出ゲームを作るときは下記を変更したらたぶん動く。

Assets

assets/images/

items: アイテム欄でのアイテムの画像
itemDetails: アイテム詳細ウィンドウでのアイテムの画像
itemDetailMaps: アイテム詳細ウィンドウのクリッカブルマップ
mainViews: 部屋の画像
mainViewOverlays: 部屋の画像の上に載せるやつ
mainViewMaps: 部屋の画像のクリッカブルマップ
endScreenBackground: ゲームクリア画面

assets/sounds/: 全般的に

Program

redux/modules/

items: ゲーム内のアイテム
itemStatus: アイテム詳細ウィンドウ内でのアイテムの状態
status: 部屋の状態

scenario/: stateから背景画像を選択したり、クリック場所から動作を決めたりするスクリプトが入ってるので、全部変える。

Reactを使ってよかったこと

JSXが好き。（けどたまにスタイルをJSXかCSSファイルどっちに書こうかめっちゃ迷う。pseudo classとかはCSSでしかできないみたいだから）

「これがあれやったらそれ表示して」って書いとくだけで初期化も再描画も勝手にやってくれるん嬉しい。

初めCanvasとどうやって折り合いをつけるかわからんくて戸惑ったけど、Componentのライフサイクルに合わせてちゃんと書いたらそこまで大変でもなかった。

アイテム欄みたいな同じコンポーネントいっぱい並べる所がすごい楽。

Reactを使って苦労したところ

Assetsを事前に全部requireしておく

たぶん私がWebpackとかの技術に慣れてなさすぎるんだと思うけど、なんか難しかった。結局このページのmih-kopylovさんの最後のコメントを参考にしたらできた。
フェード

普通は、actionをdispatchしたらstateが更新されて、即座に画面が更新される。だから、「画面を黒くフェードしていって、一番フェードが濃い状態の時に画面を切り替えて、それからフェードを徐々に消していく」みたいなやつはちょっとめんどくさかった。何かいい方法あるんかな。

Reduxを使ってよかったこと

データの矛盾が起きない感じがめっちゃいいし、一方的に決まった方法でしか状態を変更できないのもありがたい。

save/load機能を作るのがめちゃくちゃ簡単だった。saveはstateをlocalStorageにそのままばーんって入れて、loadではそれをまるごと取ってくればいいだけ。

服部の脱出ゲームにもほぼそのまま使えると思うんだけど、まだ部屋のモデリングとか全く進んでないからいつになるかわからなすぎる。

今のところはこんなん。

f:id:YaaMaa:20180609222004p:plain

実際作ってみると、紫だと思ってたベッドが実は彩度の低い赤だったとわかったり、ポスターのバイクの種類を特定できるようになってきたりと、服部の部屋についての学びが多い。

2017-12-17

音響モデルから音素間の距離を求める

Python

しばらく前に、「"工藤"を誤魔化すための単語」みたいなツイートをしたら、予想外に伸びた。

うっかり「工藤」って言っちゃったときに誤魔化すのに使える言葉を調べました pic.twitter.com/wtZQaS4Etr
— やました (@yaa_maa_520) 2017年11月12日

服部の「工藤の人」としての知名度ってすごいんやな。うれしい。

いろいろ指摘をもらったんだけど、その中でも「Juliusで提供されている音響モデル（たくさんの人の声を統計的に学習したもの）で音素間の距離を定義できるとPanPhonよりもよくなるかもしれない」っていうアドバイスをめっちゃ試したいと思ったので、やってみることにした。知らないことばかりだったので、やったことをメモする。

Juliusの音響モデル

dictation kitに入ってるhmmdefsって拡張子のやつが、モデルの定義ファイル。monophone用とtriphone用がある。

monophone用: "m", "n"みたいな音素一つ一つに対してモデルを定義したやつ
triphone用: "a-m+a", "a-n+a"みたいに3つ並んでる音素に対してモデルを定義したやつ

たぶんtriphoneの方がちゃんと音素環境を考慮してるからいいと思うし、アドバイスしてくれた方もtriphone勧めてくださってたけど、とりあえずわかりやすそうなmonophoneを使った。

HMM

「音響モデル」って聞いて、何となく音声ファイルみたいなのを想像してたんだけど、めちゃくちゃ見当外れだった。実際の音響モデルはHidden Markov Model (HMM)として定義されていた。HMMっていうのは、stateがある確率で変わっていって、変わるごとにある確率に基づいた何かを出力するってやつらしい。

音響モデルは後戻りしないので、HMMの中でもleft-to-rightのものとして分類される。確かに遷移確率見たら、リピートするか次に進むかの二択しかなかった。

hmmdefsの構造は、HTK Bookの7章読んだら何となくわかった。一つの音素に対して、

~h "音素"
<BEGINHMM>
    <NUMSTATES> stateの数
        <STATE> 2
            <NUMMIXES> mixtureの数
                <MIXTURE> 1 mixture1の重み
                    <MEAN> meanの数
                        meanが並んでる
                    <VARIANCE> varianceの数
                        varianceが並んでる
                    <GCONST>
                <MIXTURE> 2 mixture2の重み
                    (略)
        <STATE> 3
            (略)
        <STATE> 4
            (略)
    <TRANSP> stateの数
        stateの数×stateの数の行列
<ENDHMM>

って感じで定義されている。stateは5個、mixtureは16個、meanやvarianceは25個だった。

HMM同士の距離

どうやってHMM間の距離測るんやろって思っていろいろ読んだけど、mixtureを含んでいるためそんな単純に計算できない（たぶん）。Music Similarity Measuresによると、片方のHMMから出したサンプルがもう片方のHMMから出力される確率がどれくらいかを測って、それを二つのHMMの近さと見なしたらいいっぽい。どれくらいばらつきがあるかわからんけど、一つの組み合わせにつき1000回やって、その平均を取ることにした。

そのためには、HMMからサンプルを出す機能と、尤度を計算する機能が必要。hmmlearnがそういうことをやってくれるので、そのhmm.GMMHMMっていうモデルを使った。

GMM→GaussianMixture

GMMHMMのプロパティのgmms_には、scikit-learnのGMMっていうオブジェクトが入っている。それがもう古いらしくて、GMMの代わりに新しいGaussianMixtureを使えってwarningが出る。

github.com

hmmlearnのissueで話し合ってくれてはいるけど、途中で終わってる。なので自分で勝手にhmm.py内のGMMを全部GaussianMixtureに書き換えてしまった。それに伴って、sample()の引数とかscore()の戻り値の形式など変更があったところに対応できるようにちょっと直した。

GaussianMixtureのパラメータの設定

個々のMixtureの分のパラメータ（weights_とかmeans_）をGMMHMMから設定する方法がわからんかったので、gmms_に入ってるGaussianMixtureオブジェクトに対して直接_set_parameters()してしまった。そのときにprecisions_choleskyというものが欲しいみたいなんだけど、それが何なのか理解すらできなかったので、GaussianMixture._compute_precision_cholesky()を使って出してもらった。

調節とか

left-to-rightのモデルだから、最後のstateから行くとこがないので、遷移確率の行列の最後の行は全て0になる。でも、遷移確率の行ごとの合計が全て1にならないとscore()でエラーが出る。どうしたらいいかわからんけど、とりあえずリピートするところを1にした。
遷移確率や分散に0があるとエラーになるので、めちゃくちゃ小さい値に置き換えた。
普通のHMMは延々と続くので、sample()するときには何回繰り返すかを指定するんだけど、音響モデルはleft-to-rightだから最後のstateに辿り着いたら止めるようにしてほしい。なのでちょっと処理を書き換えた。

NotFittedError

パラメータをセットして、やっとsample()できる！と思ったら、NotFittedErrorとかいうものが出た。fitした後じゃないとsample出してくれへんらしい。

stackoverflow.com

↑初めにちっちゃいデータセットでっち上げてfitさせてからパラメータをセットし、その後sample()したらいいって回答がある（ちっちゃいって言っても、サンプルの数 >= Mixtureの数じゃないと無理だけど）。だから、適当にfitする→ちゃんとしたパラメータをセットする→sample()するっていう流れになった。

GMMHMMとGaussianMixtureでの引数名が指すもの

なんかcomponentとか微妙に意味が違うからややこしい。

	GMMHMM	GaussianMixture
the number of states	n_components
the number of mixtures	n_mix	n_components
the number of features		n_features

それぞれの音素がHMMとして定義されてて、そのHMMの中にstateがいくつかあって、それぞれのstateの出力確率がいくつかのmixtureから成ってて、それぞれのmixtureにfeatureが決められている。

log likelihoodはsampleの長さに影響される？

log likelihoodの行列（左がサンプルを出したもとのHMM、上がそのサンプルを出す尤度を測ったHMM）を可視化すると、以下のようになった。

f:id:YaaMaa:20171202193318p:plain

かろうじて対角線は見えてるけど、それより行ごとの違いが大きすぎる。左のHMMから取ったサンプルの長さの影響が大きいのかなと思って、x軸をサンプルの長さ、y軸をscore (= log likelihood)としたグラフを描いてみた。

f:id:YaaMaa:20171202032558p:plain

すごいあからさまに影響されてる。濃い青色のところは、行列では対角線上に当たる部分、つまり同じ音素同士を比べたときのスコア。これがだいたい横一直線になってもらわないといけないと思う。いろいろめちゃくちゃな方法で値を操作した結果、次のような行列になった。

f:id:YaaMaa:20171202200958p:plain

さっきに比べたらわりと納得できる感じになったと思う（適当）。

距離の可視化

ツイートに書いたのと同じ方法で、音素間の距離を可視化した。

f:id:YaaMaa:20171202201138p:plain

雑に分類してみると、

f:id:YaaMaa:20171202201245p:plain

何となく同じような属性を持つ音素同士で固まってるようなので満足！

あの意味わからん数字の羅列だったHMM定義ファイルの中身が、処理してみたらちゃんと人間の知覚に沿った感じになってるの、何かすごいな。

もうちょっとやりたいことがあった気がするけど、期末テストやってるうちに自分が何してたのか忘れてしまったので、終わる。