機械学習（HTM(TM層)）で簡単な演繹法を解いて見た！　（自然言語処理）

f:id:hiro-htm877:20190203223941p:plain

最近活用されている機械学習の多くは帰納法である。

そこで、大脳新皮質をモデルとした機械学習→HTM（Hierarchical Temporal Memory）を用いれば演繹法の問題も解けるのではないかと思い今回の実験をすることに決めた。

※注釈

演繹法：　演繹法は、「××だから、○○である」という論理を数珠つなぎにしていき、結論を引き出す方法。

帰納法：　演繹法とは流れが反対の思考法です。つまり、ルールと観察事項から結論を導くのではなく、観察されるいくつかの事象の共通点に注目して、ルールまたは無理なく言えそうな結論を導き出すもの。

実験内容

今回実験する内容は

日本語：　動物は生きている→犬は動物である→犬は生きている

英語　：　animal is aliving → dog is animal → dog is aliving

の３段論法です。

実験方法

はじめに、「animal is aliving」

の文章のつながりをHTMのTM層で学習させます。

「dog is animal」はエンコードで表現します。

具体的には、

dog = [0000001110 0000000000 0000000]

animal = [1111111111 1100000000 0000000]

これにより、animalの表現の中にdogが含まれていることで、

dog=animalが成り立ちます。

次に、HTMの学習モードをOFFにして、テストを行います。

dogをHTMに入力し、次の単語が予測されます。

その予測をさらにHTMに入力して、最後の結果を予測させます。

すると、dog is aliving が出力されるはずです。

では早速実験していきます！

ソースコード


import numpy
import sys
from itertools import izip as zip, count
from nupic.algorithms.temporal_memory import TemporalMemory as TM
from nupic.encoders.category import CategoryEncoder

#入力ベクトルを印刷するユーティリティルーチン
def formatRow(x):
  s = ''
  for c in range(len(x)):
    if c > 0 and c % 10 == 0:
      s += ' '
    s += str(x[c])
  s += ' '
  return s


#ステップ1：適切なパラメータを使用してTemporal Poolerインスタンスを作成する

categories = ("cat", "dog", "fish", "is", "car", "bike", "airplane", "aliving")
encoder = CategoryEncoder(w=3, categoryList=categories, forced=True)
cat = encoder.encode("cat")
dog = encoder.encode("dog")
fish = encoder.encode("fish")
Is = encoder.encode("is")
car = encoder.encode("car")
bike = encoder.encode("bike")
airplane = encoder.encode("airplane")
aliving = encoder.encode("aliving")

enLen = len(cat)
animal = numpy.zeros(enLen, dtype = int)
for i in range(12):
    animal[i] = 1
print "cat =       ", cat
print "dog =       ", dog
print "fish =    ", fish
print "is =        ", Is
print "car =        ", car
print "bike =        ", bike
print "airplane =        ", airplane
print "animal =        ", animal
print "aliving =        ", aliving
animalIsAliving = []
animalIsAliving.append(animal)
animalIsAliving.append(Is)
animalIsAliving.append(aliving)

tm = TM(columnDimensions = (enLen,),
        cellsPerColumn=1,
        initialPermanence=0.5,
        connectedPermanence=0.5,
        minThreshold=2,
        maxNewSynapseCount=20,
        permanenceIncrement=0.1,
        permanenceDecrement=0.0,
        activationThreshold=2,
        )


#ステップ3：学習のためにこの単純なシーケンスを一時記憶装置に送る
#シーケンスを10回繰り返す
print "---------------"*3, "\n"

for i in range(5):

  for j in animalIsAliving:
    #activeColumns = set([i for i, j in zip(count(), x[j]) if j == 1])
    #activeColumns = set([k for k in range(len(j)) if j[k]==1])
    activeColumns = set([k for k, data in zip(count(), j) if data==1])
    print "activeColumns", activeColumns

    tm.compute(activeColumns, learn = True)


    print("active cells " + str(tm.getActiveCells()))
    print("predictive cells " + str(tm.getPredictiveCells()))
    print("winner cells " + str(tm.getWinnerCells()))
    print("# of active segments " + str(tm.connections.numSegments()))
    print("\n")
  tm.reset()

##########################################################
#ステップ3：ベクトルの同じシーケンスを送信し、
#一時的な記憶
print "========================================================="
for j in animalIsAliving:
  activeColumns = set([k for k, data in zip(count(), j) if data==1])
  #各ベクトルをTMに送信し、学習はオフにします
  tm.compute(activeColumns, learn = False)

  #次のprint文は、アクティブなセルを出力します.predictive
  #セル、アクティブなセグメント、および勝者セル。
  #
  #注目すべきは、アクティブな状態が1の列
  #現在の入力パターンのSDRと、
  #予測状態は1であり、次の期待パターンのSDRを表す
  print "\nAll the active and predicted cells:"

  print("active cells " + str(tm.getActiveCells()))
  print("predictive cells " + str(tm.getPredictiveCells()))
  print("winner cells " + str(tm.getWinnerCells()))
  print("# of active segments " + str(tm.connections.numSegments()))

  activeColumnsIndeces = [tm.columnForCell(i) for i in tm.getActiveCells()]
  predictedColumnIndeces = [tm.columnForCell(i) for i in tm.getPredictiveCells()]


  # 1をアクティブ、0をアクティブおよび非アクティブの列に再構成する
  #非アクティブ表現。

  actColState = ['1' if i in activeColumnsIndeces else '0' for i in range(tm.numberOfColumns())]
  actColStr = ("".join(actColState))
  predColState = ['1' if i in predictedColumnIndeces else '0' for i in range(tm.numberOfColumns())]
  predColStr = ("".join(predColState))


  #便宜上、セルはグループ化されています一度に
  # 10。列ごとに複数のセルがある場合、プリントアウト
  #は、列内のセルがスタックされるように配置されています
  print "Active columns:    " + formatRow(actColStr)
  print "Predicted columns: " + formatRow(predColStr)

  #1 predictedCells [C] [i]はc番目におけるi番目のセルの状態を表します
  #列。列が予測されているかどうかを確認するには、ORその列のすべてのセルにまたがって＃を入力します。numpyでは、
  #軸1に沿った最大値。


### test dogを入力
#ステップ3：ベクトルの同じシーケンスを送信し、
#一時的な記憶
print "========================================================="

activeColumns = set([k for k, data in zip(count(), dog) if data==1])
#各ベクトルをTMに送信し、学習はオフにします
tm.compute(activeColumns, learn = False)

activeColumnsIndeces = [tm.columnForCell(i) for i in tm.getActiveCells()]
predictedColumnIndeces = [tm.columnForCell(i) for i in tm.getPredictiveCells()]

actColState = ['1' if i in activeColumnsIndeces else '0' for i in range(tm.numberOfColumns())]
actColStr = ("".join(actColState))
predColState = ['1' if i in predictedColumnIndeces else '0' for i in range(tm.numberOfColumns())]
predColStr = ("".join(predColState))
predCol = numpy.array([1 if i in predictedColumnIndeces else 0 for i in range(tm.numberOfColumns())])

print "入力: Active columns:    " + formatRow(actColStr) + "  dog"
print "予測: Predicted columns: " + formatRow(predColStr) + "  is"

### test2　1回目の予測結果を入力
activeColumns = set([k for k, data in zip(count(), predCol) if data==1])
#各ベクトルをTMに送信し、学習はオフにします
tm.compute(activeColumns, learn = False)

activeColumnsIndeces = [tm.columnForCell(i) for i in tm.getActiveCells()]
predictedColumnIndeces = [tm.columnForCell(i) for i in tm.getPredictiveCells()]

actColState = ['1' if i in activeColumnsIndeces else '0' for i in range(tm.numberOfColumns())]
actColStr = ("".join(actColState))
predColState = ['1' if i in predictedColumnIndeces else '0' for i in range(tm.numberOfColumns())]
predColStr = ("".join(predColState))
predCol = numpy.array([1 if i in predictedColumnIndeces else 0 for i in range(tm.numberOfColumns())])

print "入力: Active columns:    " + formatRow(actColStr) + "  is"
print "予測: Predicted columns: " + formatRow(predColStr) + "  living"

出力結果

### エンコーダ部

cat = [0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
dog = [0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
fish = [0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
is = [0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0]
car = [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0]
bike = [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0]

airplane = [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0]
animal = [1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
aliving = [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1]
---------------------------------------------

###　学習途中の結果は飛ばします

###　学習結果

==============================================================-

All the active and predicted cells:
active cells [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
predictive cells [12, 13, 14]
winner cells [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
# of active segments 6
Active columns: 1111111111 1100000000 0000000

Predicted columns: 0000000000 0011100000 0000000

All the active and predicted cells:
active cells [12, 13, 14]
predictive cells [24, 25, 26]
winner cells [12, 13, 14]
# of active segments 6
Active columns: 0000000000 0011100000 0000000
Predicted columns: 0000000000 0000000000 0000111

All the active and predicted cells:
active cells [24, 25, 26]
predictive cells []
winner cells [24, 25, 26]
# of active segments 6
Active columns: 0000000000 0000000000 0000111
Predicted columns: 0000000000 0000000000 0000000
==============================================================-

###　学習後のテスト結果
入力: Active columns: 0000001110 0000000000 0000000 dog
予測: Predicted columns: 0000000000 0011100000 0000000 is
入力: Active columns: 0000000000 0011100000 0000000 is
予測: Predicted columns: 0000000000 0000000000 0000111 living

まず、学習結果を見ると。

animal→is→living

となっており、学習は上手くいっている。

次に、学習後のテスト結果を見てみると、

dog→is→living

となり、きちんと演繹法が解けている結果となった。

今回は３段論法という簡単な問題を解きましたが、

次は４段、５段と複雑な問題でも、HTMで解いていけるのかを実験して見たいです。

これからも、こういった小さな実験をコツコツと繰り返して、ブログタイトルでもある人口知性を目指していきたいと思っています。

お読み下さりありがとうございました！