Creating Environment for Chatbot

  • Creating conda environment for chatbot:

    conda create -n chatbot python=3.6 -y

  • Activate the environment:

    conda activate chatbot

  • Install all the dependencies:

    pip install nltk

    pip install numpy

    pip install tensorflow-gpu For GPU only

    pip install tflearn

You can download intents.json file from here

Now Importing the required libraries.

import nltk
nltk.download('punkt')
from nltk.stem.lancaster import LancasterStemmer
stemmer = LancasterStemmer()

# Libraries needed for Tensorflow processing
import tensorflow as tf
import numpy as np
import tflearn
import random
import json
import pickle
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\DIPESH\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
  • Here we are loading our Intents.json file where we have defined our intents, patterns, responses etc.
with open('intents.json') as json_data:
    intents = json.load(json_data)
  • These are the json file we have created for this purpose.
intents
{'intents': [{'tag': 'greeting',
   'patterns': ['Hi', 'How are you', 'Is anyone there?', 'Hello', 'Good day'],
   'responses': ['Hello, thanks for visiting',
    'Good to see you again',
    'Hi there, how can I help?'],
   'context_set': ''},
  {'tag': 'goodbye',
   'patterns': ['Bye', 'See you later', 'Goodbye'],
   'responses': ['See you later, thanks for visiting',
    'Have a nice day',
    'Bye! Come back again soon.']},
  {'tag': 'thanks',
   'patterns': ['Thanks', 'Thank you', "That's helpful"],
   'responses': ['Happy to help!', 'Any time!', 'My pleasure']},
  {'tag': 'hours',
   'patterns': ['What hours are you open?',
    'What are your hours?',
    'When are you open?'],
   'responses': ["We're open every day 9am-9pm",
    'Our hours are 9am-9pm every day']},
  {'tag': 'machinelearning',
   'patterns': ['What is machine learning?', 'What is machine learning for?'],
   'responses': ['Machine learning is a branch of AI',
    'For predicting using data']},
  {'tag': 'location',
   'patterns': ['What is your location?',
    'Where are you located?',
    'What is your address?',
    'Where is your restaurant situated?'],
   'responses': ['We are on Lalitpur, Godawori.',
    'We are situated at Godawori Buspark',
    'Our Address is: Godawori Buspark, Lalitpur, Nepal']},
  {'tag': 'payments',
   'patterns': ['Do you take credit cards?',
    'Do you accept Mastercard?',
    'Are you cash only?'],
   'responses': ['We accept VISA, Mastercard and AMEX',
    'We accept most major credit cards']},
  {'tag': 'todaysmenu',
   'patterns': ['What is your menu for today?',
    'What are you serving today?',
    "What is today's special?"],
   'responses': ["Today's special is Chicken Tikka",
    'Our speciality for today is Chicken Tikka']},
  {'tag': 'deliveryoption',
   'patterns': ['Do you provide home delivery?',
    'Do you deliver the food?',
    'What are the home delivery options?'],
   'responses': ['Yes, we provide home delivery through UBER Eats and Zomato?',
    'We have home delivery options through UBER Eats and Zomato'],
   'context_set': 'food'},
  {'tag': 'menu',
   'patterns': ['What is your Menu?',
    'What are the main course options?',
    'Can you tell me the most delicious dish from the menu?',
    "What is the today's special?"],
   'responses': ['You can visit www.foodmandu.com for menu options',
    'You can check out the food menu at www.foodmandu.com',
    'You can check various delicacies given in the food menu at www.foodmandu.com'],
   'context_filter': 'food'}]}

Preprocessing the json data

  • We are tokenizing the patterns word.
  • Appending that words to the list of words.
  • Appending that words along with tag in a empty document list.
  • Appending intent in a empty classes list.
words = []
classes = []
documents = []
ignore = ['?']
# loop through each sentence in the intent's patterns
for intent in intents['intents']:
    for pattern in intent['patterns']:
        # tokenize each and every word in the sentence
        w = nltk.word_tokenize(pattern)
        # add word to the words list
        words.extend(w)
        # add word(s) to documents
        documents.append((w, intent['tag']))
        # add tags to our classes list
        if intent['tag'] not in classes:
            classes.append(intent['tag'])
  • In this cell we are stemming all the collected words and lowercasing them simulteniously and removing the duplicate word from the word, classes list.
words = [stemmer.stem(w.lower()) for w in words if w not in ignore]
words = sorted(list(set(words)))

# remove duplicate classes
classes = sorted(list(set(classes)))

print (len(documents), "documents")
print (len(classes), "classes", classes)
print (len(words), "unique stemmed words", words)
33 documents
10 classes ['deliveryoption', 'goodbye', 'greeting', 'hours', 'location', 'machinelearning', 'menu', 'payments', 'thanks', 'todaysmenu']
59 unique stemmed words ["'s", 'acceiv', 'address', 'anyon', 'ar', 'bye', 'can', 'card', 'cash', 'cours', 'credit', 'day', 'del', 'delicy', 'delivery', 'dish', 'do', 'food', 'for', 'from', 'good', 'goodby', 'hello', 'help', 'hi', 'hom', 'hour', 'how', 'is', 'lat', 'learn', 'loc', 'machin', 'main', 'mastercard', 'me', 'menu', 'most', 'on', 'op', 'opt', 'provid', 'resta', 'see', 'serv', 'situ', 'spec', 'tak', 'tel', 'thank', 'that', 'the', 'ther', 'today', 'what', 'when', 'wher', 'yo', 'you']
  • In this cell we are preparing the data for the training purposes:
training = []
output = []
# create an empty array for output
output_empty = [0] * len(classes)

# create training set, bag of words for each sentence
for doc in documents:
    # initialize bag of words
    bag = []
    # list of tokenized words for the pattern
    pattern_words = doc[0]
    # stemming each word
    pattern_words = [stemmer.stem(word.lower()) for word in pattern_words]
    # create bag of words array
    for w in words:
        bag.append(1) if w in pattern_words else bag.append(0)

    # output is '1' for current tag and '0' for rest of other tags
    output_row = list(output_empty)
    output_row[classes.index(doc[1])] = 1

    training.append([bag, output_row])

# shuffling features and turning it into np.array
random.shuffle(training)
training = np.array(training)

# creating training lists
train_x = list(training[:,0])
train_y = list(training[:,1])
d:\AnacondaInstallation\envs\ner\lib\site-packages\ipykernel_launcher.py:27: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray

In this shell we are using tflearn wrapper for faster training and easy execution than keras.

  • We are building Deep Neural Network on top of tflearn.
  • To learn more about tflearn visit: Tflearn.
  • We give input to the tflearn as a placeholder.
  • Then add a 10 node each for two fully connected layers.
  • Then at last we give the total number of classes to the softmax layer.
  • Output of softmax layer is the probability of the class and it is given to the tflearn.regression function which will perform a regression (linear or logistic) to the provided input.
  • Now we give the output of tflearn.regression to tflearn Deep Neural Network Model.
tf.compat.v1.reset_default_graph()

# Building neural network
net = tflearn.input_data(shape=[None, len(train_x[0])])
net = tflearn.fully_connected(net, 10)
net = tflearn.fully_connected(net, 10)
net = tflearn.fully_connected(net, len(train_y[0]), activation='softmax')
net = tflearn.regression(net)

# Defining model and setting up tensorboard
model = tflearn.DNN(net, tensorboard_dir='tflearn_logs')

# Start training
model.fit(train_x, train_y, n_epoch=1000, batch_size=8, show_metric=True)
model.save('model.tflearn')
Training Step: 4999  | total loss: 0.00410 | time: 0.054s
| Adam | epoch: 1000 | loss: 0.00410 - acc: 1.0000 -- iter: 32/33
Training Step: 5000  | total loss: 0.00425 | time: 0.070s
| Adam | epoch: 1000 | loss: 0.00425 - acc: 1.0000 -- iter: 33/33
--
INFO:tensorflow:d:\AI_University\Deep_and_Machine_Learning_Projects\Build_ChatBot_using_Neural_Network\model.tflearn is not in all_model_checkpoint_paths. Manually adding it.
  • In this step we are storing extracted words, classes, train_x and train_y to training_data using pickle.dump function.
pickle.dump( {'words':words, 'classes':classes, 'train_x':train_x, 'train_y':train_y}, open( "training_data", "wb" ) )

Now Prediction Time

  • Here we are restoring all saved data from the training_data using pickle.load function.
data = pickle.load( open( "training_data", "rb" ) )
words = data['words']
classes = data['classes']
train_x = data['train_x']
train_y = data['train_y']
  • Loading intents.json file.
with open('intents.json') as json_data:
    intents = json.load(json_data)

Loading trained model.

model.load('./model.tflearn')
INFO:tensorflow:Restoring parameters from d:\AI_University\Deep_and_Machine_Learning_Projects\Build_ChatBot_using_Neural_Network\model.tflearn
  • clean_up_sentences In this function we are taking input sentence from the user and cleaning it up. We are doing word tokenization and stemming which returns stemmed words.

  • bow function In this function we give sentence and extracted as a input and return the array of bag of words.

def clean_up_sentence(sentence):
    # tokenizing the pattern
    sentence_words = nltk.word_tokenize(sentence)
    # stemming each word
    sentence_words = [stemmer.stem(word.lower()) for word in sentence_words]
    return sentence_words

# returning bag of words array: 0 or 1 for each word in the bag that exists in the sentence
def bow(sentence, words, show_details=False):
    # tokenizing the pattern
    sentence_words = clean_up_sentence(sentence)
    # generating bag of words
    bag = [0]*len(words)  
    for s in sentence_words:
        for i,w in enumerate(words):
            if w == s: 
                bag[i] = 1
                if show_details:
                    print ("found in bag: %s" % w)

    return(np.array(bag))
  • classify function In this function we give the input as a sentence, which will predict the intent of the sentence and return the intent and its probability.

  • response function This function takes input as a sentence and returns the response of the chatbot.

ERROR_THRESHOLD = 0.30
def classify(sentence):
    # generate probabilities from the model
    results = model.predict([bow(sentence, words)])[0]
    # filter out predictions below a threshold
    results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD]
    # sort by strength of probability
    results.sort(key=lambda x: x[1], reverse=True)
    return_list = []
    for r in results:
        return_list.append((classes[r[0]], r[1]))
    # return tuple of intent and probability
    return return_list

def response(sentence, userID='123', show_details=False):
    results = classify(sentence)
    # if we have a classification then find the matching intent tag
    if results:
        # loop as long as there are matches to process
        while results:
            for i in intents['intents']:
                # find a tag matching the first result
                if i['tag'] == results[0][0]:
                    # a random response from the intent
                    return print(random.choice(i['responses']))

            results.pop(0)

Look at the Result

classify('What are you hours of operation?')
[('hours', 0.9996307)]
response('What is machine learning?')
Machine learning is a branch of AI
response('What is menu for today?')
Today's special is Chicken Tikka
response('Do you accept Credit Card?')
We accept most major credit cards
response('Where can we locate you?')
We are on Lalitpur, Godawori.
response('That is helpful')
Happy to help!
response('Bye')
Have a nice day

Adding some context to the conversation i.e. Contexualization for altering question and intents etc.

  • classify function is same as above.
  • In response function we add contextualization part which help to alter the question and intents.
context = {}

ERROR_THRESHOLD = 0.25
def classify(sentence):
    # generate probabilities from the model
    results = model.predict([bow(sentence, words)])[0]
    # filter out predictions below a threshold
    results = [[i,r] for i,r in enumerate(results) if r>ERROR_THRESHOLD]
    # sort by strength of probability
    results.sort(key=lambda x: x[1], reverse=True)
    return_list = []
    for r in results:
        return_list.append((classes[r[0]], r[1]))
    # return tuple of intent and probability
    return return_list

def response(sentence, userID='123', show_details=False):
    results = classify(sentence)
    # if we have a classification then find the matching intent tag
    if results:
        # loop as long as there are matches to process
        while results:
            for i in intents['intents']:
                # find a tag matching the first result
                if i['tag'] == results[0][0]:
                    # set context for this intent if necessary
                    if 'context_set' in i:
                        if show_details: print ('context:', i['context_set'])
                        context[userID] = i['context_set']

                    # check if this intent is contextual and applies to this user's conversation
                    if not 'context_filter' in i or \
                        (userID in context and 'context_filter' in i and i['context_filter'] == context[userID]):
                        if show_details: print ('tag:', i['tag'])
                        # a random response from the intent
                        return print(random.choice(i['responses']))

            results.pop(0)

See the result after adding contextualization

response('Can you please let me know the delivery options?')
Bye! Come back again soon.
response('What is menu for today?')
Today's special is Chicken Tikka
response("Hi there!", show_details=True)
context: 
tag: greeting
Hello, thanks for visiting
context
{'123': ''}
response('What is menu for today?')
Our speciality for today is Chicken Tikka
  • Here we ask a question that is not in the intents.json file and it gives preety well results. In this way we can add new intents and questions to the chatbot.