What is a chatbot?
A chatbot, or chatterbot, is a computer program aiming at simulating a written conversation with a human user.
Why making one?
Well, first… because it’s fun! Since Alan Turing, chatbot programming has been a way to test computer’s ability to pretend like they are human (see Turing test).
Also, chatbots can be have very useful applications, such as helping users on a website, teaching a language, etc.
The question this program will answer is: given a user input sentence, which output should we produce?
For this simple example, we are not going to try to extract the meaning of the sentences written by the user. That would be a lot of work, and it is not needed for what we want to achieve.
The program will have two distinct parts:
- Learning – when the user types a message, it is understood as an answer to previous statement made by the chatbot. The sentence typed by the human will then be associated with the words present in the previous message.
- Answering – the human message is decomposed in words. The program will try to identify which sentences correspond best to those words, according to its previous “experience”.
import re import sqlite3 from collections import Counter from string import punctuation from math import sqrt # initialize the connection to the database connection = sqlite3.connect('chatbot.sqlite') cursor = connection.cursor() # create the tables needed by the program create_table_request_list = [ 'CREATE TABLE words(word TEXT UNIQUE)', 'CREATE TABLE sentences(sentence TEXT UNIQUE, used INT NOT NULL DEFAULT 0)', 'CREATE TABLE associations (word_id INT NOT NULL, sentence_id INT NOT NULL, weight REAL NOT NULL)', ] for create_table_request in create_table_request_list: try: cursor.execute(create_table_request) except: pass def get_id(entityName, text): """Retrieve an entity's unique ID from the database, given its associated text. If the row is not already present, it is inserted. The entity can either be a sentence or a word.""" tableName = entityName + 's' columnName = entityName cursor.execute('SELECT rowid FROM ' + tableName + ' WHERE ' + columnName + ' = ?', (text,)) row = cursor.fetchone() if row: return row else: cursor.execute('INSERT INTO ' + tableName + ' (' + columnName + ') VALUES (?)', (text,)) return cursor.lastrowid def get_words(text): """Retrieve the words present in a given string of text. The return value is a list of tuples where the first member is a lowercase word, and the second member the number of time it is present in the text.""" wordsRegexpString = '(?:\w+|[' + re.escape(punctuation) + ']+)' wordsRegexp = re.compile(wordsRegexpString) wordsList = wordsRegexp.findall(text.lower()) return Counter(wordsList).items() B = 'Hello!' while True: # output bot's message print('B: ' + B) # ask for user input; if blank line, exit the loop H = raw_input('H: ').strip() if H == '': break # store the association between the bot's message words and the user's response words = get_words(B) words_length = sum([n * len(word) for word, n in words]) sentence_id = get_id('sentence', H) for word, n in words: word_id = get_id('word', word) weight = sqrt(n / float(words_length)) cursor.execute('INSERT INTO associations VALUES (?, ?, ?)', (word_id, sentence_id, weight)) connection.commit() # retrieve the most likely answer from the database cursor.execute('CREATE TEMPORARY TABLE results(sentence_id INT, sentence TEXT, weight REAL)') words = get_words(H) words_length = sum([n * len(word) for word, n in words]) for word, n in words: weight = sqrt(n / float(words_length)) cursor.execute('INSERT INTO results SELECT associations.sentence_id, sentences.sentence, ?*associations.weight/(4+sentences.used) FROM words INNER JOIN associations ON associations.word_id=words.rowid INNER JOIN sentences ON sentences.rowid=associations.sentence_id WHERE words.word=?', (weight, word,)) # if matches were found, give the best one cursor.execute('SELECT sentence_id, sentence, SUM(weight) AS sum_weight FROM results GROUP BY sentence_id ORDER BY sum_weight DESC LIMIT 1') row = cursor.fetchone() cursor.execute('DROP TABLE results') # otherwise, just randomly pick one of the least used sentences if row is None: cursor.execute('SELECT rowid, sentence FROM sentences WHERE used = (SELECT MIN(used) FROM sentences) ORDER BY RANDOM() LIMIT 1') row = cursor.fetchone() # tell the database the sentence has been used once more, and prepare the sentence B = row cursor.execute('UPDATE sentences SET used=used+1 WHERE rowid=?', (row,))
Starting with a “naïve” bot (i.e., on an empty database), here is the first “conversation” I had with M. Chatterbot:
B: Hello! H: Hi! B: Hi! H: How are you? B: How are you? H: Very well, thank you. B: Very well, thank you. H: Do you like music? B: Very well, thank you. H: Good. B: Do you like music? H: Yes, a lot! B: How are you? H: In a very good mood. B: Good. H: Indeed. B: Indeed. H: What is the weather like? B: Yes, a lot! H: Okay... B: What is the weather like? H: Quite rainy today. B: Indeed. H: I have to go. Goodbye! B: How are you? H: In a good mood. B: Indeed. H: See you later! B: Okay... H: Bye, I'll talk to you later :) B: Do you like music? H: I like it. B: I have to go. Goodbye! H: Bye! B: How are you? H:
Okay, this is not very exciting. But, right after this short phase of learning, here is what we can get:
B: Hello! H: Hi! B: How are you? H: Very well, thank you. How are you? B: In a very good mood. H: Glad to hear that. B: Bye! H: Goodbye! B: Hi! H:
B: Hello! H: Hi! How are you? B: In a good mood. H: Great! How do like chocolate? B: I like it. H: Really? Me too! What is the weather like at your place? B: Quite rainy today. H: Oh. B: See you later! H: Bye! B: Goodbye! H:
42 thoughts on “Make a chatbot with Python and SQLite”
does all chatbots based on quite simple pattern matching?
No, this is a very simple one. Some try to use syntax analysis, but the methods used are far beyond what is explained on this page…
Hi Mathieu, thank you very much for sharing your code, the idea behind it is very nice and simple!
But I have a problem when I try to execute it, my shell returns :
cursor.execute('SELECT rowid FROM ' + tableName + ' WHERE ' + columnName + ' = ?', (value,))
NameError: global name 'value' is not defined
Sorry but I’m a bit new to databases in general, I’m sure the solution is simple but I can’t find it. Would you help?
Oops, my mistake… it should have been
value. Corrected it in the above code, thanks!
Works perfectly now, thank you!
Hi, I hope I don’t bother you but I have another problem : I’m trying to build an online app using this script. I use Flask on server side and an AJAX function to get messages sent by users. When I post message on my page, the shell shows a ProgrammingError saying “SQLite objects created in a thread can only be used in that same thread.The object was created in thread id 140735192474384 and this is thread id 4492210176”
I described my problem on StackOverflow : http://stackoverflow.com/questions/27781207/chatbot-with-python-and-sqlite-multiple-thread-programmingerror
Any ideas on how to fix this?
Thanks a lot
Hi, I would recommend using SQLAlchemy to manage a connection pool (see http://stackoverflow.com/a/22739924/734335). Some day, I would need to rewrite the above code with a proper ORM… I didn’t use any in the above example because the contest I originally wrote it for (http://codegolf.stackexchange.com/a/24711/18794) didn’t allow external libraries.
How can i setup the sqlite with the bot? I mean, for make it run…
Just create a Python script with the code above. Name it (let’s be creative)
You can now execute it just by running
python chatbot.py(there is no dependency to install, as SQLite is included in modern versions of Python).
The SQLite database will automatically be populated if the tables are not already present.
I tried your program in french language. So, I modified it because french caracters like “ç” for example crash the bot. Moreother, using espeak, I make it speack in french.
Good job, that’s really fun. I wanted to create me too a chatbot using a SQL database but my knowledge in SQL interfaces is limited.
Thank you for this.
Hello, I don’t know how to word this, because I am new to programming, but I will try my best. I have used your code and it works fantastically, however, I want to improve upon it, but I do not know how too. My question is, how would you alter the code, to run this to be similar to a chat messenger app? By this I mean, that you have a pop up box where you type your answer, as the chatbot supplies its answer in the line below. Is it possible to not have the code visible at all, as a different window pops up as you chat?
Well, to do that you’d probably need something more than just python, maybe connecting it to HTML and CSS in a simple webpage. Here is how to use python on the web https://docs.python.org/2/howto/webservers.html. You can make a simple webpage without a server, just for you to see it, with a text editor, ask for more if you are interested. I am new too, so I might be wrong, but using HTML and CSS looks like the only logical way to do what you are asking for.
I used this code to make a bot for http://www.chatango.com(it’s a website that hosts chatrooms). I wanted the bot to be able to talk, not just execute commands with ch.py, so I used your code (which I hope is okay with you). The bot connects to the chatroom, and to respond, it waits for a message that will start with it’s name, then answers. I thought it would work fine, but the problem is, that when I talk to the bot, it doesn’t seem to actually understand what I am saying and store my input and answer in the same way later etc. It just picks a random answer, from the input it has already stored. (If you talk to it, it stores your input, but it always answers randomly). Since the user has to be the first to say something, in my program, the bot does not talk first, it waits for you to say anything, then prints what var B is set in the beginning of the program (“Hello!” for example) and then goes on with H.strip() and the rest of the code. Could that be my error? Can anyone here help me? Sorry for the long question.
Thanks in advance.
Hey, great tutorial. Can you please help me make the change from sqllite3 to mysql, as i am going to be using this on a raspberry pi. I would like it to be in mysql so then i can just copy the sql code to a new server on a new raspberry pi. Once, i have got the mysql working i am going to add eSpeak so then the bot can speak its responses.
Awesome tutorial, but when I tried to run it I got a ValueError: too many values to unpack…..this error was on line words_length = sum([n * len(word) for word, n in words])
Okay, my bad I found out that it was the second line stating words_length = sum([n * len(word) for word, n in words]) that was giving the error and it was because I had left out the n on n in words.
Hello Mathieu Rodic gtreat job with this tutorial!
Ypu said “this is a very simple one. Some try to use syntax analysis,” on another question and i want to know if you have or known a tutorial who cover that part? i’m very interested on it.
sorry my terrible english and keep doing that great job
Thank you for your feedback 🙂
Sorry, I did not have time to make a tutorial to cover more complex aspects of chatbots…
Hello, I think your code is very good. If I understood correctly, in this code the bot learn from what humans say, am I right? I mean that the bot will grow its database every time when someone speaks with it and then when it is spoken to, it will search the matching phrase. Isn’t this how the machine learning work? What are the things that doesn’t qualify this as machine learning bot?
It is some kind of machine learning, because the machine learns something…
…but what people call machine learning nowadays either involves neural networks, or genetic algorithm. In that sense, this program doesn’t really qualify.
Thank you Mathieu, it is very funny and usefull. I think with little patience I will have a very smart chatbot.
You’re welcome! I would be curious to see the result 🙂
Hi Mathieu! That’s a cool bot! i learn it but i still consufed with the code. i’ll try to understand your code, hmm can i using postgreSQL with this code? should i modify the code to make the AI more understand with my question?
Btw, Thanks! it really help me!
Hello! You can use PostgreSQL instead of SQLite for database.
For this kind of chatbot, the only way to make it “cleverer” is to give it a lot of examples. This is just a toy program, not some professional stuff 🙂
You should seek another kind of bot if you have more specific application in mind.
Iz it possible to make the bot to some sort of server that can handle many clients
Hello mathieu, in interested to know if there is a counter file for this, I.E when it I put in
Is that another file?
This is the Python regex module: https://docs.python.org/2/library/re.html
Good morning, Mathieu!
Thanks for sharing this code 🙂
I am interested in building a…
Rather silly chatbot.
In fact I’d like the chatbot just to understand a few specific keywords and enable / disable devices in my house:
“Turn on kitchen light”
“Turn off computer”
“What is the temperature”
Do you have any suggestion to build such a bot?
Thanks for your time 🙂
I can see two options for you:
But first you would need to treat the input as a list of words (e.g. with
re.findall(r'\w+', input_text)), to facilitate further treatment.
Mathieu, thanks for this, however when I tried to execute this, I receive the following error when I run it:
Traceback (most recent call last) :
File “chatbot.py”, line 64 in
H = raw_input (‘H: ‘).strip()
NameError: name ‘raw_input’ is not defined.
Which version of Python are you using? 2 or 3?
Thanks a lot! I have been looking for something like this to play around with for quite some time now, so I’m glad I found this!
Hello rodic, your program is really nice :).
I just want to know the chatbot can print hour and date.
in primary lines i import
mtn = datetime.datetime.now()
and after break break
HUMAIN = input(‘\033[1;32;40mHUMAIN:’).strip() #input ou raw_input selon la version
if HUMAIN == ”:
elif “date” in HUMAIN :
but this doesn’t work in the program can you yelp me thank you 🙂 and other time very nice work.
Hello Rodic, I followed your code, and try to put in action, and it works, but I want to know, how do you decide to calculate the weight? Why did you use:
> words_length = sum([n * len(word) for word, n in words])
> weight = sqrt(n / float(words_length))
Where did you take the formulas? I hope you can answer the question, because I didn’t understand, the reason. Thank 🙂
Hello, this formula takes the number of times a word is present in what the user wrote, and counterbalances that number by the total number of words to give a score for that word.
Thanks a lot for this very good basis !
I am making some little arrangements such as :
– encapsulating the whole script into a Chatbot class to be used elsewhere,
– replacing the ‘print’ and ‘input’ functions by a ‘sey’ and ‘get_response’ methods that can be anything more than just the console.
Now I am looking for another method to make it learn previous conversations to accelerate the process. Any idea ?
Other methods would be more complex, and might involve either machine learning (Theano with RNN could be an option), syntax analysis (see CONLL-X), or both.
I am very curious about the encapsulation, did you provide a public a example?
iI would like to take your code and have it call my wiki app, how would I do that?
I don’t really put a licence on this snippet… but please feel free to reuse it however you like 🙂
I am still getting this error
cursor.execute(‘SELECT rowid FROM ‘ + tableName + ‘ WHERE ‘ + columnName + ‘ = ?’, (text,))
sqlite3.OperationalError: no such table: sentences
Any suggestion how to get around it?
Hello, I corrected my code, in particular the part about creating the tables. Does it work for you now?