Developing Alexa Skills with MongoDB and Golang
Rate this tutorial
The popularity of Amazon Alexa and virtual assistants in general is no question, huge. Having a web application and mobile application isn't enough for most organizations anymore, and now you need to start supporting voice operated applications.
So what does it take to create something for Alexa? How different is it from creating a web application?
In this tutorial, we're going to see how to create an Amazon Alexa Skill, also referred to as an Alexa application, that interacts with a MongoDB cluster using the Go programming language (Golang) and AWS Lambda.
A few requirements must be met prior to starting this tutorial:
- Golang must be installed and configured
- A MongoDB Atlas cluster
If you don't have a MongoDB Atlas cluster, you can configure one for free. For this example an M0 cluster is more than sufficient.
Make sure the Atlas cluster has the proper IP addresses on the Network Access List for AWS services. If AWS Lambda cannot reach your cluster then requests made by Alexa will fail.
Having an Amazon Echo or other Amazon Alexa enabled device is not necessary to be successful with this tutorial. Amazon offers a really great simulator that can be used directly in the web browser.
When it comes to building an Alexa Skill, it doesn't matter if you start with the code or the design. For this tutorial we're going to start with the design, directly in the Amazon Developer Portal for Alexa.
Sign into the portal and choose to create a new custom Skill. After creating the Skill, you'll be brought to a dashboard with several checklist items:
In the checklist, you should take note of the following:
- Invocation Name
- Intents, Samples, and Slots
- Endpoint
There are other items, one being optional and the other being checked naturally as the others complete.
The first step is to define the invocation name. This is the name that users will use when they speak to their virtual assistant. It should not be confused with the Skill name because the two do not need to match. The Skill name is what would appear in the online marketplace.
For our invocation name, let's use recipe manager, something that is easy to remember and easy to pronounce. With the invocation name in place, we can anticipate using our Skill like the following:
1 Alexa, ask Recipe Manager to INTENT
The user would not literally speak INTENT in the command. The intent
is the command that will be defined through sample utterances, also
known as sample phrases or data. You can, and probably should, have
multiple intents for your Skill.
Let's start by creating an intent titled GetIngredientsForRecipeIntent with the following sample utterances:
1 what ingredients do i need for {recipe} 2 what do i need to cook {recipe} 3 to cook {recipe} what ingredients do i need
There are a few things to note about the above phrases:
- The
{recipe}
tag is a slot variable which is going to be user defined when spoken. - Every possible spoken phrase to execute the command should be listed.
Alexa operates from machine learning, so the more sample data the better. When defining the
{recipe}
variable, it should be assigned a type of AMAZON.Food
.When all said and done, you could execute the intent by doing something like:
1 Alexa, ask Recipe Manager what do I need to cook Chocolate Chip Cookies
Having one intent in your Alexa Skill is no fun, so let's create another intent with its own set of sample phrases. Choose to create a new intent titled
GetRecipeFromIngredientsIntent
with the following sample utterances:1 what can i cook with {ingredientone} and {ingredienttwo} 2 what are some recipes with {ingredientone} and {ingredienttwo} 3 if i have {ingredientone} and {ingredienttwo} what can i cook
This time around we're using two slot variables instead of one. Like previously mentioned, it is probably a good idea to add significantly more sample utterances to get the best results. Alexa needs to be able to process the data to send to your Lambda function.
At this point in time, the configuration in the Alexa Developer Portal is about complete. The exception being the endpoint which doesn't exist yet.
Alexa, for the most part should be able to direct requests, so now we need to create our backend to receive and process them. This is where Lambda, Go, and MongoDB come into play.
Assuming Golang has been properly installed and configured, create a new project within your $GOPATH and within that project, create a main.go file. As boilerplate to get the ball rolling, this file should contain the following:
1 package main 2 3 func main() { }
With the boilerplate code added, now we can install the MongoDB Go driver. To do this, you could in theory do a
go get
, but the preferred approach as of now is to use the dep package management tool for Golang. To do this, after having installed the tool, execute the following:1 dep init 2 dep ensure -add "go.mongodb.org/mongo-driver/mongo"
We're using
dep
so that way the version of the driver that we're using in our project is version locked.In addition to the MongoDB Go driver, we're also going to need to get the AWS Lambda SDK for Go as well as an unofficial SDK for Alexa, since no official SDK exists. To do this, we can execute:
1 dep ensure -add "github.com/arienmalec/alexa-go" 2 dep ensure -add "github.com/aws/aws-lambda-go/lambda"
With the dependencies available to us, we can modify the project's main.go file. Open the file and add the following code:
1 package main 2 3 import ( 4 "context" 5 "os" 6 7 "go.mongodb.org/mongo-driver/mongo" 8 "go.mongodb.org/mongo-driver/mongo/options" 9 ) 10 11 // Stores a handle to the collection being used by the Lambda function 12 type Connection struct { 13 collection *mongo.Collection 14 } 15 16 func main() { 17 ctx := context.Background() 18 client, err := mongo.Connect(ctx, options.Client().ApplyURI(os.Getenv("ATLAS_URI"))) 19 if err != nil { 20 panic(err) 21 } 22 23 defer client.Disconnect(ctx) 24 25 connection := Connection{ 26 collection: client.Database("alexa").Collection("recipes"), 27 } 28 }
In the
main
function we are creating a client using the connection string of our cluster. In this case, I'm using an environment variable on my computer that points to my MongoDB Atlas cluster. Feel free to configure that connection string however you feel the most confident.Upon connecting, we are getting a handle of a
recipes
collection for an alexa
database and storing it in a Connection
data structure. Because we won't be writing any data in this example, both the alexa
database and the recipes
collection should exist prior to running this application.You can check out more information about connecting to MongoDB with the Go programming language in a previous tutorial I wrote.
So why are we storing the collection handle in a
Connection
data structure?AWS Lambda behaves a little differently when it comes to web applications. Instead of running the
main
function and then remaining alive for as long as your server remains alive, Lambda functions tend to suspend or shutdown when they are not used. For this reason, we cannot rely on our connection being available and we also don't want to establish too many connections to our database in the scenario where our function hasn't shut down. To handle this, we can pass the connection from our main
function to our logic function.Let's make a change to see this in action:
1 package main 2 3 import ( 4 "context" 5 "os" 6 7 "github.com/arienmalec/alexa-go" 8 "github.com/aws/aws-lambda-go/lambda" 9 "go.mongodb.org/mongo-driver/mongo" 10 "go.mongodb.org/mongo-driver/mongo/options" 11 ) 12 13 // Stores a handle to the collection being used by the Lambda function 14 type Connection struct { 15 collection *mongo.Collection 16 } 17 18 func (connection Connection) IntentDispatcher(ctx context.Context, request alexa.Request) (alexa.Response, error) { 19 // Alexa logic here... 20 } 21 22 func main() { 23 ctx := context.Background() 24 client, err := mongo.Connect(ctx, options.Client().ApplyURI(os.Getenv("ATLAS_URI"))) 25 if err != nil { 26 panic(err) 27 } 28 29 defer client.Disconnect(ctx) 30 31 connection := Connection{ 32 collection: client.Database("alexa").Collection("recipes"), 33 } 34 35 lambda.Start(connection.IntentDispatcher) 36 }
Notice in the above code that we've added a
lambda.Start
call in our main
function that points to an IntentDispatcher
function. We're designing this function to use the connection information established in the main
function, which based on our Lambda knowledge, may not run every time the function is executed.So we've got the foundation to our Alexa Skill in place. Now we need to design the logic for each of our intents that were previously defined in the Alexa Developer Portal.
Since this is going to be a recipe related Skill, let's model our MongoDB documents like the following:
1 { 2 "_id": ObjectID("234232358943"), 3 "name": "chocolate chip cookies", 4 "ingredients": [ 5 "flour", 6 "egg", 7 "sugar", 8 "chocolate" 9 ] 10 }
There is no doubt that our documents could be more extravagant, but for this example it will work out fine. Within the MongoDB Atlas cluster, create the alexa database if it doesn't already exist and add a document modeled like the above in a recipes collection.
In the
main.go
file of the project, add the following data structure:1 // A data structure representation of the collection schema 2 type Recipe struct { 3 ID primitive.ObjectID `bson:"_id"` 4 Name string `bson:"name"` 5 Ingredients []string `bson:"ingredients"` 6 }
With the MongoDB Go driver, we can annotate Go data structures with BSON
so that way we can easily map between the two. It essentially makes our
lives a lot easier when working with MongoDB and Go.
Let's circle back to the
IntentDispatcher
function:1 func (connection Connection) IntentDispatcher(ctx context.Context, request alexa.Request) (alexa.Response, error) { 2 var response alexa.Response 3 switch request.Body.Intent.Name { 4 case "GetIngredientsForRecipeIntent": 5 case "GetRecipeFromIngredientsIntent": 6 default: 7 response = alexa.NewSimpleResponse("Unknown Request", "The intent was unrecognized") 8 } 9 return response, nil 10 }
Remember the two intents from the Alexa Developer Portal? We need to assign logic to them.
Essentially, we're going to do some database logic and then use the
NewSimpleResponse
function to create a response the the results.Let's start with the
GetIngredientsForRecipeIntent
logic:1 case "GetIngredientsForRecipeIntent": 2 var recipe Recipe 3 recipeName := request.Body.Intent.Slots["recipe"].Value 4 if recipeName == "" { 5 return alexa.Response{}, errors.New("Recipe name is not present in the request") 6 } 7 if err := connection.collection.FindOne(ctx, bson.M{"name": recipeName}).Decode(&recipe); err != nil { 8 return alexa.Response{}, err 9 } 10 response = alexa.NewSimpleResponse("Ingredients", strings.Join(recipe.Ingredients, ", "))
In the above snippet, we are getting the slot variable that was passed and are issuing a
FindOne
query against the collection. The filter for the query says that the name
field of the document must match the recipe that was passed in as a slot variable.If there was a match, we are serializing the array of ingredients into a string and are returning it back to Alexa. In theory, Alexa should then read back the comma separated list of ingredients.
Now let's take a look at the
GetRecipeFromIngredientsIntent
intent logic:1 case "GetRecipeFromIngredientsIntent": 2 var recipes []Recipe 3 ingredient1 := request.Body.Intent.Slots["ingredientone"].Value 4 ingredient2 := request.Body.Intent.Slots["ingredienttwo"].Value 5 cursor, err := connection.collection.Find(ctx, bson.M{ 6 "ingredients": bson.D{ 7 {"$all", bson.A{ingredient1, ingredient2}}, 8 }, 9 }) 10 if err != nil { 11 return alexa.Response{}, err 12 } 13 if err = cursor.All(ctx, &recipes); err != nil { 14 return alexa.Response{}, err 15 } 16 var recipeList []string 17 for _, recipe := range recipes { 18 recipeList = append(recipeList, recipe.Name) 19 } 20 response = alexa.NewSimpleResponse("Recipes", strings.Join(recipeList, ", "))
In the above snippet, we are taking both slot variables that represent
ingredients and are using them in a
Find
query on the collection. This
time around we are using the $all
operator because we want to filter
for all recipes that contain both ingredients anywhere in the array.With the results of the
Find
, we can create create an array of the
recipe names and serialize it to a string to be returned as part of the
Alexa response.If you'd like more information on the
Find
and FindOne
commands for
Go and MongoDB, check out my how to read documents
tutorial
on the subject.While it might seem simple, the code for the Alexa Skill is actually
complete. We've coded scenarios for each of the two intents that we've
set up in the Alexa Developer Portal. We could improve upon what we've
done or create more intents, but it is out of the scope of what we want
to accomplish.
Now that we have our application, we need to build it for Lambda.
Execute the following commands:
1 GOOS=linux go build 2 zip handler.zip ./project-name
So what's happening in the above commands? First we are building a Linux compatible binary. We're doing this because if you're developing on Mac or Windows, you're going to end up with a binary that is incompatible. By defining the operating system, we're telling Go what to build for.
For more information on cross-compiling with Go, check out my Cross Compiling Golang Applications For Use On A Raspberry Pi post.
Next, we are creating an archive of our binary. It is important to replace the
project-name
with that of your actual binary name. It is important to remember the name of the file as it is used in the Lambda dashboard.When you choose to create a new Lambda function within AWS, make sure Go is the development technology. Choose to upload the ZIP file and add the name of the binary as the handler.
Now it comes down to linking Alexa with Lambda.
Take note of the ARN value of your Lambda function. This will be added in the Alexa Portal. Also, make sure you add the Alexa Skills Kit as a trigger to the function. It is as simple as selecting it from the list.
Navigate back to the Alexa Developer Portal and choose the Endpoint checklist item. Add the ARN value to the default region and choose to build the Skill using the Build Model button.
When the Skill is done building, you can test it using the simulator that Amazon offers as part of the Alexa Developer Portal. This simulator can be accessed using the Test tab within the portal.
If you've used the same sample utterances that I have, you can try entering something like this:
1 ask recipe manager what can i cook with flour and sugar 2 ask recipe manager what chocolate chip cookies requires
Of course the assumption is that you also have collection entries for chocolate chip cookies and the various ingredients that I used above. Feel free to modify the variable terms with those of your own data.
You just saw how to build an Alexa Skill with MongoDB, Golang, and AWS Lambda. Knowing how to develop applications for voice assistants like Alexa is great because they are becoming increasingly popular, and the good news is that they aren't any more difficult than writing standard applications.
As previously mentioned, MongoDB Atlas makes pairing MongoDB with Lambda and Alexa very convenient. You can use the free tier or upgrade to something better.
If you'd like to expand your Alexa with Go knowledge and get more practice, check out a previous tutorial I wrote titled Build an Alexa Skill with Golang and AWS Lambda.