Create a custom Alexa Skill with AWS Lambda - Pt 2 (Alexa Skill)

This is part 2 on creating an Alexa Skill using Lambda. Read part 1 for an overview here.

Before Alexa can understand the commands you speak to it, you need to create a custom "skill" in the Alexa Developer Console. Skills are a set of configuration settings which control which voice commands are parsed, and which lambda function handles these commands.

Note that Skills do not involve any code in themselves, they are just collections of config settings. Check out my overview article to see where skills fit into a server-less architecture.

Invocation name

As you create a new skill, one of the first decisions to make is your skill's invocation name. This is the keyword(s) which Alexa uses to know that all commands which immediately follow are for your skill, eg Alexa, open `My Fancy Skill` (where `My Fancy Skill` is your invocation name). Note that the invocation name does not necessarily have to match your skill's actual name, it just has to be a fairly distinctive and short keyword.

Try a few invocation names, the first few I tried did not work too well - Alexa would have difficulty understanding the words and reply with an error message like "unable to find a skill with that name". Distinctive words that are short but multi-syllable seemed to work the best.

Intent Schema

Intents are entry-points to your code. Your skill is configured with a list of "sample utterances", which are sample voice commands. These utterances are then mapped to "intents". When Alexa hears your voice command, it sends a dictionary object to your lambda function and one of the keys in this object is event['request']['intent']. Your lambda function can then use the intent name to handle the command accordingly.

Something like this:

Here is what a sample intent schema looks like:


        {
          "intents": [
            {
              "intent": "RemainingIntent"
            },
            {
              "intent": "MacroIntent",
              "slots": [
                {
                  "name": "Macro",
                  "type": "LIST_OF_MACROS"
                }
              ]
            },
            {
              "intent": "AMAZON.CancelIntent"
            },
            {
              "intent": "AMAZON.StopIntent"
            }
          ]
        }

As you can see, the schema is a dictionary with "intents" as the only key. Its value is an array of one or more intent objects. Some notes:

An intent can either be a custom one which you create, or one of the built-in intents like "AMAZON.StopIntent". These built-in intents come with pre-defined utterances, so that common keywords ("stop", "help", etc) are already included.
Each intent can have one or more slots, which are variables in your utterances. For example, an utterance for "What time is it" does not require any slots since there are no variables, while another utterance for "What is the price of bananas" will probably need a slot for "Item" (eg bananas).
These slots can either be custom slot types, which are lists you configure with your skill. Or a built-in slot type, which makes it easier to parse the variables in the utterance.

Sample Utterances

Utterances are sample voice commands which your skill will handle. Each utterance must map to one intent from your intent schema. Some notes:

Alexa is not quite at the stage where it uses NLP to understand commands. It relies on you to provide a fairly comprehensive list of possible utterances, which must match the user's voice commands quite closely. So try to include a number of variations for how users may request the same thing, and map it to the same intent.
If you get an error when saving utterances, chances are good there is an error in either your intent schema or custom slot types. So check that all your intents are valid, and all slot types have been saved successfully.

Endpoint

In the configuration section, you set which endpoint handles your Alexa voice commands. This could be any normal URL handled by a web server, but I chose to link this to a lambda function.

If you choose a lambda function, ensure your lambda function is in a AWS region which can support "Alexa Skills Kit" as a trigger. This is done in the Lambda configuration, at the time of writing one of the limited options was 'us-east-1'.

Account Linking

Account linking is the process of linking your Alexa skill with some 3rd party application. This is a bit more complicated, and is only required if you use functions that require logging in to another application.

The main setting here is the "Authorization URL", which is the URL the user gets redirected to to link their account when they first add your skill. I created a separate lambda function sitting behind an API Gateway for this, with more detail to come in another article. See how it fit in the overview.

There are 2 grant types you can choose, either implicit grant or auth code grant. I implemented the implicit grant, so the flow looks like this:

When user first enables your skill, they are prompted to link accounts by being redirected to the "Authorization URL" you listed.
The user is redirected here with some additional URL parameters, namely state, redirect_uri, response_type, client_id.
Of these, only "state" and "redirect_uri" are needed in your lambda function. Your function should grant a new access_token for the user, and then redirect back to the same "redirect_uri" with some extra parameters at the end of the URL, including "state".
The redirect_uri you redirect to should include 4 parameters - vendorId (already included in the "redirect_uri" given to your function), state, access_token, and token_type. Note that after the vendorId parameter is a hashtag (ie "#") and not an ampersand (ie "&"), as it is for most URL parameters.

So after your lambda function grants an access_token, you should be redirecting to a URL like:

https://pitangui.amazon.com/spa/skill/account-linking-status.html?vendorId=ABC123#state=def123&access_token=123456&token_type=Bearer

Wrapping Up

This should cover the main tricky bits, I found the other sections straightforward. Setting up the lambda functions were a bit trickier though, and I'll expand on those shortly.