Learn how to obtain Twitter API credentials, interact with the Twitter API using Python, and retrieve information from public Twitter accounts, like tweets.

In this unit, we'll begin writing code for the **Celebrity Match** application.

We'll start by using the Twitter API to retrieve tweets from two different Twitter users and compare them.

Twitter is one of the most popular social media tools, which makes it easy to retrieve authored content, but you can also use other authored content, like blogs, papers, articles, and more. 

Overview

Before we can interact with the Twitter API, we'll need to import the following Python packages:

* `sys` - a basic interpreter that can handle low level functions of the computer's operating system.

* `operator` - a package that allows arithmetic and comparison functions, such as comparing two strings or multiplying two numbers.

* `requests` - a package that makes it easy to make HTTP requests, so we dont have to code any HTTP interactions.

* `json` - the most widely used format for data exchange on the Internet today. This package makes it easier to work with <a href="http://www.json.org/" target="_blank">JSON</a> objects in Python.

* `twitter` - the package needed to interact with the Twitter API.

* `watson_developer_cloud` - the package needed to interact with the Personality Insights API.

These Python packages are dependencies for the `python-twitter` and `watson-developer-cloud` packages we manually installed earlier. Without them, the `python-twitter` and `watson-developer-cloud` packages won't function correctly. 

The packages will also help us directly communicate with the Twitter API and format the results we'll display to users.

Import Packages

Now that we have the basic setup in our file, it's time to start retrieving some data from Twitter. 

Before we can interact with the Twitter API, we need to authenticate with the Twitter API. To do this, make sure you have your Twitter credentials handy:

1. Start by navigating to <a href="https://apps.twitter.com" target="_blank">https://apps.twitter.com</a> (make sure you are logged in to Twitter, or create an account first)
2. Click on "Create New App"
3. Fill out the "Application Details"
4. Navigate to the "Keys and Access Tokens" tab
5. Take note of your `Consumer Key (API Key)` and `Consumer Secret (API Secret)`
6. Click on "Create my access token" under "Token Actions" at the bottom of the page
7. Take note of your `Access Token` and `Access Token Secret`

Once you have the credentials, add them to the correct variables in the following lines of code:

```py
twitter_consumer_key = ''
twitter_consumer_secret = ''
twitter_access_token = ''
twitter_access_secret = ''
```

These credentials will allow us to call the Twitter API using the Twitter package we imported earlier.

Authenticating with Twitter

To interact with the Twitter API, we first need to create an <a href="https://en.wikipedia.org/wiki/Instance_(computer_science" target="_blank">instance</a> of the Twitter package we imported earlier.  

We can do this by calling the `.Api()` method on the `twitter` package and setting it equal to a variable, like `twitter_api`. The function also takes the four credentials you retrieved earlier as arguments, like so: 

```py
twitter_api = twitter.Api(consumer_key=twitter_consumer_key, consumer_secret=twitter_consumer_secret, access_token_key=twitter_access_token, access_token_secret=twitter_access_secret)
```

Querying Twitter I

Great! But to retrieve data from Twitter, we need to make an actual call to the API. So far, we've only created the object (`twitter_api`)that _represents_ the Twitter API we'll interact with.

To make the call, we'll do the following:

1. Create a `handle` variable (will be set to a Twitter username)
2. Call the `GetUserTimeline()` method on the `twitter_api` object we created in the last exercise and pass in the following three arguments to the method:

* Twitter handle of the celebrity
* The number of desired Tweets (`count`)
* A retweet flag (`include_rts` is set to `false` in order to avoid retrieving retweets)

3\. Create a `statuses` variable set to the code from Step 2 above

Here's an example:

```py
# This example uses Codecademy's twitter handle
handle = "@Codecademy"

statuses = twitter_api.GetUserTimeline(screen_name=handle, count=200, include_rts=False)
```

Querying Twitter II

We've done a lot of work, but we still haven't seen any data from Twitter. Let's fix that!

The `GetUserTimeline()` method returns will return data from Twitter in the form of a <a href="https://docs.python.org/2.7/tutorial/datastructures.html" target="_blank">Python list</a>, so we can use a <a href="https://www.codecademy.com/en/courses/python-beginner-en-cxMGf/1/5?curriculum_id=4f89dab3d788890003000096" target="_blank">for loop</a> to print out the tweets from the data we retrieved.

For example:

```py
statuses = twitter_api.GetUserTimeline(screen_name=handle, count=200, include_rts=False)

for status in statuses:
    print status
```

Understanding Twitter Results I 

The data we retrieve from Twitter doesn't include just tweets, it also contains a lot of <a href="https://en.wikipedia.org/wiki/Metadata" target="_blank">metadata</a>.

Unfortunately, the metadata contains information that we won't use (i.e., when the  tweet was created, how many people "favorited" it, what hashtags were used, what other Twitter users were mentioned, what language the tweet is in, and the actual text of the tweet). 

We're really only interested in one specific piece of data: the actual _text_ of the tweets. 

We can modify our code to print only the text we need, like so: 

```py
for status in statuses:
    print status.text
```

Understanding Twitter Results II

Great! Now that we have the text we need, it's time to prepare it so that we can send it to the Personality Insights (PI) API for analysis.

First, we'll <a href="https://www.codecademy.com/en/courses/python-beginner-sRXwR/3/1?curriculum_id=4f89dab3d788890003000096" target="_blank">concatenate</a> the text into one long string and then send it off to PI be analyzed. We'll save the long string into a variable called `text`.

We also only want the tweets that are in English, so we'll need to filter by language.

**Note:** The text retrieved from Twitter is in <a href="https://en.wikipedia.org/wiki/Unicode" target="_blank">Unicode</a> format, but we need <a href="https://en.wikipedia.org/wiki/UTF-8" target="_blank">UTF-8</a> format, so we'll need to encode it. Thankfully, the `encode()` method in Python solves that problem, so we'll use that. 

Here's how the modified code looks:

```py
statuses = twitter_api.GetUserTimeline(screen_name=handle, count=200, include_rts=False)

text = ""

for status in statuses:
    if (status.lang =='en'): #English tweets
        text += status.text.encode('utf-8')
```

The code above does the following:

1. Creates an empty `text` variable we'll save tweets into
2. Filters out English tweets
3. Appends UTF-8 encoded tweets to the `text` variable using the `+=` operator

Preparing Results for Analysis

Using the Twitter API

In this unit, you'll learn how to set up and authenticate the Twitter API in order to retrieve content from public Twitter accounts.

Retrieving Content from Twitter

Our friends at IBM updated the Watson API used in this course so this content is now out of date. Interested in similar content? Check out the Build Chatbots with Python Skill Path!

In this lesson, you'll demo Celebrity Match, an application that's powered by one of the many Watson APIs: the Personality Insights API.

Bluemix is IBM's cloud platform that enables developers and organizations to quickly and easily create, deploy, and manage applications on the cloud. 

Bluemix also provides developers with enterprise-level services that can easily integrate with your own cloud applications, without needing to know how to install or configure them. You can learn more details the services <a href="https://console.ng.bluemix.net/docs/overview/index.html" target="_blank">here</a>.

In the next lesson, we'll review the variety of services  available in Bluemix.

Overview: Bluemix

The Watson Developer Cloud is a _collection_ of <a href="https://en.wikipedia.org/wiki/Application_programming_interface" target="_blank">APIs</a> that enables developers to add cognitive capabilities to their applications. All of the APIs are available for use by applications built in Bluemix.

Specifically, The APIs offer cognitive capabilities in the following four main categories: 

* Language - use unstructured bodies of text to retrieve insights from that text. Insights can span a wide range, from the personality traits of the author who created the content to specific details on what the content is about and what it may be related to.

* Speech - interact with your users in many different languages using speech-to-text and text-to-speech.

* Vision - understand the world around you by retrieving insights on the users who created certain images. 

* Data Insight - access data pipelines and help make decisions on competing criteria.
  
Learn more about the services <a href="https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/" target="_blank">here</a>.

Overview: Watson Developer Cloud

Let's demo a working application that showcases one of the 19 APIs available within the Watson Developer Cloud: the **Personality Insights (PI)** API. 

The name of this application is **Celebrity Match**.

The application does the following:

1. Accepts two Twitter usernames as variables
2. Retrieves the last 200 Tweets from each Twitter username 
3. Sends the text of the 200 Tweets to the Personality Insights (PI)  API (covered in more detail later in this course) to gain insights on the two users
4. Compares the users to one another 
5. Displays the results of the comparison

The final results displayed will be the top 5 traits shared between the two Twitter users (for example: you and a celebrity of your choice). 

The results are displayed in the following format:

Matched Personality Trait `->` Probability of your profile exhibiting given trait `->` Probability of Celebrity's profile exhibiting given trait. 


**Note:** To successfully run this demo, a Twitter account and Twitter API credentials are _required_. <a href="https://developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api" target="_blank">The Twitter API documentation</a> describes how to obtain API credentials.

A Working App: Celebrity Match

Let's review what actually happened in the demo:

1. Both Twitter usernames (handles) are checked to make sure they're valid.
2. A call to the Twitter API retrieves the first 200 tweets from your Twitter feed (excluding any retweets).
3. A call to the Twitter API retrieves the first 200 tweets from the celebrity's Twitter feed (excluding any retweets). 
4. Your 200 tweets are sent as a single body of text to the Watson Personality Insights (PI) API to be analyzed.
5. The celebrity's 200 tweets are sent as a single body of text to the Watson Personality Insights (PI) API to be analyzed.
6. The Watson PI API runs a sorting & matching algorithm to find the most common attributes between both bodies of text.
7. The application prints the results to the user.

In this course, we'll build the **Celebrity Match** application from the ground up, beginning by installing the necessary Python packages, setting up Bluemix and Watson, and using the Twitter API and the PI API to display the results.

You'll need the following to build the application:

1. A <a href="https://twitter.com/signup?lang=en" target="_blank">Twitter account</a> (create an account if you don't already have one)

2. An IBM Bluemix account (you can create an account later in the course)

Let's get started!

The Application Flow

A Watson-Powered Application

Use PIP and the command line to install the required Twitter and Watson Developer Cloud Python packages.

The **Celebrity Match** application we're building requires a couple of Python packages to be installed. We'll use these packages in our code

To install the packages required for this project, we will need to use a package manager.

<a href="https://en.wikipedia.org/wiki/Pip_(package_manager)" target="_blank">Pip</a> is the package manager used to install Python packages. It looks up packages in the Python Package index (PyPi). PyPi is the official third party repository for Python packages. 

In the terminal, we'll use the `pip` command to install the following necessary packages for this project:

1. `python-twitter`
2. `watson-developer-cloud`

Installing Packages

Let's make sure that the packages installed properly. 

The easiest way to check is by running the following command in the terminal:

```shell
pip freeze
``` 

This commands displays the installed Python packages.

If everything installed correctly, you should see a list of all the packages that are installed, which should include `python-twitter` and `watson-developer-cloud`.

Verifying Installs

Using PIP

Sign up for a Bluemix account, learn how to obtain Personality Insights API credentials, interact with Personality Insights using Python, and analyze authored content from Twitter.

We'll use Personality Insights (PI) to analyze the data (tweets) that the **Celebrity Match** application retrieves from Twitter.

PI is just one of many services available in the Watson Developer Cloud and is part of the Bluemix Services Catalog. 

PI uses linguistic analytics to infer personality and social characteristics from unstructured text (like the Twitter data we retrieve). This service infers personality characteristics by using three models: 

* Big Five - the most generally used model which describes 5 dimensions of the personality (Agreeableness, Conscientiousness, Extraversion, Emotional Range, and Openness).
* Needs - describes what product attributes will resonate well with the person.
* Values - describes the factors that will motivate a user's decision making.
 
You can learn more about the models and their dimensions <a href="https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-insights/models.shtml" target="_blank">here</a>.

To get inspired and see use cases with Personality Insights and other Watson APIs, check out IBM's <a href="https://www.ibm.com/marketplace/learning-lab/use-cases/us/en-us" target="_blank">Learning Lab</a>

Overview: Personality Insights

Personality Insights is a service that is part of the Watson Developer Cloud. The Watson Developer Cloud has a single software development kit (SDK) that allows users to access all of the available services. 

To use the SDK, you'll need an IBM Bluemix username and password. After creating a Bluemix account, you'll be able to use Personality Insights in the **Celebrity Match** application (and beyond Codecademy):

1. <a href="https://console.ng.bluemix.net/registration/?cm_mc_uid=41462758644914551453431&cm_mc_sid_50200000=1455145343" target="_blank">Login to Bluemix (or create an account)</a> (**Note:** Bluemix credentials are required to successfully complete the application we're building)
2. Create an instance of the Personality Insights service 
3. Click on “Service Credentials” in the navigation area to the left to access the username and password for Personality Insights service. 
 
For details on how to add a service instance to your account, follow the steps in the <a href="https://www.ng.bluemix.net/docs/services/reqnsi.html#add_service" target="_blank">IBM Bluemix documentation</a>.


IBM Bluemix

Before we can send the content we retrieve from Twitter to Personality Insights to be analyzed, we need to setup the credentials for the Personality Insights.

We can do this by adding two variables that will store the username and password that are _unique to the Personality Insights instance_ you created in the previous exercise.

For example:

```py
text = "" 

for s in statuses:
    if (s.lang =='en'):
        text+=s.text.encode('utf-8')

#The IBM Bluemix credentials for Personality Insights!
pi_username = ''
pi_password = ''
```

Configuring Personality Insights Package

Next, we'll make use of the `watson_developer_cloud` Python package we imported earlier.

Similar to how we interacted with the Twitter API, we'll create an instance, initialize it with a username and password, and set it equal to a variable called `personality_insights`, like so:

```py
personality_insights = PersonalityInsights(username=pi_username, password=pi_password) 
```

The variable `personality_insights` _represents_ the Personality Insights API that we'll interact with.

Using the Personality Insights Package

Using the Personality Insights API

Finalize the Personality Insights results by computing the shared personality traits between the two Twitter users.

In this unit, we'll be adding the code that will analyze the Twitter data we retrieve from Twitter users.

At the moment, however, our **Celebrity Match** application can be tough to read or understand for other developers. Let's start by making our code easier to use (and easier to read). 

Code Consolidation

The `analyze()` function we just created will return the results from the Personality Insight (PI) API in <a href="http://www.json.org/" target="_blank">JSON</a> format. Whenever we use the `analyze()` function, we'll need to parse and flatten the JSON results to make use of them later.

By default, PI creates a tree structure for various categories. These categories are broken into Personality, Values, and Needs. These are then are broken into subcategories, and finally, broken into the actual traits. 

For this application, we'll compare the traits by flattening the structure and then extracting only the traits.
	
The `flatten()` function below will flatten the JSON structure that the `analyze()` function returns from PI.

```py
def flatten(orig):
    data = {}
    for c in orig['tree']['children']:
        if 'children' in c:
            for c2 in c['children']:
                if 'children' in c2:
                    for c3 in c2['children']:
                        if 'children' in c3:
                            for c4 in c3['children']:
                                if (c4['category'] == 'personality'):
                                    data[c4['id']] = c4['percentage']
                                    if 'children' not in c3:
                                        if (c3['category'] == 'personality'):
                                                data[c3['id']] = c3['percentage']
    return data
```

It's not important to understand exactly how the `flatten()` function works. What's important is to make sure we include it as part of our application so that we can properly process PI results.

The flatten() Function

The `flatten()` function flattens the results from a user and store the results in a <a href="https://docs.python.org/2.7/tutorial/datastructures.html#dictionaries" target="_blank">dictionary</a>. The next step is to write a function that can compare two dictionaries (the user's and the celebrity's). 

To compare them we will start with one of the returned results (let's say your traits) and then compare them with celebrity's traits. Then, we'll compute the "distance" between the traits and finally create a new dictionary that stores the trait and distance. 
	
```py
def compare(dict1, dict2):
    compared_data = {}
    for keys in dict1:
        if dict1[keys] != dict2[keys]:
        		compared_data[keys]=abs(dict1[keys] - dict2[keys])
    return compared_data
```

Again, It's not important to understand exactly how the `compare()` function works. What's important is to make sure we include it as part of our application so that we can view PI results.

The compare() Function

With all of our functionality in place, we now need two Twitter users to compare. We'll manually enter the Twitter usernames by creating empty variables that you can fill in, like so: 

```py
user_handle = "@Codecademy"
celebrity_handle = "@IBM"
```

Twitter Handles

Now that we have two Twitter usernames, we can use the `analyze()` function that we created earlier to retrieve some data from Twitter, like so:

```py
user_result = analyze(user_handle)
celebrity_result = analyze(celebrity_handle)
```

Analyze

Now let's use the two functions we added earlier to format our results.

First, we'll use the `flatten()` function to flatten the JSON structure that the `analyze()` function returns, like so:

```py
#First, flatten the results from the Watson PI API
user = flatten(user_result)
celebrity = flatten(celebrity_result)
```

Then, we'll use the `compare()` function to compare the user and celebrity in order to gain insights on their personalities.

```py
#Then, compare the results of the Watson PI API by calculating the distance between traits
compared_results = compare(user,celebrity)
```

Flattening & Comparing

Finally, to view the results, we'll have to sort them and then display the top 5 traits from the sorted results.

These will be the top 5 traits you share with the celebrity you selected for the application.

```py
sorted_result = sorted(compared_results.items(), key=operator.itemgetter(1))

for keys, value in sorted_result[:5]:
    print keys,
    print(user[keys]),
    print ('->'),
    print (celebrity[keys]),
    print ('->'),
    print (compared_results[keys])
```

Sorting and Outputting the Results

Congratulations! You built a fully functional, Watson-powered application that does the following:

* Queries the Twitter API
* Queries the Watson Personality Insights API
* Compares two bodies of text (tweets) from two Twitter users
* Displays the top 5 personality traits shared between the two Twitter users

In this course, you used the Personality Insights API &mdash; one of the many APIs available in the Watson Developer Cloud. Be sure to check out the additional APIs and learn about how they can serve your application's needs.

If you'd like to learn more about Bluemix, additional Watson Developer Cloud services, and view more courses and Watson use cases, please visit IBM's <a href="https://www.ibm.com/marketplace/learning-lab/us/en-us" target="_blank">Learning Lab</a>.

Conclusion

Gaining Insights

### Why Learn the Watson API?

IBM Watson is one of the most powerful AI systems in the world. Learn how to plug your code into the Watson API to use its amazing functionality.

### Take-Away Skills:
In this course, you'll use Python to interact with the Twitter API and IBM's Personality Insights API in order to analyze traits shared between two Twitter users.

In this unit, you'll learn about two powerful IBM products: Bluemix, a cloud development platform, and Watson, a collection of powerful, cognitive APIs.

Introduction to Bluemix and Watson

The Celebrity Match application require a couple of third-party Python packages. In this unit, you'll install the packages using PIP, Python's package manager.

PIP: The Python Package Manager

In this unit, learn how to set up and use Watson's Personality Insights API to analyze bodies of authored content.

Configuring Personality Insights

In this unit you'll add critical Python functions to analyze and display the final personality results for two Twitter users.

Final Analysis

Use IBM's Personality Insights API to analyze traits shared between two Twitter users.

Learn the Watson API

PRO SALE: Get 50% off annual Pro memberships using code [LLM50](https://www.codecademy.com/checkout?plan_id=proGoldAnnualV2&discountCode=LLM50&plan_type=pro)