[Tutorial] [Python] Analytics on Teradata Database with Vantage

Analytics
Highlighted
Teradata Employee

[Tutorial] [Python] Analytics on Teradata Database with Vantage

Hey team Teradata,

I come across a lot of people having connectivity issues with Teradata Database especially when it comes to a platform of their choice.

For Python users, who want to inspect a database or the data and run analytics on either, I have designed a Jupyter Notebook to help new users get started.

 

3 REPLIES 3
Enthusiast

Re: [Tutorial] [Python] Analytics on Teradata Database with Vantage

@ahmadmansoor  This is really great.

I am not able to download the link as it's asking for login info onto sharepoint, it may be a Teradata internal site.

Would it be possible to share it on the download section so every body can be benefited? Also do you have any documentation for it

Thanks again,

Re: [Tutorial] [Python] Analytics on Teradata Database with Vantage

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Sciences on Teradata Database by ~A7 [AM250152]\n",
    "Requirements: Python 3.5+, teradatasql, teradataml and a Vantage Machine."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "┌┬┐┌─┐┌┬┐┌─┐  ┌─┐┌─┐┬┌─┐┌┐┌┌─┐┌─┐\n",
      " ││├─┤ │ ├─┤  └─┐│  │├┤ ││││  ├┤ \n",
      "─┴┘┴ ┴ ┴ ┴ ┴  └─┘└─┘┴└─┘┘└┘└─┘└─┘\n",
      "Welcome to Data Sciences with A7™\n",
      "using Pandas version: 0.23.4\n",
      "using SciKit-Learn version: 0.19.2\n",
      "using Python version: 3.5.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:11:22) \n",
      "[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]\n"
     ]
    }
   ],
   "source": [
    "print(\"┌┬┐┌─┐┌┬┐┌─┐  ┌─┐┌─┐┬┌─┐┌┐┌┌─┐┌─┐\")\n",
    "print(\" ││├─┤ │ ├─┤  └─┐│  │├┤ ││││  ├┤ \")\n",
    "print(\"─┴┘┴ ┴ ┴ ┴ ┴  └─┘└─┘┴└─┘┘└┘└─┘└─┘\")\n",
    "\n",
    "import os, time,sys\n",
    "for char in \"Welcome to Data Sciences with A7™\":\n",
    "    time.sleep(0.05)\n",
    "    print(char, end='', flush=True)\n",
    "\n",
    "# os.system(\"pip install libraries/teradatasql-16.20.0.39-py3-none-any.whl\")\n",
    "# os.system(\"pip install libraries/teradataml-16.20.0.0-py3-none-any.whl\")\n",
    "\n",
    "import teradatasql\n",
    "import teradataml as tdml\n",
    "\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import sklearn # classic ML\n",
    "#import keras # deep learning\n",
    "\n",
    "print(\"\\nusing Pandas version:\",pd.__version__)\n",
    "print(\"using SciKit-Learn version:\", sklearn.__version__)\n",
    "print(\"using Python version:\", sys.version)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this demo, we are going to run some analytics on the Iris dataset. This Notebook is intended to get Python data scientists to get a jump start on analyzing client-side data exisiting in Teradata Databases."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Configurations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "host = \"sdt19085.labs.teradata.com\"\n",
    "username = \"user1\"\n",
    "password = \"user1\"\n",
    "database_name = \"demo\"\n",
    "table_name = \"iris\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Connect to SQL Database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "con = teradatasql.connect(None, host=host, user=username, password=password)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Load Data\n",
    "We are going to work with two tables: iris and iris_large"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2315912</td>\n",
       "      <td>6.283448</td>\n",
       "      <td>8.236815</td>\n",
       "      <td>10.391486</td>\n",
       "      <td>20.720223</td>\n",
       "      <td>versicolor</td>\n",
       "      <td>FF</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3984584</td>\n",
       "      <td>5.605049</td>\n",
       "      <td>4.246290</td>\n",
       "      <td>5.814689</td>\n",
       "      <td>4.570939</td>\n",
       "      <td>setosa</td>\n",
       "      <td>BF</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5653256</td>\n",
       "      <td>5.312167</td>\n",
       "      <td>6.365951</td>\n",
       "      <td>7.940895</td>\n",
       "      <td>6.101106</td>\n",
       "      <td>setosa</td>\n",
       "      <td>EB</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1145313</td>\n",
       "      <td>6.949874</td>\n",
       "      <td>10.257845</td>\n",
       "      <td>4.598636</td>\n",
       "      <td>5.890818</td>\n",
       "      <td>virginica</td>\n",
       "      <td>CI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2813985</td>\n",
       "      <td>5.086145</td>\n",
       "      <td>4.033024</td>\n",
       "      <td>9.856735</td>\n",
       "      <td>7.284636</td>\n",
       "      <td>setosa</td>\n",
       "      <td>DD</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         0         1          2          3          4           5   6  7\n",
       "0  2315912  6.283448   8.236815  10.391486  20.720223  versicolor  FF  0\n",
       "1  3984584  5.605049   4.246290   5.814689   4.570939      setosa  BF  1\n",
       "2  5653256  5.312167   6.365951   7.940895   6.101106      setosa  EB  0\n",
       "3  1145313  6.949874  10.257845   4.598636   5.890818   virginica  CI  0\n",
       "4  2813985  5.086145   4.033024   9.856735   7.284636      setosa  DD  0"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table_name = \"iris_large\"\n",
    "cursor = con.cursor()\n",
    "cursor.execute(\"SELECT TOP 300 * FROM {}\".format(database_name+\".\"+table_name))\n",
    "raw_data = pd.DataFrame(cursor.fetchall())\n",
    "raw_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Modify Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sepal-length</th>\n",
       "      <th>sepal-width</th>\n",
       "      <th>petal-length</th>\n",
       "      <th>petal-width</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6.283448</td>\n",
       "      <td>8.236815</td>\n",
       "      <td>10.391486</td>\n",
       "      <td>20.720223</td>\n",
       "      <td>versicolor</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.605049</td>\n",
       "      <td>4.246290</td>\n",
       "      <td>5.814689</td>\n",
       "      <td>4.570939</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.312167</td>\n",
       "      <td>6.365951</td>\n",
       "      <td>7.940895</td>\n",
       "      <td>6.101106</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6.949874</td>\n",
       "      <td>10.257845</td>\n",
       "      <td>4.598636</td>\n",
       "      <td>5.890818</td>\n",
       "      <td>virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.086145</td>\n",
       "      <td>4.033024</td>\n",
       "      <td>9.856735</td>\n",
       "      <td>7.284636</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   sepal-length  sepal-width  petal-length  petal-width       class\n",
       "0      6.283448     8.236815     10.391486    20.720223  versicolor\n",
       "1      5.605049     4.246290      5.814689     4.570939      setosa\n",
       "2      5.312167     6.365951      7.940895     6.101106      setosa\n",
       "3      6.949874    10.257845      4.598636     5.890818   virginica\n",
       "4      5.086145     4.033024      9.856735     7.284636      setosa"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_data = raw_data.drop([0,6,7],axis=1)\n",
    "cleaned_data.columns = [\"sepal-length\", \"sepal-width\",\"petal-length\",\"petal-width\",\"class\"]\n",
    "cleaned_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Initial Analysis of Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       sepal-length  sepal-width  petal-length  petal-width\n",
      "count    300.000000   300.000000    300.000000   300.000000\n",
      "mean       5.727166     8.056860      7.350811     9.512028\n",
      "std        0.811897     2.811425      2.108194     8.420680\n",
      "min        4.172506     1.209392      1.935968    -8.943831\n",
      "25%        5.068811     5.930002      5.900518     4.808855\n",
      "50%        5.616010     7.794138      7.406402     6.256710\n",
      "75%        6.348974     9.884348      8.665233    12.063914\n",
      "max        7.919086    16.079922     12.967613    46.799579 \n",
      "\n",
      "Class Distribution: \n",
      " class\n",
      "setosa        120\n",
      "versicolor     94\n",
      "virginica      86\n",
      "dtype: int64\n"
     ]
    }
   ],
   "source": [
    "print(cleaned_data.describe(), \"\\n\")\n",
    "print(\"Class Distribution: \\n\", cleaned_data.groupby('class').size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Visual Analysis of Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_data.plot(kind='box', subplots=True, layout=(2,2), sharex=False, sharey=False, title=\"Iris Dataset: Box and Whiskers Plot\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_data.hist()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.plotting.scatter_matrix(cleaned_data)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prediction: Training Models (SkLearn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn import model_selection\n",
    "from sklearn.metrics import classification_report\n",
    "from sklearn.metrics import confusion_matrix\n",
    "from sklearn.metrics import accuracy_score\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "from sklearn.discriminant_analysis import LinearDiscriminantAnalysis\n",
    "from sklearn.naive_bayes import GaussianNB\n",
    "from sklearn.svm import SVC\n",
    "\n",
    "array = cleaned_data.values\n",
    "X = array[:,0:4]\n",
    "Y = array[:,4]\n",
    "validation_size = 0.20\n",
    "seed = 7\n",
    "X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)\n",
    "seed = 7\n",
    "scoring = 'accuracy'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "LR: 0.762500 (0.064684)\n",
      "LDA: 0.783333 (0.078617)\n",
      "KNN: 0.829167 (0.070833)\n",
      "CART: 0.795833 (0.092139)\n",
      "NB: 0.858333 (0.053359)\n",
      "SVM: 0.808333 (0.046398)\n"
     ]
    }
   ],
   "source": [
    "# Create a list of different models\n",
    "\n",
    "models = []\n",
    "models.append(('LR', LogisticRegression()))\n",
    "models.append(('LDA', LinearDiscriminantAnalysis()))\n",
    "models.append(('KNN', KNeighborsClassifier()))\n",
    "models.append(('CART', DecisionTreeClassifier()))\n",
    "models.append(('NB', GaussianNB()))\n",
    "models.append(('SVM', SVC()))\n",
    "\n",
    "# evaluate each model in a loop\n",
    "results = []\n",
    "names = []\n",
    "for name, model in models:\n",
    "\tkfold = model_selection.KFold(n_splits=10, random_state=seed)\n",
    "\tcv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)\n",
    "\tresults.append(cv_results)\n",
    "\tnames.append(name)\n",
    "\tmsg = \"%s: %f (%f)\" % (name, cv_results.mean(), cv_results.std())\n",
    "\tprint(msg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Compare Algorithms\n",
    "\n",
    "fig = plt.figure()\n",
    "fig.suptitle('Algorithm Comparison')\n",
    "ax = fig.add_subplot(111)\n",
    "plt.boxplot(results)\n",
    "ax.set_xticklabels(names)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: \n",
      " 0.85\n",
      "Confusion Matrix: \n",
      " [[26  0  0]\n",
      " [ 7 13  0]\n",
      " [ 1  1 12]]\n",
      "Report: \n",
      "              precision    recall  f1-score   support\n",
      "\n",
      "     setosa       0.76      1.00      0.87        26\n",
      " versicolor       0.93      0.65      0.76        20\n",
      "  virginica       1.00      0.86      0.92        14\n",
      "\n",
      "avg / total       0.87      0.85      0.85        60\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Make predictions on validation dataset using LDA\n",
    "lda = LinearDiscriminantAnalysis()\n",
    "lda.fit(X_train, Y_train)\n",
    "predictions = lda.predict(X_validation)\n",
    "print(\"Accuracy: \\n\", accuracy_score(Y_validation, predictions))\n",
    "print(\"Confusion Matrix: \\n\", confusion_matrix(Y_validation, predictions))\n",
    "print(\"Report: \\n\", classification_report(Y_validation, predictions))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using Teradata ML Library on Vantage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create Connection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "conn_ml = tdml.create_context(host = host, username=username, password = password)\n",
    "\n",
    "from teradataml.analytics.NaiveBayes import NaiveBayes\n",
    "from teradataml.analytics.NaiveBayesPredict import NaiveBayesPredict\n",
    "from teradataml.dataframe.dataframe import DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Load data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "table_name = \"iris\"\n",
    "iris_input_train = DataFrame.from_query(\"SELECT * FROM {} WHERE id MOD 5 <> 0\".format(database_name+\".\"+table_name))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Analyze data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "    id  sepal_length  sepal_width  petal_length  petal_width     species\n",
       "0  138           6.4          3.1           5.5          1.8   virginica\n",
       "1   32           5.4          3.4           1.5          0.4      setosa\n",
       "2  124           6.3          2.7           4.9          1.8   virginica\n",
       "3    8           5.0          3.4           1.5          0.2      setosa\n",
       "4   56           5.7          2.8           4.5          1.3  versicolor\n",
       "5  137           6.3          3.4           5.6          2.4   virginica\n",
       "6  136           7.7          3.0           6.1          2.3   virginica\n",
       "7   62           5.9          3.0           4.2          1.5  versicolor\n",
       "8  117           6.5          3.0           5.5          1.8   virginica\n",
       "9   76           6.6          3.0           4.4          1.4  versicolor"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iris_input_train"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create training data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/am250152/anaconda3/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.\n",
      "  return _compile(pattern, flags).split(string, maxsplit)\n"
     ]
    }
   ],
   "source": [
    "naivebayes_train = NaiveBayes(formula=\"species ~ petal_length + sepal_width + petal_width + sepal_length\", data=iris_input_train)\n",
    "iris_input_test = DataFrame.from_query(\"SELECT * FROM {} WHERE id MOD 5 = 0\".format(database_name+\".\"+table_name))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create model and predict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/am250152/anaconda3/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.\n",
      "  return _compile(pattern, flags).split(string, maxsplit)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "############ STDOUT Output ############\n",
       "\n",
       "    id  prediction  loglik_virginica  loglik_setosa  loglik_versicolor\n",
       "0   70  versicolor        -15.236845    -152.472574          -2.353846\n",
       "1   85  versicolor         -7.002832    -249.656534          -2.004556\n",
       "2   40      setosa        -58.353886       0.976841         -35.442558\n",
       "3  105   virginica         -1.583216    -540.563571         -14.859641\n",
       "4   95  versicolor        -10.180244    -198.037172          -1.105673\n",
       "5  100  versicolor        -10.131539    -187.295006          -1.028853\n",
       "6  110   virginica         -6.113021    -654.802111         -28.838515\n",
       "7   35      setosa        -58.198028       0.660203         -34.933600\n",
       "8   15      setosa        -64.716957      -3.554763         -42.613273\n",
       "9   65  versicolor        -12.649648    -138.435759          -2.189800"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "naivebayes_predict_result = NaiveBayesPredict(newdata=iris_input_test,\n",
    "                                       modeldata = naivebayes_train,\n",
    "                                       id_col = \"id\",\n",
    "                                       responses = [\"virginica\",\"setosa\",\"versicolor\"])\n",
    "naivebayes_predict_result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extra Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_table_cols(db,table):\n",
    "    cursor = con.cursor()\n",
    "    cursor.execute(\"select columnname from dbc.columns where databasename = '{}' and tablename='{}';\".format(db,table))\n",
    "    col_names = cursor.fetchall()\n",
    "    col_names = [item.strip(\" \") for sublist in col_names for item in sublist]\n",
    "    return col_names"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
Tags (1)

Re: [Tutorial] [Python] Analytics on Teradata Database with Vantage

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Sciences on Teradata Database by ~A7 [AM250152]\n",
    "Requirements: Python 3.5+, teradatasql, teradataml and a Vantage Machine."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "┌┬┐┌─┐┌┬┐┌─┐  ┌─┐┌─┐┬┌─┐┌┐┌┌─┐┌─┐\n",
      " ││├─┤ │ ├─┤  └─┐│  │├┤ ││││  ├┤ \n",
      "─┴┘┴ ┴ ┴ ┴ ┴  └─┘└─┘┴└─┘┘└┘└─┘└─┘\n",
      "Welcome to Data Sciences with A7™\n",
      "using Pandas version: 0.23.4\n",
      "using SciKit-Learn version: 0.19.2\n",
      "using Python version: 3.5.5 |Anaconda, Inc.| (default, Apr 26 2018, 08:11:22) \n",
      "[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]\n"
     ]
    }
   ],
   "source": [
    "print(\"┌┬┐┌─┐┌┬┐┌─┐  ┌─┐┌─┐┬┌─┐┌┐┌┌─┐┌─┐\")\n",
    "print(\" ││├─┤ │ ├─┤  └─┐│  │├┤ ││││  ├┤ \")\n",
    "print(\"─┴┘┴ ┴ ┴ ┴ ┴  └─┘└─┘┴└─┘┘└┘└─┘└─┘\")\n",
    "\n",
    "import os, time,sys\n",
    "for char in \"Welcome to Data Sciences with A7™\":\n",
    "    time.sleep(0.05)\n",
    "    print(char, end='', flush=True)\n",
    "\n",
    "# os.system(\"pip install libraries/teradatasql-16.20.0.39-py3-none-any.whl\")\n",
    "# os.system(\"pip install libraries/teradataml-16.20.0.0-py3-none-any.whl\")\n",
    "\n",
    "import teradatasql\n",
    "import teradataml as tdml\n",
    "\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import sklearn # classic ML\n",
    "#import keras # deep learning\n",
    "\n",
    "print(\"\\nusing Pandas version:\",pd.__version__)\n",
    "print(\"using SciKit-Learn version:\", sklearn.__version__)\n",
    "print(\"using Python version:\", sys.version)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this demo, we are going to run some analytics on the Iris dataset. This Notebook is intended to get Python data scientists to get a jump start on analyzing client-side data exisiting in Teradata Databases."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Configurations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "host = \"sdt19085.labs.teradata.com\"\n",
    "username = \"user1\"\n",
    "password = \"user1\"\n",
    "database_name = \"demo\"\n",
    "table_name = \"iris\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Connect to SQL Database"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "con = teradatasql.connect(None, host=host, user=username, password=password)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Load Data\n",
    "We are going to work with two tables: iris and iris_large"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2315912</td>\n",
       "      <td>6.283448</td>\n",
       "      <td>8.236815</td>\n",
       "      <td>10.391486</td>\n",
       "      <td>20.720223</td>\n",
       "      <td>versicolor</td>\n",
       "      <td>FF</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3984584</td>\n",
       "      <td>5.605049</td>\n",
       "      <td>4.246290</td>\n",
       "      <td>5.814689</td>\n",
       "      <td>4.570939</td>\n",
       "      <td>setosa</td>\n",
       "      <td>BF</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5653256</td>\n",
       "      <td>5.312167</td>\n",
       "      <td>6.365951</td>\n",
       "      <td>7.940895</td>\n",
       "      <td>6.101106</td>\n",
       "      <td>setosa</td>\n",
       "      <td>EB</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1145313</td>\n",
       "      <td>6.949874</td>\n",
       "      <td>10.257845</td>\n",
       "      <td>4.598636</td>\n",
       "      <td>5.890818</td>\n",
       "      <td>virginica</td>\n",
       "      <td>CI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2813985</td>\n",
       "      <td>5.086145</td>\n",
       "      <td>4.033024</td>\n",
       "      <td>9.856735</td>\n",
       "      <td>7.284636</td>\n",
       "      <td>setosa</td>\n",
       "      <td>DD</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         0         1          2          3          4           5   6  7\n",
       "0  2315912  6.283448   8.236815  10.391486  20.720223  versicolor  FF  0\n",
       "1  3984584  5.605049   4.246290   5.814689   4.570939      setosa  BF  1\n",
       "2  5653256  5.312167   6.365951   7.940895   6.101106      setosa  EB  0\n",
       "3  1145313  6.949874  10.257845   4.598636   5.890818   virginica  CI  0\n",
       "4  2813985  5.086145   4.033024   9.856735   7.284636      setosa  DD  0"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table_name = \"iris_large\"\n",
    "cursor = con.cursor()\n",
    "cursor.execute(\"SELECT TOP 300 * FROM {}\".format(database_name+\".\"+table_name))\n",
    "raw_data = pd.DataFrame(cursor.fetchall())\n",
    "raw_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Modify Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sepal-length</th>\n",
       "      <th>sepal-width</th>\n",
       "      <th>petal-length</th>\n",
       "      <th>petal-width</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6.283448</td>\n",
       "      <td>8.236815</td>\n",
       "      <td>10.391486</td>\n",
       "      <td>20.720223</td>\n",
       "      <td>versicolor</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.605049</td>\n",
       "      <td>4.246290</td>\n",
       "      <td>5.814689</td>\n",
       "      <td>4.570939</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.312167</td>\n",
       "      <td>6.365951</td>\n",
       "      <td>7.940895</td>\n",
       "      <td>6.101106</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6.949874</td>\n",
       "      <td>10.257845</td>\n",
       "      <td>4.598636</td>\n",
       "      <td>5.890818</td>\n",
       "      <td>virginica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.086145</td>\n",
       "      <td>4.033024</td>\n",
       "      <td>9.856735</td>\n",
       "      <td>7.284636</td>\n",
       "      <td>setosa</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   sepal-length  sepal-width  petal-length  petal-width       class\n",
       "0      6.283448     8.236815     10.391486    20.720223  versicolor\n",
       "1      5.605049     4.246290      5.814689     4.570939      setosa\n",
       "2      5.312167     6.365951      7.940895     6.101106      setosa\n",
       "3      6.949874    10.257845      4.598636     5.890818   virginica\n",
       "4      5.086145     4.033024      9.856735     7.284636      setosa"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_data = raw_data.drop([0,6,7],axis=1)\n",
    "cleaned_data.columns = [\"sepal-length\", \"sepal-width\",\"petal-length\",\"petal-width\",\"class\"]\n",
    "cleaned_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Initial Analysis of Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       sepal-length  sepal-width  petal-length  petal-width\n",
      "count    300.000000   300.000000    300.000000   300.000000\n",
      "mean       5.727166     8.056860      7.350811     9.512028\n",
      "std        0.811897     2.811425      2.108194     8.420680\n",
      "min        4.172506     1.209392      1.935968    -8.943831\n",
      "25%        5.068811     5.930002      5.900518     4.808855\n",
      "50%        5.616010     7.794138      7.406402     6.256710\n",
      "75%        6.348974     9.884348      8.665233    12.063914\n",
      "max        7.919086    16.079922     12.967613    46.799579 \n",
      "\n",
      "Class Distribution: \n",
      " class\n",
      "setosa        120\n",
      "versicolor     94\n",
      "virginica      86\n",
      "dtype: int64\n"
     ]
    }
   ],
   "source": [
    "print(cleaned_data.describe(), \"\\n\")\n",
    "print(\"Class Distribution: \\n\", cleaned_data.groupby('class').size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Visual Analysis of Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_data.plot(kind='box', subplots=True, layout=(2,2), sharex=False, sharey=False, title=\"Iris Dataset: Box and Whiskers Plot\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_data.hist()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.plotting.scatter_matrix(cleaned_data)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prediction: Training Models (SkLearn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn import model_selection\n",
    "from sklearn.metrics import classification_report\n",
    "from sklearn.metrics import confusion_matrix\n",
    "from sklearn.metrics import accuracy_score\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "from sklearn.discriminant_analysis import LinearDiscriminantAnalysis\n",
    "from sklearn.naive_bayes import GaussianNB\n",
    "from sklearn.svm import SVC\n",
    "\n",
    "array = cleaned_data.values\n",
    "X = array[:,0:4]\n",
    "Y = array[:,4]\n",
    "validation_size = 0.20\n",
    "seed = 7\n",
    "X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)\n",
    "seed = 7\n",
    "scoring = 'accuracy'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "LR: 0.762500 (0.064684)\n",
      "LDA: 0.783333 (0.078617)\n",
      "KNN: 0.829167 (0.070833)\n",
      "CART: 0.795833 (0.092139)\n",
      "NB: 0.858333 (0.053359)\n",
      "SVM: 0.808333 (0.046398)\n"
     ]
    }
   ],
   "source": [
    "# Create a list of different models\n",
    "\n",
    "models = []\n",
    "models.append(('LR', LogisticRegression()))\n",
    "models.append(('LDA', LinearDiscriminantAnalysis()))\n",
    "models.append(('KNN', KNeighborsClassifier()))\n",
    "models.append(('CART', DecisionTreeClassifier()))\n",
    "models.append(('NB', GaussianNB()))\n",
    "models.append(('SVM', SVC()))\n",
    "\n",
    "# evaluate each model in a loop\n",
    "results = []\n",
    "names = []\n",
    "for name, model in models:\n",
    "\tkfold = model_selection.KFold(n_splits=10, random_state=seed)\n",
    "\tcv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)\n",
    "\tresults.append(cv_results)\n",
    "\tnames.append(name)\n",
    "\tmsg = \"%s: %f (%f)\" % (name, cv_results.mean(), cv_results.std())\n",
    "\tprint(msg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Compare Algorithms\n",
    "\n",
    "fig = plt.figure()\n",
    "fig.suptitle('Algorithm Comparison')\n",
    "ax = fig.add_subplot(111)\n",
    "plt.boxplot(results)\n",
    "ax.set_xticklabels(names)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: \n",
      " 0.85\n",
      "Confusion Matrix: \n",
      " [[26  0  0]\n",
      " [ 7 13  0]\n",
      " [ 1  1 12]]\n",
      "Report: \n",
      "              precision    recall  f1-score   support\n",
      "\n",
      "     setosa       0.76      1.00      0.87        26\n",
      " versicolor       0.93      0.65      0.76        20\n",
      "  virginica       1.00      0.86      0.92        14\n",
      "\n",
      "avg / total       0.87      0.85      0.85        60\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Make predictions on validation dataset using LDA\n",
    "lda = LinearDiscriminantAnalysis()\n",
    "lda.fit(X_train, Y_train)\n",
    "predictions = lda.predict(X_validation)\n",
    "print(\"Accuracy: \\n\", accuracy_score(Y_validation, predictions))\n",
    "print(\"Confusion Matrix: \\n\", confusion_matrix(Y_validation, predictions))\n",
    "print(\"Report: \\n\", classification_report(Y_validation, predictions))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using Teradata ML Library on Vantage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create Connection"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "conn_ml = tdml.create_context(host = host, username=username, password = password)\n",
    "\n",
    "from teradataml.analytics.NaiveBayes import NaiveBayes\n",
    "from teradataml.analytics.NaiveBayesPredict import NaiveBayesPredict\n",
    "from teradataml.dataframe.dataframe import DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Load data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "table_name = \"iris\"\n",
    "iris_input_train = DataFrame.from_query(\"SELECT * FROM {} WHERE id MOD 5 <> 0\".format(database_name+\".\"+table_name))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Analyze data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "    id  sepal_length  sepal_width  petal_length  petal_width     species\n",
       "0  138           6.4          3.1           5.5          1.8   virginica\n",
       "1   32           5.4          3.4           1.5          0.4      setosa\n",
       "2  124           6.3          2.7           4.9          1.8   virginica\n",
       "3    8           5.0          3.4           1.5          0.2      setosa\n",
       "4   56           5.7          2.8           4.5          1.3  versicolor\n",
       "5  137           6.3          3.4           5.6          2.4   virginica\n",
       "6  136           7.7          3.0           6.1          2.3   virginica\n",
       "7   62           5.9          3.0           4.2          1.5  versicolor\n",
       "8  117           6.5          3.0           5.5          1.8   virginica\n",
       "9   76           6.6          3.0           4.4          1.4  versicolor"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "iris_input_train"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create training data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/am250152/anaconda3/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.\n",
      "  return _compile(pattern, flags).split(string, maxsplit)\n"
     ]
    }
   ],
   "source": [
    "naivebayes_train = NaiveBayes(formula=\"species ~ petal_length + sepal_width + petal_width + sepal_length\", data=iris_input_train)\n",
    "iris_input_test = DataFrame.from_query(\"SELECT * FROM {} WHERE id MOD 5 = 0\".format(database_name+\".\"+table_name))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Create model and predict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/am250152/anaconda3/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.\n",
      "  return _compile(pattern, flags).split(string, maxsplit)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "############ STDOUT Output ############\n",
       "\n",
       "    id  prediction  loglik_virginica  loglik_setosa  loglik_versicolor\n",
       "0   70  versicolor        -15.236845    -152.472574          -2.353846\n",
       "1   85  versicolor         -7.002832    -249.656534          -2.004556\n",
       "2   40      setosa        -58.353886       0.976841         -35.442558\n",
       "3  105   virginica         -1.583216    -540.563571         -14.859641\n",
       "4   95  versicolor        -10.180244    -198.037172          -1.105673\n",
       "5  100  versicolor        -10.131539    -187.295006          -1.028853\n",
       "6  110   virginica         -6.113021    -654.802111         -28.838515\n",
       "7   35      setosa        -58.198028       0.660203         -34.933600\n",
       "8   15      setosa        -64.716957      -3.554763         -42.613273\n",
       "9   65  versicolor        -12.649648    -138.435759          -2.189800"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "naivebayes_predict_result = NaiveBayesPredict(newdata=iris_input_test,\n",
    "                                       modeldata = naivebayes_train,\n",
    "                                       id_col = \"id\",\n",
    "                                       responses = [\"virginica\",\"setosa\",\"versicolor\"])\n",
    "naivebayes_predict_result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extra Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_table_cols(db,table):\n",
    "    cursor = con.cursor()\n",
    "    cursor.execute(\"select columnname from dbc.columns where databasename = '{}' and tablename='{}';\".format(db,table))\n",
    "    col_names = cursor.fetchall()\n",
    "    col_names = [item.strip(\" \") for sublist in col_names for item in sublist]\n",
    "    return col_names"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
Tags (1)