{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Finding Best ML Algorithm for House Price Prediction using k Cross Validation and GridSearchCV. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In machine learning, we couldn’t fit the model on the training data and can’t say that the model will work accurately for the real data. For this, we must assure that our model got the correct patterns from the data, and it is not getting up too much noise. For this purpose, we use the cross-validation technique.Cross-validation is a technique in which we train our model using the subset of the data-set and then evaluate using the complementary subset of the data-set."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Dataset is downloaded from here: https://www.kaggle.com/amitabhajoy/bengaluru-house-price-data"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"from matplotlib import pyplot as plt\n",
"%matplotlib inline \n",
"import matplotlib\n",
"matplotlib.rcParams[\"figure.figsize\"]=(20,10)"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [],
"source": [
"df1 = pd.read_csv(\"Bengaluru_House_Data.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>area_type</th>\n",
" <th>availability</th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>society</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>balcony</th>\n",
" <th>price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Super built-up Area</td>\n",
" <td>19-Dec</td>\n",
" <td>Electronic City Phase II</td>\n",
" <td>2 BHK</td>\n",
" <td>Coomee</td>\n",
" <td>1056</td>\n",
" <td>2.0</td>\n",
" <td>1.0</td>\n",
" <td>39.07</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Plot Area</td>\n",
" <td>Ready To Move</td>\n",
" <td>Chikka Tirupathi</td>\n",
" <td>4 Bedroom</td>\n",
" <td>Theanmp</td>\n",
" <td>2600</td>\n",
" <td>5.0</td>\n",
" <td>3.0</td>\n",
" <td>120.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Built-up Area</td>\n",
" <td>Ready To Move</td>\n",
" <td>Uttarahalli</td>\n",
" <td>3 BHK</td>\n",
" <td>NaN</td>\n",
" <td>1440</td>\n",
" <td>2.0</td>\n",
" <td>3.0</td>\n",
" <td>62.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Super built-up Area</td>\n",
" <td>Ready To Move</td>\n",
" <td>Lingadheeranahalli</td>\n",
" <td>3 BHK</td>\n",
" <td>Soiewre</td>\n",
" <td>1521</td>\n",
" <td>3.0</td>\n",
" <td>1.0</td>\n",
" <td>95.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Super built-up Area</td>\n",
" <td>Ready To Move</td>\n",
" <td>Kothanur</td>\n",
" <td>2 BHK</td>\n",
" <td>NaN</td>\n",
" <td>1200</td>\n",
" <td>2.0</td>\n",
" <td>1.0</td>\n",
" <td>51.00</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" area_type availability location size \\\n",
"0 Super built-up Area 19-Dec Electronic City Phase II 2 BHK \n",
"1 Plot Area Ready To Move Chikka Tirupathi 4 Bedroom \n",
"2 Built-up Area Ready To Move Uttarahalli 3 BHK \n",
"3 Super built-up Area Ready To Move Lingadheeranahalli 3 BHK \n",
"4 Super built-up Area Ready To Move Kothanur 2 BHK \n",
"\n",
" society total_sqft bath balcony price \n",
"0 Coomee 1056 2.0 1.0 39.07 \n",
"1 Theanmp 2600 5.0 3.0 120.00 \n",
"2 NaN 1440 2.0 3.0 62.00 \n",
"3 Soiewre 1521 3.0 1.0 95.00 \n",
"4 NaN 1200 2.0 1.0 51.00 "
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"area_type\n",
"Built-up Area 2418\n",
"Carpet Area 87\n",
"Plot Area 2025\n",
"Super built-up Area 8790\n",
"Name: area_type, dtype: int64"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1.groupby('area_type')['area_type'].agg('count')"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [],
"source": [
"df2 = df1.drop(['area_type','society','balcony','availability'] , axis=\"columns\")"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Electronic City Phase II</td>\n",
" <td>2 BHK</td>\n",
" <td>1056</td>\n",
" <td>2.0</td>\n",
" <td>39.07</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Chikka Tirupathi</td>\n",
" <td>4 Bedroom</td>\n",
" <td>2600</td>\n",
" <td>5.0</td>\n",
" <td>120.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Uttarahalli</td>\n",
" <td>3 BHK</td>\n",
" <td>1440</td>\n",
" <td>2.0</td>\n",
" <td>62.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Lingadheeranahalli</td>\n",
" <td>3 BHK</td>\n",
" <td>1521</td>\n",
" <td>3.0</td>\n",
" <td>95.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Kothanur</td>\n",
" <td>2 BHK</td>\n",
" <td>1200</td>\n",
" <td>2.0</td>\n",
" <td>51.00</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price\n",
"0 Electronic City Phase II 2 BHK 1056 2.0 39.07\n",
"1 Chikka Tirupathi 4 Bedroom 2600 5.0 120.00\n",
"2 Uttarahalli 3 BHK 1440 2.0 62.00\n",
"3 Lingadheeranahalli 3 BHK 1521 3.0 95.00\n",
"4 Kothanur 2 BHK 1200 2.0 51.00"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Cleaning: Handling NA/Null values"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"location 1\n",
"size 16\n",
"total_sqft 0\n",
"bath 73\n",
"price 0\n",
"dtype: int64"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"location 0\n",
"size 0\n",
"total_sqft 0\n",
"bath 0\n",
"price 0\n",
"dtype: int64"
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df3 = df2.dropna()\n",
"df3.isnull().sum()"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['2 BHK', '4 Bedroom', '3 BHK', '4 BHK', '6 Bedroom', '3 Bedroom',\n",
" '1 BHK', '1 RK', '1 Bedroom', '8 Bedroom', '2 Bedroom',\n",
" '7 Bedroom', '5 BHK', '7 BHK', '6 BHK', '5 Bedroom', '11 BHK',\n",
" '9 BHK', nan, '9 Bedroom', '27 BHK', '10 Bedroom', '11 Bedroom',\n",
" '10 BHK', '19 BHK', '16 BHK', '43 Bedroom', '14 BHK', '8 BHK',\n",
" '12 Bedroom', '13 BHK', '18 Bedroom'], dtype=object)"
]
},
"execution_count": 80,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2['size'].unique()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature Engineering"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself "
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"<ipython-input-81-4c4c73fbe7f4>:1: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df3['bhk'] = df3['size'].apply(lambda x: int(x.split(' ')[0]))\n"
]
}
],
"source": [
"df3['bhk'] = df3['size'].apply(lambda x: int(x.split(' ')[0]))"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 2, 4, 3, 6, 1, 8, 7, 5, 11, 9, 27, 10, 19, 16, 43, 14, 12,\n",
" 13, 18], dtype=int64)"
]
},
"execution_count": 82,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df3['bhk'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1718</th>\n",
" <td>2Electronic City Phase II</td>\n",
" <td>27 BHK</td>\n",
" <td>8000</td>\n",
" <td>27.0</td>\n",
" <td>230.0</td>\n",
" <td>27</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4684</th>\n",
" <td>Munnekollal</td>\n",
" <td>43 Bedroom</td>\n",
" <td>2400</td>\n",
" <td>40.0</td>\n",
" <td>660.0</td>\n",
" <td>43</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk\n",
"1718 2Electronic City Phase II 27 BHK 8000 27.0 230.0 27\n",
"4684 Munnekollal 43 Bedroom 2400 40.0 660.0 43"
]
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df3[df3.bhk>20]"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['1056', '2600', '1440', ..., '1133 - 1384', '774', '4689'],\n",
" dtype=object)"
]
},
"execution_count": 84,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df3.total_sqft.unique()"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [],
"source": [
"def is_float(x):\n",
" try:\n",
" float(x)\n",
" except:\n",
" return False\n",
" return True"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>Yelahanka</td>\n",
" <td>4 BHK</td>\n",
" <td>2100 - 2850</td>\n",
" <td>4.0</td>\n",
" <td>186.000</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>122</th>\n",
" <td>Hebbal</td>\n",
" <td>4 BHK</td>\n",
" <td>3067 - 8156</td>\n",
" <td>4.0</td>\n",
" <td>477.000</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>137</th>\n",
" <td>8th Phase JP Nagar</td>\n",
" <td>2 BHK</td>\n",
" <td>1042 - 1105</td>\n",
" <td>2.0</td>\n",
" <td>54.005</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>165</th>\n",
" <td>Sarjapur</td>\n",
" <td>2 BHK</td>\n",
" <td>1145 - 1340</td>\n",
" <td>2.0</td>\n",
" <td>43.490</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>188</th>\n",
" <td>KR Puram</td>\n",
" <td>2 BHK</td>\n",
" <td>1015 - 1540</td>\n",
" <td>2.0</td>\n",
" <td>56.800</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>410</th>\n",
" <td>Kengeri</td>\n",
" <td>1 BHK</td>\n",
" <td>34.46Sq. Meter</td>\n",
" <td>1.0</td>\n",
" <td>18.500</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>549</th>\n",
" <td>Hennur Road</td>\n",
" <td>2 BHK</td>\n",
" <td>1195 - 1440</td>\n",
" <td>2.0</td>\n",
" <td>63.770</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>648</th>\n",
" <td>Arekere</td>\n",
" <td>9 Bedroom</td>\n",
" <td>4125Perch</td>\n",
" <td>9.0</td>\n",
" <td>265.000</td>\n",
" <td>9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>661</th>\n",
" <td>Yelahanka</td>\n",
" <td>2 BHK</td>\n",
" <td>1120 - 1145</td>\n",
" <td>2.0</td>\n",
" <td>48.130</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>672</th>\n",
" <td>Bettahalsoor</td>\n",
" <td>4 Bedroom</td>\n",
" <td>3090 - 5002</td>\n",
" <td>4.0</td>\n",
" <td>445.000</td>\n",
" <td>4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk\n",
"30 Yelahanka 4 BHK 2100 - 2850 4.0 186.000 4\n",
"122 Hebbal 4 BHK 3067 - 8156 4.0 477.000 4\n",
"137 8th Phase JP Nagar 2 BHK 1042 - 1105 2.0 54.005 2\n",
"165 Sarjapur 2 BHK 1145 - 1340 2.0 43.490 2\n",
"188 KR Puram 2 BHK 1015 - 1540 2.0 56.800 2\n",
"410 Kengeri 1 BHK 34.46Sq. Meter 1.0 18.500 1\n",
"549 Hennur Road 2 BHK 1195 - 1440 2.0 63.770 2\n",
"648 Arekere 9 Bedroom 4125Perch 9.0 265.000 9\n",
"661 Yelahanka 2 BHK 1120 - 1145 2.0 48.130 2\n",
"672 Bettahalsoor 4 Bedroom 3090 - 5002 4.0 445.000 4"
]
},
"execution_count": 86,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df3[~df3['total_sqft'].apply(is_float)].head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Above shows that total_sqft can be a range (e.g. 2100-2850). For such case we can just take average of min and max value in the range. There are other cases such as 34.46Sq. Meter which one can convert to square ft using unit conversion. I am going to just drop such corner cases to keep things simple "
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {},
"outputs": [],
"source": [
"def convert_sqft_to_num(x):\n",
" tokens = x.split('-')\n",
" if len(tokens) == 2:\n",
" return (float(tokens[0]) + float(tokens[1]))/2\n",
" try:\n",
" return float(x)\n",
" except:\n",
" return None\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {},
"outputs": [],
"source": [
"df4 = df3.copy()\n",
"df4['total_sqft'] = df4['total_sqft'].apply(convert_sqft_to_num)"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [],
"source": [
"df5 = df4.copy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Add new feature called price per square feet"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {},
"outputs": [],
"source": [
"df5['price_per_sqft'] = df5['price']*100000/df5['total_sqft']"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1304"
]
},
"execution_count": 91,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(df5.location.unique())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Examine locations which is a categorical variable. We need to apply dimensionality reduction technique here to reduce number of locations"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"df5.location = df5.location.apply(lambda x: x.strip())"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {},
"outputs": [],
"source": [
"location_stats = df5.groupby('location')['location'].agg('count').sort_values(ascending=False)"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"location\n",
"Whitefield 535\n",
"Sarjapur Road 392\n",
"Electronic City 304\n",
"Kanakpura Road 266\n",
"Thanisandra 236\n",
" ... \n",
"LIC Colony 1\n",
"Kuvempu Layout 1\n",
"Kumbhena Agrahara 1\n",
"Kudlu Village, 1\n",
"1 Annasandrapalya 1\n",
"Name: location, Length: 1293, dtype: int64"
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"location_stats"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1052"
]
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(location_stats[location_stats<=10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dimensionality Reduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Any location having less than 10 data points should be tagged as \"other\" location. This way number of categories can be reduced by huge amount. Later on when we do one hot encoding, it will help us with having fewer dummy columns"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [],
"source": [
"location_stats_less_10 = location_stats[location_stats<=10]"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"location\n",
"BTM 1st Stage 10\n",
"Basapura 10\n",
"Sector 1 HSR Layout 10\n",
"Naganathapura 10\n",
"Kalkere 10\n",
" ..\n",
"LIC Colony 1\n",
"Kuvempu Layout 1\n",
"Kumbhena Agrahara 1\n",
"Kudlu Village, 1\n",
"1 Annasandrapalya 1\n",
"Name: location, Length: 1052, dtype: int64"
]
},
"execution_count": 97,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"location_stats_less_10"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1293"
]
},
"execution_count": 98,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(df5.location.unique())"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [],
"source": [
"df5.location = df5.location.apply(lambda x: 'other' if x in location_stats_less_10 else x)"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"242"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(df5.location.unique())"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>price_per_sqft</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Electronic City Phase II</td>\n",
" <td>2 BHK</td>\n",
" <td>1056.0</td>\n",
" <td>2.0</td>\n",
" <td>39.07</td>\n",
" <td>2</td>\n",
" <td>3699.810606</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Chikka Tirupathi</td>\n",
" <td>4 Bedroom</td>\n",
" <td>2600.0</td>\n",
" <td>5.0</td>\n",
" <td>120.00</td>\n",
" <td>4</td>\n",
" <td>4615.384615</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Uttarahalli</td>\n",
" <td>3 BHK</td>\n",
" <td>1440.0</td>\n",
" <td>2.0</td>\n",
" <td>62.00</td>\n",
" <td>3</td>\n",
" <td>4305.555556</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Lingadheeranahalli</td>\n",
" <td>3 BHK</td>\n",
" <td>1521.0</td>\n",
" <td>3.0</td>\n",
" <td>95.00</td>\n",
" <td>3</td>\n",
" <td>6245.890861</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Kothanur</td>\n",
" <td>2 BHK</td>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>51.00</td>\n",
" <td>2</td>\n",
" <td>4250.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk \\\n",
"0 Electronic City Phase II 2 BHK 1056.0 2.0 39.07 2 \n",
"1 Chikka Tirupathi 4 Bedroom 2600.0 5.0 120.00 4 \n",
"2 Uttarahalli 3 BHK 1440.0 2.0 62.00 3 \n",
"3 Lingadheeranahalli 3 BHK 1521.0 3.0 95.00 3 \n",
"4 Kothanur 2 BHK 1200.0 2.0 51.00 2 \n",
"\n",
" price_per_sqft \n",
"0 3699.810606 \n",
"1 4615.384615 \n",
"2 4305.555556 \n",
"3 6245.890861 \n",
"4 4250.000000 "
]
},
"execution_count": 101,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df5.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Outlier Removal Using Business Logic"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>price_per_sqft</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>other</td>\n",
" <td>6 Bedroom</td>\n",
" <td>1020.0</td>\n",
" <td>6.0</td>\n",
" <td>370.0</td>\n",
" <td>6</td>\n",
" <td>36274.509804</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>HSR Layout</td>\n",
" <td>8 Bedroom</td>\n",
" <td>600.0</td>\n",
" <td>9.0</td>\n",
" <td>200.0</td>\n",
" <td>8</td>\n",
" <td>33333.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>58</th>\n",
" <td>Murugeshpalya</td>\n",
" <td>6 Bedroom</td>\n",
" <td>1407.0</td>\n",
" <td>4.0</td>\n",
" <td>150.0</td>\n",
" <td>6</td>\n",
" <td>10660.980810</td>\n",
" </tr>\n",
" <tr>\n",
" <th>68</th>\n",
" <td>Devarachikkanahalli</td>\n",
" <td>8 Bedroom</td>\n",
" <td>1350.0</td>\n",
" <td>7.0</td>\n",
" <td>85.0</td>\n",
" <td>8</td>\n",
" <td>6296.296296</td>\n",
" </tr>\n",
" <tr>\n",
" <th>70</th>\n",
" <td>other</td>\n",
" <td>3 Bedroom</td>\n",
" <td>500.0</td>\n",
" <td>3.0</td>\n",
" <td>100.0</td>\n",
" <td>3</td>\n",
" <td>20000.000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk \\\n",
"9 other 6 Bedroom 1020.0 6.0 370.0 6 \n",
"45 HSR Layout 8 Bedroom 600.0 9.0 200.0 8 \n",
"58 Murugeshpalya 6 Bedroom 1407.0 4.0 150.0 6 \n",
"68 Devarachikkanahalli 8 Bedroom 1350.0 7.0 85.0 8 \n",
"70 other 3 Bedroom 500.0 3.0 100.0 3 \n",
"\n",
" price_per_sqft \n",
"9 36274.509804 \n",
"45 33333.333333 \n",
"58 10660.980810 \n",
"68 6296.296296 \n",
"70 20000.000000 "
]
},
"execution_count": 102,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df5[df5.total_sqft/df5.bhk<300].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Check above data points. We have 6 bhk apartment with 1020 sqft. Another one is 8 bhk and total sqft is 600. These are clear data errors that can be removed safely"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(13246, 7)"
]
},
"execution_count": 103,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df5.shape"
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {},
"outputs": [],
"source": [
"df6 = df5[~(df5.total_sqft/df5.bhk<300)]"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(12502, 7)"
]
},
"execution_count": 105,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df6.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Outlier Removal Using Standard Deviation and Mean"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 12456.000000\n",
"mean 6308.502826\n",
"std 4168.127339\n",
"min 267.829813\n",
"25% 4210.526316\n",
"50% 5294.117647\n",
"75% 6916.666667\n",
"max 176470.588235\n",
"Name: price_per_sqft, dtype: float64"
]
},
"execution_count": 106,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df6.price_per_sqft.describe()"
]
},
{
"cell_type": "code",
"execution_count": 107,
"metadata": {},
"outputs": [],
"source": [
"def remove_pps_outliers(df):\n",
" df_out = pd.DataFrame()\n",
" for key, subdf in df.groupby('location'):\n",
" m = np.mean(subdf.price_per_sqft)\n",
" st = np.std(subdf.price_per_sqft)\n",
" reduced_df = subdf[(subdf.price_per_sqft>(m-st)) & (subdf.price_per_sqft<=(m+st))]\n",
" df_out = pd.concat([df_out,reduced_df],ignore_index=True)\n",
" return df_out "
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {},
"outputs": [],
"source": [
"df7 = remove_pps_outliers(df6)"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(10241, 7)"
]
},
"execution_count": 109,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df7.shape"
]
},
{
"cell_type": "code",
"execution_count": 110,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA3sAAAJcCAYAAABAE73ZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdf5zdd10n+tc7baUyGWyhRWIK22KCFLqa5Y7AXbM6qDwouQqL4J3u6l27zS4uP1ahygKuu6CuPnpBRC+PVW7RvUHAJRV1qWyK/DLhRhBuqpUf7bLJLo2EsBAr4GTWlrbzuX+cM82QTCYnyZw5c77zfD4eeZw5n+/3e857grZ9Pd7v7+dbrbUAAADQLRtGXQAAAAArT9gDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhD4B1o6p+pKreN+C5P1NVv9n/+XFVdbyqLhhuhQCwcspz9gAYJ1V1d5JvTvJgkuNJ3pvkpa214yOu6RuTPL61Ntdf+2dJfrS1Nj2qugBY33T2ABhHP9ha25hkW5K/l+TVI64nSS5M8pOjLqKqLhx1DQCsDcIeAGOrtfY/kvxReqEvSVJVr6qq/1ZVs1V1Z1U9b9Gx66tq/6L3v1ZVn6uqv6mq26vqHyw69tqqenv/5yurqp0hSL0+yU9X1SVLHTzDd31jVb21qr5cVXdV1b+qqiNn8Tv9SVW9sar+OslrB/vbA6DrhD0AxlZVXZHk2UkOLVr+b0n+QZJvSvJzSd5eVZtO8xH/X3pB8ZFJfifJ71bVxedYzoEke5P89Dl812uSXJnk8UmemeRHT7r2TL/T05L89ySPTvKL51g/AB0j7AEwjv5TVc0m+VySL6UXlpIkrbXfba0dba3Nt9Z2JzmY5KlLfUhr7e2ttXtaaw+01t6Q5GFJvu086vq3Sf5lVV1+lt/1vyf5pdbal1trR5L8Xydde6bf6Whr7U39z/7b86gfgA4R9gAYR/+wtTaZZDrJE5NctnCgqv5JVd1RVV+pqq8kuWbx8cWq6qf6Y5Nf7Z/7Tac7dxCttU8leU+SV53ld31LesF1wedOuvZMv9PXnQ8AibAHwBhrre1LsivJLydJVf2dJG9J8tIkj2qtXZLkU0nq5Gv798y9Mr2u2qX9c7+61Lln6TVJ/nmSzWfxXV9IcsWiz3jsomsH+Z1srQ3AKYQ9AMbdryZ5ZlVtSzKRXvA5liRV9U/T64ItZTLJA/1zL6yqf5vkEedbTGvtUJLdSX7iLL7rliSvrqpLq2pzesFuwdn8TgDwEGEPgLHWWjuW5LeT/JvW2p1J3pDko0m+mOTvJvmT01z6R0luS/JfkxxOcm9Wbhzy59MLaYN+188nOZLks0k+kORdSe5LkrP8nQDgIR6qDsC6UVU3pPeg8+8d4NyfT3JFa+2G4Vd2yne/KMl1rbXvWe3vBqA7dPYAWE+enF73bFlVVUmeNMi5K6GqNlXVd1XVhqr6tiQ/leQPVuO7Aeiu5R4OCwCdUVX/KcnWJD88wOl/lt4Y5UvPdOIK+YYk/3eSq5J8Jck7k/z6Kn03AB1ljBMAAKCDjHECAAB00FiPcV522WXtyiuvHHUZAAAAI3H77bf/VWvt8qWOjXXYu/LKK3PgwIFRlwEAADASVXX4dMeMcQIAAHSQsAcAANBBwh4AAEAHjfU9e0u5//77c+TIkdx7772jLmWkLr744lxxxRW56KKLRl0KAAAwAp0Le0eOHMnk5GSuvPLKVNWoyxmJ1lruueeeHDlyJFddddWoywEAAEagc2Oc9957bx71qEet26CXJFWVRz3qUeu+uwkAAOtZ58JeknUd9Bb4OwAAgPWtk2EPAABgvRP2VtjnPve5POMZz8jVV1+dJz/5yfm1X/u1Jc977Wtfm82bN2fbtm154hOfmBe96EWZn59Pklx//fV517ve9XXnb9y4MUly991355prrnlo/S1veUue8pSn5Mtf/vKQfiMAAGAcrfuwNzub/OZvJq98Ze91dvb8Pu/CCy/MG97whtx111350z/90/z7f//vc+eddy557stf/vLccccdufPOO/PJT34y+/btO6vvetvb3pY3velNed/73pdLL730/AoHAAA6pXO7cZ6N/fuTHTuS+flkbi6ZmEhuvDHZsyfZvv3cPnPTpk3ZtGlTkmRycjJXX311Pv/5z+dJT3rSaa/52te+lnvvvfesAtstt9ySm266KR/84Adz2WWXnVuxAABAZ63bzt7sbC/ozc72gl7Se11YP378/L/j7rvvzp//+Z/naU972pLH3/jGN2bbtm3ZtGlTnvCEJ2Tbtm0PHXvFK16Rbdu2PfRnscOHD+elL31p3ve+9+Uxj3nM+RcKAAB0zroNe7t39zp6S5mf7x0/H8ePH8/zn//8/Oqv/moe8YhHLHnOwhjnl770pczNzeWd73znQ8de//rX54477njoz2KXX355Hve4x+WWW245vyIBAIDOWrdh7+DBEx29k83NJYcOnftn33///Xn+85+fH/mRH8kP/dAPnfH8iy66KNdee20+/OEPD/T5D3/4w3PbbbflzW9+c97xjnece6EAAEBnrdt79rZu7d2jt1Tgm5hItmw5t89trWXnzp25+uqrc+ONNw58zUc+8pFTxjWXc/nll+e9731vpqenc9lll+VZz3rWuRUMAAB00rrt7M3MJBtO89tv2NA7fi7+5E/+JG9729vyoQ996KH77fbs2bPkuQv37F1zzTV54IEH8uIXv/isvuuqq67KrbfemhtuuCEf+9jHzq1gAACgk6q1NuoaztnU1FQ7cODA163dddddufrqqwe6fqndODdsOL/dONeSs/m7AAAAxk9V3d5am1rq2Lod40x6ge7o0d5mLIcO9UY3Z2aS/vPLAQAAxta6DntJL9jt3DnqKgAAAFbWur1nDwAAWF+md01netf0qMtYNcIeAABABwl7AAAAHbTu79kDAAC6a/HY5r7D+05Z23v93tUtaBXp7K2we++9N0996lPzHd/xHXnyk5+c17zmNUue99rXvjabN2/Otm3b8sQnPjEvetGLMj8/nyS5/vrr8653vevrzt/Y3yL07rvvzjXXXPPQ+lve8pY85SlPyZe//OUh/UYAAMA40tnLiWS/Eqn+YQ97WD70oQ9l48aNuf/++7N9+/Y8+9nPztOf/vRTzn35y1+en/7pn878/Hy++7u/O/v27csznvGMgb/rbW97W970pjflQx/6UC699NLzrh0AALpm8X/jr+R/948DYW+FVdVDXbj7778/999/f6pq2Wu+9rWv5d577z2rwHbLLbfkpptuygc/+MFcdtll51UzAADQPcY4h+DBBx/Mtm3b8uhHPzrPfOYz87SnPW3J8974xjdm27Zt2bRpU57whCdk27ZtDx17xStekW3btj30Z7HDhw/npS99ad73vvflMY95zFB/FwAAYDyt287eMG/UvOCCC3LHHXfkK1/5Sp73vOflU5/61NfdZ7dgYYzz/vvvzwte8IK8853vzHXXXZckef3rX58XvOAFD5270C1MkssvvzyPfOQjc8stt+TlL3/5OdcJAADryXoZ31ygszdEl1xySaanp/Pe97532fMuuuiiXHvttfnwhz880Oc+/OEPz2233ZY3v/nNecc73rESpQIAAB2zbjt7w7pR89ixY7noootyySWX5G//9m/zgQ98IK985SuXvaa1lo985COnjGsu5/LLL8973/veTE9P57LLLsuznvWs8y0dAADoEJ29FfaFL3whz3jGM/Lt3/7t+c7v/M4885nPzA/8wA8see7CPXvXXHNNHnjggbz4xS8+q++66qqrcuutt+aGG27Ixz72sZUoHwAA6IhqrY26hnM2NTXVDhw48HVrd911V66++uqz+pyubsF6Ln8XAADA+Kiq21trU0sdW7djnIt1LeQBAAAY4wQAAOigToa9cR5NXSn+DgAAYH3rXNi7+OKLc88996zrsNNayz333JOLL7541KUAADAE07umv+4Z0bCUzt2zd8UVV+TIkSM5duzYqEsZqYsvvjhXXHHFqMsAAABGpHNh76KLLspVV1016jIAAABGqnNhDwAAumjx2Oa+w/tOWbPDPCfr3D17AAAA6OwBAMBYWNy5W+jo6eaxHJ09AACADhL2AAAAOsgYJwAAjBnjmwxCZw8AAMaMh6ozCGEPAACgg4Q9AACADnLPHgAAjAEPVeds6ewBAAB0kM4eAACMAQ9V52zp7AEAAHSQsAcAANBBxjgBAGDMGN9kEDp7AAAAHSTsAQAAdJCwBwAAsIzpXdNf90zDcSHsAQAAdJCwBwAA0EF24wQAADjJ4rHNfYf3nbI2Djui6uwBAAB0kM4eAABwVhY6XOPQ3TpXi3+3cf19h97Zq6oLqurPq+o9/fePrKr3V9XB/uuli859dVUdqqrPVNWzhl0bAABAV63GGOdPJrlr0ftXJflga21rkg/236eqnpTkuiRPTnJtkl+vqgtWoT4AAFgXxvURApyboY5xVtUVSf63JL+Y5Mb+8nOTTPd/fmuSvUle2V9/Z2vtviSfrapDSZ6a5KPDrBEAADizLmxYcq7G9XcbdmfvV5P8qyTzi9a+ubX2hSTpvz66v745yecWnXekv/Z1quqFVXWgqg4cO3ZsOFUDAACMuaF19qrqB5J8qbV2e1VND3LJEmvtlIXWbk5yc5JMTU2dchwAADhhpTpyXdiwZL0Z5hjndyV5TlXtSHJxkkdU1duTfLGqNrXWvlBVm5J8qX/+kSSPXXT9FUmODrE+AACAzhpa2GutvTrJq5Ok39n76dbaj1bV65P8WJKb+q/v7l9ya5LfqapfSfItSbYm+fiw6gMAgPVAR279GsVz9m5KcktV7Uzyl0l+OElaa5+uqluS3JnkgSQvaa09OIL6AACAZQiL42FVwl5rbW96u26mtXZPku87zXm/mN7OnQAAwArQzVu/RtHZAwAARkDgW19W46HqAAAArDKdPQAA6Jj1/AB0TtDZAwAA6CCdPQAA6BiPWyDR2QMAAOgkYQ8AAKCDjHECAECHGd9cv3T2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AABgxKZ3TWd61/Soy6BjhD0AAIAOEvYAAAA66MJRFwAAAOvR4rHNfYf3nbK29/q9q1sQnaOzBwAA0EE6ewAAMAKLO3cLHb1Bu3lnez7rk84eAABABwl7AAAAHWSMEwAARmyQcUwbunC2dPYAAAA6SGcPAIB1ZVw3NzmfDV1Yn3T2AAAAOkjYAwAA6CBjnAAAdF7XNjcZt3oZDZ09AACADtLZAwCg82xuwnqkswcAANBBwh4AAEAHGeMEAGBdMb7JeqGzBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAANBh07umM71retRlMALCHgAAQAcJewAAAB104agLAAAAVtbisc19h/edsrb3+r2rWxAjobMHAADQQTp7AADQMYs7dwsdPd289UdnDwAAoIOEPQAAgA4yxgkAAB1mfHP90tkDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAgDVtetd0pndNj7qMsTO0sFdVF1fVx6vqL6rq01X1c/3111bV56vqjv6fHYuueXVVHaqqz1TVs4ZVGwAAQNddOMTPvi/J97bWjlfVRUn2V9Vt/WNvbK398uKTq+pJSa5L8uQk35LkA1X1hNbag0OsEQAAoJOGFvZaay3J8f7bi/p/2jKXPDfJO1tr9yX5bFUdSvLUJB8dVo0AAMDatHhsc9/hfaes7b1+7+oWNIaGes9eVV1QVXck+VKS97fWPtY/9NKq+kRV/YequrS/tjnJ5xZdfqS/dvJnvrCqDlTVgWPHjg2zfAAAgLE1zDHO9Ecwt1XVJUn+oKquSfIbSX4hvS7fLyR5Q5IbktRSH7HEZ96c5OYkmZqaWq5TCAAAjKnFnbuFjp5u3tlZld04W2tfSbI3ybWttS+21h5src0neUt6o5pJr5P32EWXXZHk6GrUBwAA0DXD3I3z8n5HL1X1jUm+P8l/qapNi057XpJP9X++Ncl1VfWwqroqydYkHx9WfQAAAF02zDHOTUneWlUXpBcqb2mtvaeq3lZV29Ib0bw7yY8nSWvt01V1S5I7kzyQ5CV24gQAAIxvnpvqbZo5nqamptqBAwdGXQYAAMBIVNXtrbWppY6tyj17AAAArC5hDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAGDEpndNZ3rX9KjLWLP8/ZwbYQ8AAKCDhD0AAIAOunDUBQAAwHq0eCxx3+F9p6ztvX7v6ha0xvj7OX86ewAAAB1UrbVR13DOpqam2oEDB0ZdBgAAnJeFjpVu1dL8/ZxeVd3eWpta6pjOHgAAQAcJewAAAB1kjBMAAGBMGeMEAABYZ4Q9AAAYE9O7pr/u8QOwHGEPAACgg4Q9AACADrpw1AUAAACnt3hsc9/hfaesefYcp6OzBwAA0EE6ewAAsIYt7twtdPR08xiEzh4AAHDW7Ay69gl7AAAAHWSMEwAAxoTxTc6GsAcAAAzEzqDjxRgnAABAB+nsAQAAA7Ez6HjR2QMAAOggYQ8AAKCDjHECAABnzfjm2qezBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAAAjNr1rOtO7pkddBh0j7AEAAHSQsAcAANBBF466AAAAWI8Wj23uO7zvlLW91+9d3YLoHJ09AACADtLZAwCAEVjcuVvo6OnmsZJ09gAAADpI2AMAAOggY5wAADBixjcZBp09AACADhL2AAAAOkjYAwAA6KCBw15VTVTVBcMsBgAAgJVx2rBXVRuq6h9X1X+uqi8l+S9JvlBVn66q11fV1tUrEwAAgLOxXGfvj5N8a5JXJ3lMa+2xrbVHJ/kHSf40yU1V9aOrUCMAAABnablHL3x/a+3+kxdba3+d5PeS/F5VXTS0ygAAADhnp+3sLQS9qvrWqnpY/+fpqvqJqrpk8TkAAACsLYNs0PJ7SR6sqi1JfivJVUl+Z6hVAQAAcF4GCXvzrbUHkjwvya+21l6eZNNwywIAAOB8DBL27q+qf5Tkx5K8p7/mXj0AAIA1bJCw90+T/K9JfrG19tmquirJ24dbFgAAAOdjud04kySttTur6pVJHtd//9kkNw27MAAAAM7dGTt7VfWDSe5I8t7++21VdeuwCwMAAODcDTLG+dokT03ylSRprd2R3o6cAAAArFGDhL0HWmtfPWmtDaMYAAAAVsYZ79lL8qmq+sdJLqiqrUl+IslHhlsWAAAA52OQzt6/TPLkJPel9zD1ryZ52TCLAgAA4PwMshvn/0zyr6vql1prc6tQEwAAAOdpkN04/35V3Znkrv7776iqXx96ZQAAAJyzQcY435jkWUnuSZLW2l8k+e5hFgUAAMD5GSTspbX2uZOWHhxCLQAAAKyQQXbj/FxV/f0kraq+Ib3dOO8ablkAAACcj0E6e/8iyUuSbE7y+STb+u8BAABYowbZjfOvkvzIKtQCAADAChlkN87HV9UfVtWxqvpSVb27qh4/wHUXV9XHq+ovqurTVfVz/fVHVtX7q+pg//XSRde8uqoOVdVnqupZ5/erAQAArF+DjHH+TpJbkmxK8i1JfjfJfxzguvuSfG9r7TvSG/28tqqenuRVST7YWtua5IP996mqJyW5Lr0HuF+b5Ner6oKz+3UAAABIBgt71Vp7W2vtgf6ftydpZ7qo9Rzvv72o/6cleW6St/bX35rkH/Z/fm6Sd7bW7mutfTbJoSRPPYvfBQAAgL5Bwt4fV9WrqurKqvo7VfWvkvzn/jjmI5e7sKouqKo7knwpyftbax9L8s2ttS8kSf/10f3TNydZ/IiHI/21kz/zhVV1oKoOHDt2bIDyAQAA1p9BHr0w03/98ZPWb0ivU3fa+/daaw8m2VZVlyT5g6q6ZpnvqaU+YonPvDnJzUkyNTV1xg4jAADAejTIbpxXne+XtNa+UlV707sX74tVtam19oWq2pRe1y/pdfIeu+iyK5IcPd/vBgAAWI/OGPaq6p8std5a++0zXHd5kvv7Qe8bk3x/kv8zya1JfizJTf3Xd/cvuTXJ71TVr6S3EczWJB8f8PcAAABgkUHGOL9z0c8XJ/m+JH+WZNmwl97unW/t76i5IcktrbX3VNVHk9xSVTuT/GWSH06S1tqnq+qWJHcmeSDJS/pjoAAAAJylau3sbnurqm9K8rbW2nOGU9Lgpqam2oEDB0ZdBgAAwEhU1e2ttamljg2yG+fJ/md6I5YAAACsUYPcs/eHObEr5oYkT0rvweoAAACsUYPcs/fLi35+IMnh1tqRIdUDAADAClg27PU3V/l0a+2v+u+/Icn1VfXy1trVq1EgAAAAZ++09+xV1XVJ/jrJJ6pqX1U9I8l/T/LsJD+ySvUBAABwDpbr7P1skv+ltXaoqp6S5KNJrmut/cHqlAYAAMC5Wm43zq+11g4lSWvtz5J8VtADAAAYD8t19h5dVTcuer9x8fvW2q8MrywAAADOx3Jh7y1JJpd5DwAAwBp12rDXWvu51SwEAACAlTPIc/YA6LjZ2WT37uTgwWTr1mRmJpk0ywEAY03YA1jn9u9PduxI5ueTublkYiK58cZkz55k+/ZRVwcAnKvlduMEoONmZ3tBb3a2F/SS3uvC+vHjo60PADh3Z+zsVdXDkjw/yZWLz2+t/fzwygJgNeze3evoLWV+vnd8587VrQkAWBmDjHG+O8lXk9ye5L7hlgPAajp48ERH72Rzc8mhQ6tbDwCwcgYJe1e01q4deiUArLqtW3v36C0V+CYmki1bVr8mAGBlDHLP3keq6u8OvRIAVt3MTLLhNP8m2LChdxwAGE+DhL3tSW6vqs9U1Seq6pNV9YlhFwbA8E1O9nbdnJzsdfKS3uvC+saNo60PADh3g4xxPnvoVQAwMtu3J0eP9jZjOXSoN7o5MyPoAcC4O2PYa60dTpKqenSSi4deEQCrbuNGu24CQNeccYyzqp5TVQeTfDbJviR3J7ltyHUBAABwHga5Z+8Xkjw9yX9trV2V5PuS/MlQqwIAAOC8DBL27m+t3ZNkQ1VtaK39cZJtQ64LAACA8zDIBi1fqaqNST6c5B1V9aUkDwy3LAAAAM7HIJ295yb52yQvT/LeJP8tyQ8OsygAAADOzyC7cc4tevvWIdYCAADACjlt2Kuq/a217VU1m6QtPpSktdYeMfTqAAAAOCenDXutte3918nVKwcAAICVsFxn75HLXdha++uVLwcAAICVsNw9e7enN75ZSR6X5Mv9ny9J8pdJrhp6dQAAAJyT0+7G2Vq7qrX2+CR/lOQHW2uXtdYeleQHkvz+ahUIAADA2Rvk0Qvf2Vrbs/CmtXZbku8ZXkkAAACcr0Eeqv5XVfWzSd6e3ljnjya5Z6hVAbCqZmeT3buTgweTrVuTmZlk0vZcADDWBgl7/yjJa5L8Qf/9h/trAHTA/v3Jjh3J/HwyN5dMTCQ33pjs2ZNs3z7q6gCAczXIQ9X/OslPrkItAKyy2dle0JudPbE2N9d73bEjOXo02bhxNLUBAOfnjPfsVdUTqurmqnpfVX1o4c9qFAfAcO3e3evoLWV+vnccABhPg4xx/m6SNyf5zSQPDrccAFbTwYMnOnknm5tLDh1a3XoAgJUzSNh7oLX2G0OvBIBVt3Vr7x69pQLfxESyZcvq1wQArIxBHr3wh1X14qraVFWPXPgz9MoAGLqZmWTDaf5NsGFD7zgAMJ4G6ez9WP/1FYvWWpLHr3w5AKymycnerpsn78a5YUNv3eYsADC+BtmN86rVKASA0di+vbfr5u7dvXv0tmzpdfQEPQAYb6cNe1X1Q8td2Fr7/ZUvB4BR2Lgx2blz1FUAACtpuc7eDy5zrCUR9gAAANao04a91to/Xc1CALpoetd0kmTv9XtHWgcAsP4MshsnAAAAY0bYAwAA6KBBHr0AwFlYGN1Mkn2H952yZqQTAFgNA4W9qvr7Sa5cfH5r7beHVBMAAADn6Yxhr6reluRbk9yR5MH+cksi7AEsYXHnzgYtAMCoDNLZm0rypNZaG3YxAAAArIxBNmj5VJLHDLsQAAAAVs5pO3tV9YfpjWtOJrmzqj6e5L6F46215wy/PIDxZnwTABiV5cY4f3nVqgAAAGBFnTbstdb2JUlVPbu1dtviY1X1L5LsG3JtAGPPBi09s7PJ7t3JwYPJ1q3JzEwyOTnqqgCg2wbZoOXfVNV9rbUPJUlVvTLJdJI3D7MwALph//5kx45kfj6Zm0smJpIbb0z27Em2bx91dQDQXYOEveckeU9VvSLJtUme2F8DgGXNzvaC3uzsibW5ud7rjh3J0aPJxo2jqQ0Auu6MYa+19ldV9ZwkH0hye5IXeAwDwOktjG4myb7D+05ZW08jnbt39zp6S5mf7x3fuXN1awKA9WK53Thn09uNc8E3JHl8khdUVWutPWLYxQEw3g4ePNHJO9ncXHLo0Ll/tvshAWB5y23Q4tZ5gHOwOHys90CydWvvHr2lAt/ERLJly+rXBADrxSAPVU9VXVpVT62q7174M+zCABh/MzPJhtP8m2bDht5xAGA4znjPXlX9syQ/meSKJHckeXqSjyb53uGWBsC4m5zs7bp58m6cGzb01s92cxb3QwLA4AbZjfMnk3xnkj9trT2jqp6Y5OeGWxZANwgfvccrHD3a24zl0KHe6ObMjF04AWDYBgl797bW7q2qVNXDWmv/paq+beiVAdAZGzeuzK6b7ocEgMENEvaOVNUlSf5TkvdX1ZeTHB1uWQAAAJyPQZ6z97z+j6+tqj9O8k1JbhtqVQAAAJyXQTp7D2mt7UuSqvrLJI8bSkUAMADjmwCwvIEevbCEWvgjisgAAB2WSURBVNEqAAAAWFHnGvbailYBAADAijrtGGdV3Xi6Q0lsmA0AALCGLXfP3uQyx35tpQsBAABg5Zw27LXWPDgdAABgTJ32nr2q+tmqunSZ499bVT8wnLIAumF61/RDD/8GAFhNy41xfjLJe6rq3iR/luRYkouTbE2yLckHkvzS0CsEAADgrC03xvnuJO+uqq1JvivJpiR/k+TtSV7YWvvb1SkRAE610DH1vD0AWNoZH6reWjuY5ODZfnBVPTbJbyd5TJL5JDe31n6tql6b5J+n1ylMkp9pre3pX/PqJDuTPJjkJ1prf3S23wswaovHNvcd3nfKmnACAKyGM4a98/BAkp9qrf1ZVU0mub2q3t8/9sbW2i8vPrmqnpTkuiRPTvItST5QVU9orT04xBoBAAA6aWhhr7X2hSRf6P88W1V3Jdm8zCXPTfLO1tp9ST5bVYeSPDXJR4dVI8AwLO7cGTVcWbqmADC40+7GuZKq6sokfy/Jx/pLL62qT1TVf1i04+fmJJ9bdNmRLBEOq+qFVXWgqg4cO3bs5MMAAABkgM5eVT0hyW8k+ebW2jVV9e1JntNa+3eDfEFVbUzye0le1lr7m6r6jSS/kKT1X9+Q5IYktcTl7ZSF1m5OcnOSTE1NnXIcgO7SNQWAwQ3S2XtLklcnuT9JWmufSO/eujOqqovSC3rvaK39fv/6L7bWHmytzfc/+6n9048keeyiy69IcnSQ7wFYq/Zev1cYAQBGYpCw9/DW2sdPWnvgTBdVVSX5rSR3tdZ+ZdH6pkWnPS/Jp/o/35rkuqp6WFVdld7z/E7+XgAAAAYwyAYtf1VV35r+SGVVvSD9jVfO4LuS/B9JPllVd/TXfibJP6qqbf3PuzvJjydJa+3TVXVLkjvTC5MvsRMnMO6MGg6Pv1MAWN4gYe8l6d0j98Sq+nySzyb50TNd1Frbn6Xvw9uzzDW/mOQXB6gJAACAZQzyUPX/nuT7q2oiyYbW2uzwywIAAOB8DLIb5y8leV1r7Sv995em97D0nx12cQDjyLPgAIC1YJANWp69EPSSpLX25SQ7hlcSAAAA52uQe/YuqKqHtdbuS5Kq+sYkDxtuWQDjy7PgAIC1YJCw9/YkH6yq/ye9HTRvSPLWoVYFAADAeRlkg5bXVdUnk3xfertr/kJr7Y+GXhlAB9zxP+4480kAAEMwSGcvrbXbktw25FoAOmfbY7aNugQAYJ06bdirqv2tte1VNZv+A9UXDiVprbVHDL06ADphdjbZvTs5eDDZujWZmUkmJ8/vM90PCZwr//xgvTht2Gutbe+/nue/jgHWF49e+Hr79yc7diTz88ncXDIxkdx4Y7JnT7J9+6irA4DuWvbRC1W1oao+tVrFANAts7O9oDc72wt6Se91Yf348dHWBwBdtuw9e621+ar6i6p6XGvtL1erKIBx5tELJ+ze3evoLWV+vnd8587BP0/XFDhX/vnBejTIBi2bkny6qj6eZG5hsbX2nKFVBUAnHDx4oqN3srm55NCh1a0HANaTQcLezw29CgA6aevW3j16SwW+iYlky5az+zxdU+Bc+ecH69Fyu3FenORfJNmS5JNJfqu19sBqFQbQBev9PyRmZnqbsSxlw4becQBgOJbboOWtSabSC3rPTvKGVakIgM6YnOztujk52evkJb3XhfWNG0dbHwB0WbXWlj5Q9cnW2t/t/3xhko+31p6ymsWdydTUVDtw4MCoywDgDI4f723GcuhQb3RzZkbQA4CVUFW3t9amljq23D179y/80Fp7oKpWvDAA1oeNG89u100A4PwtF/a+o6r+pv9zJfnG/vtK0lprjxh6dQAAAJyT04a91toFq1kIAAAAK2e5DVoAAAAYU8IeAABABwl7AAAAHSTsAQAAdJCwBzBE07umM71retRlAADrkLAHAADQQcs9Zw+A83TH/7hj1CUAAOuUsAewwhaPbX71vq+esrb3+r2rWxAAsC4Z4wQAAOggYQ8AAKCDhD0AAIAOcs8ewApbfE/eJTddcsoaAMBqEPYAhmjbY7aNugQAYJ0S9lizZmeT3buTgweTrVuTmZlkcnLUVQEAwHio1tqoazhnU1NT7cCBA6MugyHYvz/ZsSOZn0/m5pKJiWTDhmTPnmT79lFXB9218IgIY6cAMB6q6vbW2tRSx2zQwpozO9sLerOzvaCX9F4X1o8fH219AAAwDoQ91pzdu3sdvaXMz/eOw7iY3jX9dQ9UBwBYLe7ZY805ePBER+9kc3PJoUOrWw903eIwuu/wvlPWjHQCwHjS2WPN2bq1d4/eUiYmki1bVrceAAAYRzZoYc2ZnU02b+69nmxyMjl6NNm4cfXrgkEt1Sn7nr/zPQ+treVOmQ1aAGC82KCFsTI52dt1c3LyRIdvYuLEuqAHAABn5p491qTt23sdvN27e/fobdnSe86eoMc4WNwVu/DnLzxlDQBgNQh7rFkbNyY7d466ClhfhFIA6A5jnAAAAB2kswewwi656ZKHfn6wPXjK2lde9ZVVrwkAWH909gAAADpIZw9ghS3u3C109HTzAIDVprMHAADQQcIeAABABxnjBBgi45sAwKjo7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB104agL4MxmZ5Pdu5ODB5OtW5OZmWRyctRVAeNietd0kmTv9XtX5ToAYG0YWmevqh5bVX9cVXdV1aer6if764+sqvdX1cH+66WLrnl1VR2qqs9U1bOGVds42b8/2bw5ednLkte9rve6eXNvHQAA4HSGOcb5QJKfaq1dneTpSV5SVU9K8qokH2ytbU3ywf779I9dl+TJSa5N8utVdcEQ61vzZmeTHTt6r3NzvbW5uRPrx4+Ptj4AAGDtGtoYZ2vtC0m+0P95tqruSrI5yXOTTPdPe2uSvUle2V9/Z2vtviSfrapDSZ6a5KPDqnGt2707mZ9f+tj8fO/4zp2rWxMwHhZGMJNk3+F9p6ydbjTzXK8DANaeVdmgpaquTPL3knwsyTf3g+BCIHx0/7TNST636LIj/bWTP+uFVXWgqg4cO3ZsmGWP3MGDJzp6J5ubSw4dWt16AACA8TH0DVqqamOS30vystba31TVaU9dYq2dstDazUluTpKpqalTjnfJ1q3JxMTSgW9iItmyZfVrAsbD4g7c2Wy0cq7XAQBrz1A7e1V1UXpB7x2ttd/vL3+xqjb1j29K8qX++pEkj110+RVJjg6zvrVuZibZcJr/hTZs6B0HAABYyjB346wkv5Xkrtbaryw6dGuSH+v//GNJ3r1o/bqqelhVXZVka5KPD6u+cTA5mezZ03udmOitTUycWN+4cbT1AQAAa1e1NpxJyKranuT/TfLJJAvbjPxMevft3ZLkcUn+MskPt9b+un/Nv05yQ3o7eb6stXbbct8xNTXVDhw4MJT615Ljx3ubsRw61BvdnJkR9AAAgKSqbm+tTS15bFhhbzWsl7AHAACwlOXC3qrsxgkAAMDqEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhD4CRm941neld06MuAwA6RdgDAADoIGEPAACggy4cdQEArE+Lxzb3Hd53ytre6/eubkEA0DE6ewAAAB2kswfASCzu3C109HTzAGDl6OwBAAB0kLAHAADQQcY4YYzMzia7dycHDyZbtyYzM8nk5KirgvNnfBMAVp6wB2Ni//5kx45kfj6Zm0smJpIbb0z27Em2bx91dQAArDXGOGEMzM72gt7sbC/oJb3XhfXjx0dbHwAAa4+wB2Ng9+5eR28p8/O94wAAsJiwB2Pg4METHb2Tzc0lhw6tbj0AAKx9wh6Mga1be/foLWViItmyZXXrAQBg7RP2YAzMzCQbTvP/rRs29I4DAMBiwh6MgcnJ3q6bk5MnOnwTEyfWN24cbX0AAKw9Hr0AY2L79uTo0d5mLIcO9UY3Z2YEPQAAlibswRjZuDHZuXPUVQAAMA6McQIAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIewICmd01netf0qMsAABiIsAcAANBBwh4AAEAHXTjqAgDWssVjm/sO7ztlbe/1e1e3IACAAensAQAAdJDOHsAyFnfuFjp6unkAwDjQ2QMAAOggYQ8AAKCDjHECDMj4JgAwToQ9WMLsbLJ7d3LwYLJ1azIzk0xOjrqqtVsXAABrT7XWRl3DOZuammoHDhwYdRl0zP79yY4dyfx8MjeXTEwkGzYke/Yk27erCwCAtaOqbm+tTS15TNiDE2Znk82be68nm5xMjh5NNm5UFwAAa8NyYc8GLbDI7t29ztlS5ud7x0dhrdYFAMDaJezBIgcP9kYklzI3lxw6tLr1LFirdQEAsHYJe7DI1q29e+GWMjGRbNmyuvUsWKt1AQCwdgl7sMjMTG/Tk6Vs2NA7PgprtS4AANYuYQ8WmZzs7W45OXmikzYxcWJ9VJugrNW6AABYuzxnD06yfXtvd8vdu3v3wm3Z0uucjTpQbd+efOYzyate1Xv9tm9Lbrop2bRptHUBALA2efQCjAnP2QMA4GQevQBjbna2F/RmZ0/syjk3d2L9+PHR1gcAwNoj7K2g2dnkN38zeeUre69LPQAbzoXn7AEAcLbcs7dClhqxu/FGI3asDM/ZAwDgbOnsrQAjdgyb5+wBAHC2hL0VYMSOYfOcPQAAzpawtwKM2DFsnrMHAMDZcs/eClgYsVsq8BmxY6Ws1ef/AQCwNnnO3gqYnU02b156983Jyd5/oPsPcgAAYKV5zt6QGbEDAADWGmOcK8SIHQAAsJYIeyto48Zk585RVwEAAGCMEwAAoJOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6KALR10AZzY7m+zenRw8mGzdmszMJJOTo64K1pfpXdNJkr3X7x1pHQAAgxpaZ6+q/kNVfamqPrVo7bVV9fmquqP/Z8eiY6+uqkNV9Zmqetaw6ho3+/cnmzcnL3tZ8rrX9V43b+6tAwAAnM4wxzh3Jbl2ifU3tta29f/sSZKqelKS65I8uX/Nr1fVBUOsbSzMziY7dvRe5+Z6a3NzJ9aPHx9tfQAAwNo1tDHO1tqHq+rKAU9/bpJ3ttbuS/LZqjqU5KlJPjqk8sbC7t3J/PzSx+bne8d37lzdmmA9WRjdTJJ9h/edsmakEwBYy0axQctLq+oT/THPS/trm5N8btE5R/prp6iqF1bVgao6cOzYsWHXOlIHD57o6J1sbi45dGh16wEAAMbHam/Q8htJfiFJ67++IckNSWqJc9tSH9BauznJzUkyNTW15DldsXVrMjGxdOCbmEi2bFn9mmA9Wdy5s0ELADBuVrWz11r7YmvtwdbafJK3pDeqmfQ6eY9ddOoVSY6uZm1r0cxMsuE0/wtt2NA7DgAAsJRVDXtVtWnR2+clWdip89Yk11XVw6rqqiRbk3x8NWtbiyYnkz17eq8TE721iYkT6xs3jrY+AABg7RraGGdV/cck00kuq6ojSV6TZLqqtqU3onl3kh9Pktbap6vqliR3JnkgyUtaaw8Oq7Zxsn17cvRobzOWQ4d6o5szM4IerDbjmwDAuKnWxve2t6mpqXbgwIFRlwEAADASVXV7a21qqWOj2I0TAACAIRP2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADooAtHXQCMyuxssnt3cvBgsnVrMjOTTE4uf830rukkyd7r9w69PgAAOB/CHuvS/v3Jjh3J/HwyN5dMTCQ33pjs2ZNs3z7q6gAA4PwZ42TdmZ3tBb3Z2V7QS3qvC+vHj4+2PgAAWAk6e6w7u3f3OnpLmZ/vHd+588Tawuhmkuw7vO+UNSOdAACsRTp7rDsHD57o6J1sbi45dGh16wEAgGHQ2eOcnMvmJmvF1q29e/SWCnwTE8mWLV+/trhzZ4MWAADGhc4eZ23//mTz5uRlL0te97re6+bNvfVxMDOTbDjN/+Vv2NA7DgAA407Y46x0YXOTycnerpuTk71OXtJ7XVjfuHG09QEAwEowxslZOdvNTdaq7duTo0d79R461BvdnJk5c9AzvgkAwLgQ9jgrXdrcZOPG8QimAABwLoxxclYWNjdZylKbmwAAAKMh7HFWbG4CAADjQdjjrNjcBAAAxoN79jhr57q5CQAAsHqEPc6JzU0AAGBtM8YJAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTs8f+3d/+xV9V1HMefr/EVyFBUECNkfsk0Q3PEr+ky9YsN0zl/bMw0S8lqjTZDNzQdrdFapmJZxPJH6fyRP9JhaW6GyoA2pyB++fGFFH8EDBRTskwqTerdH+dzx/F6z+VKfu899/J6bGffcz/nfM/5nPvifr+fN+fH18zMzMzMOpCLPTMzMzMzsw7kYs/MzMzMzKwDudgzMzMzMzPrQIqIVvdht0l6DdjU6n400XBgW6s7Ye+LM2s/zqy9OK/248zajzNrP86s/fw/mR0SEQfWWtDWxd6eRtKKiJjY6n5Y45xZ+3Fm7cV5tR9n1n6cWftxZu2nvzLzZZxmZmZmZmYdyMWemZmZmZlZB3Kx115uanUH7H1zZu3HmbUX59V+nFn7cWbtx5m1n37JzPfsmZmZmZmZdSCf2TMzMzMzM+tALvbMzMzMzMw6kIu9FpN0i6RXJa3Ntc2V9KykNZJ+I2m/3LIrJL0gab2kk3PtEyT1pWXzJKnZx7InqJVXbtksSSFpeK7NebVYUWaSLkq5rJN0Ta7dmbVYwc/FcZKelLRK0gpJk3PLnFkLSRotabGkZ9LnaWZqP0DSo5KeT1/3z32PM2uhOpl5/FFSRZnllnsMUjL1MmvqGCQiPLVwAo4HxgNrc21Tga40fzVwdZofC6wGBgFjgBeBAWnZcuBYQMDDwCmtPrZOnGrlldpHAwuBTcBw51WeqeAz1gM8BgxKr0c4s/JMBZk9UnnPgVOBJc6sHBMwEhif5vcBnku5XANcntov9++y8kx1MvP4o6RTUWbptccgJZzqfM6aOgbxmb0Wi4g/AK9XtT0SETvSyyeBg9P8GcA9EfF2RGwAXgAmSxoJ7BsRT0T2L+J24MzmHMGepVZeyXXAZUD+iUfOqwQKMpsBXBURb6d1Xk3tzqwECjILYN80PxR4Oc07sxaLiK0R0Zvm3wSeAUaRZXNbWu02dr7/zqzFijLz+KO86nzOwGOQUqqTWVPHIC72yu9Csgoesn8gm3PLtqS2UWm+ut2aQNLpwEsRsbpqkfMqr8OBz0paJmmppEmp3ZmV18XAXEmbgWuBK1K7MysRSd3Ap4FlwEERsRWyQQ8wIq3mzEqkKrM8jz9KKp+ZxyDtoepz1tQxSNfud9v6m6TZwA7gzkpTjdWiTrv1M0l7A7PJLn15z+Iabc6rHLqA/YFjgEnAvZI+hjMrsxnAJRGxQNLZwM3A53BmpSFpCLAAuDgi/l7nlhJnVhLVmeXaPf4oqXxmZBl5DFJyNX42NnUM4jN7JSXpAuA04Lx0yhaySn50brWDyS5l2sLOSy3y7db/DiW7rnq1pI1k732vpI/gvMpsC3B/ZJYD/wWG48zK7ALg/jR/H1B5QIszKwFJe5ENZu6MiEpOf06XH5G+Vi5VcmYlUJCZxx8lViMzj0FKruBz1tQxiIu9EpL0eeDbwOkR8c/cogeBcyQNkjQGOAxYni6PeVPSMenpPOcDDzS943ugiOiLiBER0R0R3WQfyPER8QrOq8x+C0wBkHQ4MBDYhjMrs5eBE9L8FOD5NO/MWiy9vzcDz0TEj3OLHiQr0klfH8i1O7MWKsrM44/yqpWZxyDlVudnY3PHII0+ycVTvz2p525gK/AO2Yf0q2Q3ZG4GVqXphtz6s8mezrOe3JN4gInA2rRsPqBWH1snTrXyqlq+kfQkLOdVjqngMzYQ+FXKoBeY4szKMxVkdhzwNNmTypYBE5xZOaaUTQBrcr+3TgWGAYvICvNFwAHOrBxTncw8/ijpVJRZ1Toeg5RoqvM5a+oYRGkDZmZmZmZm1kF8GaeZmZmZmVkHcrFnZmZmZmbWgVzsmZmZmZmZdSAXe2ZmZmZmZh3IxZ6ZmZmZmVkHcrFnZmb9StIwSavS9Iqkl3KvB1ate7GkvRvY5hJJE2u0nyZppaTVkv4o6Rsf5LHsLklzqo77qt3Yxn6SvrmLdc6SFJKO2P3emplZp/CfXjAzs6aRNAfYHhHXFizfCEyMiG272M4SYFZErMi17QVsAiZHxBZJg4DuiFj/AXW/Vj+6ImJHA+vNoc5xN7ivbuChiDiqzjr3AiOBRRExp8byARHxn93tg5mZtRef2TMzs6aTdFI6A9cn6RZJgyR9C/gosFjS4rTe9ZJWSFon6Xu72Ow+QBfwF4CIeLtS6EkaI+kJSU9J+r6k7an9REkP5fo1X9L0NP/dtP5aSTdJUmpfIulKSUuBmZImSFoq6WlJCyWNbPA9GCBpbtrHmvxZSEmX5torx30VcGg6Mzi3xvaGAJ8h+yP05+TaT5S0WNJdQF/RfiUNkbRIUm/K5YxGjsPMzMrLxZ6ZmTXbYOBW4AsR8SmyAm1GRMwDXgZ6IqInrTs7IiYCRwMnSDq6aKMR8TrwILBJ0t2SzpNU+T33U+D6iJgEvNJgP+dHxKR0Ju1DwGm5ZftFxAnAPOBnwLSImADcAvygYHuX5C7jPJmsKHsj9WkS8PVUlE4FDgMmA+OACZKOBy4HXoyIcRFxaY3tnwn8PiKeA16XND63bDLZezm2aL/AW8BZETEe6AF+VClwzcysPbnYMzOzZhsAbEhFCcBtwPEF654tqRdYCRwJjK234Yj4GnASsByYRVZ8QXbG6+40f0eD/eyRtExSHzAl7b/i1+nrJ4CjgEclrQK+AxxcsL3rUqE2LiIWAlOB89P3LQOGkRV5U9O0EugFjkjtu3IucE+avye9rlgeERvSfNF+BVwpaQ3wGDAKOKiB/ZqZWUl1tboDZma2x/lHIyuls02zgEkR8VdJt5KdFawrIvrILle8A9gATK8sqrH6Dt79H5+D074HAz8nu39wc7rnLr/vyjEIWBcRxzZyTFUEXJQKv52N2Vm/H0bEjVXt3YUbkoaRFaRHSQqygjokXVbV33r7nQ4cCEyIiHfS/ZO7fL/NzKy8fGbPzMyabTDQLenj6fWXgaVp/k2ye+8A9iUrUt6QdBBwSr2NpnvOTsw1jSN7YAvA4+y8j+283DqbgLHpnsGhZGcFK30E2JbuhZtWsNv1wIGSjk192EvSkQXrVlsIzEgPlkHS4ZI+nNovTPtF0ihJI3j3e1NtGnB7RBwSEd0RMZqs0D3ufex3KPBqKvR6gEMaPA4zMyspn9kzM7Nmewv4CnCfpC7gKeCGtOwm4GFJWyOiR9JKYB3wJ7KCrR4Bl0m6EfgXWaE4PS2bCdwlaSawoPIN6azdvcAa4HmySyeJiL9J+gXQB2xMfXyPiPi3pGnAvFQsdgE/SX3elV8C3UBvujfuNeDMiHhE0ieBJ9Itc9uBL0XEi5Iel7QWeLjqvr1zyR7gkrcA+CI7Lzmtu1/gTuB3klYAq4BnGzgGMzMrMf/pBTMz2+NI2h4RQ1rdDzMzs/7kyzjNzMzMzMw6kM/smZmZmZmZdSCf2TMzMzMzM+tALvbMzMzMzMw6kIs9MzMzMzOzDuRiz8zMzMzMrAO52DMzMzMzM+tA/wNaUCIIeIILEwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 1080x720 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def plot_scatter_chart(df,location):\n",
" bhk2 = df[(df.location==location) & (df.bhk==2)]\n",
" bhk3 = df[(df.location==location) & (df.bhk==3)]\n",
" matplotlib.rcParams['figure.figsize'] = (15,10)\n",
" plt.scatter(bhk2.total_sqft,bhk2.price,color='blue',label='2 BHK', s=50)\n",
" plt.scatter(bhk3.total_sqft,bhk3.price,marker='+', color='green',label='3 BHK', s=50)\n",
" plt.xlabel(\"Total Square Feet Area\")\n",
" plt.ylabel(\"Price (Lakh Indian Rupees)\")\n",
" plt.title(location)\n",
" plt.legend()\n",
" \n",
"plot_scatter_chart(df7,\"Rajaji Nagar\")"
]
},
{
"cell_type": "code",
"execution_count": 111,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA3sAAAJcCAYAAABAE73ZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdfZDld10n+vdnkpCQmdYkJJgwgU0kw5IHYcTm4a5ztSNCIIWiPDhx2b1kyRaswCqJUsCu94K67KZARQtxuYC7EyMuMwJKZEkEgjPsLAh3glEg0Zopk5ghkYSQQGfIw0zme/843ZnOTHfP6Zlz+nT/+vWq6jp9vr/z8OkpoOrN5/P7fqu1FgAAALpl1agLAAAAYPCEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAoE9V9Y6q+qN5rt9aVT95hJ+9tar+7ZFXBwCPJewBsKLMFsiq6tKq2j6qmgBgGIQ9AACADhL2AGCGqnpSVX2squ6uqluq6hcPeskJVbW5qiar6itV9cyDrj+7qm6qqnur6r9X1QlTn3tyVX1y6nPvnfr9zMX5qwBYiYQ9AJhSVauS/HmSv0myNsnzk7ypqi6a8bKXJvmTJKck+eMkf1ZVx824/qokFyV5apKnJfnVqfVVSf57kn+W5ClJHkjye0P7YwBY8YQ9AFaiP6uq+6Z/kvz+1Pqzk5zWWvv11trDrbV/SPLBJJfMeO8NrbWPttb2JvntJCcked6M67/XWru9tfbtJO9M8vNJ0lq7p7X2sdba91prk1PXfny4fyYAK9mxoy4AAEbgZ1prn51+UlWXJvm36XXdnjQVAKcdk+R/zXh++/QvrbX9VbU7yZNmu57ktulrVXVikvckeVGSk6euj1XVMa21R476LwKAgwh7AHDA7Uluaa2tm+c1T57+ZWrs88wkd8x2Pb1xzelrv5zknyd5bmvtn6pqfZK/TlKDKBwADmaMEwAO+HKS71bVW6rq8VV1TFVdUFXPnvGaH6mql1XVsUnelOShJH814/obqurMqjolyX9IsnlqfSy9+/Tum7r29uH/OQCsZMIeAEyZGqf8qSTrk9yS5FtJPpTk+2e87BNJNia5N8m/TvKyqfv3pv1xkk8n+Yepn/80tf47SR4/9Zl/leS6of0hAJCkWmujrgEAAIAB09kDAADoIGEPAACgg4Q9AACADhL2AAAAOmhZn7N36qmntrPOOmvUZQAAAIzEDTfc8K3W2mmzXVvWYe+ss87Kjh07Rl0GAADASFTVbXNdM8YJAADQQcIeAABABwl7AAAAHbSs79mbzd69e7N79+48+OCDoy5lpE444YSceeaZOe6440ZdCgAAMAKdC3u7d+/O2NhYzjrrrFTVqMsZidZa7rnnnuzevTtnn332qMsBAABGoHNjnA8++GCe8IQnrNiglyRVlSc84QkrvrsJAAArWefCXpIVHfSm+TcAAICVrZNhDwAAYKUT9gbs9ttvz4UXXphzzz03559/fn73d3931te94x3vyNq1a7N+/fo8/elPzy/8wi9k//79SZJLL700H/3oRx/z+jVr1iRJbr311lxwwQWPrn/wgx/Ms571rNx7771D+osAAIDlaMWHvcnJ5EMfSt7ylt7j5OTRfd6xxx6b3/qt38rNN9+cv/qrv8r73ve+3HTTTbO+9vLLL8+NN96Ym266KV/96lezbdu2BX3X1Vdfnfe+97359Kc/nZNPPvnoCgcAADqlc7txLsT27cnFFyf79yd79iSrVydXXJF86lPJhg1H9plnnHFGzjjjjCTJ2NhYzj333HzjG9/IeeedN+d7Hn744Tz44IMLCmxbtmzJlVdemeuvvz6nnnrqkRULAAB01ort7E1O9oLe5GQv6CW9x+n1++8/+u+49dZb89d//dd57nOfO+v197znPVm/fn3OOOOMPO1pT8v69esfvfbmN78569evf/Rnpttuuy1vfOMb8+lPfzqnn3760RcKAAB0zooNe5s39zp6s9m/v3f9aNx///15+ctfnt/5nd/J933f9836mukxzrvuuit79uzJRz7ykUevvfvd786NN9746M9Mp512Wp7ylKdky5YtR1ckAADQWSs27O3ceaCjd7A9e5Jdu478s/fu3ZuXv/zledWrXpWXvexlh339cccdlxe96EX5/Oc/39fnn3jiibn22mvz/ve/Px/+8IePvFAAAKCzVuw9e+vW9e7Rmy3wrV6dnHPOkX1uay2XXXZZzj333FxxxRV9v+cLX/jCIeOa8znttNNy3XXXZWJiIqeeemouuuiiIysYAADopBXb2du4MVk1x1+/alXv+pH43//7f+fqq6/O5z73uUfvt/vUpz4162un79m74IILsm/fvrz+9a9f0HedffbZueaaa/Ka17wmX/rSl46sYAAAoJOqtTbqGo7Y+Ph427Fjx2PWbr755px77rl9vX+23ThXrTq63TiXkoX8WwAAAMtPVd3QWhuf7dqKHeNMeoHujjt6m7Hs2tUb3dy4MZk6vxwAAGDZWtFhL+kFu8suG3UVAAAAg7Vi79kDAADox8SmiUxsmhh1GQsm7AEAAHSQsAcAANBBK/6ePQAAgIPNHNvcdtu2Q9a2Xrp1cQs6Ajp7A/bggw/mOc95Tp75zGfm/PPPz9vf/vZZX/eOd7wja9euzfr16/P0pz89v/ALv5D9+/cnSS699NJ89KMffczr10xtEXrrrbfmggsueHT9gx/8YJ71rGfl3nvvHdJfBAAALEc6ezmQ0AeRzo8//vh87nOfy5o1a7J3795s2LAhL37xi/O85z3vkNdefvnl+ZVf+ZXs378/P/ZjP5Zt27blwgsv7Pu7rr766rz3ve/N5z73uZx88slHXTsAANAzMxsMMi8sJmFvwKrq0S7c3r17s3fv3lTVvO95+OGH8+CDDy4osG3ZsiVXXnllrr/++px66qlHVTMAANA9xjiH4JFHHsn69evzxCc+MS94wQvy3Oc+d9bXvec978n69etzxhln5GlPe1rWr1//6LU3v/nNWb9+/aM/M91222154xvfmE9/+tM5/fTTh/q3AAAAy9OK7ewN84bLY445JjfeeGPuu+++/OzP/my+9rWvPeY+u2nTY5x79+7NK17xinzkIx/JJZdckiR597vfnVe84hWPvna6W5gkp512Wk455ZRs2bIll19++RHXCQAAHN5yG9+cprM3RCeddFImJiZy3XXXzfu64447Li960Yvy+c9/vq/PPfHEE3Pttdfm/e9/fz784Q8PolQAAKBjVmxnb1g3XN5999057rjjctJJJ+WBBx7IZz/72bzlLW+Z9z2ttXzhC184ZFxzPqeddlquu+66TExM5NRTT81FF110tKUDAAAdorM3YHfeeWcuvPDCPOMZz8izn/3svOAFL8hLXvKSWV87fc/eBRdckH379uX1r3/9gr7r7LPPzjXXXJPXvOY1+dKXvjSI8gEAgI6o1tqoazhi4+PjbceOHY9Zu/nmm3Puuecu6HOW61aqh3Mk/xYAAMDyUVU3tNbGZ7u2Ysc4Z+payAMAADDGCQAA0EGdDHvLeTR1UPwbAADAyta5sHfCCSfknnvuWdFhp7WWe+65JyeccMKoSwEAAEakc/fsnXnmmdm9e3fuvvvuUZcyUieccELOPPPMUZcBAACMSOfC3nHHHZezzz571GUAAACMVOfGOAEAABD2AAAAOknYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpoaGGvqk6oqi9X1d9U1der6tem1k+pqs9U1c6px5NnvOdtVbWrqv6+qi4aVm0AAABdN8zO3kNJfqK19swk65O8qKqel+StSa5vra1Lcv3U81TVeUkuSXJ+khcl+f2qOmaI9QEAAHTW0MJe67l/6ulxUz8tyUuTXDW1flWSn5n6/aVJPtJae6i1dkuSXUmeM6z6AAAAumyo9+xV1TFVdWOSu5J8prX2pSQ/0Fq7M0mmHp849fK1SW6f8fbdU2sHf+Zrq2pHVe24++67h1k+AADAsjXUsNdae6S1tj7JmUmeU1UXzPPymu0jZvnMD7TWxltr46eddtqgSgUAAOiURdmNs7V2X5Kt6d2L982qOiNJph7vmnrZ7iRPnvG2M5PcsRj1AQAAdM0wd+M8rapOmvr98Ul+MsnfJbkmyaunXvbqJJ+Y+v2aJJdU1fFVdXaSdUm+PKz6AAAAuuzYIX72GUmumtpRc1WSLa21T1bVF5NsqarLkvxjklcmSWvt61W1JclNSfYleUNr7ZEh1gcAANBZ1doht8UtG+Pj423Hjh2jLgMAAGAkquqG1tr4bNcW5Z49AAAAFpewBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAAdM7EpolMbJoYdRkjJewBAAB0kLAHAADQQceOugAAAIBBmDm2ue22bYesbb106+IWNGI6ewAAAB2kswcAAHTCzM7ddEdvpXXzZtLZAwAA6CBhDwAAoIOMcQIAAJ2zksc3p+nsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAKxYE5smMrFpYtRlAAzF0MJeVT25qv6yqm6uqq9X1S9Nrb+jqr5RVTdO/Vw84z1vq6pdVfX3VXXRsGoDAADoumOH+Nn7kvxya+0rVTWW5Iaq+szUtfe01n5z5our6rwklyQ5P8mTkny2qp7WWntkiDUCAAB00tDCXmvtziR3Tv0+WVU3J1k7z1temuQjrbWHktxSVbuSPCfJF4dVIwCw8swc29x227ZD1rZeunVxCwIYkkW5Z6+qzkryw0m+NLX0xqr626r6b1V18tTa2iS3z3jb7swSDqvqtVW1o6p23H333UOsGgAAYPka5hhnkqSq1iT5WJI3tda+W1X/NclvJGlTj7+V5DVJapa3t0MWWvtAkg8kyfj4+CHXAQDmM7NzN93R080Dumionb2qOi69oPfh1trHk6S19s3W2iOttf1JPpjeqGbS6+Q9ecbbz0xyxzDrAwAA6Kph7sZZSf4gyc2ttd+esX7GjJf9bJKvTf1+TZJLqur4qjo7ybokXx5WfQAAAF02zDHOH03yr5N8tapunFr7D0l+vqrWpzeieWuS1yVJa+3rVbUlyU3p7eT5BjtxAgDDZHwT6LJh7sa5PbPfh/eped7zziTvHFZNAAAAK8Wi7MYJAADA4hL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAGAeE5smMrFpYtRlLJiwBwAA0EHCHgAAQAcdO+oCAAAAlpqZY5vbbtt2yNrWS7cubkFHQGcPAACgg3T2AAAADjKzczfd0VsO3byZdPYAAAA6SNgDAADoIGOcAAAA81hu45vTdPYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYAwAA6CBhDwAAoIOEPQAAgA4S9gAAADpI2AMAAOggYQ8AAKCDhD0AAIAOEvYAAAA6SNgDAADoIGEPAACgg4Q9AACADhL2AAAAOkjYA4AOmtg0kYlNE6MuA4AREvYAAAA6SNgDAADooGNHXQAAMBgzxza33bbtkLWtl25d3IIAGCmdPQAAgA7S2QOAjpjZuZvu6OnmAaxcfXf2qmp1VR0zzGIAAAAYjDnDXlWtqqp/WVX/s6ruSvJ3Se6sqq9X1burat3ilQkAAMBCzDfG+ZdJPpvkbUm+1lrbnyRVdUqSC5NcWVV/2lr7o+GXCQAshPFNAOYLez/ZWtt78GJr7dtJPpbkY1V13NAqAwAA4IjNOcY5HfSq6qlVdfzU7xNV9YtVddLM1wAAALC09LNBy8eSPFJV5yT5gyRnJ/njoVYFAADAUekn7O1vre1L8rNJfqe1dnmSM4ZbFgAAAEejn7C3t6p+Psmrk3xyas29egAAAEtYP2Hv3yT5P5K8s7V2S1WdncQOnAAAAEvYfLtxJklaazdV1VuSPGXq+S1Jrhx2YQAAABy5w3b2quqnktyY5Lqp5+ur6pphFwYAAMCR62eM8x1JnpPkviRprd2Y3o6cAAAALFH9hL19rbXvHLTWhlEMAAAAg3HYe/aSfK2q/mWSY6pqXZJfTPKF4ZYFAADA0eins/fvk5yf5KH0DlP/TpI3DbMoAAAAjk4/u3F+L8l/rKr/3Frbswg1AQAAcJT62Y3zX1TVTUlunnr+zKr6/aFXBgAAwBHrZ4zzPUkuSnJPkrTW/ibJjw2zKAAAAI5OP2EvrbXbD1p6ZAi1AAAAMCD97MZ5e1X9iyStqh6X3m6cNw+3LAAAAI5GP529f5fkDUnWJvlGkvVTzwEAAFii+tmN81tJXrUItQAAADAg/ezG+YNV9edVdXdV3VVVn6iqH1yM4gAAADgy/Yxx/nGSLUnOSPKkJH+S5H8MsygAAACOTj9hr1prV7fW9k39/FGSNuzCAAAAOHL97Mb5l1X11iQfSS/kbUzyP6vqlCRprX17iPUBAABwBPoJexunHl930Ppr0gt/7t8DAABYYvrZjfPsxSgEAACAwTls2Kuq/2u29dbaHw6+HAAAAAahnw1anj3j5/9M8o4kP324N1XVk6vqL6vq5qr6elX90tT6KVX1maraOfV48oz3vK2qdlXV31fVRUf0FwEAANDXGOe/n/m8qr4/ydV9fPa+JL/cWvtKVY0luaGqPpPk0iTXt9aunNr45a1J3lJV5yW5JMn56R3x8Nmqelpr7ZEF/UUAACMysWkiSbL10q2L8j6A+fTT2TvY95KsO9yLWmt3tta+MvX7ZJKbk6xN8tIkV0297KokPzP1+0uTfKS19lBr7ZYku5I85wjqAwAAWPH6uWfvz3PgXL1VSc5L72D1vlXVWUl+OMmXkvxAa+3OpBcIq+qJUy9bm+SvZrxt99TawZ/12iSvTZKnPOUpCykDAABgxejn6IXfnPH7viS3tdZ29/sFVbUmyceSvKm19t2qmvOls6wdcnh7a+0DST6QJOPj4w53BwBGanoEM0m23bbtkLW5RjOP9H0A/Zp3jLOqjkny9dbattbatiT/X5KLq+rmfj68qo5LL+h9uLX28anlb1bVGVPXz0hy19T67iRPnvH2M5Pc0fdfAgAAwKPm7OxV1SVJ/t8ke6pqZ3q7cF6dXuB71eE+uHotvD9IcnNr7bdnXLomyauTXDn1+IkZ639cVb+d3gYt65J8eYF/DwDAoprZgVvIRitH+j6Afs03xvmrSX6ktbarqp6V5ItJLmmt/Wmfn/2jSf51kq9W1Y1Ta/8hvZC3paouS/KPSV6ZJK21r1fVliQ3pTcu+gY7cQIAAByZ+cLew621XUkydXzCLQsIemmtbc/s9+ElyfPneM87k7yz3+8AAJaPldC92v6P20ddAsCj5gt7T6yqK2Y8XzPz+UGjmQAAK96ax605ovd1OQADozNf2PtgkrF5ngMAALBEzRn2Wmu/tpiFAADdsxKOFzjpypMe/f07D33nkLX73nrfotcEkBzm6AUAAACWp34OVQcAOCIr4XiBmZ276Y6ebh6wFOjsAQAAdNBhO3tVdXySlyc5a+brW2u/PryyAAAAOBr9jHF+Isl3ktyQ5KHhlgMAdFXXxjdnY3wTWEr6CXtnttZeNPRKAAAAGJh+7tn7QlX90NArAYAFmtg08Zht/AGAA/rp7G1IcmlV3ZLeGGclaa21Zwy1MgAAAI5YP2HvxUOvAgAAgIE6bNhrrd2WJFX1xCQnDL0iAJjHzLHNbbdtO2RtJWwCAgD9OOw9e1X101W1M8ktSbYluTXJtUOuCwAAgKPQzxjnbyR5XpLPttZ+uKouTPLzwy0LAGY3s3M33dHTzQOAQ/WzG+fe1to9SVZV1arW2l8mWT/kugAAADgK/XT27quqNUk+n+TDVXVXkn3DLQsAAICj0U/Ye2mSB5NcnuRVSb4/ya8PsygA6IfxTQCYWz+7ce6Z8fSqIdYCAADAgMwZ9qpqe2ttQ1VNJmkzL6V3qPr3Db06AAAAjsicYa+1tmHqcWzxygEAAGAQ5uvsnTLfG1tr3x58OQAAAAzCfPfs3ZDe+GYleUqSe6d+PynJPyY5e+jVAQAAcETmPGevtXZ2a+0Hk/xFkp9qrZ3aWntCkpck+fhiFQgAAMDC9XOo+rNba5+aftJauzbJjw+vJAAAAI5WP+fsfauqfjXJH6U31vmvktwz1KoAgE6b2DSRxFmJAMPUT2fv55OcluRPk/xZkidOrQEAALBE9XOo+reT/NIi1AIAAMCAHDbsVdXTkvxKkrNmvr619hPDKwsA6Jrp0c0k2XbbtkPWjHQCDFY/9+z9SZL3J/lQkkeGWw4AAACD0E/Y29da+69DrwQA6LSZnTsbtAAMXz8btPx5Vb2+qs6oqlOmf4ZeGQAAAEesn87eq6ce3zxjrSX5wcGXAwAAwCD0sxvn2YtRCACwchjfBBi+OcNeVb1svje21j4++HIAYGlxbxkAy9V8nb2fmudaSyLsAQAALFFzhr3W2r9ZzEIAAAAYnH42aAGAFcXh3wB0QT9HLwAAALDM6OwBwEEc/g1AF/QV9qrqXyQ5a+brW2t/OKSaAAAAOEqHDXtVdXWSpya5MckjU8stibAHAACwRPXT2RtPcl5rrQ27GABYaoxvArBc9bNBy9eSnD7sQgAAABicOTt7VfXn6Y1rjiW5qaq+nOSh6euttZ8efnkAAAAcifnGOH9z0aoAAABgoOYMe621bUlSVS9urV0781pV/bsk24ZcGwAAAEeon3v2/u+q+onpJ1X1liQvHV5JAAAAHK1+duP86SSfrKo3J3lRkqdPrQEAALBEHTbstda+VVU/neSzSW5I8grHMAAAACxt8+3GOZnebpzTHpfkB5O8oqpaa+37hl0cAAAAR2a+DVrGFrMQAAAABqefe/ZSVScnWZfkhOm11trnh1UUAPRjYtNEkmTrpVtHWgcALEWHDXtV9W+T/FKSM5PcmOR5Sb6Y5Cfmex8AAACj08/RC7+U5NlJbmutXZjkh5PcPdSqAAAAOCr9jHE+2Fp7sKpSVce31v6uqv750CsDgFlMj24mybbbth2yZqQTAHr6CXu7q+qkJH+W5DNVdW+SO4ZbFgAAAEejFnJkXlX9eJLvT3Jta23v0Krq0/j4eNuxY8eoywBgRGzQAsBKV1U3tNbGZ7vW126c01pr26Y+8B+TPGUAtQEAADAE/WzQMpsaaBUAAAAM1II6ezP0P/sJAENifBMA5jZn2KuqK+a6lGTNcMoBAABgEObr7I3Nc+13B10IAAAAgzNn2Gut/dpiFgIAAMDgzLlBS1X9alWdPM/1n6iqlwynLAAAAI7GfGOcX03yyap6MMlXktyd5IQk65KsT/LZJP956BUCAACwYPONcX4iySeqal2SH01yRpLvJvmjJK9trT2wOCUCAACwUIc9eqG1tjPJzkWoBQAAgAE50kPVAQAAWMKEPQAAgA4S9gCGbGLTRCY2TYy6DABghTls2Kuqp1XV9VX1tannz6iqXx1+aQAAABypfjp7H0zytiR7k6S19rdJLhlmUQAAABydw+7GmeTE1tqXq2rm2r4h1QPQCTPHNrfdtu2Qta2Xbl3cggCAFaefzt63quqpSVqSVNUrktw51KoAAAA4Kv109t6Q5ANJnl5V30hyS5J/NdSqAJa5mZ276Y6ebh4AsJj6OVT9H5L8ZFWtTrKqtTY5/LIAAAA4Gv3sxvmfq+qk1tqe1tpkVZ1cVf9pMYoDAADgyPRzz96LW2v3TT9prd2b5OLhlQTQLVsv3WqEEwBYdP2EvWOq6vjpJ1X1+CTHz/N6AJiXg+YBYPj6CXt/lOT6qrqsql6T5DNJrjrcm6rqv1XVXdOHsU+tvaOqvlFVN079XDzj2tuqaldV/X1VXXQkfwwAAAA9/WzQ8q6q+mqS5yepJL/RWvuLPj57U5LfS/KHB62/p7X2mzMXquq89A5qPz/Jk5J8tqqe1lp7pI/vAQAA4CD9HL2Q1tq1Sa5dyAe31j5fVWf1+fKXJvlIa+2hJLdU1a4kz0nyxYV8JwBLl4PmAWBxzTnGWVXbpx4nq+q7M34mq+q7R/Gdb6yqv50a8zx5am1tkttnvGb31Npsdb22qnZU1Y677777KMoAAADorjk7e621DVOPYwP8vv+a5DeStKnH30rymvTGQw8pYY66PpDeIe8ZHx+f9TUALD0OmgeAxTXvBi1VtWrmBitHq7X2zdbaI621/Uk+mN6oZtLr5D15xkvPTHLHoL4XAABgpZk37E2Fsr+pqqcM4suq6owZT382yXSQvCbJJVV1fFWdnWRdki8P4jsBAABWon42aDkjyder6stJ9kwvttZ+er43VdX/SDKR5NSq2p3k7Ukmqmp9eiOatyZ53dRnfb2qtiS5Kcm+JG+wEydAdxnfBIDhq9bmv+2tqn58tvXW2rahVLQA4+PjbceOHaMuA5SUBVAAAB+gSURBVAAAYCSq6obW2vhs1+bs7FXVCUn+XZJzknw1yR+01vYNp0QAAAAGab579q5KMp5e0HtxejtnAgAAsAzMd8/eea21H0qSqvqD2DAFAABg2Zivs7d3+hfjmwAAAMvLfJ29Z1bVd6d+rySPn3peSVpr7fuGXh0AAABHZM6w11o7ZjELAQAAYHDmPVQdAACA5UnYAwAA6CBhD2AFmtg0kYlNE6MuAwAYImEPAACgg4Q9AACADprv6AUAOmTm2Oa227Ydsrb10q2LWxAAMFQ6ewAAAB2kswewDE135BbSjZv52iN5PwCwvOjsAQAAdJCwBwAA0EHGOAGWiUFusGJ8EwC6T2cPYMAcWA4ALAU6ewDLRJc2WDnpypOSJPe99b4RVwIA3aWzBwAA0EE6ewAD4MByAGCpEfYAlqHlGB6nRzeT5DsPfeeQNSOdADBYwh7AAHTpfjoAoBuEPQAWxczOnQ1aAGD4bNACAADQQTp7AANmfBMAWAqEPYAOWS73CxrfBIDhM8YJAADQQcIeAABABxnjBFjmHOgOAMxGZw8AAKCDdPYAlrmFHOi+XDZwAQCOns4eAABABwl7AAAAHWSME6BDZhvPtIELAKxMOnsAAAAdpLMHsEQMa/OUhWzgAgB0h84eAABAB+nsAUMzOZls3pzs3JmsW5ds3JiMjY26qm476cqTkiT3vfW+EVcCAIyasAcMxfbtycUXJ/v3J3v2JKtXJ1dckXzqU8mGDaOubulY7M1TjG8CwMphjBMYuMnJXtCbnOwFvaT3OL1+//2jrQ8AYCXQ2QMGbvPmXkdvNvv3965fdtni1rRUDWLzlOnRzST5zkPfOWTNSCcArEw6e8DA7dx5oKN3sD17kl27FreexTaxaeIxo5gAAKOgswcM3Lp1vXv0Zgt8q1cn55yz+DUtphv/6cZF/b6ZnTsbtAAA03T2gIHbuDFZNcf/uqxa1bvOobZeutUGKgDAwOjsAQM3NtbbdfPg3ThXreqtr1kz6goHb7qjtv709Y/eNzfMXTWXEwe5A8BoCHvAUGzYkNxxR28zll27eqObGzd2M+glyf0P97YYnTnCudjjnInxTQDgAGEPGJo1a1bOrptrHtdLsetPX//oeXnrT18/ypIAgBVO2AM4QjPHNKdHN5PkmDomycoeW1zsw+IBgEMJewAD9kh7ZNQlAAAIewBHaq4D0evXajQFLSGDOCweADg6jl4AAADoIJ096KDJyd4umDt39g4437ixdxzCUrHU6zsS0/elzezqzfy9vb0tek0AwMom7EHHbN9+6Pl2V1zRO99uw4ZRV7f062PwjG8CwGhUa8v3/20eHx9vO3bsGHUZsGRMTiZr1/YeDzY21jv3bpTn3C31+gZluqOnmwcADFtV3dBaG5/tmnv2oEM2b+51zGazf3/v+igt9foAALpE2IMO2bmzNxo5mz17kl27Freegy31+gAAusQ9e9Ah69b17oGbLVCtXp2cc87i1zTTUq9vUH78n/34qEsAANDZgy7ZuDFZNcd/q1et6l0fpaVeHwBAl+jsQYeMjfV2tTx4t8tVq3rro978ZKnXdzSmDw5PDhzDMHPNjpQAwGIT9qBjNmzo7Wq5eXPvHrhzzul1zJZKkFrq9QEAdIWjF2ABungYOIM33dHTzQMAhm2+oxd09qBPDgMHAGA5sUEL9GFyshf0JicP7CS5Z8+B9fvvH219AABwMJ096EM/h4Ffdtni1sTSZXwTAFgKhD3ow0o5DNw9iQAA3SHsQR9WwmHg7kkEAOgWu3FCHyYnk7Vre48HGxvrHSWwnI8OGNbfp1MIADBc8+3GaYMW6MP0YeBjY72OV9J7nF5fzkEv6e+exIXavr0XIN/0puRd7+o9rl3bWwcAYPiMcUKfunwY+KDvSZy5e+nMz0l668u9EwoAsBwIe7AAa9Z0c9fNQd+TaPdSAIDRM8YJZOPGZNUc/2uwalXv+kKslN1LAQCWMmEPVrjpTVR+6qeS449PTjyxt3409yROdwpn05XdSwEAljpjnLCCzXbcwiOPJK96VXLhhUd+T+LGjb1jG2ZzJJ1CAAAWTmcPVqiZm6hMj1zu2ZM8+GByzTVHt/lM13cvBQBYDnT2YIUa9iYqXd69FABgORD2YIVajE1Uurp7KQDAcmCME1Yom6gAAHSbsAcr1KCPWwAAYGkxxgkdNn2sws6dvU7exo29TVKSA5ulvPjFyd69yUMP9Y5eOO44m6gAAHTB0Dp7VfXfququqvrajLVTquozVbVz6vHkGdfeVlW7qurvq+qiYdUFK8X27cnatcmb3pS86129x7Vre+sHa+2xj4Pyla8kT31qbyz0qU/tPQcAYHEMc4xzU5IXHbT21iTXt9bWJbl+6nmq6rwklyQ5f+o9v19VxwyxNui0uY5VmF6///7H/v7ww73XPPxw7/n0+tHYuDH5kR9J/uEfku99r/f4Iz9iPBQAYLEMLey11j6f5NsHLb80yVVTv1+V5GdmrH+ktfZQa+2WJLuSPGdYtUHX9XOswnyvefjh5JWvTD70oV4oXKivfCXZsmX2a1u2JH/7twv/TAAAFmaxN2j5gdbanUky9fjEqfW1SW6f8brdU2uHqKrXVtWOqtpx9913D7VYWK76OVZhvtc89FBy3XXJa1+bnH767KOf83nlK+e//rKXLezzAABYuKWyG2fNsjbr3UOttQ+01sZba+OnnXbakMuC5amfYxXme8201nojmBdeuLCxzn/6p6O7DgDA0VvssPfNqjojSaYe75pa353kyTNed2aSOxa5NuiMjRvnH+PcuHH+oxcOtm9f8t739v/9p59+dNcBADh6ix32rkny6qnfX53kEzPWL6mq46vq7CTrknx5kWuDTqnZ+uUz1qePXhgbS0488fCf93u/1/93/8mfzH/94x/v/7NGZWLTRCY2TYy6DACAIzbMoxf+R5IvJvnnVbW7qi5LcmWSF1TVziQvmHqe1trXk2xJclOS65K8obX2yLBqg67bvHn+sLd5c+/3DRuSO+5IXvGKw3/mQsY4n/Ws5Od+bvZrP/dzyTOe0f9nAQBwZIZ2qHpr7efnuPT8OV7/ziTvHFY90BXH/nrvv7b7/p99c76mnw1apq1Z099Y5ROecOjafIe2b96c/Mf/2NuM5Z/+qfcdH/+4oNcV013PrZduHWkdAMDchhb2gNGZ3nxltsA3vUHLwa8/5pjkkXn66S984WOfb9/eO49v//7e96xenVxxRW80dMOG3mue8YzHBsulbubY5rbbth2yJtgAAMvJUtmNExig+TZfWbXq0IPNN25Mjp3n//o59tjk2c8+8LyfQ9sBABgtnT1YBqZHN5PkkanbWWeuHTzSOb35ysGdt1Wreutr1uSQ119zTXLRRbN//+Mf/9iA2M+h7Zdd1v/ft1TM7NwZUzyUzicALC/CHnTU9OYrmzf3RinPOacX2A4OetNe+MLkL/4ieelLe+Oce/fOHRAXck8gAACjIezBMrDv/9n36GYor919bKqSey/f9+hmKHNZs+bQDtt8m6q88IXJ3XcfPiAu9J5AukHnEwCWl2qtjbqGIzY+Pt527Ngx6jJg6B6zGcov9/4/mrH37HvMZigL/pyDRjsX8jmTk8natb3Hg42N9TqKc3UQ6QZhDwCWhqq6obU2Pts1G7TAEjfbZigz1/vdDGWQm6rMPJB99ere2urVB9YFPQCA0TPGCUvcIZuh/PqBzVgWshnK5s29+/Bms3fvwjdVWeg9gXSLjh4ALH3CHixxh2yG8taTeo9X3regzVC+9rXkwQdnv/bgg8lNNy28ttnuCewCI4oAQBcY44QlbnozlNksZDOUe++d//o99yysLgAAljZhD5a4hR6QPpdTTpn/+hOesLC6AABY2oxxwhI3Npbs+5WTkgemFk74Tu/xrSdl3+OTM38vue+t9x32c84/PznhhNlHOU84ITnvvMHVvBw5MBwA6BqdPVjiJieTBx6Y/doDDyT9np6ycWNy3HGzXzvuuP47hAAALA86e7DEbd6crH7vfQc2aZmxQcvq1clv/25/nzN9LMJc5+yt9F00HRgOAHSNsAdL3CG7cc6wkN04E8clAACsJMIeLHHTu3HOFvgWshvntK4elwAAwGNV6/eGnyVofHy87dixY9RlwFBNTiZr1/YeDzY21uvU6cwBAKxMVXVDa218tms2aIElbvpeu7GxA+ftrV59YH05Bb2JTROP2eESAIDhMcYJy4B77QAAWChhD5aJUdxrV79WSZL29uU77g0AsFIJe8BQOawcAGA03LMHAADQQTp7QCYne/cD7tyZvOvEOuT69DhnsvCRToeVAwCMhrAHK9z27cnFFyf790+d5ff2qQuHZj4AAJYRYQ9WsMnJXtB7zBl+v9br3I2NJZO/bIMWAIDlStiDFWzz5l5HbzZzrR8N45sAAIvHBi2wgu3cOTW6OYs9e5Jo6AEALFvCHqxg69Ylq1fPfm316uRDT25GOAEAlilhD1awjRuTVXP8r8CqVb3rAAAsT8IerGBjY8mnPtV7nO7wrV59YH3NmtHWBwDAkbNBCywz02feDWq8csOG5I47epu17NqVnHNOr6Mn6AEALG/CHpA1a5LLLht1FQAADJIxTgAAgA7S2YNlYHp0c641O2YCAHAwnT0AAIAO0tmDZWBm527QG7SsJBObJpIkWy/dOtI6AAAWg84eAABABwl7AAAAHWSME5aZg8c3jSbOb/rfJ0m23bbtkDX/bgBAV+nsAQAAdJDOHtBpMzt3uqAAwEoi7MEyZDQRAIDDMcYJAADQQdXa8j2ra3x8vO3YsWPUZcBIGU0EAFi5quqG1tr4bNd09gAAADrIPXswIJOTyebNyc6dybp1ycaNydjYqKsCAGClMsYJA7B9e3Lxxcn+/cmePcnq1cmqVcmnPpVs2DDq6gAA6CpjnDBEk5O9oDc52Qt6Se9xev3++0dbHwAAK5OwB0dp8+ZeR282+/f3rgMAwGIT9uAo7dx5oKN3sD17kl27FrceAABIhD04auvW9e7Rm83q1ck55yxuPQAAkAh7cNQ2buxtxjKbVat61wEAYLEJe6wYE5smHj2AfJDGxnq7bo6NHejwrV59YH3NmoF/5VEZ1r8DAABLi3P2YAA2bEjuuKO3GcuuXb3RzY0bl17QAwBg5RD2YEDWrEkuu2zUVQAAQI+wR6fNHFfcdtu2Q9a2Xrp1cQsaEf8OAAArj3v2AAAAOqhaa6Ou4YiNj4+3HTt2jLoMlonpTtZK72L5dwAA6I6quqG1Nj7bNWOcLBmTk70NTnbu7J1dt3Fjb0fLrnwfAAAsJmGPJWH79uTii5P9+5M9e3pHF1xxRe/ogg0bhvN9L35xsndv8tBDyfHHJ5dfnlx77XC+DwAAFpsxTkZucjJZu7b3eLCxsd6RBoM8wmByMjn99OR73zv02oknJt/8piMTAABYHuYb47RBCyO3eXOvozeb/ft71wfpqqtmD3pJb/11r5s9eA7K5GTyoQ8lb3lL73GY3wUAwMol7DFyO3f2Rjdns2dP75DyQfrkJ+e/vnlzr9O4fftgvzfpfebatcmb3pS86129x2F9FwAAK5t79hi5det69+jNFvhWr07OOWdx63nkkWTy5ROZ2JTct35r3yOdh9vwZXKyd1/izE7e9N988cWDH1cFAGBl09lj5DZuTFbN8Z/EVat61wfpJS/p73Wt9T9C2k/HbrHHVQEAWNmEPUZubKy36+bYWK+Tl/Qep9cH3e169auTxz/+8K/bv7+/EdKZHbvpTt2ePQfW77+/t7bY46oAAKxsxjhZEjZs6I0xbt7cCz3nnNPr6A1jrHFsLPn0p3tHLzzwQG9sM0ly6cSBF521LUny8cdN5IubektzHULeT8fussuW3rgqAADdJuyxZKxZ0wtFi2HDhuTOO3s7c15xRfLww7O/7olPnP9zJieTj360v47dxo2975rNMMZVAQBY2YQ9Vqw1a5I3vCF55jOnDnT/k62PHuj+4CUT+aEfSv7XZVvnfP/0QfBzBcXksR276bHUgw+PX7VqOOOqAACsbMIeK95sI6SbkhxzzNzvmW1nzdkc3LFbzHFVAABWNmGvgw53BEAXHe3ffPAI6dWb5n/9fPfpJcnjHpccf/zsHbvFHFcFAGDlEvY6Znq0cOaY4BVX9ELHhg2jrm44hvE3z7UZy7T5dtZMkuc/P9myRccOAIDREfY6ZCUe2n0kf/NsXcBkYZ3Bw+2s+fKXd+/fGgCA5UXY65B+jwDokoX+zbN1AX/xF5Oq3k+/nUE7awIAsNQ5VL1DVuKh3Qv5m+c6/PyBB5LvfW/+A9EPttgHwQMAwELp7HXISjy0eyF/8+E2VTnY4bqhdtYEAGApE/Y6ZCWOFi7kbz7cpioH66cbamdNAACWKmOcHbISRwsX8jfv3r2wz+5qNxQAgJWhWmujruGIjY+Ptx07doy6jCXn/vsPjBaeeWbSWnL77d0+c2/m3zzbOOUddyRr1y7sM8fGurmDKQAA3VFVN7TWxme9Jux112w7T65a1e0z9+by6lcnf/iHc18/5pjkhBP8OwEAsLzMF/ZGcs9eVd2aZDLJI0n2tdbGq+qUJJuTnJXk1iQ/11q7dxT1dcFKPHNvPn/3d/Nff9azkte9zkYrAAB0xyg3aLmwtfatGc/fmuT61tqVVfXWqedvGU1py99in7k320HlS2lc9OlPT7785bmvn3eejVYAAOiWpbRBy0uTXDX1+1VJfmaEtSx7i3nm3vbtvfvh3vSm5F3v6j2uXdtbXyr+y3+Z//qVVy5OHQAAsFhGFfZakk9X1Q1V9dqptR9ord2ZJFOPT5ztjVX12qraUVU77r777kUqd/mZPn9uNoPcZXKug8oPdyj5YnvSk5L3vW/2a+97X3L66YtbDwAADNuowt6PttaeleTFSd5QVT/W7xtbax9orY231sZPO+204VW4zG3c2NtkZDaDPHOvn3HRpeL1r0/uvLO3Wcvzntd7vPPO3joAAHTNSO7Za63dMfV4V1X9aZLnJPlmVZ3RWruzqs5IctcoauuK6XPm5tqNc1CbjyzmuOggnH56smnTqKsAAIDhW/SwV1Wrk6xqrU1O/f7CJL+e5Jokr05y5dTjJxa7tq7ZsKG36+Z8588drelx0dkCn0PJAQBgdBb9nL2q+sEkfzr19Ngkf9xae2dVPSHJliRPSfKPSV7ZWvv2fJ/lnL3Rm5zsbcYy84iHaQ4lBwCA4VpS5+y11v4hyTNnWb8nyfMXux6OzmKNiwIAAAszynP26IjFGBcFAAAWRthjINascSg5AAAsJUvpUHUAAAAGRGdvgCYne6OMO3f2dqncuLF3TxsAAMBiE/YGZPv2QzcpueKK3iYlGzaMujoAAGClMcY5AJOTvaA3OXngvLk9ew6s33//aOsDAABWHmFvADZv7nX0ZrN/f+86AADAYhL2BmDnzgMdvYPt2dM7jgAAAGAxCXsDsG5d7x692axe3Tt3DgAAYDEJewOwcWOyao5/yVWretcBAAAWk7A3AGNjvV03x8YOdPhWrz6wvmbNaOsDAABWHkcvDMiGDckdd/Q2Y9m1qze6uXGjoAcAAIyGsDdAa9Ykl1026ioAAACMcQIAAHSSsAcAANBBwh4AAEAHCXsAAAAdJOwBAAB0kLAHAADQQcIeAABABwl7AAAAHSTsAQAAdJCwBwAA0EHCHgAAQAcJewAAAB0k7AEAAHSQsAcAANBBwh4AAEAHCXsAAAAdJOzx/7d357FylWUcx7+/tJWiZVFEgoVQorgAmkqhgWigFYNiTIAEF0QFtxjc0AQNLtEao+KCCxpRVMISBGswUUkQkQAmhlCxFEpFVAQERBFxoUZU9PGP894wXmemV1LuvT3z/SQn98xz3jnnPTNP39vnvufMSJIkSeohiz1JkiRJ6iGLPUmSJEnqoVTVXPfhEUvye+COue6HHnVPBO6b605oXjI3NIx5oWHMC41ibmiYbSkv9qqqXYdt2KaLPU2GJNdV1YFz3Q/NP+aGhjEvNIx5oVHMDQ3Tl7zwMk5JkiRJ6iGLPUmSJEnqIYs9bQvOmusOaN4yNzSMeaFhzAuNYm5omF7khffsSZIkSVIPObMnSZIkST1ksSdJkiRJPWSxpzmR5Owk9ya5aSD2hCSXJ/lF+/n4gW3vSfLLJLckeeFAfEWSjW3bGUky2+eirWdEXqxJcneSDW158cA282ICJNkzyZVJbk6yKcnJLe6YMcHG5IVjxoRLsjjJuiQ3tNz4UIs7ZkywMXnR7zGjqlxcZn0BDgUOAG4aiH0COLWtnwp8vK3vC9wAbAfsDdwKLGjb1gGHAAEuBY6c63Nz2ep5sQY4ZUhb82JCFmB34IC2vgPw8/b+O2ZM8DImLxwzJnxp7+OStr4IuBY42DFjspcxedHrMcOZPc2JqvohcP+08FHAuW39XODogfhFVfX3qroN+CWwMsnuwI5VdU11//LOG3iOtkEj8mIU82JCVNU9VbW+rT8A3AwsxTFjoo3Ji1HMiwlRnc3t4aK2FI4ZE21MXozSi7yw2NN8sltV3QPdL3HgSS2+FLhzoN1dLba0rU+Pq3/emuTGdpnn1GU35sUESrIMeA7dX2QdMwT8T16AY8bES7IgyQbgXuDyqnLM0Ki8gB6PGRZ72hYMuw66xsTVL2cCTwGWA/cAp7e4eTFhkiwBLgbeUVV/Gdd0SMzc6KkheeGYIarqX1W1HNiDbjZm/zHNzY0JMSIvej1mWOxpPvldmxqn/by3xe8C9hxotwfwmxbfY0hcPVJVv2uD87+BrwAr2ybzYoIkWUT3H/oLqupbLeyYMeGG5YVjhgZV1Z+Aq4AX4ZihZjAv+j5mWOxpPvkOcEJbPwH49kD8FUm2S7I3sA+wrl2C8UCSg9unIL1m4DnqialfzM0xwNQndZoXE6K9j18Dbq6qTw9scsyYYKPywjFDSXZNsnNb3x54AfAzHDMm2qi86PuYsXCuO6DJlORCYBXwxCR3AR8ETgPWJnk98GvgpQBVtSnJWuCnwEPAW6rqX21XJwHnANvTfRrSpbN4GtrKRuTFqiTL6S6RuB14E5gXE+a5wKuBje1eC4D34pgx6UblxXGOGRNvd+DcJAvoJjbWVtUlSa7BMWOSjcqL8/s8ZqT7EBlJkiRJUp94GackSZIk9ZDFniRJkiT1kMWeJEmSJPWQxZ4kSZIk9ZDFniRJkiT1kMWeJOlRlWSXJBva8tskdw88fsy0tu9I8tgZ7POqJAcOib8kyfVJbkjy0yRv2prn8kglWTPtvE97BPvYOcmbt9DmmCSV5BmPvLeSpL7wqxckSbMmyRpgc1V9asT224EDq+q+LeznKuCUqrpuILYIuANYWVV3JdkOWFZVt2yl7g/rx8KqemgG7dYw5rxneKxlwCVVtf+YNmvpvkvqiqpaM2T7goHviZIk9Zwze5KkWZfk8DYDtzHJ2Um2S/J24MnAlUmubO3OTHJdkk1JPrSF3e4ALAT+AFBVf58q9JLsneSaJD9O8uEkm1t8VZJLBvr1hSQntvUPtPY3JTkrSVr8qiQfTXI1cHKSFUmuTvKTJJcl2X2Gr8GCJJ9sx7hxcBYyybsG4lPnfRrwlDYz+Mkh+1tC90XjrwdeMRBfleTKJF+n+wLyocdNsiTJFUnWt/flqJmchyRp/rLYkyTNtsXAOcDLq+pZdAXaSVV1BvAbYHVVrW5t31dVBwLPBg5L8uxRO62q+4HvAHckuTDJ8Ummfs99Djizqg4CfjvDfn6hqg5qM2nbAy8Z2LZzVR0GnAF8Hji2qlYAZwMfGbG/dw5cxvlCuqLsz61PBwFvbEXpEcA+wEpgObAiyaHAqcCtVbW8qt41ZP9HA9+rqp8D9yc5YGDbSrrXct9RxwUeBI6pqgOA1cDpUwWuJGnbZLEnSZptC4DbWlECcC5w6Ii2L0uyHrge2A/Yd9yOq+oNwOHAOuAUuuILuhmvC9v6+TPs5+ok1ybZCDy/HX/KN9rPpwP7A5cn2QC8H9hjxP4+0wq15VV1GXAE8Jr2vGuBXeiKvCPacj2wHnhGi2/JccBFbf2i9njKuqq6ra2POm6Ajya5EfgBsBTYbQbHlSTNUwvnugOSpInz15k0arNNpwAHVdUfk5xDNys4VlVtpLtc8XzgNuDEqU1Dmj/Ef//hc3E79mLgi3T3D97Z7rkbPPbUOQTYVFWHzOScpgnwtlb4PRzsZv0+VlVfnhZfNnJHyS50Ben+SYquoK4k757W33HHPRHYFVhRVf9s909u8fWWJM1fzuxJkmbbYmBZkqe2x68Grm7rD9DdewewI12R8uckuwFHjttpu+ds1UBoOd0HtgD8iIfvYzt+oM0dwL7tnsGd6GYFp/oIcF+7F+7YEYe9Bdg1ySGtD4uS7Dei7XSXASe1D5YhydOSPK7FX9eOS5KlSZ7Ef7820x0LnFdVe1XVsqrak67Qfd7/cdydgHtbobca2GuG5yFJmqec2ZMkzbYHgdcC30yyEPgx8KW27Szg0iT3VNXqJNcDm4Bf0RVs4wR4d5IvA3+jKxRPbNtOBr6e5GTg4qkntFm7tcCNwC/oLp2kqv6U5CvARuD21sf/UVX/SHIscEYrFhcCn2193pKvAsuA9e3euN8DR1fV95M8E7im3TK3GXhVVd2a5EdJbgIunXbf3nF0H+Ay6GLglTx8yenY4wIXAN9Nch2wAfjZDM5BkjSP+dULkqSJk2RzVS2Z635IkvRo8jJOSZIkSeohZ/YkSZIkqYec2ZMkSZKkHrLYkyRJkqQestiTJEmSpB6y2JMkSZKkHrLYkyRJkqQe+g99gK1VnVHCgwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 1080x720 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plot_scatter_chart(df7,\"Hebbal\")"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [],
"source": [
"def remove_bhk_outliers(df):\n",
" exclude_indices = np.array([])\n",
" for location,location_df in df.groupby('location'):\n",
" bhk_stats = {}\n",
" for bhk,bhk_df in location_df.groupby('bhk'):\n",
" bhk_stats[bhk] = {\n",
" 'mean':np.mean(bhk_df.price_per_sqft),\n",
" 'std':np.std(bhk_df.price_per_sqft),\n",
" 'count':bhk_df.shape[0]\n",
" }\n",
" for bhk,bhk_df in location_df.groupby(\"bhk\"):\n",
" stats = bhk_stats.get(bhk-1)\n",
" if stats and stats['count']>5:\n",
" exclude_indices = np.append(exclude_indices,bhk_df[bhk_df.price_per_sqft < (stats['mean'])].index.values)\n",
" return df.drop(exclude_indices,axis='index') "
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {},
"outputs": [],
"source": [
"df8 = remove_bhk_outliers(df7)"
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(7329, 7)"
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df8.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Outlier Removal Using Bathrooms Feature"
]
},
{
"cell_type": "code",
"execution_count": 115,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 4., 3., 2., 5., 8., 1., 6., 7., 9., 12., 16., 13.])"
]
},
"execution_count": 115,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df8.bath.unique()"
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0, 0.5, 'Count')"
]
},
"execution_count": 116,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA4EAAAJQCAYAAAAwv2HyAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3dfbRldX3f8c9Xxgc0EiGMlDCkgykmQeJDHCiJiYliIo0uoVkhIcvotKWhJcRoYkygrpU0XSXFmgdrG7HUWKAxsiZGC/EpUoLYdKE4+ISABKoGJ1CZJI2SZBUFv/3jbMLxzp3hjpkz5878Xq+17jrn/O7e537vbJiZ9+xz9q3uDgAAAGN4xLIHAAAAYP8RgQAAAAMRgQAAAAMRgQAAAAMRgQAAAAMRgQAAAANZaARW1Wer6qaq+lhVbZ/Wjqiqq6vq9un28LntL6iqO6rqtqp6/tz6M6fnuaOqXl9Vtci5AQAADlb740zgc7r76d29ZXp8fpJruvv4JNdMj1NVJyQ5K8lTkpyW5A1Vdci0z8VJzkly/PRx2n6YGwAA4KCzjJeDnp7ksun+ZUnOmFu/orvv6+7PJLkjyclVdXSSw7r7+p79ZPvL5/YBAABgL2xY8PN3kvdVVSf5z919SZKjuvvuJOnuu6vqidO2xyT54Ny+O6a1L0/3V67v0ZFHHtmbN2/+u38HAAAAB6Abb7zxz7p748r1RUfgs7r7rin0rq6qT+1h29Xe59d7WN/1CarOyexlo/mmb/qmbN++fW/nBQAAOChU1Z+str7Ql4N2913T7T1J3pHk5CSfn17imen2nmnzHUmOndt9U5K7pvVNq6yv9vUu6e4t3b1l48ZdghcAAGB4C4vAqnpcVT3+wftJfiDJJ5NclWTrtNnWJFdO969KclZVPbqqjsvsAjA3TC8dvbeqTpmuCvrSuX0AAADYC4t8OehRSd4x/TSHDUl+p7vfW1UfTrKtqs5OcmeSM5Oku2+uqm1Jbklyf5LzuvuB6bnOTXJpkkOTvGf6AAAAYC/V7IKbB58tW7a09wQCAACjqqob535U399axo+IAAAAYElEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEBEIAAAwEA2LHuA0Ww+/13LHmEpPnvRC5Y9AgAAEGcCAQAAhiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABrLwCKyqQ6rqo1X1zunxEVV1dVXdPt0ePrftBVV1R1XdVlXPn1t/ZlXdNH3u9VVVi54bAADgYLQ/zgS+PMmtc4/PT3JNdx+f5JrpcarqhCRnJXlKktOSvKGqDpn2uTjJOUmOnz5O2w9zAwAAHHQWGoFVtSnJC5K8aW759CSXTfcvS3LG3PoV3X1fd38myR1JTq6qo5Mc1t3Xd3cnuXxuHwAAAPbCos8Evi7Jzyf5ytzaUd19d5JMt0+c1o9J8rm57XZMa8dM91eu76Kqzqmq7VW1fefOnfvmOwAAADiILCwCq+qFSe7p7hvXussqa72H9V0Xuy/p7i3dvWXjxo1r/LIAAADj2LDA535WkhdV1Q8meUySw6rqt5N8vqqO7u67p5d63jNtvyPJsXP7b0py17S+aZV1AAAA9tLCzgR29wXdvam7N2d2wZc/7O4fT3JVkq3TZluTXDndvyrJWVX16Ko6LrMLwNwwvWT03qo6Zboq6Evn9gEAAGAvLPJM4O5clGRbVZ2d5M4kZyZJd99cVduS3JLk/iTndfcD0z7nJrk0yaFJ3jN9AAAAsJf2SwR29/uTvH+6/+dJTt3NdhcmuXCV9e1JTlzchAAAAGPYHz8nEAAAgHVCBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxkYRFYVY+pqhuq6uNVdXNV/fK0fkRVXV1Vt0+3h8/tc0FV3VFVt1XV8+fWn1lVN02fe31V1aLmBgAAOJgt8kzgfUme291PS/L0JKdV1SlJzk9yTXcfn+Sa6XGq6oQkZyV5SpLTkryhqg6ZnuviJOckOX76OG2BcwMAABy0FhaBPfNX08NHTh+d5PQkl03rlyU5Y7p/epIruvu+7v5MkjuSnFxVRyc5rLuv7+5OcvncPgAAAOyFhb4nsKoOqaqPJbknydXd/aEkR3X33Uky3T5x2vyYJJ+b233HtHbMdH/lOgAAAHtpoRHY3Q9099OTbMrsrN6Je9h8tff59R7Wd32CqnOqantVbd+5c+feDwwAAHCQ2y9XB+3uv0zy/szey/f56SWemW7vmTbbkeTYud02JblrWt+0yvpqX+eS7t7S3Vs2bty4T78HAACAg8Eirw66saqeMN0/NMnzknwqyVVJtk6bbU1y5XT/qiRnVdWjq+q4zC4Ac8P0ktF7q+qU6aqgL53bBwAAgL2wYYHPfXSSy6YrfD4iybbufmdVXZ9kW1WdneTOJGcmSXffXFXbktyS5P4k53X3A9NznZvk0iSHJnnP9AEAAMBeWlgEdvcnkjxjlfU/T3Lqbva5MMmFq6xvT7Kn9xMCAACwBvvlPYEAAACsDyIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgICIQAABgIGuKwKp61lrWAAAAWN/WeibwP65xDQAAgHVsw54+WVXfmeS7kmysqp+d+9RhSQ5Z5GAAAADse3uMwCSPSvJ103aPn1v/YpIfXtRQAAAALMYeI7C7r0tyXVVd2t1/sp9mAgAAYEEe7kzggx5dVZck2Ty/T3c/dxFDAQAAsBhrjcDfTfLGJG9K8sDixgEAAGCR1hqB93f3xQudBAAAgIVb64+I+P2q+smqOrqqjnjwY6GTAQAAsM+t9Uzg1un2VXNrneRJ+3YcAAAAFmlNEdjdxy16EAAAABZvTRFYVS9dbb27L9+34wAAALBIa3056Elz9x+T5NQkH0kiAgEAAA4ga3056MvmH1fV1yf5bwuZCAAAgIVZ69VBV/qbJMfvy0EAAABYvLW+J/D3M7saaJIckuTbkmxb1FAAAAAsxlrfE/irc/fvT/In3b1jAfMAAACwQGt6OWh3X5fkU0ken+TwJF9a5FAAAAAsxpoisKp+JMkNSc5M8iNJPlRVP7zIwQAAANj31vpy0FcnOam770mSqtqY5H8keduiBgMAAGDfW+vVQR/xYABO/nwv9gUAAGCdWOuZwPdW1R8keev0+EeTvHsxIwEAALAoe4zAqvoHSY7q7ldV1Q8l+e4kleT6JG/ZD/MBAACwDz3cSzpfl+TeJOnut3f3z3b3z2R2FvB1ix4OAACAfevhInBzd39i5WJ3b0+yeSETAQAAsDAPF4GP2cPnDt2XgwAAALB4DxeBH66qn1i5WFVnJ7lxMSMBAACwKA93ddBXJHlHVb04D0XfliSPSvKPFzkYAAAA+94eI7C7P5/ku6rqOUlOnJbf1d1/uPDJAAAA2OfW9HMCu/vaJNcueBYAAAAW7OHeEwgAAMBBRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMRAQCAAAMZGERWFXHVtW1VXVrVd1cVS+f1o+oqqur6vbp9vC5fS6oqjuq6raqev7c+jOr6qbpc6+vqlrU3AAAAAezRZ4JvD/JK7v725KckuS8qjohyflJrunu45NcMz3O9LmzkjwlyWlJ3lBVh0zPdXGSc5IcP32ctsC5AQAADloLi8Duvru7PzLdvzfJrUmOSXJ6ksumzS5LcsZ0//QkV3T3fd39mSR3JDm5qo5Oclh3X9/dneTyuX0AAADYC/vlPYFVtTnJM5J8KMlR3X13MgvFJE+cNjsmyefmdtsxrR0z3V+5DgAAwF5aeARW1dcl+b0kr+juL+5p01XWeg/rq32tc6pqe1Vt37lz594PCwAAcJBbaARW1SMzC8C3dPfbp+XPTy/xzHR7z7S+I8mxc7tvSnLXtL5plfVddPcl3b2lu7ds3Lhx330jAAAAB4lFXh20kvxWklu7+9fnPnVVkq3T/a1JrpxbP6uqHl1Vx2V2AZgbppeM3ltVp0zP+dK5fQAAANgLGxb43M9K8pIkN1XVx6a1f5XkoiTbqursJHcmOTNJuvvmqtqW5JbMrix6Xnc/MO13bpJLkxya5D3TBwAAAHtpYRHY3X+U1d/PlySn7mafC5NcuMr69iQn7rvpAAAAxrRfrg4KAADA+iACAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABiICAQAABrJh2QPAw9l8/ruWPcLSfPaiFyx7BAAADjLOBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxkYRFYVW+uqnuq6pNza0dU1dVVdft0e/jc5y6oqjuq6raqev7c+jOr6qbpc6+vqlrUzAAAAAe7RZ4JvDTJaSvWzk9yTXcfn+Sa6XGq6oQkZyV5yrTPG6rqkGmfi5Ock+T46WPlcwIAALBGC4vA7v5Akr9YsXx6ksum+5clOWNu/Yruvq+7P5PkjiQnV9XRSQ7r7uu7u5NcPrcPAAAAe2l/vyfwqO6+O0mm2ydO68ck+dzcdjumtWOm+yvXV1VV51TV9qravnPnzn06OAAAwMFgvVwYZrX3+fUe1lfV3Zd095bu3rJx48Z9NhwAAMDBYn9H4Oenl3hmur1nWt+R5Ni57TYluWta37TKOgAAAF+D/R2BVyXZOt3fmuTKufWzqurRVXVcZheAuWF6yei9VXXKdFXQl87tAwAAwF7asKgnrqq3Jvm+JEdW1Y4kv5TkoiTbqursJHcmOTNJuvvmqtqW5JYk9yc5r7sfmJ7q3MyuNHpokvdMHwAAAHwNFhaB3f1ju/nUqbvZ/sIkF66yvj3JiftwNAAAgGGtlwvDAAAAsB+IQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIGIQAAAgIFsWPYAwGJsPv9dyx5hKT570QuWPQIAwLrmTCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBARCAAAMBANix7AID1ZPP571r2CEvx2YtesOwRAID9xJlAAACAgYhAAACAgYhAAACAgYhAAACAgRwwEVhVp1XVbVV1R1Wdv+x5AAAADkQHxNVBq+qQJL+Z5PuT7Ejy4aq6qrtvWe5kALiiKgAcWA6UM4EnJ7mjuz/d3V9KckWS05c8EwAAwAHngDgTmOSYJJ+be7wjyT9c0iwA8HfmDCr7i//WgJWqu5c9w8OqqjOTPL+7//n0+CVJTu7ul63Y7pwk50wPvyXJbft10IPTkUn+bNlDsCrHZn1zfNYvx2Z9c3zWL8dmfXN81q9lHpu/390bVy4eKGcCdyQ5du7xpiR3rdyouy9Jcsn+GmoEVbW9u7csew525disb47P+uXYrG+Oz/rl2Kxvjs/6tR6PzYHynsAPJzm+qo6rqkclOSvJVUueCQAA4IBzQJwJ7O77q+qnkvxBkkOSvLm7b17yWAAAAAecAyICk6S7353k3cueY0BeXrt+OTbrm+Ozfjk265vjs345Nuub47N+rbtjc0BcGAYAAIB940B5TyAAAAD7gAhkF1V1bFVdW1W3VtXNVfXyZc/ErqrqkKr6aFW9c9mz8JCqekJVva2qPjX9P/Sdy56Jh1TVz0y/r32yqt5aVY9Z9kwjq6o3V9U9VfXJubUjqurqqrp9uj18mTOOajfH5rXT722fqKp3VNUTljnjyFY7PnOf+7mq6qo6chmzjW53x6aqXlZVt01/Bv37Zc33IBHIau5P8sru/rYkpyQ5r6pOWPJM7OrlSW5d9hDs4j8keW93f2uSp8UxWjeq6pgkP51kS3efmNmFxs5a7lTDuzTJaSvWzk9yTXcfn+Sa6TH736XZ9dhcneTE7n5qkj9OcsH+Hoq/dWl2PT6pqmOTfH+SO/f3QPytS7Pi2FTVc5KcnuSp3f2UJL+6hLm+ighkF919d3d/ZLp/b2Z/iT1muVMxr6o2JXlBkjctexYeUlWHJXl2kt9Kku7+Unf/5XKnYoUNSQ6tqg1JHptVfuYs+093fyDJX6xYPj3JZdP9y5KcsV+HIsnqx6a739fd908PP5jZz21mCXbz/06S/EaSn0/ioh9Lsptjc26Si7r7vmmbe/b7YCuIQPaoqjYneUaSDy13ElZ4XWa/yX9l2YPwVZ6UZGeS/zq9VPdNVfW4ZQ/FTHf/aWb/+npnkruTfKG737fcqVjFUd19dzL7R8kkT1zyPKzunyV5z7KH4CFV9aIkf9rdH1/2LOziyUm+p6o+VFXXVdVJyx5IBLJbVfV1SX4vySu6+4vLnoeZqnphknu6+8Zlz8IuNiT5jiQXd/czkvx1vJRt3ZjeW3Z6kuOSfGOSx1XVjy93KjjwVNWrM3vryFuWPQszVfXYJK9O8ovLnoVVbUhyeGZvs3pVkm1VVcscSASyqqp6ZGYB+Jbufvuy5+GrPCvJi6rqs0muSPLcqvrt5Y7EZEeSHd394Jnzt2UWhawPz0vyme7e2d1fTvL2JN+15JnY1eer6ugkmW6X/rIpHlJVW5O8MMmL288ZW0++ObN/4Pr49PeDTUk+UlV/b6lT8aAdSd7eMzdk9kqupV64RwSyi+lfJn4rya3d/evLnoev1t0XdPem7t6c2UUt/rC7nc1YB7r7/yT5XFV9y7R0apJbljgSX+3OJKdU1WOn3+dOjQv3rEdXJdk63d+a5MolzsKcqjotyS8keVF3/82y5+Eh3X1Tdz+xuzdPfz/YkeQ7pj+XWL7/nuS5SVJVT07yqCR/tsyBRCCreVaSl2R2hulj08cPLnsoOEC8LMlbquoTSZ6e5FeWPA+T6Qzt25J8JMlNmf0ZeMlShxpcVb01yfVJvqWqdlTV2UkuSvL9VXV7Zlc5vGiZM45qN8fmPyV5fJKrp78bvHGpQw5sN8eHdWA3x+bNSZ40/diIK5JsXfaZ9HImHwAAYBzOBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAIAAAxEBAKwdFXVVfVrc49/rqr+9T567kur6of3xXM9zNc5s6puraprV6x/X1W9cy+f6xVV9di5x3+1r+YEABEIwHpwX5Ifqqojlz3IvKo6ZC82PzvJT3b3c/bBl35Fksc+7FZzqmrDPvi6AAxABAKwHtyf2Q9u/5mVn1h5Ju/Bs2LTGbbrqmpbVf1xVV1UVS+uqhuq6qaq+ua5p3leVf3PabsXTvsfUlWvraoPV9UnqupfzD3vtVX1O5n9UPmV8/zY9PyfrKrXTGu/mOS7k7yxql67yvd3WFW9o6puqao3VtUjpv0urqrtVXVzVf3ytPbTSb4xybXzZxWr6sKq+nhVfbCqjpr7tfn1abvXVNXTp89/Yvp6h0/b7W79/VX1G1X1geks5klV9faqur2q/u20zeOq6l3T1/5kVf3oWg4oAOuXCARgvfjNJC+uqq/fi32eluTlSb49yUuSPLm7T07ypiQvm9tuc5LvTfKCzELtMZmduftCd5+U5KQkP1FVx03bn5zk1d19wvwXq6pvTPKaJM9N8vQkJ1XVGd39b5JsT/Li7n7VKnOenOSV05zfnOSHpvVXd/eWJE9N8r1V9dTufn2Su5I8Z+6s4uOSfLC7n5bkA0l+Yu65n5zked39yiSXJ/mF7n5qZgH7S9M2u1tPki9197OTvDHJlUnOS3Jikn9SVd+Q5LQkd3X307r7xCTvXeX7A+AAIgIBWBe6+4uZxcpP78VuH+7uu7v7viT/O8n7pvWbMgu/B23r7q909+1JPp3kW5P8QJKXVtXHknwoyTckOX7a/obu/swqX++kJO/v7p3dfX+StyR59hrmvKG7P93dDyR5a2ZnDZPkR6rqI0k+muQpSU7Yzf5fSvLg+wpvXPG9/W53PzDF8xO6+7pp/bIkz97d+tz+V023NyW5ee7X89NJjp3Wn1dVr6mq7+nuL6zh+wVgHROBAKwnr59ju70AAAHRSURBVMvsDN3j5tbuz/TnVVVVkkfNfe6+uftfmXv8lSTz75HrFV+nk1SSl3X306eP47r7wYj8693MV2v9Rlb5el/1eDrr+HNJTp3O0L0ryWN2s/+Xu/vB53ggX/297W7WtZr/NVv567mhu/84yTMzi8F/N730FYADmAgEYN3o7r9Isi2zEHzQZzOLkCQ5Pckjv4anPrOqHjG9T/BJSW5L8gdJzq2qRyZJVT25qh63pyfJ7Izh91bVkdNFY34syXUPs0+SnFxVx03vBfzRJH+U5LDMAu4L03v8/tHc9vcmefxefH+ZztD936r6nmnpJUmu2936Wp93egns33T3byf51STfsTdzAbD+uJIYAOvNryX5qbnH/yXJlVV1Q5Jr8rWd+bots/A5Ksm/7O7/V1VvyuxllR+ZzjDuTHLGnp6ku++uqguSXJvZWcF3d/eVa/j61ye5KLP3BH4gyTu6+ytV9dEkN2f20sv/Nbf9JUneU1V37+XVRrdm9p7Hx07P+U8fZn0tvj3Ja6vqK0m+nOTcvdgXgHWoHnp1CQAAAAc7LwcFAAAYiAgEAAAYiAgEAAAYiAgEAAAYiAgEAAAYiAgEAAAYiAgEAAAYiAgEAAAYyP8HWwN7/4koG1EAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 1080x720 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.hist(df8.bath,rwidth=0.8)\n",
"plt.xlabel(\"Number of bathrooms\")\n",
"plt.ylabel(\"Count\")"
]
},
{
"cell_type": "code",
"execution_count": 117,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>price_per_sqft</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>5277</th>\n",
" <td>Neeladri Nagar</td>\n",
" <td>10 BHK</td>\n",
" <td>4000.0</td>\n",
" <td>12.0</td>\n",
" <td>160.0</td>\n",
" <td>10</td>\n",
" <td>4000.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8486</th>\n",
" <td>other</td>\n",
" <td>10 BHK</td>\n",
" <td>12000.0</td>\n",
" <td>12.0</td>\n",
" <td>525.0</td>\n",
" <td>10</td>\n",
" <td>4375.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8575</th>\n",
" <td>other</td>\n",
" <td>16 BHK</td>\n",
" <td>10000.0</td>\n",
" <td>16.0</td>\n",
" <td>550.0</td>\n",
" <td>16</td>\n",
" <td>5500.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9308</th>\n",
" <td>other</td>\n",
" <td>11 BHK</td>\n",
" <td>6000.0</td>\n",
" <td>12.0</td>\n",
" <td>150.0</td>\n",
" <td>11</td>\n",
" <td>2500.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9639</th>\n",
" <td>other</td>\n",
" <td>13 BHK</td>\n",
" <td>5425.0</td>\n",
" <td>13.0</td>\n",
" <td>275.0</td>\n",
" <td>13</td>\n",
" <td>5069.124424</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk price_per_sqft\n",
"5277 Neeladri Nagar 10 BHK 4000.0 12.0 160.0 10 4000.000000\n",
"8486 other 10 BHK 12000.0 12.0 525.0 10 4375.000000\n",
"8575 other 16 BHK 10000.0 16.0 550.0 16 5500.000000\n",
"9308 other 11 BHK 6000.0 12.0 150.0 11 2500.000000\n",
"9639 other 13 BHK 5425.0 13.0 275.0 13 5069.124424"
]
},
"execution_count": 117,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df8[df8.bath>10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### It is unusual to have 2 more bathrooms than number of bedrooms in a home"
]
},
{
"cell_type": "code",
"execution_count": 118,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>price_per_sqft</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1626</th>\n",
" <td>Chikkabanavar</td>\n",
" <td>4 Bedroom</td>\n",
" <td>2460.0</td>\n",
" <td>7.0</td>\n",
" <td>80.0</td>\n",
" <td>4</td>\n",
" <td>3252.032520</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5238</th>\n",
" <td>Nagasandra</td>\n",
" <td>4 Bedroom</td>\n",
" <td>7000.0</td>\n",
" <td>8.0</td>\n",
" <td>450.0</td>\n",
" <td>4</td>\n",
" <td>6428.571429</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6711</th>\n",
" <td>Thanisandra</td>\n",
" <td>3 BHK</td>\n",
" <td>1806.0</td>\n",
" <td>6.0</td>\n",
" <td>116.0</td>\n",
" <td>3</td>\n",
" <td>6423.034330</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8411</th>\n",
" <td>other</td>\n",
" <td>6 BHK</td>\n",
" <td>11338.0</td>\n",
" <td>9.0</td>\n",
" <td>1000.0</td>\n",
" <td>6</td>\n",
" <td>8819.897689</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk price_per_sqft\n",
"1626 Chikkabanavar 4 Bedroom 2460.0 7.0 80.0 4 3252.032520\n",
"5238 Nagasandra 4 Bedroom 7000.0 8.0 450.0 4 6428.571429\n",
"6711 Thanisandra 3 BHK 1806.0 6.0 116.0 3 6423.034330\n",
"8411 other 6 BHK 11338.0 9.0 1000.0 6 8819.897689"
]
},
"execution_count": 118,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df8[df8.bath>df8.bhk + 2]"
]
},
{
"cell_type": "code",
"execution_count": 119,
"metadata": {},
"outputs": [],
"source": [
"df9 = df8[df8.bath < df8.bhk + 2]"
]
},
{
"cell_type": "code",
"execution_count": 120,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(7251, 7)"
]
},
"execution_count": 120,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df9.shape"
]
},
{
"cell_type": "code",
"execution_count": 121,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>size</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>price_per_sqft</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>4 BHK</td>\n",
" <td>2850.0</td>\n",
" <td>4.0</td>\n",
" <td>428.0</td>\n",
" <td>4</td>\n",
" <td>15017.543860</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>3 BHK</td>\n",
" <td>1630.0</td>\n",
" <td>3.0</td>\n",
" <td>194.0</td>\n",
" <td>3</td>\n",
" <td>11901.840491</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>3 BHK</td>\n",
" <td>1875.0</td>\n",
" <td>2.0</td>\n",
" <td>235.0</td>\n",
" <td>3</td>\n",
" <td>12533.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>3 BHK</td>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>130.0</td>\n",
" <td>3</td>\n",
" <td>10833.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>2 BHK</td>\n",
" <td>1235.0</td>\n",
" <td>2.0</td>\n",
" <td>148.0</td>\n",
" <td>2</td>\n",
" <td>11983.805668</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10232</th>\n",
" <td>other</td>\n",
" <td>2 BHK</td>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>70.0</td>\n",
" <td>2</td>\n",
" <td>5833.333333</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10233</th>\n",
" <td>other</td>\n",
" <td>1 BHK</td>\n",
" <td>1800.0</td>\n",
" <td>1.0</td>\n",
" <td>200.0</td>\n",
" <td>1</td>\n",
" <td>11111.111111</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10236</th>\n",
" <td>other</td>\n",
" <td>2 BHK</td>\n",
" <td>1353.0</td>\n",
" <td>2.0</td>\n",
" <td>110.0</td>\n",
" <td>2</td>\n",
" <td>8130.081301</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10237</th>\n",
" <td>other</td>\n",
" <td>1 Bedroom</td>\n",
" <td>812.0</td>\n",
" <td>1.0</td>\n",
" <td>26.0</td>\n",
" <td>1</td>\n",
" <td>3201.970443</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10240</th>\n",
" <td>other</td>\n",
" <td>4 BHK</td>\n",
" <td>3600.0</td>\n",
" <td>5.0</td>\n",
" <td>400.0</td>\n",
" <td>4</td>\n",
" <td>11111.111111</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>7251 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" location size total_sqft bath price bhk \\\n",
"0 1st Block Jayanagar 4 BHK 2850.0 4.0 428.0 4 \n",
"1 1st Block Jayanagar 3 BHK 1630.0 3.0 194.0 3 \n",
"2 1st Block Jayanagar 3 BHK 1875.0 2.0 235.0 3 \n",
"3 1st Block Jayanagar 3 BHK 1200.0 2.0 130.0 3 \n",
"4 1st Block Jayanagar 2 BHK 1235.0 2.0 148.0 2 \n",
"... ... ... ... ... ... ... \n",
"10232 other 2 BHK 1200.0 2.0 70.0 2 \n",
"10233 other 1 BHK 1800.0 1.0 200.0 1 \n",
"10236 other 2 BHK 1353.0 2.0 110.0 2 \n",
"10237 other 1 Bedroom 812.0 1.0 26.0 1 \n",
"10240 other 4 BHK 3600.0 5.0 400.0 4 \n",
"\n",
" price_per_sqft \n",
"0 15017.543860 \n",
"1 11901.840491 \n",
"2 12533.333333 \n",
"3 10833.333333 \n",
"4 11983.805668 \n",
"... ... \n",
"10232 5833.333333 \n",
"10233 11111.111111 \n",
"10236 8130.081301 \n",
"10237 3201.970443 \n",
"10240 11111.111111 \n",
"\n",
"[7251 rows x 7 columns]"
]
},
"execution_count": 121,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df9"
]
},
{
"cell_type": "code",
"execution_count": 122,
"metadata": {},
"outputs": [],
"source": [
"df10 = df9.drop(['size','price_per_sqft'],axis = 'columns')"
]
},
{
"cell_type": "code",
"execution_count": 123,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>2850.0</td>\n",
" <td>4.0</td>\n",
" <td>428.0</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1630.0</td>\n",
" <td>3.0</td>\n",
" <td>194.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1875.0</td>\n",
" <td>2.0</td>\n",
" <td>235.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>130.0</td>\n",
" <td>3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1235.0</td>\n",
" <td>2.0</td>\n",
" <td>148.0</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" location total_sqft bath price bhk\n",
"0 1st Block Jayanagar 2850.0 4.0 428.0 4\n",
"1 1st Block Jayanagar 1630.0 3.0 194.0 3\n",
"2 1st Block Jayanagar 1875.0 2.0 235.0 3\n",
"3 1st Block Jayanagar 1200.0 2.0 130.0 3\n",
"4 1st Block Jayanagar 1235.0 2.0 148.0 2"
]
},
"execution_count": 123,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df10.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using One Hot Encoding For Location"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###### For categorical variables where no such ordinal relationship exists, the integer encoding is not enough.In fact, using this encoding and allowing the model to assume a natural ordering between categories may result in poor performance or unexpected results (predictions halfway between categories).In this case, a one-hot encoding can be applied to the integer representation. This is where the integer encoded variable is removed and a new binary variable is added for each unique integer value."
]
},
{
"cell_type": "code",
"execution_count": 124,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>1st Block Jayanagar</th>\n",
" <th>1st Phase JP Nagar</th>\n",
" <th>2nd Phase Judicial Layout</th>\n",
" <th>2nd Stage Nagarbhavi</th>\n",
" <th>5th Block Hbr Layout</th>\n",
" <th>5th Phase JP Nagar</th>\n",
" <th>6th Phase JP Nagar</th>\n",
" <th>7th Phase JP Nagar</th>\n",
" <th>8th Phase JP Nagar</th>\n",
" <th>9th Phase JP Nagar</th>\n",
" <th>...</th>\n",
" <th>Vishveshwarya Layout</th>\n",
" <th>Vishwapriya Layout</th>\n",
" <th>Vittasandra</th>\n",
" <th>Whitefield</th>\n",
" <th>Yelachenahalli</th>\n",
" <th>Yelahanka</th>\n",
" <th>Yelahanka New Town</th>\n",
" <th>Yelenahalli</th>\n",
" <th>Yeshwanthpur</th>\n",
" <th>other</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 242 columns</p>\n",
"</div>"
],
"text/plain": [
" 1st Block Jayanagar 1st Phase JP Nagar 2nd Phase Judicial Layout \\\n",
"0 1 0 0 \n",
"1 1 0 0 \n",
"2 1 0 0 \n",
"3 1 0 0 \n",
"4 1 0 0 \n",
"\n",
" 2nd Stage Nagarbhavi 5th Block Hbr Layout 5th Phase JP Nagar \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" 6th Phase JP Nagar 7th Phase JP Nagar 8th Phase JP Nagar \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" 9th Phase JP Nagar ... Vishveshwarya Layout Vishwapriya Layout \\\n",
"0 0 ... 0 0 \n",
"1 0 ... 0 0 \n",
"2 0 ... 0 0 \n",
"3 0 ... 0 0 \n",
"4 0 ... 0 0 \n",
"\n",
" Vittasandra Whitefield Yelachenahalli Yelahanka Yelahanka New Town \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Yelenahalli Yeshwanthpur other \n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
"[5 rows x 242 columns]"
]
},
"execution_count": 124,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dummies = pd.get_dummies(df10.location)\n",
"dummies.head()"
]
},
{
"cell_type": "code",
"execution_count": 125,
"metadata": {},
"outputs": [],
"source": [
"df11 = pd.concat([df10,dummies],axis = 'columns')"
]
},
{
"cell_type": "code",
"execution_count": 126,
"metadata": {},
"outputs": [],
"source": [
"df11 = df11.drop(['other'],axis = 'columns')"
]
},
{
"cell_type": "code",
"execution_count": 127,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>location</th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>1st Block Jayanagar</th>\n",
" <th>1st Phase JP Nagar</th>\n",
" <th>2nd Phase Judicial Layout</th>\n",
" <th>2nd Stage Nagarbhavi</th>\n",
" <th>5th Block Hbr Layout</th>\n",
" <th>...</th>\n",
" <th>Vijayanagar</th>\n",
" <th>Vishveshwarya Layout</th>\n",
" <th>Vishwapriya Layout</th>\n",
" <th>Vittasandra</th>\n",
" <th>Whitefield</th>\n",
" <th>Yelachenahalli</th>\n",
" <th>Yelahanka</th>\n",
" <th>Yelahanka New Town</th>\n",
" <th>Yelenahalli</th>\n",
" <th>Yeshwanthpur</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>2850.0</td>\n",
" <td>4.0</td>\n",
" <td>428.0</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1630.0</td>\n",
" <td>3.0</td>\n",
" <td>194.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1875.0</td>\n",
" <td>2.0</td>\n",
" <td>235.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>130.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1st Block Jayanagar</td>\n",
" <td>1235.0</td>\n",
" <td>2.0</td>\n",
" <td>148.0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 246 columns</p>\n",
"</div>"
],
"text/plain": [
" location total_sqft bath price bhk 1st Block Jayanagar \\\n",
"0 1st Block Jayanagar 2850.0 4.0 428.0 4 1 \n",
"1 1st Block Jayanagar 1630.0 3.0 194.0 3 1 \n",
"2 1st Block Jayanagar 1875.0 2.0 235.0 3 1 \n",
"3 1st Block Jayanagar 1200.0 2.0 130.0 3 1 \n",
"4 1st Block Jayanagar 1235.0 2.0 148.0 2 1 \n",
"\n",
" 1st Phase JP Nagar 2nd Phase Judicial Layout 2nd Stage Nagarbhavi \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" 5th Block Hbr Layout ... Vijayanagar Vishveshwarya Layout \\\n",
"0 0 ... 0 0 \n",
"1 0 ... 0 0 \n",
"2 0 ... 0 0 \n",
"3 0 ... 0 0 \n",
"4 0 ... 0 0 \n",
"\n",
" Vishwapriya Layout Vittasandra Whitefield Yelachenahalli Yelahanka \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Yelahanka New Town Yelenahalli Yeshwanthpur \n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
"[5 rows x 246 columns]"
]
},
"execution_count": 127,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df11.head()"
]
},
{
"cell_type": "code",
"execution_count": 128,
"metadata": {},
"outputs": [],
"source": [
"df12 = df11.drop(['location'],axis = 'columns')"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>price</th>\n",
" <th>bhk</th>\n",
" <th>1st Block Jayanagar</th>\n",
" <th>1st Phase JP Nagar</th>\n",
" <th>2nd Phase Judicial Layout</th>\n",
" <th>2nd Stage Nagarbhavi</th>\n",
" <th>5th Block Hbr Layout</th>\n",
" <th>5th Phase JP Nagar</th>\n",
" <th>...</th>\n",
" <th>Vijayanagar</th>\n",
" <th>Vishveshwarya Layout</th>\n",
" <th>Vishwapriya Layout</th>\n",
" <th>Vittasandra</th>\n",
" <th>Whitefield</th>\n",
" <th>Yelachenahalli</th>\n",
" <th>Yelahanka</th>\n",
" <th>Yelahanka New Town</th>\n",
" <th>Yelenahalli</th>\n",
" <th>Yeshwanthpur</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2850.0</td>\n",
" <td>4.0</td>\n",
" <td>428.0</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1630.0</td>\n",
" <td>3.0</td>\n",
" <td>194.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1875.0</td>\n",
" <td>2.0</td>\n",
" <td>235.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>130.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1235.0</td>\n",
" <td>2.0</td>\n",
" <td>148.0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 245 columns</p>\n",
"</div>"
],
"text/plain": [
" total_sqft bath price bhk 1st Block Jayanagar 1st Phase JP Nagar \\\n",
"0 2850.0 4.0 428.0 4 1 0 \n",
"1 1630.0 3.0 194.0 3 1 0 \n",
"2 1875.0 2.0 235.0 3 1 0 \n",
"3 1200.0 2.0 130.0 3 1 0 \n",
"4 1235.0 2.0 148.0 2 1 0 \n",
"\n",
" 2nd Phase Judicial Layout 2nd Stage Nagarbhavi 5th Block Hbr Layout \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" 5th Phase JP Nagar ... Vijayanagar Vishveshwarya Layout \\\n",
"0 0 ... 0 0 \n",
"1 0 ... 0 0 \n",
"2 0 ... 0 0 \n",
"3 0 ... 0 0 \n",
"4 0 ... 0 0 \n",
"\n",
" Vishwapriya Layout Vittasandra Whitefield Yelachenahalli Yelahanka \\\n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
" Yelahanka New Town Yelenahalli Yeshwanthpur \n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
"[5 rows x 245 columns]"
]
},
"execution_count": 129,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df12.head()"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(7251, 245)"
]
},
"execution_count": 130,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df12.shape\n"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>total_sqft</th>\n",
" <th>bath</th>\n",
" <th>bhk</th>\n",
" <th>1st Block Jayanagar</th>\n",
" <th>1st Phase JP Nagar</th>\n",
" <th>2nd Phase Judicial Layout</th>\n",
" <th>2nd Stage Nagarbhavi</th>\n",
" <th>5th Block Hbr Layout</th>\n",
" <th>5th Phase JP Nagar</th>\n",
" <th>6th Phase JP Nagar</th>\n",
" <th>...</th>\n",
" <th>Vijayanagar</th>\n",
" <th>Vishveshwarya Layout</th>\n",
" <th>Vishwapriya Layout</th>\n",
" <th>Vittasandra</th>\n",
" <th>Whitefield</th>\n",
" <th>Yelachenahalli</th>\n",
" <th>Yelahanka</th>\n",
" <th>Yelahanka New Town</th>\n",
" <th>Yelenahalli</th>\n",
" <th>Yeshwanthpur</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2850.0</td>\n",
" <td>4.0</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1630.0</td>\n",
" <td>3.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1875.0</td>\n",
" <td>2.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1200.0</td>\n",
" <td>2.0</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1235.0</td>\n",
" <td>2.0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 244 columns</p>\n",
"</div>"
],
"text/plain": [
" total_sqft bath bhk 1st Block Jayanagar 1st Phase JP Nagar \\\n",
"0 2850.0 4.0 4 1 0 \n",
"1 1630.0 3.0 3 1 0 \n",
"2 1875.0 2.0 3 1 0 \n",
"3 1200.0 2.0 3 1 0 \n",
"4 1235.0 2.0 2 1 0 \n",
"\n",
" 2nd Phase Judicial Layout 2nd Stage Nagarbhavi 5th Block Hbr Layout \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"2 0 0 0 \n",
"3 0 0 0 \n",
"4 0 0 0 \n",
"\n",
" 5th Phase JP Nagar 6th Phase JP Nagar ... Vijayanagar \\\n",
"0 0 0 ... 0 \n",
"1 0 0 ... 0 \n",
"2 0 0 ... 0 \n",
"3 0 0 ... 0 \n",
"4 0 0 ... 0 \n",
"\n",
" Vishveshwarya Layout Vishwapriya Layout Vittasandra Whitefield \\\n",
"0 0 0 0 0 \n",
"1 0 0 0 0 \n",
"2 0 0 0 0 \n",
"3 0 0 0 0 \n",
"4 0 0 0 0 \n",
"\n",
" Yelachenahalli Yelahanka Yelahanka New Town Yelenahalli Yeshwanthpur \n",
"0 0 0 0 0 0 \n",
"1 0 0 0 0 0 \n",
"2 0 0 0 0 0 \n",
"3 0 0 0 0 0 \n",
"4 0 0 0 0 0 \n",
"\n",
"[5 rows x 244 columns]"
]
},
"execution_count": 131,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X = df12.drop('price',axis='columns')\n",
"X.head()"
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 428.0\n",
"1 194.0\n",
"2 235.0\n",
"3 130.0\n",
"4 148.0\n",
" ... \n",
"10232 70.0\n",
"10233 200.0\n",
"10236 110.0\n",
"10237 26.0\n",
"10240 400.0\n",
"Name: price, Length: 7251, dtype: float64"
]
},
"execution_count": 132,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = df12.price\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 133,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2,random_state = 10)"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.8452277697873348"
]
},
"execution_count": 134,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import LinearRegression\n",
"lr_clf = LinearRegression()\n",
"lr_clf.fit(X_train,y_train)\n",
"lr_clf.score(X_test,y_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use K Fold cross validation to measure accuracy of our LinearRegression model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this method, we split the data-set into k number of subsets(known as folds) then we perform training on the all the subsets but leave one(k-1) subset for the evaluation of the trained model. In this method, we iterate k times with a different subset reserved for testing purpose each time.\n",
"Always remember, a lower value of k is more biased, and hence undesirable. On the other hand, a higher value of K is less biased, but can suffer from large variability. It is important to know that a smaller value of k always takes us towards validation set approach, whereas a higher value of k leads to LOOCV approach."
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.82430186, 0.77166234, 0.85089567, 0.80837764, 0.83653286])"
]
},
"execution_count": 135,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.model_selection import ShuffleSplit\n",
"from sklearn.model_selection import cross_val_score\n",
"cv = ShuffleSplit(n_splits = 5, test_size = 0.2, random_state = 0)\n",
"cross_val_score(LinearRegression(),X,y,cv=cv)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Find best model using GridSearchCV "
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>model</th>\n",
" <th>best_score</th>\n",
" <th>best_params</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>linear_regression</td>\n",
" <td>0.818354</td>\n",
" <td>{'normalize': False}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>lasso</td>\n",
" <td>0.687430</td>\n",
" <td>{'alpha': 2, 'selection': 'random'}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>decision_tree</td>\n",
" <td>0.720273</td>\n",
" <td>{'criterion': 'friedman_mse', 'splitter': 'best'}</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" model best_score \\\n",
"0 linear_regression 0.818354 \n",
"1 lasso 0.687430 \n",
"2 decision_tree 0.720273 \n",
"\n",
" best_params \n",
"0 {'normalize': False} \n",
"1 {'alpha': 2, 'selection': 'random'} \n",
"2 {'criterion': 'friedman_mse', 'splitter': 'best'} "
]
},
"execution_count": 136,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.model_selection import GridSearchCV\n",
"from sklearn.linear_model import Lasso\n",
"from sklearn.tree import DecisionTreeRegressor\n",
"\n",
"def find_best_model_using_gridsearchcv(X,y):\n",
" algos = {\n",
" 'linear_regression' : {\n",
" 'model': LinearRegression(),\n",
" 'params': {\n",
" 'normalize': [True, False]\n",
" }\n",
" },\n",
" 'lasso': {\n",
" 'model': Lasso(),\n",
" 'params': {\n",
" 'alpha': [1,2],\n",
" 'selection': ['random', 'cyclic']\n",
" }\n",
" },\n",
" 'decision_tree': {\n",
" 'model': DecisionTreeRegressor(),\n",
" 'params': {\n",
" 'criterion' : ['mse','friedman_mse'],\n",
" 'splitter': ['best','random']\n",
" }\n",
" }\n",
" }\n",
" scores = []\n",
" cv = ShuffleSplit(n_splits=5, test_size=0.2, random_state=0)\n",
" for algo_name, config in algos.items():\n",
" gs = GridSearchCV(config['model'], config['params'], cv=cv, return_train_score=False)\n",
" gs.fit(X,y)\n",
" scores.append({\n",
" 'model': algo_name,\n",
" 'best_score': gs.best_score_,\n",
" 'best_params': gs.best_params_\n",
" })\n",
"\n",
" return pd.DataFrame(scores,columns=['model','best_score','best_params'])\n",
"\n",
"find_best_model_using_gridsearchcv(X,y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Based on above results we can say that LinearRegression gives the best score. Hence we will use that. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Test the model for few properties "
]
},
{
"cell_type": "code",
"execution_count": 137,
"metadata": {},
"outputs": [],
"source": [
"def predict_price(location,sqft,bath,bhk):\n",
" loc_index = np.where(X.columns==location)[0][0]\n",
" \n",
" x = np.zeros(len(X.columns))\n",
" x[0] = sqft\n",
" x[1] = bath\n",
" x[2] = bhk\n",
" if loc_index >= 0:\n",
" x[loc_index] = 1\n",
" return lr_clf.predict([x])[0] \n",
" "
]
},
{
"cell_type": "code",
"execution_count": 138,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"83.49904676591962"
]
},
"execution_count": 138,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predict_price('1st Phase JP Nagar',1000,2,2)"
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"184.58430202040012"
]
},
"execution_count": 139,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predict_price('Indira Nagar',1000, 3, 3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}