Skip to content

Commit bd8378b

Browse files
committed
https://pedroazambuja.medium.com/adaboost-adaptive-boosting-dbbec150fced
1 parent 842e623 commit bd8378b

File tree

1 file changed

+273
-0
lines changed

1 file changed

+273
-0
lines changed

Adaboost.ipynb

+273
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": null,
6+
"metadata": {
7+
"id": "WLpfCn9XJqu7"
8+
},
9+
"outputs": [],
10+
"source": [
11+
"from sklearn.tree import DecisionTreeClassifier\n",
12+
"from sklearn.svm import SVC\n",
13+
"from sklearn.ensemble import AdaBoostClassifier\n",
14+
"from sklearn import datasets\n",
15+
"from sklearn.model_selection import train_test_split\n",
16+
"from sklearn.metrics import confusion_matrix\n",
17+
"from sklearn import metrics"
18+
]
19+
},
20+
{
21+
"cell_type": "markdown",
22+
"metadata": {
23+
"id": "XnW6ubOJJqu8"
24+
},
25+
"source": [
26+
"Primeiramente carregamos o dataset de lírios do Scikit Learn, mais informações deste podem ser encontradas em:\n",
27+
"\n",
28+
"https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html"
29+
]
30+
},
31+
{
32+
"cell_type": "code",
33+
"execution_count": null,
34+
"metadata": {
35+
"id": "fq-a--cbJqu9"
36+
},
37+
"outputs": [],
38+
"source": [
39+
"iris = datasets.load_iris()"
40+
]
41+
},
42+
{
43+
"cell_type": "markdown",
44+
"metadata": {
45+
"id": "4-Bj4cVsJqu9"
46+
},
47+
"source": [
48+
"Em seguida separamos os dados carregados em data (X) e target (y). Estes por sua vez são divididos em conjunto de treinamento e de teste com uma proporção de 60% para o primeiro e 40% para o segundo."
49+
]
50+
},
51+
{
52+
"cell_type": "code",
53+
"execution_count": null,
54+
"metadata": {
55+
"id": "V9n-GkRzJqu9"
56+
},
57+
"outputs": [],
58+
"source": [
59+
"X = iris.data\n",
60+
"y = iris.target\n",
61+
"\n",
62+
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)"
63+
]
64+
},
65+
{
66+
"cell_type": "markdown",
67+
"metadata": {
68+
"id": "ALoRz9SyJqu9"
69+
},
70+
"source": [
71+
"Então, criamos o classificador com Adaboost, o **AdaboostClassfier** tem um DecisionTreeClassifier com profundidade 1 como seu classificador padrão. Porém é possível utilizar outros classificadores como será mostrado mais adiante.\n",
72+
"\n",
73+
"Para mais informações sobre o **AdaboostClassifier**:\n",
74+
"\n",
75+
"https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html"
76+
]
77+
},
78+
{
79+
"cell_type": "code",
80+
"execution_count": null,
81+
"metadata": {
82+
"id": "lyqqo0fXJqu-"
83+
},
84+
"outputs": [],
85+
"source": [
86+
"ab_classifier = AdaBoostClassifier(DecisionTreeClassifier(max_depth=1),\n",
87+
" n_estimators=50,\n",
88+
" learning_rate=1)"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"metadata": {
94+
"id": "2iigVztGJqu-"
95+
},
96+
"source": [
97+
"Por fim, treinamos o classificador criado com o conjunto de treinamento e o utilizamos para realizar a predição do conjunto de testes."
98+
]
99+
},
100+
{
101+
"cell_type": "code",
102+
"execution_count": null,
103+
"metadata": {
104+
"id": "RHBpWTBTJqu-"
105+
},
106+
"outputs": [],
107+
"source": [
108+
"model = ab_classifier.fit(X_train, y_train)\n",
109+
"\n",
110+
"y_pred = model.predict(X_test)"
111+
]
112+
},
113+
{
114+
"cell_type": "markdown",
115+
"metadata": {
116+
"id": "bJgxTIo8Jqu-"
117+
},
118+
"source": [
119+
"Para comparação, todo o procedimento anterior foi realizado com o mesmo classificador mas sem Adaboost:"
120+
]
121+
},
122+
{
123+
"cell_type": "code",
124+
"execution_count": null,
125+
"metadata": {
126+
"id": "1SANz_9cJqu_"
127+
},
128+
"outputs": [],
129+
"source": [
130+
"dt = DecisionTreeClassifier(max_depth=1)\n",
131+
"dt_model = dt.fit(X_train, y_train)\n",
132+
"y_pred_dt = dt_model.predict(X_test)"
133+
]
134+
},
135+
{
136+
"cell_type": "markdown",
137+
"metadata": {
138+
"id": "6N9o3KnxJqu_"
139+
},
140+
"source": [
141+
"Abaixo são calculadas a acurácia e a matriz de confusão geradas pelos modelos:"
142+
]
143+
},
144+
{
145+
"cell_type": "code",
146+
"execution_count": null,
147+
"metadata": {
148+
"id": "xi2r0G_IJqu_",
149+
"outputId": "86c9dd4c-5007-4701-d705-ddf07749c9c0"
150+
},
151+
"outputs": [
152+
{
153+
"name": "stdout",
154+
"output_type": "stream",
155+
"text": [
156+
"Matriz de Confusão sem Adaboost:\n",
157+
"[[18 0 0]\n",
158+
" [ 0 0 24]\n",
159+
" [ 0 0 18]]\n",
160+
"Matriz de Confusão sem Adaboost:\n",
161+
"[[18 0 0]\n",
162+
" [ 0 18 6]\n",
163+
" [ 0 0 18]]\n",
164+
"Acurácia sem Adaboost: 0.6\n",
165+
"Acurácia com Adaboost: 0.9\n"
166+
]
167+
}
168+
],
169+
"source": [
170+
"from sklearn.model_selection import cross_val_score\n",
171+
"\n",
172+
"print(\"Matriz de Confusão sem Adaboost:\")\n",
173+
"print(confusion_matrix(y_test, y_pred_dt))\n",
174+
"\n",
175+
"print(\"Matriz de Confusão sem Adaboost:\")\n",
176+
"print(confusion_matrix(y_test, y_pred))\n",
177+
"\n",
178+
"print(\"Acurácia sem Adaboost:\", metrics.accuracy_score(y_test, y_pred_dt))\n",
179+
"print(\"Acurácia com Adaboost:\", metrics.accuracy_score(y_test, y_pred))"
180+
]
181+
},
182+
{
183+
"cell_type": "markdown",
184+
"metadata": {
185+
"id": "lKkyRpz3Jqu_"
186+
},
187+
"source": [
188+
"Como é possível observar, a adição do Adaboost ao classificador melhora muito sua eficiência."
189+
]
190+
},
191+
{
192+
"cell_type": "markdown",
193+
"metadata": {
194+
"id": "ZKpY5_KqJqu_"
195+
},
196+
"source": [
197+
"Além disso, abaixo é mostrada a utilização do Adaboost, porém com outro tipo de classificador."
198+
]
199+
},
200+
{
201+
"cell_type": "code",
202+
"execution_count": null,
203+
"metadata": {
204+
"id": "kz1O13DKJqvA"
205+
},
206+
"outputs": [],
207+
"source": [
208+
"svc = SVC(probability=True, kernel='linear')\n",
209+
"\n",
210+
"# Create adaboost classifer object\n",
211+
"ab_classifier = AdaBoostClassifier(svc,\n",
212+
" n_estimators=50,\n",
213+
" learning_rate=1)\n",
214+
"# Train Adaboost Classifer\n",
215+
"model = ab_classifier.fit(X_train, y_train)\n",
216+
"\n",
217+
"#Predict the response for test dataset\n",
218+
"y_pred = model.predict(X_test)"
219+
]
220+
},
221+
{
222+
"cell_type": "code",
223+
"execution_count": null,
224+
"metadata": {
225+
"id": "nVedkP3qJqvA",
226+
"outputId": "56de05f5-6c6a-4148-aa46-f4bb751aa875"
227+
},
228+
"outputs": [
229+
{
230+
"name": "stdout",
231+
"output_type": "stream",
232+
"text": [
233+
"Matriz de Confusão do SVC com Adaboost:\n",
234+
"[[18 0 0]\n",
235+
" [ 0 22 2]\n",
236+
" [ 0 0 18]]\n",
237+
"Acurácia do SVC com Adaboost: 0.9666666666666667\n"
238+
]
239+
}
240+
],
241+
"source": [
242+
"print(\"Matriz de Confusão do SVC com Adaboost:\")\n",
243+
"print(confusion_matrix(y_test, y_pred))\n",
244+
"\n",
245+
"print(\"Acurácia do SVC com Adaboost:\", metrics.accuracy_score(y_test, y_pred))"
246+
]
247+
}
248+
],
249+
"metadata": {
250+
"kernelspec": {
251+
"display_name": "Python 3",
252+
"language": "python",
253+
"name": "python3"
254+
},
255+
"language_info": {
256+
"codemirror_mode": {
257+
"name": "ipython",
258+
"version": 3
259+
},
260+
"file_extension": ".py",
261+
"mimetype": "text/x-python",
262+
"name": "python",
263+
"nbconvert_exporter": "python",
264+
"pygments_lexer": "ipython3",
265+
"version": "3.7.6"
266+
},
267+
"colab": {
268+
"provenance": []
269+
}
270+
},
271+
"nbformat": 4,
272+
"nbformat_minor": 0
273+
}

0 commit comments

Comments
 (0)