Skip to content

Commit 0b1f3d0

Browse files
authored
Merge pull request #50 from fuglede/patch-7
Fix a few typos
2 parents c628cd2 + c1c76da commit 0b1f3d0

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

Diff for: notebooks/05.04-Feature-Engineering.ipynb

+2-2
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
"\n",
2828
"The previous sections outline the fundamental ideas of machine learning, but all of the examples assume that you have numerical data in a tidy, ``[n_samples, n_features]`` format.\n",
2929
"In the real world, data rarely comes in such a form.\n",
30-
"With this in mind, one of the more important steps in using machine learning in practice is *feature engineering*: that is, taking whatever information you have about your problem and turning it in to numbers that you can use to build your feature matrix.\n",
30+
"With this in mind, one of the more important steps in using machine learning in practice is *feature engineering*: that is, taking whatever information you have about your problem and turning it into numbers that you can use to build your feature matrix.\n",
3131
"\n",
3232
"In this section, we will cover a few common examples of feature engineering tasks: features for representing *categorical data*, features for representing *text*, and features for representing *images*.\n",
3333
"Additionally, we will discuss *derived features* for increasing model complexity and *imputation* of missing data.\n",
@@ -83,7 +83,7 @@
8383
"cell_type": "markdown",
8484
"metadata": {},
8585
"source": [
86-
"It turns out that this is not generally a useful approach in Scikit-Learn: the package's models make the fundamental assumption that numerical features reflect algebraic quanitites.\n",
86+
"It turns out that this is not generally a useful approach in Scikit-Learn: the package's models make the fundamental assumption that numerical features reflect algebraic quantities.\n",
8787
"Thus such a mapping would imply, for example, that *Queen Anne < Fremont < Wallingford*, or even that *Wallingford - Queen Anne = Fremont*, which (niche demographic jokes aside) does not make much sense.\n",
8888
"\n",
8989
"In this case, one proven technique is to use *one-hot encoding*, which effectively creates extra columns indicating the presence or absence of a category with a value of 1 or 0, respectively.\n",

0 commit comments

Comments
 (0)