k-Nearest Neighbors and Generalization​​

I recently played with the Digit dataset in sklearn.

The exercise gave me good insights into how the number of neighbors plays a important role in model complexity.  A complex model  (in this case when # neighbours=1)  will suffer from overfitting.

GenerelizationError.JPG

Here is how the accuracy numbers look between train and test sets.

Accuracy.JPG

Github Code:

 

IPython Notebook Shortcuts.

Blue indicates I have tried out these.

Command Mode (press Esc to enable)

Enter: enter edit mode
ShiftEnter: run cell, select below
CtrlEnter: run cell
AltEnter: run cell, insert below
Y: to code
M: to markdown
R: to raw
1: to heading 1
2: to heading 2
3: to heading 3
4: to heading 4
5: to heading 5
6: to heading 6
Up: select cell above
K: select cell above
Down: select cell below
J: select cell below
A: insert cell above
B: insert cell below
X: cut selected cell
C: copy selected cell
ShiftV: paste cell above
V: paste cell below
Z: undo last cell deletion
D,D: delete selected cell
ShiftM: merge cell below
S: Save and Checkpoint
CtrlS: Save and Checkpoint
L: toggle line numbers
O: toggle output
ShiftO: toggle output scrolling
Esc: close pager
Q: close pager
H: show keyboard shortcut help dialog
I,I: interrupt kernel
0,0: restart kernel
Space: scroll down
ShiftSpace: scroll up
Shift: ignore

Edit Mode (press Enter to enable)

Tab: code completion or indent
ShiftTab: tooltip
Ctrl]: indent
Ctrl[: dedent
CtrlA: select all
CtrlZ: undo
CtrlShiftZ: redo
CtrlY: redo
CtrlHome: go to cell start
CtrlUp: go to cell start
CtrlEnd: go to cell end
CtrlDown: go to cell end
CtrlLeft: go one word left
CtrlRight: go one word right
CtrlBackspace: delete word before
CtrlDelete: delete word after
Esc: command mode
CtrlM: command mode
ShiftEnter: run cell, select below
CtrlEnter: run cell
AltEnter: run cell, insert below
CtrlShiftSubtract: split cell
CtrlShift-: split cell
CtrlS: Save and Checkpoint
Up: move cursor up or previous cell
Down: move cursor down or next cell
Shift: ignore