AIL701

AIL701
Mathematics Behind Machine Learning
Credits	3
Structure	3-0-0
Pre-requisites
Overlaps	MTL101, MTL104, MTL103, MTL106, MTL108,

AIL701 : Mathematics Behind Machine Learning

MTL265, MTL502, MTL508, MTL601, MTL628, MCL261, MCL761, ELL701, ELL780, COL756, MSL719 Vector space, dimension, basis, matrix rank & nullity, system of linear equations, eigenvalues and properties for some special matrices, diagonalization. Normed linear space, matrix norm, closed sets, sequence, convergence, Banach space, Hahn Banach Theorem (without proof), Inner product space, orthonormal basis, Gram Schmidt process, Hilbert space, adjoint operator, SVD. Nonlinear programming, linearizing cone, KKT optimality, convex function and its properties, Lagrange function, Lagrange dual. Steepest decent method for unconstrained problem, momentum gradient and stochastic gradient method, empirical risk, shattering, VC dimension, Mercer theorem, reproducing Kernel. Probability, Bayes theorem, pdf and cdf, moment generating function, some discrete and continuous distributions, transformation of variables, t-test, chi-square test, F-test, random sampling, central limit theorem, maximum likelihood estimator, hypothesis testing (without proof but the concept building) AIp701 Machine Learning Lab 1 Credit (0-0-2) About one third of the course will be devoted to programming aspects. One third would be devoted to assignments and experiments, and one third to a term project. Refresher on fundamentals of python programming. Implementation of linear models (linear, logistic and polynomial regressors) from scratch without using libraries, implementation of naive Bayes, verification of perceptron convergence algorithm, validation of different regularization techniques, generation of biasvariance curves; Kernel machines - Experiments with hyperparameters, effect of kernel functions, margins. Implementation and study of classification and regression trees, back-propagation, dropout, batch normalization; study of choice of loss functions on deep neural network performance. Implementations of a CNN, GRU on image and sequential tasks. ML on edge - Implementation and optimization of algorithms on edge hardware such as FPGA. Term project - Proposing, solving and implementing a real-world ML problem.