1.4 Principles of Deep Learning
Created Date: 2025-05-10
Use tensorflow/tfjs to anmiation MNIST train and predict process!
1.5.1 Machine learning
The usual way to make a computer do useful work is to have a human programmer write down rules — a computer program — to be followed to turn input data into appropriate answers.
File classical_program.py can enter the user's age and income to determine whether to lend him a loan.
def check_loan_eligibility(age, income):
if age < 18 or age > 65:
return 'NO'
if income < 30000:
return 'NO'
return 'YES'
age = int(input('Please enter age: '))
income = float(input('Please enter income: '))
eligibility = check_loan_eligibility(age, income)
print(eligibility)
Please enter age: 30 Please enter income: 23000 NO
Machine learning turns this around: the machine looks at the input data and the corresponding answers, and figures out what the rules should be:

A machine learning system is trained rather than explicitly programmed. It’s presented with many examples relevant to a task, and it finds statistical structure in these examples that eventually allows the system to come up with rules for automating the task. For instance, if you wished to automate the task of tagging your vacation pictures, you could present a machine learning system with many examples of pictures already tagged by humans, and the system would learn statistical rules for associating specific pictures to specific tags.

1.5.2 Rules and Representations
To define deep learning and understand the difference between deep learning and other machine learning approaches, first we need some idea of what machine learning algorithms do. We just stated that machine learning discovers rules for executing a data processing task, given examples of what’s expected. So, to do machine learning, we need three things:
Input data points - For instance, if the task is speech recognition, these data points could be sound files of people speaking. If the task is image tagging, they could be pictures.
Examples of the expected output - In a speech-recognition task, these could be human-generated transcripts of sound files. In an image task, expected outputs could be tags such as "dog", "cat" and so on.
A way to measure whether the algorithm is doing a good job -
As illustrate below figure, the model, composed of layers that are chained together, maps the input data to predictions. The loss function then compares these predictions to the targets, producing a loss value: a measure of how well the model’s predictions match what was expected. The optimizer uses this loss value to update the model’s weights.

1.5.5 Gradient Descent
Gradient descent is a fundamental optimization algorithm used in machine learning and other mathematical fields to find the minimum of a function. Imagine you are on a foggy mountain and want to find the lowest point (the valley). You can't see the whole landscape, but you can feel the slope of the ground beneath your feet. Gradient descent works similarly: it iteratively takes steps in the steepest downward direction to reach a local or global minimum of a function.
In summary, gradient descent is a powerful and widely used iterative optimization algorithm that forms the backbone of training many machine learning models. By repeatedly moving in the direction opposite to the gradient of a cost function, it efficiently searches for the optimal parameters that minimize the model's error. Understanding its principles, variations, and challenges is crucial for anyone working in machine learning.