# 3. Let P(w, m, c) denote the profit (in dollars) for a company when they have...

3. Let P(w, m, c) denote the profit (in dollars) for a company when they have w workers, m managers, and c computers. Explain what Pe(w, m, c) means in this case. What is the interpretation of Pe(w, m, c) > 0? 4. Under what circumstances might a linear approximation be a "bad" estimate for the value of a function of multiple variables? Give at least two different scenarios а 5. Let f(x, y) first figure out a nearby point that you can easily compute.) xy. Use linear approximation to estimate the value of f(3.8,4.1). (Hint: 6. Under what circumstances might gradient descent not work? Could it miss a point? Does your initial guess matter?

3)

it says that the profit is positive when the number of
computers is constant and the other two quantities are
changing.

that is the profit depends upon the number of
managers and that of the workers and it is always
positive.

5)

6)

Gradient Descent is an algorithm which is designed to find the optimal points, but these optimal points are not necessarily global. And yes if it happens that it diverges from a local location it may converge to another optimal point but its probability is not too much. The reason is that the step size(referred to as alpha) might be too large that prompts it to recede one optimal point and the probability that it oscillates is much more than convergence.

About gradient descent there are two main perspectives, machine learning era and deep learning era. During machine learning era it was considered that gradient descent will find the local/global optimum but in deep learning era where the dimension of input features are too much it is shown in practice that the probability that all of the features be located in there optimal value at a single point is not too much and rather seeing to have optimal locations in cost functions, most of the time saddle points are observed. This is one of the reasons that training with lots of data and training epochs cause the deep learning models to outperform other algorithms. So if you train your model, it will find a detour or will find its way to go downhill and do not stuck in saddle points, but you have to have appropriate step sizes.

