One of the most important applications of derivatives is optimization. In some introductory calculus classes these types of problems are called max/min problems: given a function, what is the maximum or minimum output subject to some constraints. This module will review how derivatives can be used in these problems and give some of the reasons why these methods work.

Critical points

First, observe that for a differentiable function , if the derivative is not zero at a point, then that point cannot be a maximum or a minimum. For instance, if the derivative is positive, then the output is increasing with respect to the input, so by increasing the input, one can increase the output. Hence, the point is not a maximum. If the derivative is negative, then decreasing the input will increase the output, so that point cannot be a maximum.

Similarly, a point cannot be a minimum if the derivative is not zero. Thus, the only possible inputs where a maximum or minimum can occur are those where the derivative is zero. This motivates the following definition

Critical point

A critical point of a function is an input where either or where the derivative is undefined.

Critical points include maximum and minimum points (called extrema) as well as inflection points; these are the points where the derivative is 0. Other critical points occur at corner points or discontinuities, where the derivative is undefined. The reason for including points where the derivative is not defined is that such a point could be a maximum or minimum:


Compute the critical points of .

The derivative is defined everywhere, so the critical points are where . Since

the critical points are and .

Classifying critical points

Once one has computed a critical points , one can classify whether it is a maximum or minimum using the second derivative test:

Second Derivative Test

Suppose is a critical point of where .

  1. If , then has a local minimum at .
  2. If , then has a local maximum at .
  3. If , then the test fails.

In the third case, one can use the Taylor expansion about to determine the behavior of the function. In this case, could still be a local maximum, minimum, or inflection point.

The second derivative test is justified by considering the Taylor series for about :

since . Thus, when is close to , behaves like a parabola centered at . Recall that the sign of the coefficient of the square term in a parabola determines if the parabola opens up or down. A positive coefficient means the parabola opens upward, and a negative coefficient means the parabola opens downward.

Here, the coefficient of the is . So if , then the parabola opens upward, meaning is a local minimum of . If , then the parabola opens downward, meaning is a local maximum of . If , then one has to look at more terms of the Taylor series to determine 's behavior at .


Use the Taylor series about for

to determine whether the function has a local maximum, local minimum, or inflection point at . (Take as a given that is a critical point).

Expanding and multiplying the Taylor series gives

Thus, near the function behaves like , which is downward opening (because of the negative coefficient) and U shaped (because it is an even power). Therefore, the function has a local maximum at .

Note that the second derivative test, besides being tricky to apply with all of the product rules and chain rules, would ultimately be inconclusive in this example.


Consider a square sheet of cardboard of side length . By cutting equal sized squares of side length from each corner of the sheet and folding up the flaps which are formed, one gets an open box:

Note that as gets bigger, the box gets taller but the area of the base of the box shrinks. As gets smaller, the area of the base grows, but the height shrinks. Find the value of which maximizes the volume of the resulting box.

The volume of the box is the area of the base times the height. The base is a square of side length (since has been cut from both sides). The height of the box is . Thus

Finding the critical points means taking the derivative with respect to and setting equal to 0:

This factors as

so the critical points are and . To apply the second derivative test, we compute

and evaluate at each critical point:

Thus, is a local minimum and is a local maximum. (Note also that for there is no cardboard left, since the removed corners have consumed the entire square!).

The volume that results from is


Classify the critical points of .

As found in a previous example, the critical points of are and . The second derivative of is . Thus, and , and it follows from the second derivative test that 3 is a local minimum of , and 1 is a local maximum of .


Suppose a firm producing widgets expects to sell units (where is the price of the unit). What price should the firm set to maximize revenue (note that revenue here is just price times quantity sold)?

Revenue is . Taking the derivative gives , and setting equal to 0 gives

So has critical point (ignore since price should be positive). The second derivative is . Thus , and is a local maximum.

At this price, the revenue is

Global Extrema

While a local maximum or minimum is sometimes useful information, what is usually more important is the global maximum and minimum values of a function on a closed interval (or subject to some other constraint such as ). These are called the global extrema, or absolute extrema, of a function.

Global extrema on the interval either occur at critical points of or at the endpoints of the interval. So in addition to finding the critical points of in the interval and checking their values, one must also evaluate at the endpoints of the interval to find the global extrema.


Find the global extrema of on the interval .

From the prior examples, and are the critical points of . But is not in the interval , so disregard it. Then evaluate at 2,3,4 to find the extreme values:

Thus the absolute maximum is and occurs when . The absolute minimum is and occurs at .

Application: Statistics

In statistics, one often takes experimental data points of the form and looks for a relationship. A very simple relationship is the linear relationship . The data may not follow this relationship perfectly, and there may be some slight experimental error or other noise, so one tries to find the value of which best fits the data:

This process, called a linear regression, can be framed as an optimization problem. But what is the quantity being optimized?

There are several different linear regression models, depending on the quantity being minimized. These different quantities yield different best fit lines. One of the most common models is called ordinary least squares. This method seeks to minimize the sum of the squares of the residuals, which are the vertical distances from the points to the line:

As shown above, the residual for a given point is . Thus, the quantity being minimized is

Taking the derivative with respect to gives

Setting this equal to 0 and solving for gives

Applying the second derivative test, we compute

so the above value of minimizes the sum of squares (hence the least squares name).


To find the line of best fit of the form requires methods of multivariable calculus (because there are two variables, and , which need to be optimized). Optimization with multiple variables is not much more difficult than for a single variable, but these methods are beyond the scope of this course.


  • Find all the local maxima and minima of the function .
  • Which type of critical point does the function have at zero ?
  • Use a Taylor series about to determine whether the function has a local maximum or local minimum at the origin.
  • Find the location of the global maximum and minimum of on the interval .
  • Consider a stretch of highway in which cars are traveling at an average speed . The "traffic density" is the total amount of cars on our stretch of road divided by its length. These two quantities are related: the less cars on the road, the faster drivers are able to go. On the other hand, if traffic becomes heavy, drivers will naturally decrease their speed. The so-called "parabolic model" assumes that this relationship is dictated by the equation: where represents the capacity of the road, and the speed limit on it. The amount of cars passing through our road is called the "traffic flux" or "throughput", and is given by the product of the traffic density and the average speed: . Using the parabolic model, find out at what average speed the flux through our road is maximized.
  • A manufacturing company wants to know how many workers it should hire. If it employs too many people, the machines in the factory will be overutilized and the workers will have to wait until they are free, thus reducing the number of units each one will produce in a day's work. On the other hand, too few workers would leave the machines idle for long periods of time. A rough model for the relationship between the number of workers and their productivity is given by the equation where is the maximum number of units a worker can produce in a day and is the maximum number of workers the factory can accommodate. The amount of units manufactured in the whole factory in one day is equal to the product of the number of workers and the number of units each one produces: . How many workers should the company hire in order to maximize its production?
  • A technology company has just invented a new gadget. In order to maximize the profit derived from its sale, the company must make a critical decision: at what price should it be sold? A market study suggests that the number of units sold would approximately follow the equation , where is the sale price, is the number of units that would saturate the market, and . If it costs to manufacture one of these gadgets, at what price would be profit of the company be maximized?
  • The manufacturing process of a certain chemical substance is exothermic, that is, it releases heat. The amount of heat released, , depends on the temperature at which the process is carried out, and it is given by the equation , where is the room temperature of the manufacturing plant, and and . If the temperature must be maintained above , at what temperature would be the heat loss be minimized?
  • Classify the critical point of the function using Taylor series.
  • Construct a box without a top whose base is a square. The material cost for the bottom is $10 per square feet, the cost for the side is $5 per square feet. The box must have volume 8 cubic feet. Determine the dimension of the box that will minimize the cost.