integral is just the area under the curve, not the summation of individual p(x) values.

These come to the same thing. Have a look at my integration diagram below.

This shows 22 rectangles whose area, if added together, would give an approximation for the area under the curve between x1 and x2. Now imagine that the rectangles get thinner and there are more of them. The estimate for the area gets closer to the true value. In integration theory we imagine that the rectangles get infinitesimally thin and that there are an infinite number of them. At this moment the sum gives the exact area but you may also think of it as a series of lines added together. Strictly speaking those lines are still rectangles with height p(x) and width dx.

Bob

]]>I always thought integration means the summation of all the f(x) values in an interval.

The definite integral can be thought of as the sum of all the rectangles underneath the curve. Not the f(x)'s.

]]>Thank you so much for your help and the images

So what I understand is that the integral is just the area under the curve, not the summation of individual p(x) values. It is still quite confusing to be honest I always thought integration means the summation of all the f(x) values in an interval.

For example in light transport equation (in computer graphics), part of the outgoing radiance from a point on a surface is the sum of the incident radiance from all the directions on a sphere around that point, and this is calculated by an integral equation.

I think I have to revisit my math text books and read more about integration

Again, thank you for your time and your help

]]>Welcome to the forum!

Your understanding of a pdf is good. The bit that is confusing you, confuses me too. What does it mean?

I have made a picture below to show a pdf.

The total area under the graph must be 1 so the y scale is adjusted to make that true.

If you wanted the probability that 2<x<4 then you would find the area between the two red lines.

You cannot have the probability of a single discrete value for x as a vertical line would contain no area. You can only find the probability that x lies between two values.

Hope that gives you a start. You are welcome to post again on the topic.

EDIT: !'ve added a second pdf. The shape this time is a triangle. The area is still 1 (half base x height = 0.5 x 1 x 2) so it is valid.

You can see it is possible for the maximum height to be over 1.

EVEN LATER EDIT: It is possible to have a pdf with P(x1) = 10 and P(x2) = 100. (graph 3) The x values would have to be small so that the total area property is obeyed.

But the probability of being in the vicinity of x1 is definitely lower than being in a similar vicinity of x2. (as you said)

Bob

]]>I am studying a lecture note on Monte Carlo integration method as I need it for a project I'm working on.

Up until now I thought I know what a PDF is, but I'm confused by the lecture notes right now.

The notes state that if p(x) is a PDF, it has the following properties:

1) it is always non-negative:

2) It is normalized; meaning that its integral over the interval of real numbers should exactly be equal to 1

The above properties tell me that each individual value of p(x) must be less than 1, if their integral over negative infinity to positive infinity is exactly 1. (correct me if I'm wrong)

Now here is the confusing part:

"... the probability density function describes the relative likelihood of a random variable (or event) having a certain value. For instance, if p(x1) = 10 and p(x2) = 100 , then the random variable with the PDF p is ten times more likely to have a value near x1 than near x2 ..."

How can we have p(x1) = 100, when the summation of all the possible p(x) values must be equal to 1? (property number 2)

I also don't get the part where it says x1 is ten times more likely than x2... shouldn't it be the other way around?

Thanks for your help

]]>