# Page Rank - Intro to Computer Science

Get 300 checks per monthabsolutely FREE!

No credit card needed. No strings attached.

##### So, that idea for how to rank webpages is the same idea as how we measure popularity of people.

**00:00**

So, that idea for how to rank webpages is
the same idea as how we measure popularity of people.
But instead of thinking about friendships as the way
to measure popularity, what we're measuring is links on the
web. When one page has a link to another
page, well that indicates it's more likely that this other
page is popular, just like when someone is a friend,
it indicates that the person they are friends with is
more likely to be popular. So, the goal in ranking web
pages is to get a measure of how popular are pages,
based on the other pages that link to it but we
have the same issue with popularity then not all links are
the same. That a link from a page that's really important
counts for a lot more than a link from a page
that is not important. So, if the New York Times has
a link to your page, that counts for a lot more than
if your mom sets up a web page and puts a
link to your page in it. Unless your mom is Lady
Gaga, in which case her link probably counts for more. So,
another way of thinking about this is what we're trying to model
is a random web surfer. So, our random web surfer has
some set of pages that they know about. And those pages
have links to other pages. Some pages might have a lot
of links. Some pages might just have one link. Some pages might
have no links. So, one way to think about this
is that we're trying to model a random web surfer. The
web surfer starts knowing about some pages. And she picks
one page at random, let's say she picks this page. And
then, when she's on that page, she picks a random
link and follows that links. Whoops, this was a bad starting
page. It actually has no out links. So, then, she picks
another random page. Let's say she picks this one. She follows
the link from that page, and now she got to the
page with no links again. Let's say she picked a better starting
point. Let's say she randomly picked this one. Now she's got
two links to follow. She randomly picks one of those. She follows
it. She gets to a new page. She randomly picks a
link from that page, in this case, it only had one, and
in this case, it seems we have a bit of a

##### Problem, because all of the starting pages eventually lead into this one.

**02:01**

problem, because all of the starting pages eventually lead into this one,
which has no outgoing links. So, we'll think about how to
solve that problem later, but we can think of what our
random web surfer is doing, is picking random pages, following links,
and what we want to measure is the popularity of a page.
And that's the probability that she reaches that given page, starting
from these random pages. So, if you did this over and
over again, and you counted the number of times you reached
each page, that would give you a measure of that page's popularity.
So, this is very similar to the popularity function. We're
going to define a function that we'll call the rank of
the page. And, like the way we defined the popularity
function, it's going to have two inputs. It's going to have a
time stamp, and it's going to have the page which we'll
use a URL for. And the output of rank will
be some number. Except for we'll define for time step
zero. This is our base case, and we're going to find all
the ranks have value one. We'll actually change
that shortly, but we'll start out think of it,
all the ranks having value one, like we
did with the popularity scores. And then we'll define
the value of the rank at time step t for any given URL. Just like we defined
the popularity score. It's going to be the sum of
all the pages that are friends with this page,
and what it means for our webpage to be friends
with another page is that it has a link to it.
So, this is going to be for all the pages that
exist that have some link to that URL, or its friends.
And so, we're going to go through each of those pages.
We'll call them inlinks instead of friends. We're going to go through
those and we're going to sum up the ranks that we got
for those pages at time t minus one. So, this is
our first model of popularity on, of webpages. This is exactly
the same as the model we had of popularity for people.
It's not going to work that well yet and one of the
reasons it's not going to work that well is some pages
might have lots of links. And if a page has lots

##### Of links, the value of each one of its links should be diminished.

**04:03**

of links, the value of each one of its links should
be diminished. It shouldn't have the same value as the page
that only has one link that links to this URL. Maybe that
should be the same case for a friend's popularity, if
someone has lots of friends, each friend is less valuable.
Whereas someone who only has a small number of friends
has lots of time for each friend. So this is the
way we're going to model web popularity. We don't want
to just give the same score to each link. We're
going to change this by dividing by the number of outlinks.
So, if a page has many outgoing links, the value to
the pages that it links to is less for each page. So,
a page that's just a long list of lots of links won't
have that much influence on the rankings. If a page only has
a few outgoing links, well then, they are worth more. So, what we are
going to do, is divide this by the number of out links from
p. So, each of the p pages, right? These will be the
values of p, as they go through the inlinks of URL. We
are going to sum up the rank that we got on the previous time
step and divide that by the number of out links.

## Related Topics

### Get 300 checks per month absolutely FREE!

No credit card needed. No strings attached. 👍