目录

  • 1 Java语言概述
    • 1.1 Java语言的特点
    • 1.2 Java虚拟机
    • 1.3 Java开发环境
    • 1.4 编译执行和解释执行
    • 1.5 第一个java程序1
    • 1.6 第一个java程序2
    • 1.7 第一个java程序3
    • 1.8 章节测验
  • 2 Java基础语法
    • 2.1 基本数据类型
    • 2.2 赋值语句
    • 2.3 表达式
    • 2.4 运算符
    • 2.5 类型转换
    • 2.6 章节测验
  • 3 类与对象
    • 3.1 面向对象的概念
    • 3.2 类和对象的概念
    • 3.3 构造方法
    • 3.4 章节测验
  • 4 继承与多态
    • 4.1 继承
    • 4.2 多态
    • 4.3 final关键字
    • 4.4 static关键字
    • 4.5 抽象类
    • 4.6 接口
    • 4.7 内部类
    • 4.8 章节测验
  • 5 数组和字符串
    • 5.1 数组
    • 5.2 字符串
    • 5.3 章节测验
  • 6 常用类与接口
    • 6.1 课件
    • 6.2 重要知识点
  • 7 异常处理
    • 7.1 课件
    • 7.2 重要知识点
  • 8 文件和流
    • 8.1 课件
    • 8.2 重要知识点
  • 9 Elements of Programming
    • 9.1 You First Program
      • 9.1.1 Exercise
      • 9.1.2 Program
    • 9.2 Built-in Types of Data
      • 9.2.1 Exercise
      • 9.2.2 Program
    • 9.3 Conditionals and Loops
      • 9.3.1 Exercise
      • 9.3.2 Program
    • 9.4 Arrays
      • 9.4.1 Exercise
      • 9.4.2 Program
    • 9.5 Input and Output
      • 9.5.1 Exercise
      • 9.5.2 Program
    • 9.6 Case Study: Random Web Surfer
  • 10 Functions
    • 10.1 Static Methods
      • 10.1.1 Exercise
      • 10.1.2 Program
    • 10.2 Libraries and Clients
      • 10.2.1 Exercise
      • 10.2.2 Program
    • 10.3 Recursion
      • 10.3.1 Exercise
      • 10.3.2 Program
    • 10.4 Case Study: Percolation
  • 11 Object-Oriented Programming
    • 11.1 Using Data Types
    • 11.2 Creating Data Types
    • 11.3 Designing Data Types
    • 11.4 Case Study: N-Body Simulation
  • 12 参考资料
    • 12.1 主要参考书
    • 12.2 慕课资源
Case Study: Random Web Surfer

Communicating across the web has become an integral part of everyday life. This communication is enabled in part by scientific studies of the structure of the web. We consider a simple model, known as the random-surfer model. We consider the web to be a fixed set of pages, with each page containing a fixed set of hyperlinks, and each link a reference to some other page. We study what happens to a person (the random surfer冲浪) who randomly moves from page to page, either by typing a page name into the address bar or by clicking a link on the current page.


The model.

The crux(核心) of the matter is to specify what it means to randomly move from page to page. The following intuitive(直观的) 90–10 rule captures both methods of moving to a new page: Assume that 90 percent of the time the random surfer clicks a random link on the current page (each link chosen with equal probability) and that 10 percent of the time the random surfer goes directly to a random page (all pages on the web chosen with equal probability).

You can immediately see that this model has flaws(缺陷), because you know from your own experience that the behavior of a real web surfer is not quite so simple:

  • No one chooses links or pages with equal probability.

  • There is no real potential to surf directly to each page on the web.

  • The 90–10 (or any fixed) breakdown is just a guess.

  • It does not take the back button or bookmarks into account.

Despite these flaws, the model is sufficiently rich that computer scientists have learned a great deal about properties of the web by studying it.


Input format.

We assume that there are n web pages, numbered from 0 to n−1, and we represent links with ordered pairs of such numbers, the first specifying the page containing the link and the second specifying the page to which it refers. The input format we adopt is an integer (the value of n) followed by a sequence of pairs of integers (the representations of all the links).

The data files tiny.txt and medium.txt are two simple examples.


Transition matrix.

We use a two-dimensional matrix, which we refer to as the transition matrix, to completely specify the behavior of the random surfer. With n web pages, we define an n-by-n matrix such that the entry in row i and column j is the probability that the random surfer moves to page j when on page i.

Transition.java is a filter that reads links from standard input and produces the corresponding transition matrix on standard output.


Simulation.

 RandomSurfer.java simulates the behavior of the random surfer. It reads a transition matrix and surfs according to the rules, starting at page 0 and taking the number of moves as a command-line argument. It counts the number of times that the surfer visits each page. Dividing that count by the number of moves yields(生产) an estimate of the probability that a random surfer winds up on the page. This probability is known as the page's rank.(统计每个网页打开的次数,平均得到访问的概率)

  • One random move. The key to the computation is the random move, which is specified by the transition matrix: each row represents a discrete probability distribution—the entries fully specify the behavior of the random surfer's next move, giving the probability of surfing to each page.



 RandomSurfer.javais an improved version that uses two library methods that we will introduce in Section 2.2.

  • Markov chains. The random process that describes the surfer's behavior is known as a Markov chain. Markov chains are widely applicable, well-studied, and have many remarkable and useful properties. For example, a basic limit theorem(定理) for Markov chains says that our surfer could start anywhere, because the probability that a random surfer eventually winds up on any particular page is the same for all starting pages!

  • Page ranks. The random-surfer simulation is straightforward: it loops for the indicated number of moves, randomly surfing through the graph. Increasing the number of iterations gives increasingly accurate estimates of the probability that the surfer lands on each page—the page ranks.

  • Visualizing the histogram. RandomSurferHistogram.java draws a frequency histogram that eventually stabilizes(稳定) to the page ranks.


Mixing a Markov chain.

Directly simulating the behavior of a random surfer to understand the structure of the web is appealing(激动人心的), but can be too time consuming. Fortunately, we can compute the same quantity more efficiently by using linear algebra.

  • Squaring a Markov chain. What is the probability that the random surfer will move from page i to page j in two moves? The first move goes to an intermediate page k, so we calculate the probability of moving from i to k and then from k to j for all possible k and add up the results. This calculation is one that we have seen before—matrix–matrix multiplication.


  • The power method. We might then calculate the probabilities for three moves by multiplying by p[][] again, and for four moves by multiplying by p[][] yet again, and so forth. However, matrix–matrix multiplication is expensive, and we are actually interested in a vector–matrix calculation.


Markov.java is an implementation that you can use to check convergence(收敛) for our example. For instance, it gets the same results (the page ranks accurate to two decimal places) as RandomSurfer.java, but with just 20 vector–matrix multiplications.